WorldWideScience

Sample records for medline references sequence

  1. Sequence Factorization with Multiple References.

    Directory of Open Access Journals (Sweden)

    Sebastian Wandelt

    Full Text Available The success of high-throughput sequencing has lead to an increasing number of projects which sequence large populations of a species. Storage and analysis of sequence data is a key challenge in these projects, because of the sheer size of the datasets. Compression is one simple technology to deal with this challenge. Referential factorization and compression schemes, which store only the differences between input sequence and a reference sequence, gained lots of interest in this field. Highly-similar sequences, e.g., Human genomes, can be compressed with a compression ratio of 1,000:1 and more, up to two orders of magnitude better than with standard compression techniques. Recently, it was shown that the compression against multiple references from the same species can boost the compression ratio up to 4,000:1. However, a detailed analysis of using multiple references is lacking, e.g., for main memory consumption and optimality. In this paper, we describe one key technique for the referential compression against multiple references: The factorization of sequences. Based on the notion of an optimal factorization, we propose optimization heuristics and identify parameter settings which greatly influence 1 the size of the factorization, 2 the time for factorization, and 3 the required amount of main memory. We evaluate a total of 30 setups with a varying number of references on data from three different species. Our results show a wide range of factorization sizes (optimal to an overhead of up to 300%, factorization speed (0.01 MB/s to more than 600 MB/s, and main memory usage (few dozen MB to dozens of GB. Based on our evaluation, we identify the best configurations for common use cases. Our evaluation shows that multi-reference factorization is much better than single-reference factorization.

  2. MedlinePlus FAQ: Can you tell me how to cite MedlinePlus pages?

    Science.gov (United States)

    ... MedlinePlus, the National Library of Medicine recommends the citation style below, based upon Citing Medicine . This style, like many other citation styles, requires that for online references you include the ...

  3. MedlinePlus FAQ: MedlinePlus and MEDLINE/PubMed

    Science.gov (United States)

    ... What is the difference between MedlinePlus and MEDLINE/PubMed? To use the sharing features on this page, ... latest health professional articles on your topic. MEDLINE/PubMed: Is a database of professional biomedical literature Is ...

  4. The Release 6 reference sequence of the Drosophila melanogaster genome.

    Science.gov (United States)

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. © 2015 Hoskins et al.; Published by Cold Spring Harbor Laboratory Press.

  5. The medline UK filter: development and validation of a geographic search filter to retrieve research about the UK from OVID medline.

    Science.gov (United States)

    Ayiku, Lynda; Levay, Paul; Hudson, Tom; Craven, Jenny; Barrett, Elizabeth; Finnegan, Amy; Adams, Rachel

    2017-07-13

    A validated geographic search filter for the retrieval of research about the United Kingdom (UK) from bibliographic databases had not previously been published. To develop and validate a geographic search filter to retrieve research about the UK from OVID medline with high recall and precision. Three gold standard sets of references were generated using the relative recall method. The sets contained references to studies about the UK which had informed National Institute for Health and Care Excellence (NICE) guidance. The first and second sets were used to develop and refine the medline UK filter. The third set was used to validate the filter. Recall, precision and number-needed-to-read (NNR) were calculated using a case study. The validated medline UK filter demonstrated 87.6% relative recall against the third gold standard set. In the case study, the medline UK filter demonstrated 100% recall, 11.4% precision and a NNR of nine. A validated geographic search filter to retrieve research about the UK with high recall and precision has been developed. The medline UK filter can be applied to systematic literature searches in OVID medline for topics with a UK focus. © 2017 Crown copyright. Health Information and Libraries Journal © 2017 Health Libraries GroupThis article is published with the permission of the Controller of HMSO and the Queen's Printer for Scotland.

  6. Newborn Screening: MedlinePlus Health Topic

    Science.gov (United States)

    ... more articles Reference Desk Glossary (National Center for Biotechnology Information) Find an Expert Eunice Kennedy Shriver National ... other than English on Newborn Screening NIH MedlinePlus Magazine Hearing Loss: Screening Newborns Screening Newborns' Hearing Now ...

  7. Liver Transplantation: MedlinePlus Health Topic

    Science.gov (United States)

    ... Statistics and Research The SRTR/OPTN Annual Data Report (Scientific Registry of Transplant Recipients) Clinical Trials ClinicalTrials.gov: Liver Transplantation (National Institutes of Health) Journal Articles References and abstracts from MEDLINE/PubMed (National ...

  8. MedlinePlus FAQ: What's the difference between MedlinePlus and MedlinePlus Connect?

    Science.gov (United States)

    ... MedlinePlus Connect is a free service that allows electronic health record (EHR) systems to easily link users to MedlinePlus, ... updates Subscribe to RSS Follow us Disclaimers Copyright Privacy Accessibility Quality Guidelines Viewers & Players MedlinePlus Connect for ...

  9. Pain Relievers: MedlinePlus Health Topic

    Science.gov (United States)

    ... Health) ClinicalTrials.gov: Narcotics (National Institutes of Health) Journal Articles References and abstracts from MEDLINE/PubMed (National ... Concentration Before Giving Acetaminophen to Infants (Food and Drug Administration) ... Related Health Topics Chronic Pain Medicines Opioid Abuse and Addiction Over-the-Counter Medicines Pain Other ...

  10. A1C: MedlinePlus Health Topic

    Science.gov (United States)

    ... Numbers: Use Them to Manage Your Diabetes (National Diabetes Education Program) Also in Spanish Clinical Trials ClinicalTrials.gov: Hemoglobin A, Glycosylated (National Institutes of Health) Journal Articles References and abstracts from MEDLINE/PubMed ( ...

  11. Reference genome sequence of the model plant Setaria.

    Science.gov (United States)

    Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

    2012-05-13

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  12. Reference genome sequence of the model plant Setaria

    Energy Technology Data Exchange (ETDEWEB)

    Bennetzen, Jeffrey L [ORNL; Schmutz, Jeremy [Hudson Alpha Institute of Biotechnology; Wang, Hao [University of Georgia, Athens, GA; Percifield, Ryan [University of Georgia, Athens, GA; Hawkins, Jennifer [University of Georgia, Athens, GA; Pontaroli, Ana C. [University of Georgia, Athens, GA; Estep, Matt [University of Georgia, Athens, GA; Feng, Liang [University of Georgia, Athens, GA; Vaughn, Justin N [ORNL; Grimwood, Jane [Hudson Alpha Institute of Biotechnology; Jenkins, Jerry [Hudson Alpha Institute of Biotechnology; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Lindquist, Erika [U.S. Department of Energy, Joint Genome Institute; Hellsten, Uffe [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Wang, Xuewen [University of Georgia, Athens, GA; Wu, Xiaomei [University of Georgia, Athens, GA; Mitros, Therese [University of California, Berkeley; Triplett, Jimmy [University of Missouri, St. Louis; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Mauro-Herrera, Margarita [Oklahoma State University; Wang, Lin [Cornell University; Li, Pinghua [Cornell University; Sharma, Manoj [University of California, Davis; Sharma, Rita [University of California, Davis; Ronald, Pamela [University of California, Davis; Panaud, Olivier [Universite de Perpignan, Perpignan, France; Kellogg, Elizabeth A. [University of Missouri, St. Louis; Brutnell, Thomas P. [Cornell University; Doust, Andrew N. [Oklahoma State University; Tuskan, Gerald A [ORNL; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Devos, Katrien M [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  13. Reference genome sequence of the model plant Setaria

    Energy Technology Data Exchange (ETDEWEB)

    Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  14. A novel algorithm for analyzing drug-drug interactions from MEDLINE literature.

    Science.gov (United States)

    Lu, Yin; Shen, Dan; Pietsch, Maxwell; Nagar, Chetan; Fadli, Zayd; Huang, Hong; Tu, Yi-Cheng; Cheng, Feng

    2015-11-27

    Drug-drug interaction (DDI) is becoming a serious clinical safety issue as the use of multiple medications becomes more common. Searching the MEDLINE database for journal articles related to DDI produces over 330,000 results. It is impossible to read and summarize these references manually. As the volume of biomedical reference in the MEDLINE database continues to expand at a rapid pace, automatic identification of DDIs from literature is becoming increasingly important. In this article, we present a random-sampling-based statistical algorithm to identify possible DDIs and the underlying mechanism from the substances field of MEDLINE records. The substances terms are essentially carriers of compound (including protein) information in a MEDLINE record. Four case studies on warfarin, ibuprofen, furosemide and sertraline implied that our method was able to rank possible DDIs with high accuracy (90.0% for warfarin, 83.3% for ibuprofen, 70.0% for furosemide and 100% for sertraline in the top 10% of a list of compounds ranked by p-value). A social network analysis of substance terms was also performed to construct networks between proteins and drug pairs to elucidate how the two drugs could interact.

  15. Diversity in non-repetitive human sequences not found in the reference genome.

    Science.gov (United States)

    Kehr, Birte; Helgadottir, Anna; Melsted, Pall; Jonsson, Hakon; Helgason, Hannes; Jonasdottir, Adalbjörg; Jonasdottir, Aslaug; Sigurdsson, Asgeir; Gylfason, Arnaldur; Halldorsson, Gisli H; Kristmundsdottir, Snaedis; Thorgeirsson, Gudmundur; Olafsson, Isleifur; Holm, Hilma; Thorsteinsdottir, Unnur; Sulem, Patrick; Helgason, Agnar; Gudbjartsson, Daniel F; Halldorsson, Bjarni V; Stefansson, Kari

    2017-04-01

    Genomes usually contain some non-repetitive sequences that are missing from the reference genome and occur only in a population subset. Such non-repetitive, non-reference (NRNR) sequences have remained largely unexplored in terms of their characterization and downstream analyses. Here we describe 3,791 breakpoint-resolved NRNR sequence variants called using PopIns from whole-genome sequence data of 15,219 Icelanders. We found that over 95% of the 244 NRNR sequences that are 200 bp or longer are present in chimpanzees, indicating that they are ancestral. Furthermore, 149 variant loci are in linkage disequilibrium (r 2 > 0.8) with a genome-wide association study (GWAS) catalog marker, suggesting disease relevance. Additionally, we report an association (P = 3.8 × 10 -8 , odds ratio (OR) = 0.92) with myocardial infarction (23,360 cases, 300,771 controls) for a 766-bp NRNR sequence variant. Our results underline the importance of including variation of all complexity levels when searching for variants that associate with disease.

  16. Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis.

    Science.gov (United States)

    Guo, Yan; Dai, Yulin; Yu, Hui; Zhao, Shilin; Samuels, David C; Shyr, Yu

    2017-03-01

    Analyses of high throughput sequencing data starts with alignment against a reference genome, which is the foundation for all re-sequencing data analyses. Each new release of the human reference genome has been augmented with improved accuracy and completeness. It is presumed that the latest release of human reference genome, GRCh38 will contribute more to high throughput sequencing data analysis by providing more accuracy. But the amount of improvement has not yet been quantified. We conducted a study to compare the genomic analysis results between the GRCh38 reference and its predecessor GRCh37. Through analyses of alignment, single nucleotide polymorphisms, small insertion/deletions, copy number and structural variants, we show that GRCh38 offers overall more accurate analysis of human sequencing data. More importantly, GRCh38 produced fewer false positive structural variants. In conclusion, GRCh38 is an improvement over GRCh37 not only from the genome assembly aspect, but also yields more reliable genomic analysis results. Copyright © 2017. Published by Elsevier Inc.

  17. A STUDY ON DETERMINING THE REFERENCE SPREADING SEQUENCES FOR A DS/CDMACOMMUNICATION SYSTEM

    Directory of Open Access Journals (Sweden)

    Cebrail ÇİFTLİKLİ

    2002-02-01

    Full Text Available In a direct sequence/code division multiple access (DS/CDMA system, the role of the spreading sequences (codes is crucial since the multiple access interference (MAI is the main performance limitation. In this study, we propose an accurate criterion which enables the determination of the reference spreading codes which yield lower bit error rates (BER's in a given code set for a DS/CDMA system using despreading sequences weighted by stepping chip waveforms. The numerical results show that the spreading codes determined by the proposed criterion are the most suitable codes for using as references.

  18. Reference genome-independent assessment of mutation density using restriction enzyme-phased sequencing

    Directory of Open Access Journals (Sweden)

    Monson-Miller Jennifer

    2012-02-01

    Full Text Available Abstract Background The availability of low cost sequencing has spurred its application to discovery and typing of variation, including variation induced by mutagenesis. Mutation discovery is challenging as it requires a substantial amount of sequencing and analysis to detect very rare changes and distinguish them from noise. Also challenging are the cases when the organism of interest has not been sequenced or is highly divergent from the reference. Results We describe the development of a simple method for reduced representation sequencing. Input DNA was digested with a single restriction enzyme and ligated to Y adapters modified to contain a sequence barcode and to provide a compatible overhang for ligation. We demonstrated the efficiency of this method at SNP discovery using rice and arabidopsis. To test its suitability for the discovery of very rare SNP, one control and three mutagenized rice individuals (1, 5 and 10 mM sodium azide were used to prepare genomic libraries for Illumina sequencers by ligating barcoded adapters to NlaIII restriction sites. For genome-dependent discovery 15-30 million of 80 base reads per individual were aligned to the reference sequence achieving individual sequencing coverage from 7 to 15×. We identified high-confidence base changes by comparing sequences across individuals and identified instances consistent with mutations, i.e. changes that were found in a single treated individual and were solely GC to AT transitions. For genome-independent discovery 70-mers were extracted from the sequence of the control individual and single-copy sequence was identified by comparing the 70-mers across samples to evaluate copy number and variation. This de novo "genome" was used to align the reads and identify mutations as above. Covering approximately 1/5 of the 380 Mb genome of rice we detected mutation densities ranging from 0.6 to 4 per Mb of diploid DNA depending on the mutagenic treatment. Conclusions The

  19. Pacemakers and Implantable Defibrillators: MedlinePlus Health Topic

    Science.gov (United States)

    ... ClinicalTrials.gov: Pacemaker, Artificial (National Institutes of Health) Journal Articles References and abstracts from MEDLINE/PubMed (National ... Leadless Cardiac Pacemakers: The Next Evolution in Pacemaker Technology. ... on Pacemakers and Implantable Defibrillators is the National Heart, Lung, and Blood Institute Other Languages Find health information in languages other than English on Pacemakers and ...

  20. Locus Reference Genomic sequences: An improved basis for describing human DNA variants

    KAUST Repository

    Dalgleish, Raymond; Flicek, Paul; Cunningham, Fiona; Astashyn, Alex; Tully, Raymond E; Proctor, Glenn; Chen, Yuan; McLaren, William M; Larsson, Pontus; Vaughan, Brendan W; Bé roud, Christophe; Dobson, Glen; Lehvä slaiho, Heikki; Taschner, Peter EM; den Dunnen, Johan T; Devereau, Andrew; Birney, Ewan; Brookes, Anthony J; Maglott, Donna R

    2010-01-01

    As our knowledge of the complexity of gene architecture grows, and we increase our understanding of the subtleties of gene expression, the process of accurately describing disease-causing gene variants has become increasingly problematic. In part, this is due to current reference DNA sequence formats that do not fully meet present needs. Here we present the Locus Reference Genomic (LRG) sequence format, which has been designed for the specifi c purpose of gene variant reporting. The format builds on the successful National Center for Biotechnology Information (NCBI) RefSeqGene project and provides a single-fi le record containing a uniquely stable reference DNA sequence along with all relevant transcript and protein sequences essential to the description of gene variants. In principle, LRGs can be created for any organism, not just human. In addition, we recognize the need to respect legacy numbering systems for exons and amino acids and the LRG format takes account of these. We hope that widespread adoption of LRGs - which will be created and maintained by the NCBI and the European Bioinformatics Institute (EBI) - along with consistent use of the Human Genome Variation Society (HGVS)- approved variant nomenclature will reduce errors in the reporting of variants in the literature and improve communication about variants aff ecting human health. Further information can be found on the LRG web site (http://www.lrg-sequence.org). 2010 Dalgleish et al.; licensee BioMed Central Ltd.

  1. Locus Reference Genomic sequences: An improved basis for describing human DNA variants

    KAUST Repository

    Dalgleish, Raymond

    2010-04-15

    As our knowledge of the complexity of gene architecture grows, and we increase our understanding of the subtleties of gene expression, the process of accurately describing disease-causing gene variants has become increasingly problematic. In part, this is due to current reference DNA sequence formats that do not fully meet present needs. Here we present the Locus Reference Genomic (LRG) sequence format, which has been designed for the specifi c purpose of gene variant reporting. The format builds on the successful National Center for Biotechnology Information (NCBI) RefSeqGene project and provides a single-fi le record containing a uniquely stable reference DNA sequence along with all relevant transcript and protein sequences essential to the description of gene variants. In principle, LRGs can be created for any organism, not just human. In addition, we recognize the need to respect legacy numbering systems for exons and amino acids and the LRG format takes account of these. We hope that widespread adoption of LRGs - which will be created and maintained by the NCBI and the European Bioinformatics Institute (EBI) - along with consistent use of the Human Genome Variation Society (HGVS)- approved variant nomenclature will reduce errors in the reporting of variants in the literature and improve communication about variants aff ecting human health. Further information can be found on the LRG web site (http://www.lrg-sequence.org). 2010 Dalgleish et al.; licensee BioMed Central Ltd.

  2. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley.

    Directory of Open Access Journals (Sweden)

    Martin Mascher

    Full Text Available The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS, a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new

  3. Linking to MedlinePlus

    Science.gov (United States)

    ... want to link patients or healthcare providers from electronic health record (EHR) systems to relevant MedlinePlus information, use MedlinePlus ... updates Subscribe to RSS Follow us Disclaimers Copyright Privacy Accessibility Quality Guidelines Viewers & Players MedlinePlus Connect for ...

  4. Articles about MedlinePlus

    Science.gov (United States)

    ... MedlinePlus → Articles about MedlinePlus URL of this page: https://medlineplus.gov/bibliography.html Articles about MedlinePlus To ... Dec 29]; 3(5):256-60. Available from: http://ecp.acponline.org/sepoct00/nlm.htm . Marill JL, ...

  5. Choice of reference sequence and assembler for alignment of Listeria monocytogenes short-read sequence data greatly influences rates of error in SNP analyses.

    Directory of Open Access Journals (Sweden)

    Arthur W Pightling

    Full Text Available The wide availability of whole-genome sequencing (WGS and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i depth of sequencing coverage, ii choice of reference-guided short-read sequence assembler, iii choice of reference genome, and iv whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT, using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming. We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers

  6. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences.

    Science.gov (United States)

    Rideout, Jai Ram; He, Yan; Navas-Molina, Jose A; Walters, William A; Ursell, Luke K; Gibbons, Sean M; Chase, John; McDonald, Daniel; Gonzalez, Antonio; Robbins-Pianka, Adam; Clemente, Jose C; Gilbert, Jack A; Huse, Susan M; Zhou, Hong-Wei; Knight, Rob; Caporaso, J Gregory

    2014-01-01

    We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA) sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs) for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime) and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity). Because more of our algorithm can be run in parallel relative to "classic" open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, "classic" open-reference OTU clustering is often faster). We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons). To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of "classic" open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by "classic" open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME's uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository) that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced). Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in QIIME's OTU picking workflows and

  7. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences

    Directory of Open Access Journals (Sweden)

    Jai Ram Rideout

    2014-08-01

    Full Text Available We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity. Because more of our algorithm can be run in parallel relative to “classic” open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, “classic” open-reference OTU clustering is often faster. We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons. To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of “classic” open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by “classic” open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME’s uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced. Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in

  8. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

    DEFF Research Database (Denmark)

    Petersen, Thomas Nordahl; Lukjancenko, Oksana; Thomsen, Martin Christen Frølund

    2017-01-01

    number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post...... pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets....

  9. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.

    Directory of Open Access Journals (Sweden)

    Jovan Rebolledo-Mendez

    Full Text Available The reference assembly for the domestic horse, EquCab2, published in 2009, was built using approximately 30 million Sanger reads from a Thoroughbred mare named Twilight. Contiguity in the assembly was facilitated using nearly 315 thousand BAC end sequences from Twilight's half brother Bravo. Since then, it has served as the foundation for many genome-wide analyses that include not only the modern horse, but ancient horses and other equid species as well. As data mapped to this reference has accumulated, consistent variation between mapped datasets and the reference, in terms of regions with no read coverage, single nucleotide variants, and small insertions/deletions have become apparent. In many cases, it is not clear whether these differences are the result of true sequence variation between the research subjects' and Twilight's genome or due to errors in the reference. EquCab2 is regarded as "The Twilight Assembly." The objective of this study was to identify inconsistencies between the EquCab2 assembly and the source Twilight Sanger data used to build it. To that end, the original Sanger and BAC end reads have been mapped back to this equine reference and assessed with the addition of approximately 40X coverage of new Illumina Paired-End sequence data. The resulting mapped datasets identify those regions with low Sanger read coverage, as well as variation in genomic content that is not consistent with either the original Twilight Sanger data or the new genomic sequence data generated from Twilight on the Illumina platform. As the haploid EquCab2 reference assembly was created using Sanger reads derived largely from a single individual, the vast majority of variation detected in a mapped dataset comprised of those same Sanger reads should be heterozygous. In contrast, homozygous variations would represent either errors in the reference or contributions from Bravo's BAC end sequences. Our analysis identifies 720,843 homozygous discrepancies

  10. Death, dying and informatics: misrepresenting religion on MedLine.

    Science.gov (United States)

    Rodríguez Del Pozo, Pablo; Fins, Joseph J

    2005-07-01

    The globalization of medical science carries for doctors worldwide a correlative duty to deepen their understanding of patients' cultural contexts and religious backgrounds, in order to satisfy each as a unique individual. To become better informed, practitioners may turn to MedLine, but it is unclear whether the information found there is an accurate representation of culture and religion. To test MedLine's representation of this field, we chose the topic of death and dying in the three major monotheistic religions. We searched MedLine using PubMed in order to retrieve and thematically analyze full-length scholarly journal papers or case reports dealing with religious traditions and end-of-life care. Our search consisted of a string of words that included the most common denominations of the three religions, the standard heading terms used by the National Reference Center for Bioethics Literature (NRCBL), and the Medical Subject Headings (MeSH) used by the National Library of Medicine. Eligible articles were limited to English-language papers with an abstract. We found that while a bibliographic search in MedLine on this topic produced instant results and some valuable literature, the aggregate reflected a selection bias. American writers were over-represented given the global prevalence of these religious traditions. Denominationally affiliated authors predominated in representing the Christian traditions. The Islamic tradition was under-represented. MedLine's capability to identify the most current, reliable and accurate information about purely scientific topics should not be assumed to be the same case when considering the interface of religion, culture and end-of-life care.

  11. MedlinePlus XML Data Sources

    Science.gov (United States)

    ... on MedlinePlus health topic pages. With the Web service, software developers can build applications that leverage the authoritative, reliable health information in MedlinePlus. The MedlinePlus Web service is free of charge and does not require ...

  12. Mobile MedlinePlus | NIH MedlinePlus Magazine

    Science.gov (United States)

    ... version of this page please turn Javascript on. Mobile MedlinePlus Past Issues / Winter 2010 Table of Contents Trusted medical information on your mobile phone http://m.medlineplus.gov Wondering what the ...

  13. Sequence-based comparative study of classical swine fever virus genogroup 2.2 isolate with pestivirus reference strains.

    Science.gov (United States)

    Kumar, Ravi; Rajak, Kaushal Kishor; Chandra, Tribhuwan; Muthuchelvan, Dhanavelu; Saxena, Arpit; Chaudhary, Dheeraj; Kumar, Ajay; Pandey, Awadh Bihari

    2015-09-01

    This study was undertaken with the aim to compare and establish the genetic relatedness between classical swine fever virus (CSFV) genogroup 2.2 isolate and pestivirus reference strains. The available complete genome sequences of CSFV/IND/UK/LAL-290 strain and other pestivirus reference strains were retrieved from GenBank. The complete genome sequence, complete open reading frame, 5' and 3' non-coding region (NCR) sequences were analyzed and compared with reference pestiviruses strains. Clustal W model in MegAlign program of Lasergene 6.0 software was used for analysis of genetic heterogeneity. Phylogenetic analysis was carried out using MEGA 6.06 software package. The complete genome sequence alignment of CSFV/IND/UK/LAL-290 isolate and reference pestivirus strains showed 58.9-72% identities at the nucleotide level and 50.3-76.9% at amino acid level. Sequence homology of 5' and 3' NCRs was found to be 64.1-82.3% and 22.9-71.4%, respectively. In phylogenetic analysis, overall tree topology was found similar irrespective of sequences used in this study; however, whole genome phylogeny of pestivirus formed two main clusters, which further distinguished into the monophyletic clade of each pestivirus species. CSFV/IND/UK/LAL-290 isolate placed with the CSFV Eystrup strain in the same clade with close proximity to border disease virus and Aydin strains. CSFV/IND/UK/LAL-290 exhibited the analogous genomic organization to those of all reference pestivirus strains. Based on sequence identity and phylogenetic analysis, the isolate showed close homology to Aydin/04-TR virus and distantly related to Bungowannah virus.

  14. NIH Institutes and MLN MedlinePlus Advisory Board | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  15. Death, dying and informatics: misrepresenting religion on MedLine

    Directory of Open Access Journals (Sweden)

    Fins Joseph J

    2005-07-01

    Full Text Available Abstract Background The globalization of medical science carries for doctors worldwide a correlative duty to deepen their understanding of patients' cultural contexts and religious backgrounds, in order to satisfy each as a unique individual. To become better informed, practitioners may turn to MedLine, but it is unclear whether the information found there is an accurate representation of culture and religion. To test MedLine's representation of this field, we chose the topic of death and dying in the three major monotheistic religions. Methods We searched MedLine using PubMed in order to retrieve and thematically analyze full-length scholarly journal papers or case reports dealing with religious traditions and end-of-life care. Our search consisted of a string of words that included the most common denominations of the three religions, the standard heading terms used by the National Reference Center for Bioethics Literature (NRCBL, and the Medical Subject Headings (MeSH used by the National Library of Medicine. Eligible articles were limited to English-language papers with an abstract. Results We found that while a bibliographic search in MedLine on this topic produced instant results and some valuable literature, the aggregate reflected a selection bias. American writers were over-represented given the global prevalence of these religious traditions. Denominationally affiliated authors predominated in representing the Christian traditions. The Islamic tradition was under-represented. Conclusion MedLine's capability to identify the most current, reliable and accurate information about purely scientific topics should not be assumed to be the same case when considering the interface of religion, culture and end-of-life care.

  16. Transcriptome sequencing of the Microarray Quality Control (MAQC RNA reference samples using next generation sequencing

    Directory of Open Access Journals (Sweden)

    Thierry-Mieg Danielle

    2009-06-01

    Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

  17. Mobile MedlinePlus | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... version of this page please turn Javascript on. Mobile MedlinePlus Past Issues / Spring 2013 Table of Contents Trusted medical information on your mobile phone http://m.medlineplus.gov Wondering what the ...

  18. Identifying nurse staffing research in Medline: development and testing of empirically derived search strategies with the PubMed interface.

    Science.gov (United States)

    Simon, Michael; Hausner, Elke; Klaus, Susan F; Dunton, Nancy E

    2010-08-23

    The identification of health services research in databases such as PubMed/Medline is a cumbersome task. This task becomes even more difficult if the field of interest involves the use of diverse methods and data sources, as is the case with nurse staffing research. This type of research investigates the association between nurse staffing parameters and nursing and patient outcomes. A comprehensively developed search strategy may help identify nurse staffing research in PubMed/Medline. A set of relevant references in PubMed/Medline was identified by means of three systematic reviews. This development set was used to detect candidate free-text and MeSH terms. The frequency of these terms was compared to a random sample from PubMed/Medline in order to identify terms specific to nurse staffing research, which were then used to develop a sensitive, precise and balanced search strategy. To determine their precision, the newly developed search strategies were tested against a) the pool of relevant references extracted from the systematic reviews, b) a reference set identified from an electronic journal screening, and c) a sample from PubMed/Medline. Finally, all newly developed strategies were compared to PubMed's Health Services Research Queries (PubMed's HSR Queries). The sensitivities of the newly developed search strategies were almost 100% in all of the three test sets applied; precision ranged from 6.1% to 32.0%. PubMed's HSR queries were less sensitive (83.3% to 88.2%) than the new search strategies. Only minor differences in precision were found (5.0% to 32.0%). As with other literature on health services research, nurse staffing studies are difficult to identify in PubMed/Medline. Depending on the purpose of the search, researchers can choose between high sensitivity and retrieval of a large number of references or high precision, i.e. and an increased risk of missing relevant references, respectively. More standardized terminology (e.g. by consistent use of the

  19. Finding biomedical categories in Medline®

    Directory of Open Access Journals (Sweden)

    Yeganova Lana

    2012-10-01

    Full Text Available Abstract Background There are several humanly defined ontologies relevant to Medline. However, Medline is a fast growing collection of biomedical documents which creates difficulties in updating and expanding these humanly defined ontologies. Automatically identifying meaningful categories of entities in a large text corpus is useful for information extraction, construction of machine learning features, and development of semantic representations. In this paper we describe and compare two methods for automatically learning meaningful biomedical categories in Medline. The first approach is a simple statistical method that uses part-of-speech and frequency information to extract a list of frequent nouns from Medline. The second method implements an alignment-based technique to learn frequent generic patterns that indicate a hyponymy/hypernymy relationship between a pair of noun phrases. We then apply these patterns to Medline to collect frequent hypernyms as potential biomedical categories. Results We study and compare these two alternative sets of terms to identify semantic categories in Medline. We find that both approaches produce reasonable terms as potential categories. We also find that there is a significant agreement between the two sets of terms. The overlap between the two methods improves our confidence regarding categories predicted by these independent methods. Conclusions This study is an initial attempt to extract categories that are discussed in Medline. Rather than imposing external ontologies on Medline, our methods allow categories to emerge from the text.

  20. MedlinePlus

    Data.gov (United States)

    U.S. Department of Health & Human Services — MedlinePlus is the National Institutes of Health's Web site for patients and their families and friends. Produced by the National Library of Medicine, the world’s...

  1. MedlinePlus Connect in Use

    Science.gov (United States)

    ... MedlinePlus Connect in Use URL of this page: https://medlineplus.gov/connect/users.html MedlinePlus Connect in ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  2. MedlinePlus Connect: Web Service

    Science.gov (United States)

    ... MedlinePlus Connect → Web Service URL of this page: https://medlineplus.gov/connect/service.html MedlinePlus Connect: Web ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  3. MedlinePlus Connect: Web Application

    Science.gov (United States)

    ... MedlinePlus Connect → Web Application URL of this page: https://medlineplus.gov/connect/application.html MedlinePlus Connect: Web ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  4. MedlinePlus Connect: Technical Information

    Science.gov (United States)

    ... MedlinePlus Connect → Technical Information URL of this page: https://medlineplus.gov/connect/technical.html MedlinePlus Connect: Technical ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  5. MedlinePlus Connect: Email List

    Science.gov (United States)

    ... MedlinePlus Connect → Email List URL of this page: https://medlineplus.gov/connect/emaillist.html MedlinePlus Connect: Email ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  6. MedlinePlus Milestones: 1998-present

    Science.gov (United States)

    ... Connect , a service linking patients or providers in electronic health record (EHR) systems to related MedlinePlus information on conditions ... updates Subscribe to RSS Follow us Disclaimers Copyright Privacy Accessibility Quality Guidelines Viewers & Players MedlinePlus Connect for ...

  7. MedlinePlus.gov on Twitter

    Science.gov (United States)

    ... page please turn Javascript on. MedlinePlus.gov on Twitter Past Issues / Fall 2009 Table of Contents You can now follow MedlinePlus.gov on Twitter: twitter.com/medlineplus4you The medlineplus4you Twitter feed provides ...

  8. Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database

    NARCIS (Netherlands)

    Ritari, Jarmo; Salojärvi, Jarkko; Lahti, Leo; Vos, de Willem M.

    2015-01-01

    Background: Current sequencing technology enables taxonomic profiling of microbial ecosystems at high resolution and depth by using the 16S rRNA gene as a phylogenetic marker. Taxonomic assignation of newly acquired data is based on sequence comparisons with comprehensive reference databases to

  9. De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units

    Directory of Open Access Journals (Sweden)

    Sarah L. Westcott

    2015-12-01

    Full Text Available Background. 16S rRNA gene sequences are routinely assigned to operational taxonomic units (OTUs that are then used to analyze complex microbial communities. A number of methods have been employed to carry out the assignment of 16S rRNA gene sequences to OTUs leading to confusion over which method is optimal. A recent study suggested that a clustering method should be selected based on its ability to generate stable OTU assignments that do not change as additional sequences are added to the dataset. In contrast, we contend that the quality of the OTU assignments, the ability of the method to properly represent the distances between the sequences, is more important.Methods. Our analysis implemented six de novo clustering algorithms including the single linkage, complete linkage, average linkage, abundance-based greedy clustering, distance-based greedy clustering, and Swarm and the open and closed-reference methods. Using two previously published datasets we used the Matthew’s Correlation Coefficient (MCC to assess the stability and quality of OTU assignments.Results. The stability of OTU assignments did not reflect the quality of the assignments. Depending on the dataset being analyzed, the average linkage and the distance and abundance-based greedy clustering methods generated OTUs that were more likely to represent the actual distances between sequences than the open and closed-reference methods. We also demonstrated that for the greedy algorithms VSEARCH produced assignments that were comparable to those produced by USEARCH making VSEARCH a viable free and open source alternative to USEARCH. Further interrogation of the reference-based methods indicated that when USEARCH or VSEARCH were used to identify the closest reference, the OTU assignments were sensitive to the order of the reference sequences because the reference sequences can be identical over the region being considered. More troubling was the observation that while both USEARCH and

  10. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    Directory of Open Access Journals (Sweden)

    Kodner Robin B

    2010-10-01

    Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.

  11. Dialysis search filters for PubMed, Ovid MEDLINE, and Embase databases.

    Science.gov (United States)

    Iansavichus, Arthur V; Haynes, R Brian; Lee, Christopher W C; Wilczynski, Nancy L; McKibbon, Ann; Shariff, Salimah Z; Blake, Peter G; Lindsay, Robert M; Garg, Amit X

    2012-10-01

    Physicians frequently search bibliographic databases, such as MEDLINE via PubMed, for best evidence for patient care. The objective of this study was to develop and test search filters to help physicians efficiently retrieve literature related to dialysis (hemodialysis or peritoneal dialysis) from all other articles indexed in PubMed, Ovid MEDLINE, and Embase. A diagnostic test assessment framework was used to develop and test robust dialysis filters. The reference standard was a manual review of the full texts of 22,992 articles from 39 journals to determine whether each article contained dialysis information. Next, 1,623,728 unique search filters were developed, and their ability to retrieve relevant articles was evaluated. The high-performance dialysis filters consisted of up to 65 search terms in combination. These terms included the words "dialy" (truncated), "uremic," "catheters," and "renal transplant wait list." These filters reached peak sensitivities of 98.6% and specificities of 98.5%. The filters' performance remained robust in an independent validation subset of articles. These empirically derived and validated high-performance search filters should enable physicians to effectively retrieve dialysis information from PubMed, Ovid MEDLINE, and Embase.

  12. Subscribe to NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... turn Javascript on. Subscribe to NIH MedlinePlus the magazine NIH MedlinePlus the magazine is published quarterly, in print and on the ... up for a free subscription to NIH MedlinePlus Magazine. Librarians may order this magazine in bulk . Please ...

  13. No Reference Prediction of Quality Metrics for H.264 Compressed Infrared Image Sequences for UAV Applications

    DEFF Research Database (Denmark)

    Hossain, Kabir; Mantel, Claire; Forchhammer, Søren

    2018-01-01

    The framework for this research work is the acquisition of Infrared (IR) images from Unmanned Aerial Vehicles (UAV). In this paper we consider the No-Reference (NR) prediction of Full Reference Quality Metrics for Infrared (IR) video sequences which are compressed and thus distorted by an H.264...

  14. NLM MedlinePlus Magazine Team | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... Home Current issue contents Magazine Team Follow us Magazine Team National Library of Medicine at the National ... MLS, MA TREASURER Dennis Cryer, MD NIH MedlinePlus magazine is published by Friends of the NLM in ...

  15. NLM MedlinePlus Magazine Team | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... Robert George DIRECTOR OF OPERATIONS Carolyn Medeiros DIRECTOR, BUSINESS DEVELOPMENT Michele Tezduyar MANAGING EDITOR Emily Poe SENIOR ... MD 20814 CONNECT WITH US Follow us on Facebook Facebook MedlinePlus www.facebook.com/mplus.gov Facebook ...

  16. Effector-independent motor sequence representations exist in extrinsic and intrinsic reference frames.

    Science.gov (United States)

    Wiestler, Tobias; Waters-Metenier, Sheena; Diedrichsen, Jörn

    2014-04-02

    Many daily activities rely on the ability to produce meaningful sequences of movements. Motor sequences can be learned in an effector-specific fashion (such that benefits of training are restricted to the trained hand) or an effector-independent manner (meaning that learning also facilitates performance with the untrained hand). Effector-independent knowledge can be represented in extrinsic/world-centered or in intrinsic/body-centered coordinates. Here, we used functional magnetic resonance imaging (fMRI) and multivoxel pattern analysis to determine the distribution of intrinsic and extrinsic finger sequence representations across the human neocortex. Participants practiced four sequences with one hand for 4 d, and then performed these sequences during fMRI with both left and right hand. Between hands, these sequences were equivalent in extrinsic or intrinsic space, or were unrelated. In dorsal premotor cortex (PMd), we found that sequence-specific activity patterns correlated higher for extrinsic than for unrelated pairs, providing evidence for an extrinsic sequence representation. In contrast, primary sensory and motor cortices showed effector-independent representations in intrinsic space, with considerable overlap of the two reference frames in caudal PMd. These results suggest that effector-independent representations exist not only in world-centered, but also in body-centered coordinates, and that PMd may be involved in transforming sequential knowledge between the two. Moreover, although effector-independent sequence representations were found bilaterally, they were stronger in the hemisphere contralateral to the trained hand. This indicates that intermanual transfer relies on motor memories that are laid down during training in both hemispheres, but preferentially draws upon sequential knowledge represented in the trained hemisphere.

  17. Iterative normalization technique for reference sequence generation for zero-tail discrete fourier transform spread orthogonal frequency division multiplexing

    DEFF Research Database (Denmark)

    2017-01-01

    Systems, methods, apparatuses, and computer program products for generating sequences for zero-tail discrete fourier transform (DFT)-spread-orthogonal frequency division multiplexing (OFDM) (ZT DFT-s-OFDM) reference signals. One method includes adding a zero vector to an input sequence...... of each of the elements, converting the sequence to time domain, generating a zero-padded sequence by forcing a zero head and tail of the sequence, and repeating the steps until a final sequence with zero-tail and flat frequency response is obtained....

  18. Supplementary searches of PubMed to improve currency of MEDLINE and MEDLINE In-Process searches via Ovid.

    Science.gov (United States)

    Duffy, Steven; de Kock, Shelley; Misso, Kate; Noake, Caro; Ross, Janine; Stirk, Lisa

    2016-10-01

    The research investigated whether conducting a supplementary search of PubMed in addition to the main MEDLINE (Ovid) search for a systematic review is worthwhile and to ascertain whether this PubMed search can be conducted quickly and if it retrieves unique, recently published, and ahead-of-print studies that are subsequently considered for inclusion in the final systematic review. Searches of PubMed were conducted after MEDLINE (Ovid) and MEDLINE In-Process (Ovid) searches had been completed for seven recent reviews. The searches were limited to records not in MEDLINE or MEDLINE In-Process (Ovid). Additional unique records were identified for all of the investigated reviews. Search strategies were adapted quickly to run in PubMed, and reviewer screening of the results was not time consuming. For each of the investigated reviews, studies were ordered for full screening; in six cases, studies retrieved from the supplementary PubMed searches were included in the final systematic review. Supplementary searching of PubMed for studies unavailable elsewhere is worthwhile and improves the currency of the systematic reviews.

  19. Dry Mouth: MedlinePlus Health Topic

    Science.gov (United States)

    ... MEDLINE/PubMed (National Library of Medicine) Article: Transcription profiling of peripheral B cells in antibody-positive primary ... our quality guidelines . About MedlinePlus Site Map FAQs Customer Support Get email updates Subscribe to RSS Follow ...

  20. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  1. Genetics Home Reference: arginase deficiency

    Science.gov (United States)

    ... belongs to a class of genetic diseases called urea cycle disorders. The urea cycle is a sequence of reactions ... links) Baby's First Test GeneReview: Arginase Deficiency GeneReview: Urea Cycle Disorders Overview MedlinePlus Encyclopedia: Hereditary urea cycle abnormality National ...

  2. MedlinePlus Tour

    Science.gov (United States)

    ... captioning, click the CC button on the lower right-hand corner of the player. Video player keyboard shortcuts Transcript Welcome to MedlinePlus, the consumer health information website from the National Library of ...

  3. Reference voltage calculation method based on zero-sequence component optimisation for a regional compensation DVR

    Science.gov (United States)

    Jian, Le; Cao, Wang; Jintao, Yang; Yinge, Wang

    2018-04-01

    This paper describes the design of a dynamic voltage restorer (DVR) that can simultaneously protect several sensitive loads from voltage sags in a region of an MV distribution network. A novel reference voltage calculation method based on zero-sequence voltage optimisation is proposed for this DVR to optimise cost-effectiveness in compensation of voltage sags with different characteristics in an ungrounded neutral system. Based on a detailed analysis of the characteristics of voltage sags caused by different types of faults and the effect of the wiring mode of the transformer on these characteristics, the optimisation target of the reference voltage calculation is presented with several constraints. The reference voltages under all types of voltage sags are calculated by optimising the zero-sequence component, which can reduce the degree of swell in the phase-to-ground voltage after compensation to the maximum extent and can improve the symmetry degree of the output voltages of the DVR, thereby effectively increasing the compensation ability. The validity and effectiveness of the proposed method are verified by simulation and experimental results.

  4. The Medline/full-text research project.

    Science.gov (United States)

    McKinin, E J; Sievert, M; Johnson, E D; Mitchell, J A

    1991-05-01

    This project was designed to test the relative efficacy of index terms and full-text for the retrieval of documents in those MEDLINE journals for which full-text searching was also available. The full-text files used were MEDIS from Mead Data Central and CCML from BRS Information Technologies. One hundred clinical medical topics were searched in these two files as well as the MEDLINE file to accumulate the necessary data. It was found that full-text identified significantly more relevant articles than did the indexed file, MEDLINE. The full-text searches, however, lacked the precision of searches done in the indexed file. Most relevant items missed in the full-text files, but identified in MEDLINE, were missed because the searcher failed to account for some aspect of natural language, used a logical or positional operator that was too restrictive, or included a concept which was implied, but not expressed in the natural language. Very few of the unique relevant full-text citations would have been retrieved by title or abstract alone. Finally, as of July, 1990 the more current issue of a journal was just as likely to appear in MEDLINE as in one of the full-text files.

  5. Automatic inference of indexing rules for MEDLINE

    Directory of Open Access Journals (Sweden)

    Shooshan Sonya E

    2008-11-01

    Full Text Available Abstract Background: Indexing is a crucial step in any information retrieval system. In MEDLINE, a widely used database of the biomedical literature, the indexing process involves the selection of Medical Subject Headings in order to describe the subject matter of articles. The need for automatic tools to assist MEDLINE indexers in this task is growing with the increasing number of publications being added to MEDLINE. Methods: In this paper, we describe the use and the customization of Inductive Logic Programming (ILP to infer indexing rules that may be used to produce automatic indexing recommendations for MEDLINE indexers. Results: Our results show that this original ILP-based approach outperforms manual rules when they exist. In addition, the use of ILP rules also improves the overall performance of the Medical Text Indexer (MTI, a system producing automatic indexing recommendations for MEDLINE. Conclusion: We expect the sets of ILP rules obtained in this experiment to be integrated into MTI.

  6. MedlinePlus FAQ: Will MedlinePlus work on my mobile device?

    Science.gov (United States)

    ... mobile.html Question: Will MedlinePlus work on my mobile device? To use the sharing features on this page, ... Some video content might not play on your mobile device. See our FAQ on playing videos on phones ...

  7. Genetics Home Reference: ornithine transcarbamylase deficiency

    Science.gov (United States)

    ... belongs to a class of genetic diseases called urea cycle disorders. The urea cycle is a sequence of reactions ... Baby's First Test GeneReview: Ornithine Transcarbamylase Deficiency GeneReview: Urea Cycle Disorders Overview MedlinePlus Encyclopedia: Hereditary urea cycle abnormality National ...

  8. The development of rhythmic attending in auditory sequences: attunement, referent period, focal attending.

    Science.gov (United States)

    Drake, C; Jones, M R; Baruch, C

    2000-12-15

    This paper is divided into three sections. The first section is theoretical; it extends Dynamic Attending Theory (Jones, M. R. Psychological Review 83 (1976) 323; Jones, M. R. Perception and Psychophysics 41(6) (1987) 631; Jones, M. R. Psychomusicology 9(2) (1990) 193; Jones, M. R., & Boltz, M. Psychological Review 96(3) (1989) 459) to developmental questions concerning tempo and time hierarchies. Generally Dynamic Attending Theory proposes that, when listening to a complex auditory sequence, listeners spontaneously focus on events occurring at an intermediate rate (the referent level), and they then may shift attention to events occurring over longer or shorter time spans, that is at lower (faster) or higher (slower) hierarchical levels (focal attending). The second section of the paper is experimental. It examines maturational changes of three dynamic attending activities involving referent period and level, attunement, and focal attending. Tasks involve both motor tapping (including spontaneous motor tempo and synchronization with simple sequences and music) and tempo discrimination. We compare performances by 4-, 6-, 8-, and 10-year-old children and adults, with or without musical training. Results indicate three changes with increased age and musical training: (1) a slowing of the mean spontaneous tapping rate (a reflection of the referent period) and mean synchronization rate (a reflection of the referent level), (2) enhanced ability to synchronize tapping and discriminate tempo (improved attunement), and (3) an enlarged range of tapping rates towards slower rates and higher hierarchical levels (improved focal attending). A final section considers results in light of the theory proposed here. It is suggested that growth trends can be expressed in terms of listeners' engagement of slower attending oscillators with age and experience, accompanied by the passage from the initial use of a single oscillator towards the coupling of multiple oscillators.

  9. Using the Genetics Home Reference Website | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... of this page please turn Javascript on. Feature: Genetics 101 Using the Genetics Home Reference Website Past Issues / Summer 2013 Table ... as the GHR website keeps growing. What Is Genetic Counseling? Genetic counseling provides information and support to ...

  10. Remembering Mary Tyler Moore | MedlinPlus Magazine

    Science.gov (United States)

    ... Remembering Mary Tyler Moore Follow us NIH MedlinePlus Magazine Remembers Mary Tyler Moore A little more than ... helped launch the first issue of NIH MedlinePlus magazine on Capitol Hill. The award-winning actress and ...

  11. MedlinePlus FAQ: Listing Your Web Site

    Science.gov (United States)

    ... JavaScript. Answer: MedlinePlus is a selected list of authoritative resources. MedlinePlus uses quality guidelines to evaluate Web ... ensure that the information we link to is authoritative, accurate, up-to-date, educational and available at ...

  12. MedlinePlus FAQ: Disease or Condition Information

    Science.gov (United States)

    ... on the Health Topics button on the MedlinePlus homepage. You can also find the Health Topics button ... MedlinePlus Connect for EHRs For Developers U.S. National Library of Medicine 8600 Rockville Pike, Bethesda, MD 20894 ...

  13. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  14. Health Videos: MedlinePlus

    Science.gov (United States)

    ... Duplication for commercial use must be authorized in writing by ADAM Health Solutions. About MedlinePlus Site Map FAQs Customer Support Get email updates Subscribe to RSS Follow us Disclaimers Copyright ...

  15. Medical Encyclopedia: MedlinePlus

    Science.gov (United States)

    ... Duplication for commercial use must be authorized in writing by ADAM Health Solutions. About MedlinePlus Site Map FAQs Customer Support Get email updates Subscribe to RSS Follow us Disclaimers Copyright ...

  16. Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

    Science.gov (United States)

    Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...

  17. Friends of the National Library of Medicine, Welcome to NIH MedlinePlus, the magazine | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... Contents Dear Readers, WELCOME to NIH MedlinePlus , the magazine. The purpose of NIH MedlinePlus , the magazine, is to provide you with a FREE , trusted ... medical information. Published four times a year, the magazine showcases the National Institutes of Health's (NIH) latest ...

  18. Genetics Home Reference: N-acetylglutamate synthase deficiency

    Science.gov (United States)

    ... belongs to a class of genetic diseases called urea cycle disorders. The urea cycle is a sequence of reactions ... Other Diagnosis and Management Resources (3 links) GeneReview: Urea Cycle Disorders Overview MedlinePlus Encyclopedia: Hereditary Urea Cycle Abnormality National ...

  19. MedlinePlus FAQ: Framing

    Science.gov (United States)

    ... URL of this page: https://medlineplus.gov/faq/framing.html I'd like to link to MedlinePlus, ... M. encyclopedia. Our license agreements do not permit framing of their content from our site. For more ...

  20. Search Tips: MedlinePlus

    Science.gov (United States)

    ... of this page: https://medlineplus.gov/searchtips.html Search Tips To use the sharing features on this page, please enable JavaScript. How do I search MedlinePlus? The search box appears at the top ...

  1. Molecular Findings Among Patients Referred for Clinical Whole-Exome Sequencing

    Science.gov (United States)

    Yang, Yaping; Muzny, Donna M.; Xia, Fan; Niu, Zhiyv; Person, Richard; Ding, Yan; Ward, Patricia; Braxton, Alicia; Wang, Min; Buhay, Christian; Veeraraghavan, Narayanan; Hawes, Alicia; Chiang, Theodore; Leduc, Magalie; Beuten, Joke; Zhang, Jing; He, Weimin; Scull, Jennifer; Willis, Alecia; Landsverk, Megan; Craigen, William J.; Bekheirnia, Mir Reza; Stray-Pedersen, Asbjorg; Liu, Pengfei; Wen, Shu; Alcaraz, Wendy; Cui, Hong; Walkiewicz, Magdalena; Reid, Jeffrey; Bainbridge, Matthew; Patel, Ankita; Boerwinkle, Eric; Beaudet, Arthur L.; Lupski, James R.; Plon, Sharon E.; Gibbs, Richard A.; Eng, Christine M.

    2015-01-01

    ), 65 (12.3%) X-linked, and 1 (0.2%) mitochondrial. Of 504 patients with a molecular diagnosis, 23 (4.6%) had blended phenotypes resulting from 2 single gene defects. About 30% of the positive cases harbored mutations in disease genes reported since 2011. There were 95 medically actionable incidental findings in genes unrelated to the phenotype but with immediate implications for management in 92 patients (4.6%), including 59 patients (3%) with mutations in genes recommended for reporting by the American College of Medical Genetics and Genomics. CONCLUSIONS AND RELEVANCE Whole-exome sequencing provided a potential molecular diagnosis for 25% of a large cohort of patients referred for evaluation of suspected genetic conditions, including detection of rare genetic events and new mutations contributing to disease. The yield of whole-exome sequencing may offer advantages over traditional molecular diagnostic approaches in certain patients. PMID:25326635

  2. The zebrafish reference genome sequence and its relationship to the human genome.

    Science.gov (United States)

    Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

    2013-04-25

    Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.

  3. Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

    Science.gov (United States)

    Martin, Guillaume; Baurens, Franc-Christophe; Droc, Gaëtan; Rouard, Mathieu; Cenci, Alberto; Kilian, Andrzej; Hastie, Alex; Doležel, Jaroslav; Aury, Jean-Marc; Alberti, Adriana; Carreel, Françoise; D'Hont, Angélique

    2016-03-16

    Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in

  4. Combating HIV/AIDS | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  5. Recovery and Treatment | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  6. Exploring Graphic Medicine | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  7. The Opioid Crisis | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  8. Hope for Aphasia Patients | NIH MedlinePlus Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.9 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  9. Expanding Hearing Healthcare | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  10. NIH Institutes and MLN MedlinePlus Advisory Board

    Science.gov (United States)

    ... main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.9 mb] ... nih.gov (301) 496-7301 National Institute of Mental Health (NIMH) www.nimh.nih.gov 1-866- ...

  11. Beyond MEDLINE for literature searches.

    Science.gov (United States)

    Conn, Vicki S; Isaramalai, Sang-arun; Rath, Sabyasachi; Jantarakupt, Peeranuch; Wadhawan, Rohini; Dash, Yashodhara

    2003-01-01

    To describe strategies for a comprehensive literature search. MEDLINE searches result in limited numbers of studies that are often biased toward statistically significant findings. Diversified search strategies are needed. Empirical evidence about the recall and precision of diverse search strategies is presented. Challenges and strengths of each search strategy are identified. Search strategies vary in recall and precision. Often sensitivity and specificity are inversely related. Valuable search strategies include examination of multiple diverse computerized databases, ancestry searches, citation index searches, examination of research registries, journal hand searching, contact with the "invisible college," examination of abstracts, Internet searches, and contact with sources of synthesized information. Extending searches beyond MEDLINE enables researchers to conduct more systematic comprehensive searches.

  12. Unraveling systematic inventory of Echinops (Asteraceae) with special reference to nrDNA ITS sequence-based molecular typing of Echinops abuzinadianus.

    Science.gov (United States)

    Ali, M A; Al-Hemaid, F M; Lee, J; Hatamleh, A A; Gyulai, G; Rahman, M O

    2015-10-02

    The present study explored the systematic inventory of Echinops L. (Asteraceae) of Saudi Arabia, with special reference to the molecular typing of Echinops abuzinadianus Chaudhary, an endemic species to Saudi Arabia, based on the internal transcribed spacer (ITS) sequences (ITS1-5.8S-ITS2) of nuclear ribosomal DNA. A sequence similarity search using BLAST and a phylogenetic analysis of the ITS sequence of E. abuzinadianus revealed a high level of sequence similarity with E. glaberrimus DC. (section Ritropsis). The novel primary sequence and the secondary structure of ITS2 of E. abuzinadianus could potentially be used for molecular genotyping.

  13. The zebrafish reference genome sequence and its relationship to the human genome

    Science.gov (United States)

    Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.

    2013-01-01

    Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743

  14. A BRCA2 mutation incorrectly mapped in the original BRCA2 reference sequence, is a common West Danish founder mutation disrupting mRNA splicing

    DEFF Research Database (Denmark)

    Thomassen, Mads; Pedersen, Inge Søkilde; Vogel, Ida

    2011-01-01

    Inherited mutations in the tumor suppressor genes BRCA1 and BRCA2 predispose carriers to breast and ovarian cancer. The authors have identified a mutation in BRCA2, 7845+1G>A (c.7617+1G>A), not previously regarded as deleterious because of incorrect mapping of the splice junction in the originally...... published genomic reference sequence. This reference sequence is generally used in many laboratories and it maps the mutation 16 base pairs inside intron 15. However, according to the recent reference sequences the mutation is located in the consensus donor splice sequence. By reverse transcriptase analysis......, loss of exon 15 in the final transcript interrupting the open reading frame was demonstrated. Furthermore, the mutation segregates with a cancer phenotype in 18 Danish families. By genetic analysis of more than 3,500 Danish breast/ovarian cancer risk families, the mutation was identified as the most...

  15. MedlinePlus: Awards and Recognition

    Science.gov (United States)

    ... winner of the 2005 World Summit on the Information Society Awards for e-health. Winner of the Thomas Reuters/Frank Bradway Rogers Information Advancement Award in 2014 for MedlinePlus Connect and ...

  16. Helping others hear better | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  17. Racing Against Lung Cancer | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  18. The ABCs of GERD | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  19. A Lifelong Asthma Struggle | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  20. NIH on the web | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  1. Surgery of the Future | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.9 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  2. 10 NIH Research Highlights | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  3. NIH on the web | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  4. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high...

  5. Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    DEFF Research Database (Denmark)

    Maretty, Lasse; Jensen, Jacob Malte; Petersen, Bent

    2017-01-01

    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome...... or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high......-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set...

  6. Chiropractic: MedlinePlus Health Topic

    Science.gov (United States)

    ... for back pain (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Chiropractic updates by ... ENCYCLOPEDIA Chiropractic care for back pain Related Health Topics Back Pain Complementary and Integrative Medicine National Institutes ...

  7. Diets: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Mediterranean diet (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Diets updates by ... foods Diet-busting foods Mediterranean diet Related Health Topics Child Nutrition DASH Eating Plan Diabetic Diet Nutrition ...

  8. Colonoscopy: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Virtual colonoscopy (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Colonoscopy updates by ... Colonoscopy Colonoscopy discharge Sigmoidoscopy Virtual colonoscopy Related Health Topics Colonic Diseases Colonic Polyps Colorectal Cancer National Institutes ...

  9. Dialysis: MedlinePlus Health Topic

    Science.gov (United States)

    ... access for hemodialysis (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Dialysis updates by ... for hemodialysis Show More Show Less Related Health Topics Creatinine Kidney Cysts Kidney Failure Peritoneal Disorders National ...

  10. Menopause: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish What is Menopause? (National Institute on Aging) Topic Image MedlinePlus Email Updates Get Menopause updates by ... test Menopause Types of hormone therapy Related Health Topics Hormone Replacement Therapy Menstruation Premature Ovarian Failure National ...

  11. Vaginitis: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Vulvovaginitis - overview (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Vaginitis updates by ... Vaginitis test - wet mount Vulvovaginitis - overview Related Health Topics Trichomoniasis Vaginal Diseases Yeast Infections Other Languages Find ...

  12. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  13. Mary Tyler Moore Helps Launch NIH MedlinePlus Magazine

    Science.gov (United States)

    ... Issues Mary Tyler Moore Helps Launch NIH MedlinePlus Magazine Past Issues / Winter 2007 Table of Contents For ... Javascript on. Among those attending the NIH MedlinePlus magazine launch on Capitol Hill were (l-r) NIH ...

  14. NIH MedlinePlus the Magazine: Health, Medical & Wellness Articles

    Science.gov (United States)

    ... to the Web site for NIH MedlinePlus, the magazine. Our purpose is to present you with the ... sponsorship and other charitable donations for NIH MedlinePlus magazine's publication and distribution, many more thousands of Americans ...

  15. Prediabetes:MedlinePlus Health Topic

    Science.gov (United States)

    ... in Spanish Prediabetes (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Prediabetes updates by ... Glucose tolerance test - non-pregnant Prediabetes Related Health Topics A1C Diabetes Diabetes in Children and Teens Diabetes ...

  16. Diabetes: MedlinePlus Health Topic

    Science.gov (United States)

    ... High blood sugar (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Diabetes updates by ... ketones test Show More Show Less Related Health Topics A1C Blood Sugar Diabetes and Pregnancy Diabetes Complications ...

  17. Comparing Medline citations using modified N-grams

    Science.gov (United States)

    Nawab, Rao Muhammad Adeel; Stevenson, Mark; Clough, Paul

    2014-01-01

    Objective We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information. Materials and methods Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) deletion, an item in the n-gram is removed; and (2) substitution, an item in the n-gram is substituted with a similar term obtained from the Unified Medical Language System  Metathesaurus. N-grams are also weighted using a score derived from a language model. Evaluation is carried out using a set of 520 Medline citation pairs, including a set of 260 manually verified duplicate pairs obtained from the Deja Vu database. Results The approach accurately detects duplicate Medline document pairs with an F1 measure score of 0.99. Allowing for word deletions and substitution improves performance. The best results are obtained by combining scores for n-grams of length 1–5 words. Discussion Results show that the detection of duplicate Medline citations can be improved by modifying n-grams and that high performance can also be obtained using only unigrams (F1=0.959), particularly when allowing for substitutions of alternative phrases. PMID:23715801

  18. The Sequence of Acquisition of Personal Pronoun Case and Person Reference among 6 Year Old Children in Two Selected Malaysian Kindergartens

    Directory of Open Access Journals (Sweden)

    Arshad Abd Samad

    2017-03-01

    Full Text Available Pronoun case and person reference refer to the position of the pronoun in the sentence and the person the pronoun refers to respectively.  Examining the acquisition of pronoun case and person reference among young children can be insightful as, besides their obvious relevance to language development, both these constructs can have implications on other aspects of child development.  Attention given by children to these various constructs may indicate the importance children place on the concept of ego and self as well as on social relations.  The sequence of acquisition of personal pronouns among these children is therefore an important phenomenon to be examined as it can reflect linguistic and socio-cognitive development.  This largely descriptive study examines the sequence of acquisition of the English pronouns among forty 6 year old Malaysian children learning ESL in two kindergartens.  The children in the study were presented with 33 drawings to assess their familiarity with case and person reference expressed through English personal pronouns.  They were required to select the correct pronoun from three pronouns that were used to describe each drawing.  This paper reports on the accuracy rates for each pronoun and assumes that high accuracy rates indicate a more complete acquisition of the pronoun.  Error forms by the children were also be identified and examined.  Data obtained were compared to acquisition sequences in the literature and general implications related to the acquisition of personal pronouns among children in an ESL setting in Malaysia will be discussed.

  19. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  20. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    Science.gov (United States)

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  1. Caregiving: It Takes a Village | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  2. Breathtaking: Managing a COPD Diagnosis | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  3. Understanding and preventing tick bites | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.9 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  4. The Future of Asthma Monitoring | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  5. Love and Life without Gluten | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  6. A Couple’s Caregiving Journey | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  7. Surgeon General Outlines Opioid Plan | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  8. Exploring the Celiac Disease Mystery | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  9. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  10. Annotating patents with Medline MeSH codes via citation mapping.

    Science.gov (United States)

    Griffin, Thomas D; Boyer, Stephen K; Councill, Isaac G

    2010-01-01

    Both patents and Medline are important document collections for discovering new relationships between chemicals and biology, searching for prior art for patent applications and retrieving background knowledge for current research activities. Finding relevance to a topic within patents is often made difficult by poor categorization, badly written descriptions, and even intentional obfuscation. Unlike patents, the Medline corpus has Medical Subject Heading (MeSH) keywords manually added to their articles, giving a medically relevant taxonomy to the 18 million article abstracts. Our work attempts to accurately recognize the citations made in patents to Medline-indexed articles, linking them to their corresponding PubMed ID and exploiting the associated MeSH to enhance patent search by annotating the referencing patents with their Medline citations' MeSH codes. The techniques, system features, and benefits are explained.

  11. Development and testing of a medline search filter for identifying patient and public involvement in health research.

    Science.gov (United States)

    Rogers, Morwenna; Bethel, Alison; Boddy, Kate

    2017-06-01

    Research involving the public as partners often proves difficult to locate due to the variations in terms used to describe public involvement, and inability of medical databases to index this concept effectively. To design a search filter to identify literature where patient and public involvement (PPI) was used in health research. A reference standard of 172 PPI papers was formed. The references were divided into a development set and a test set. Search terms were identified from common words, phrases and synonyms in the development set. These terms were combined as a search strategy for medline via OvidSP, which was then tested for sensitivity against the test set. The resultant search filter was then assessed for sensitivity, specificity and precision using a previously published systematic review. The search filter was found to be highly sensitive 98.5% in initial testing. When tested against results generated by a 'real-life' systematic review, the filter had a specificity of 81%. However, sensitivity dropped to 58%. Adjustments to the population group of terms increased the sensitivity to 73%. The PPI filter designed for medline via OvidSP could aid information specialists and researchers trying to find literature specific to PPI. © 2016 Health Libraries Group.

  12. Eye Wear: MedlinePlus Health Topic

    Science.gov (United States)

    ... When You Exercise (National Institute on Aging) - PDF Topic Image MedlinePlus Email Updates Get Eye Wear updates by email What's this? GO Related Health Topics Refractive Errors National Institutes of Health The primary ...

  13. Female Infertility: MedlinePlus Health Topic

    Science.gov (United States)

    ... Prolactin blood test (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Female Infertility updates ... Serum progesterone Show More Show Less Related Health Topics Assisted Reproductive Technology Infertility Male Infertility National Institutes ...

  14. Mobility Aids: MedlinePlus Health Topic

    Science.gov (United States)

    ... Mobility Problems (AGS Foundation for Health in Aging) Topic Image MedlinePlus Email Updates Get Mobility Aids updates ... standing and walking Using a cane Related Health Topics Assistive Devices Other Languages Find health information in ...

  15. Genetic Testing: MedlinePlus Health Topic

    Science.gov (United States)

    ... Your Family's Health (National Institutes of Health) - PDF Topic Image MedlinePlus Email Updates Get Genetic Testing updates ... testing and your cancer risk Karyotyping Related Health Topics Birth Defects Genetic Counseling Genetic Disorders Newborn Screening ...

  16. Folic Acid: MedlinePlus Health Topic

    Science.gov (United States)

    ... acid in diet (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Folic Acid updates ... acid - test Folic acid in diet Related Health Topics Vitamins National Institutes of Health The primary NIH ...

  17. Pneumococcal Infections: MedlinePlus Health Topic

    Science.gov (United States)

    ... Prevention, Immunization Action Coalition) - PDF Also in Spanish Topic Image MedlinePlus Email Updates Get Pneumococcal Infections updates ... ray Meningitis - pneumococcal Sputum gram stain Related Health Topics Meningitis Pneumonia Sepsis Sinusitis Streptococcal Infections National Institutes ...

  18. Wilms' Tumor: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Wilms tumor (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Wilms Tumor updates ... ENCYCLOPEDIA After chemotherapy - discharge Wilms tumor Related Health Topics Kidney Cancer National Institutes of Health The primary ...

  19. Child Safety: MedlinePlus Health Topic

    Science.gov (United States)

    ... injuries in children (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Child Safety updates ... safety Preventing head injuries in children Related Health Topics Infant and Newborn Care Internet Safety Motor Vehicle ...

  20. Pneumocystis Infections: MedlinePlus Health Topic

    Science.gov (United States)

    ... Pneumocystis jiroveci pneumonia (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Pneumocystis Infections updates ... GO MEDICAL ENCYCLOPEDIA Pneumocystis jiroveci pneumonia Related Health Topics HIV/AIDS HIV/AIDS and Infections Pneumonia National ...

  1. Collapsed Lung: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Pneumothorax - infants (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Collapsed Lung updates ... Lung surgery Pneumothorax - slideshow Pneumothorax - infants Related Health Topics Chest Injuries and Disorders Lung Diseases Pleural Disorders ...

  2. Male Infertility: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Testicular biopsy (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Male Infertility updates ... analysis Sperm release pathway Testicular biopsy Related Health Topics Assisted Reproductive Technology Female Infertility Infertility National Institutes ...

  3. Healthy Aging: MedlinePlus Health Topic

    Science.gov (United States)

    ... Aging National Institute on Aging Also in Spanish Topic Image MedlinePlus Email Updates Get Healthy Aging updates ... 65 Health screening - women - over 65 Related Health Topics Exercise for Seniors Nutrition for Seniors Seniors' Health ...

  4. Psoriatic Arthritis: MedlinePlus Health Topic

    Science.gov (United States)

    ... Handouts Psoriatic arthritis (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Psoriatic Arthritis updates ... this? GO MEDICAL ENCYCLOPEDIA Psoriatic arthritis Related Health Topics Arthritis Psoriasis National Institutes of Health The primary ...

  5. Hip Replacement: MedlinePlus Health Topic

    Science.gov (United States)

    ... invasive hip replacement (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Hip Replacement updates ... replacement - precautions Minimally invasive hip replacement Related Health Topics Hip Injuries and Disorders National Institutes of Health ...

  6. Platelet Disorders: MedlinePlus Health Topic

    Science.gov (United States)

    ... Thromobocytopenia - drug-induced (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Platelet Disorders updates ... Willebrand disease Show More Show Less Related Health Topics Bleeding Disorders Blood Clots Blood Count Tests Blood ...

  7. Cardiac Rehabilitation: MedlinePlus Health Topic

    Science.gov (United States)

    ... in Spanish Electrocardiogram (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Cardiac Rehabilitation updates ... How to take your pulse Pulse Related Health Topics Heart Attack Heart Diseases How to Prevent Heart ...

  8. Cardiac Arrest: MedlinePlus Health Topic

    Science.gov (United States)

    ... Handouts Cardiac arrest (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Cardiac Arrest updates ... this? GO MEDICAL ENCYCLOPEDIA Cardiac arrest Related Health Topics Arrhythmia CPR Pacemakers and Implantable Defibrillators National Institutes ...

  9. Kawasaki Disease: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Kawasaki disease (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Kawasaki Disease updates ... GO MEDICAL ENCYCLOPEDIA Electrocardiogram Kawasaki disease Related Health Topics Vasculitis National Institutes of Health The primary NIH ...

  10. Diabetic Diet: MedlinePlus Health Topic

    Science.gov (United States)

    ... Sweeteners - sugar substitutes (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Diabetic Diet updates ... you have diabetes Sweeteners - sugar substitutes Related Health Topics Blood Sugar Diabetes Diabetes in Children and Teens ...

  11. Infection Control: MedlinePlus Health Topic

    Science.gov (United States)

    ... Staph infections - hospital (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Infection Control updates ... infections when visiting Staph infections - hospital Related Health Topics Hepatitis HIV/AIDS MRSA National Institutes of Health ...

  12. Hearing Aids: MedlinePlus Health Topic

    Science.gov (United States)

    ... for hearing loss (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Hearing Aids updates ... MEDICAL ENCYCLOPEDIA Devices for hearing loss Related Health Topics Cochlear Implants Hearing Disorders and Deafness National Institutes ...

  13. Kidney Tests: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Total protein (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Kidney Tests updates ... hour volume Show More Show Less Related Health Topics Kidney Cancer Kidney Diseases National Institutes of Health ...

  14. Ischemic Stroke: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Thrombolytic therapy (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Ischemic Stroke updates ... cardiogenic embolism Stroke - slideshow Thrombolytic therapy Related Health Topics Hemorrhagic Stroke Stroke Stroke Rehabilitation National Institutes of ...

  15. Pulmonary Rehabilitation: MedlinePlus Health Topic

    Science.gov (United States)

    ... Handouts Postural drainage (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Pulmonary Rehabilitation updates ... this? GO MEDICAL ENCYCLOPEDIA Postural drainage Related Health Topics Lung Diseases National Institutes of Health The primary ...

  16. Back Cover: NIH MedlinePlus Salud

    Science.gov (United States)

    ... Bar Home Current Issue Past Issues NIH MedlinePlus Salud Past Issues / Winter 2009 Table of Contents For ... this page please turn Javascript on. ¡A su salud! Los Institutos Nacionales de la Salud (NIH, por ...

  17. MedlinePlus Connect: How it Works

    Science.gov (United States)

    ... Connect → How it Works URL of this page: https://medlineplus.gov/connect/howitworks.html MedlinePlus Connect: How ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  18. Antibiotic Resistance: MedlinePlus Health Topic

    Science.gov (United States)

    ... GO GO About MedlinePlus Site Map FAQs Customer Support Health Topics Drugs & Supplements Videos & Tools Español You Are Here: Home → Health Topics → Antibiotic Resistance URL of this page: https://medlineplus.gov/antibioticresistance. ...

  19. MeSHmap: a text mining tool for MEDLINE.

    OpenAIRE

    Srinivasan, P.

    2001-01-01

    Our research goal is to explore text mining from the metadata included in MEDLINE documents. We present MeSHmap our prototype text mining system that exploits the MeSH indexing accompanying MEDLINE records. MeSHmap supports searches via PubMed followed by user driven exploration of the MeSH terms and subheadings in the retrieved set. The potential of the system goes beyond text retrieval. It may also be used to compare entities of the same type such as pairs of drugs or pairs of procedures et...

  20. Heart Surgery: MedlinePlus Health Topic

    Science.gov (United States)

    ... Living With Related Issues Specifics See, Play and Learn Videos and Tutorials Research Clinical Trials Journal Articles Resources ... Also in Spanish Videos and Tutorials MedlinePlus: Surgery Videos ... strategy is better to reduce postoperative stroke... Article: Ferumoxtyol-enhanced MR ...

  1. PubMed alternatives to search MEDLINE: An environmental scan

    Directory of Open Access Journals (Sweden)

    Arun Keepanasseril

    2014-01-01

    Full Text Available The prime objective of this article is to introduce the newer methods to access, search and process MEDLINE citations. It also aims to provide a brief overview of each service′s salient features. A targeted search was conducted in MEDLINE through the OVID gateway. This was followed with a search in Google Scholar as well as Google and Bing. Ninety-two web-based services that can be used to search MEDLINE were identified. The list was shortened to 24 by applying a set of relevancy criteria to select those services more relevant to general medical and dental users. Salient features of the selected services are outlined and a use case based classification of the system has been proposed to help dental practitioners and researchers select the appropriate service for a given purpose.

  2. We’re All in This Together | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [4.3 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  3. Achoo! Cold, Flu, or Something Else? | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  4. Psoriasis: On the Road to Discovery | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.9 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  5. Nick Jonas on Type 1 Diabetes | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.9 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  6. Asthma: What You Need to Know | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  7. Battling C. Difficile: Don’t Delay | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  8. Palliative Care: A Spectrum of Support | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  9. From the lab - Progress Against Zika | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [4.3 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  10. Understanding Asthma from the Inside Out | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  11. Welcome to NIH MedlinePlus, the magazine

    Science.gov (United States)

    ... from the world's largest medical library, NIH's National Library of Medicine. MedlinePlus has extensive information from the NIH and other trusted sources on more than 700 diseases and conditions. There ...

  12. MedlinePlus Connect: Frequently Asked Questions (FAQs)

    Science.gov (United States)

    ... topic data in XML format. Using the Web service, software developers can build applications that utilize MedlinePlus health topic information. The service accepts keyword searches as requests and returns relevant ...

  13. PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews.

    Science.gov (United States)

    Katchamart, Wanruchada; Faulkner, Amy; Feldman, Brian; Tomlinson, George; Bombardier, Claire

    2011-07-01

    To compare the performance of Ovid-MEDLINE vs. PubMed for identifying randomized controlled trials of methotrexate (MTX) in patients with rheumatoid arthritis (RA). We created search strategies for Ovid-MEDLINE and PubMed for a systematic review of MTX in RA. Their performance was evaluated using sensitivity, precision, and number needed to read (NNR). Comparing searches in Ovid-MEDLINE vs. PubMed, PubMed retrieved more citations overall than Ovid-MEDLINE; however, of the 20 citations that met eligibility criteria for the review, Ovid-MEDLINE retrieved 17 and PubMed 18. The sensitivity was 85% for Ovid-MEDLINE vs. 90% for PubMed, whereas the precision and NNR were comparable (precision: 0.881% for Ovid-MEDLINE vs. 0.884% for PubMed and NNR: 114 for Ovid-MEDLINE vs. 113 for PubMed). In systematic reviews of RA, PubMed has higher sensitivity than Ovid-MEDLINE with comparable precision and NNR. This study highlights the importance of well-designed database-specific search strategies. Copyright © 2010 Elsevier Inc. All rights reserved.

  14. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  15. Baby Health Checkup: MedlinePlus Health Topic

    Science.gov (United States)

    ... Know (Centers for Disease Control and Prevention) - PDF Topic Image MedlinePlus Email Updates Get Baby Health Checkup ... GO MEDICAL ENCYCLOPEDIA Well-child visits Related Health Topics Childhood Immunization Common Infant and Newborn Problems Infant ...

  16. Laser Eye Surgery: MedlinePlus Health Topic

    Science.gov (United States)

    ... corneal surgery - discharge (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Laser Eye Surgery ... surgery - what to ask your doctor Related Health Topics Refractive Errors National Institutes of Health The primary ...

  17. Child Mental Health: MedlinePlus Health Topic

    Science.gov (United States)

    ... events and children (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Child Mental Health ... in childhood Traumatic events and children Related Health Topics Bullying Child Behavior Disorders Mental Disorders Mental Health ...

  18. Bone Marrow Transplantation: MedlinePlus Health Topic

    Science.gov (United States)

    ... marrow transplant - discharge (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Bone Marrow Transplantation ... transplant - slideshow Graft-versus-host disease Related Health Topics Bone Marrow Diseases Stem Cells National Institutes of ...

  19. Nutrition for Seniors: MedlinePlus Health Topic

    Science.gov (United States)

    ... America) National Institute on Aging Also in Spanish Topic Image MedlinePlus Email Updates Get Nutrition for Seniors updates by email What's this? GO Related Health Topics Nutrition Seniors' Health National Institutes of Health The ...

  20. Blood Count Tests: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish WBC count (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Blood Count Tests ... WBC count Show More Show Less Related Health Topics Bleeding Disorders Blood Laboratory Tests National Institutes of ...

  1. Hormone Replacement Therapy: MedlinePlus Health Topic

    Science.gov (United States)

    ... of hormone therapy (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Hormone Replacement Therapy ... Estrogen overdose Types of hormone therapy Related Health Topics Menopause National Institutes of Health The primary NIH ...

  2. Overview of errors in the reference sequence and annotation of Mycobacterium tuberculosis H37Rv, and variation amongst its isolates

    KAUST Repository

    Köser, Claudio U.

    2012-06-01

    Since its publication in 1998, the genome sequence of the Mycobacterium tuberculosis H37Rv laboratory strain has acted as the cornerstone for the study of tuberculosis. In this review we address some of the practical aspects that have come to light relating to the use of H37Rv throughout the past decade which are of relevance for the ongoing genomic and laboratory studies of this pathogen. These include errors in the genome reference sequence and its annotation, as well as the recently detected variation amongst isolates of H37Rv from different laboratories. © 2011 Elsevier B.V..

  3. To Your Health: NLM Update—MedlinePlus

    Science.gov (United States)

    ... illustrate health care May 7 2018 Transcript Genetic architecture of mental disorders April 30 2018 Transcript Why ... MedlinePlus Connect for EHRs For Developers U.S. National Library of Medicine 8600 Rockville Pike, Bethesda, MD 20894 ...

  4. Treating Cataracts | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... Claudine Klose, 63, lives on a farm in New York's Hudson Valley. She had successful cataract surgery in 2013 and shared her experience recently with NIH MedlinePlus magazine. What did you notice about your vision that ...

  5. Traumatic Brain Injury: MedlinePlus Health Topic

    Science.gov (United States)

    ... injury - discharge (Medical Encyclopedia) Also in Spanish Chronic subdural hematoma (Medical Encyclopedia) Also in Spanish EEG (Medical Encyclopedia) ... Intracranial pressure monitoring (Medical Encyclopedia) Also in Spanish Subdural hematoma (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus ...

  6. Preventing Pregnancy with a Gel for Men? | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  7. Sickle Cell Disease: What You Should Know | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  8. A Journey with Mid-life Hearing Loss | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  9. Solving the Undiagnosed Disease Puzzle at NIH | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  10. From the lab - Testing Malaria-Resistant Mosquitoes | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  11. Step Inside NIH’s Sickle Cell Branch | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  12. A Closer Look at Cancer Imaging Tools | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  13. Beyond Pain Relief: Total Knee Replacement Surgery | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  14. Keys to Recovery after Knee Replacement Surgery | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  15. Fighting the Flu with a Universal Vaccine | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  16. Routine Whole-Genome Sequencing for Outbreak Investigations of Staphylococcus aureus in a National Reference Center

    Directory of Open Access Journals (Sweden)

    Geraldine Durand

    2018-03-01

    Full Text Available The French National Reference Center for Staphylococci currently uses DNA arrays and spa typing for the initial epidemiological characterization of Staphylococcus aureus strains. We here describe the use of whole-genome sequencing (WGS to investigate retrospectively four distinct and virulent S. aureus lineages [clonal complexes (CCs: CC1, CC5, CC8, CC30] involved in hospital and community outbreaks or sporadic infections in France. We used a WGS bioinformatics pipeline based on de novo assembly (reference-free approach, single nucleotide polymorphism analysis, and on the inclusion of epidemiological markers. We examined the phylogeographic diversity of the French dominant hospital-acquired CC8-MRSA (methicillin-resistant S. aureus Lyon clone through WGS analysis which did not demonstrate evidence of large-scale geographic clustering. We analyzed sporadic cases along with two outbreaks of a CC1-MSSA (methicillin-susceptible S. aureus clone containing the Panton–Valentine leukocidin (PVL and results showed that two sporadic cases were closely related. We investigated an outbreak of PVL-positive CC30-MSSA in a school environment and were able to reconstruct the transmission history between eight families. We explored different outbreaks among newborns due to the CC5-MRSA Geraldine clone and we found evidence of an unsuspected link between two otherwise distinct outbreaks. Here, WGS provides the resolving power to disprove transmission events indicated by conventional methods (same sequence type, spa type, toxin profile, and antibiotic resistance profile and, most importantly, WGS can reveal unsuspected transmission events. Therefore, WGS allows to better describe and understand outbreaks and (inter-national dissemination of S. aureus lineages. Our findings underscore the importance of adding WGS for (inter-national surveillance of infections caused by virulent clones of S. aureus but also substantiate the fact that technological optimization at

  17. Automatically identifying gene/protein terms in MEDLINE abstracts.

    Science.gov (United States)

    Yu, Hong; Hatzivassiloglou, Vasileios; Rzhetsky, Andrey; Wilbur, W John

    2002-01-01

    Natural language processing (NLP) techniques are used to extract information automatically from computer-readable literature. In biology, the identification of terms corresponding to biological substances (e.g., genes and proteins) is a necessary step that precedes the application of other NLP systems that extract biological information (e.g., protein-protein interactions, gene regulation events, and biochemical pathways). We have developed GPmarkup (for "gene/protein-full name mark up"), a software system that automatically identifies gene/protein terms (i.e., symbols or full names) in MEDLINE abstracts. As a part of marking up process, we also generated automatically a knowledge source of paired gene/protein symbols and full names (e.g., LARD for lymphocyte associated receptor of death) from MEDLINE. We found that many of the pairs in our knowledge source do not appear in the current GenBank database. Therefore our methods may also be used for automatic lexicon generation. GPmarkup has 73% recall and 93% precision in identifying and marking up gene/protein terms in MEDLINE abstracts. A random sample of gene/protein symbols and full names and a sample set of marked up abstracts can be viewed at http://www.cpmc.columbia.edu/homepages/yuh9001/GPmarkup/. Contact. hy52@columbia.edu. Voice: 212-939-7028; fax: 212-666-0140.

  18. La producción científica española en bioética a través de MEDLINE Scientific production in bioethics in Spain through MEDLINE

    Directory of Open Access Journals (Sweden)

    José Manuel Ramos

    2007-10-01

    Full Text Available Fundamento: Describir la producción científica española en bioética entre 1966 y 2003. Métodos: Se seleccionaron los documentos publicados por autores españoles y recogidos en la base de datos MEDLINE, mediante el cruce de las palabras bioética con otras diversas del mismo ámbito. Resultados: Se estudiaron 858 documentos, de los cuales 78 (9,1% se publicaron entre 1966 y 1983, 163 (19% entre 1984 y 1993, y 617 (71,9% entre 1994 y 2003. Los principales temas publicados fueron: legislación y derechos (15,4% e investigación y comités de ética (13,1%. En el último período se ha observado un aumento significativo de las publicaciones sobre genética y clonación y un descenso sobre las de aborto. El 38,9% de los documentos se atribuyó a universidades y el 38,5% a hospitales. Conclusiones: La publicaciones científicas de bioética se incrementó durante el período de estudio, lo que demuestra un aumento progresivo de la producción científica española en bioética.Objective: To describe Spain's scientific production in the field of bioethics from 1966 to 2003. Methods: Manuscripts published by Spanish authors between 1966 and 2003 and containing key word references to bioethics, ethics, and 22 other related terms were retrieved from the Medline database. Results: 858 documents were selected: 78 (9.1% were published between 1966 and 1983, 163 (19% between 1984 and 1993, and 617 (71.9% between 1994 and 2003. The main subject areas treated were laws and rights (15.4% and research and ethics committees (13.1%. The last of these periods witnessed an increase in publications on genetics and human cloning and a decrease in those treating abortion. Institutional affiliations referred mainly to universities (38.9% and hospitals (38.5%. Conclusions: There was a progressive increase in the number of scientific publications on bioethics by Spanish authors during the study period.

  19. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes [v2; ref status: indexed, http://f1000r.es/2x3

    Directory of Open Access Journals (Sweden)

    Ted Kalbfleisch

    2014-02-01

    Full Text Available Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease.  High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals.  Comparisons between these species have provided unique insights into mammalian gene function.  However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life.  For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project.  Only six of these have reference genomes:  cattle, swine, sheep, goat, water buffalo, and bison.  Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade.  In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species’ reference genome (Ovis aries Oar3.1 and to that of a species that diverged 15 to 30 million years ago (Bos taurus UMD3.1.  In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding repeat regions and sex chromosomes, nearly 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep.  Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous.  These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand

  20. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  1. The Reference Scenarios for the Swiss Emergency Planning

    International Nuclear Information System (INIS)

    Hanspeter Isaak; Navert, Stephan B.; Ralph Schulz

    2006-01-01

    For the purpose of emergency planning and preparedness, realistic reference scenarios and corresponding accident source terms have been defined on the basis of common plant features. Three types of representative reference scenarios encompass the accident sequences expected to be the most probable. Accident source terms are assumed to be identical for all Swiss nuclear power plants, although the plants differ in reactor type and power. Plant-specific probabilistic safety analyses were used to justify the reference scenarios and the postulated accident source terms. From the full spectrum of release categories available, those categories were selected which would be covered by the releases and time frames assumed in the reference scenarios. For each nuclear power plant, the cumulative frequency of accident sequences not covered by the reference scenarios was determined. It was found that the cumulative frequency for such accident sequences does not exceed about 1 x 10 -6 per year. The Swiss Federal Nuclear Safety Inspectorate concludes that the postulated accident source terms for the reference scenarios are consistent with the current international approach in emergency planning, where one should concentrate on the most probable accident sequences. (N.C.)

  2. From the lab - Can Potassium Help Your Heart? | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  3. Joint Replacement Surgery: What you Need to Know | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  4. Too ‘Stubborn’ to Give in to COPD | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [2.68 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  5. Cholesterol: The Good, the Bad, and the Unhealthy | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  6. Confronting 9/11 Trauma from Childhood into Adulthood | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  7. A Path to Hope for Sickle Cell Disease | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  8. Advance Care Plan: A Checklist for the Future | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  9. Quark enables semi-reference-based compression of RNA-seq data.

    Science.gov (United States)

    Sarkar, Hirak; Patro, Rob

    2017-11-01

    The past decade has seen an exponential increase in biological sequencing capacity, and there has been a simultaneous effort to help organize and archive some of the vast quantities of sequencing data that are being generated. Although these developments are tremendous from the perspective of maximizing the scientific utility of available data, they come with heavy costs. The storage and transmission of such vast amounts of sequencing data is expensive. We present Quark, a semi-reference-based compression tool designed for RNA-seq data. Quark makes use of a reference sequence when encoding reads, but produces a representation that can be decoded independently, without the need for a reference. This allows Quark to achieve markedly better compression rates than existing reference-free schemes, while still relieving the burden of assuming a specific, shared reference sequence between the encoder and decoder. We demonstrate that Quark achieves state-of-the-art compression rates, and that, typically, only a small fraction of the reference sequence must be encoded along with the reads to allow reference-free decompression. Quark is implemented in C ++11, and is available under a GPLv3 license at www.github.com/COMBINE-lab/quark. rob.patro@cs.stonybrook.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  10. Mentoring in Medicine | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... front is Andrew Morrison, MIM vice president of marketing. Giving Students a Vision of Healthcare Careers Research ... federal tax purposes. Web site: www.fnlm.org Mobile MedlinePlus! Trusted medical information on your mobile phone. ...

  11. Sexual Problems in Men: MedlinePlus Health Topic

    Science.gov (United States)

    ... Spanish Retrograde ejaculation (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Sexual Problems in ... Premature ejaculation Reifenstein syndrome Retrograde ejaculation Related Health Topics Erectile Dysfunction Penis Disorders Prostate Diseases Testicular Disorders ...

  12. Cancer--Living with Cancer: MedlinePlus Health Topic

    Science.gov (United States)

    ... during cancer treatment (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get Cancer--Living with ... care plan Show More Show Less Related Health Topics Cancer Cancer Chemotherapy Palliative Care National Institutes of ...

  13. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  14. Do language fluency and other socioeconomic factors influence the use of PubMed and MedlinePlus?

    Science.gov (United States)

    Sheets, L; Gavino, A; Callaghan, F; Fontelo, P

    2013-01-01

    Increased usage of MedlinePlus by Spanish-speakers was observed after introduction of MedlinePlus in Spanish. This probably reflects increased usage of MEDLINE and PubMed by those with greater fluency in the language in which it is presented; but this has never been demonstrated in English speakers. Evidence that lack of English fluency deters international healthcare personnel from using PubMed could support the use of multi-language search tools like Babel-MeSH. This study aims to measure the effects of language fluency and other socioeconomic factors on PubMed MEDLINE and MedlinePlus access by international users. We retrospectively reviewed server pageviews of PubMed and MedlinePlus from various periods of time, and analyzed them against country statistics on language fluency, GDP, literacy rate, Internet usage, medical schools, and physicians per capita, to determine whether they were associated. We found fluency in English to be positively associated with pageviews of PubMed and MedlinePlus in countries with high literacy rates. Spanish was generally found to be positively associated with pageviews of MedlinePlus en Español. The other parameters also showed varying degrees of association with pageviews. After adjusting for the other factors investigated in this study, language fluency was a consistently significant predictor of the use of PubMed, MedlinePlus English and MedlinePlus en Español. This study may support the need for multi-language search tools and may increase access of health information resources from non-English speaking countries.

  15. MedlinePlus Connect: Linking Patient Portals and Electronic Health Records to Health Information

    Science.gov (United States)

    ... Here: Home → MedlinePlus Connect URL of this page: https://medlineplus.gov/connect/overview.html MedlinePlus Connect Linking ... will change.) Old URLs New URLs Web Application https://apps.nlm.nih.gov/medlineplus/services/mpconnect.cfm? ...

  16. Book Catalogs; Selected References.

    Science.gov (United States)

    Brandhorst, Wesley T.

    The 116 citations on book catalogs are divided into the following two main sections: (1) Selected References, in alphabetic sequence by personal or institutional author and (2) Anonymous Entries, in alphabetic sequence by title. One hundred and seven of the citations cover the years 1960 through March 1969. There are five scattered citations in…

  17. New Approaches and Technologies to Sequence de novo Plant reference Genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy

    2013-03-01

    Jeremy Schmutz of the HudsonAlpha Institute for Biotechnology on New approaches and technologies to sequence de novo plant reference genomes at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.

  18. Welcome to MedlinePlus en español

    Science.gov (United States)

    ... para el público de la Biblioteca Nacional de Medicina, la biblioteca médica más grande del mundo. Los ... B. Lindberg, M.D. Director, Biblioteca Nacional de Medicina MedlinePlus.gov/salud Spring 2007 Issue: Volume 2 ...

  19. Evaluating the MEDLINE Core Clinical Journals filter: data-driven evidence assessing clinical utility.

    Science.gov (United States)

    Klein-Fedyshin, Michele; Ketchum, Andrea M; Arnold, Robert M; Fedyshin, Peter J

    2014-12-01

    MEDLINE offers the Core Clinical Journals filter to limit to clinically useful journals. To determine its effectiveness for searching and patient-centric decision making, this study compared literature used for Morning Report in Internal Medicine with journals in the filter. An EndNote library with references answering 327 patient-related questions during Morning Report from 2007 to 2012 was exported to a file listing variables including designated Core Clinical Journal, Impact Factor, date used and medical subject. Bradford's law of scattering was applied ranking the journals and reflecting their clinical utility. Recall (sensitivity) and precision of the Core Morning Report journals and non-Core set was calculated. This study applied bibliometrics to compare the 628 articles used against these criteria to determine journals impacting decision making. Analysis shows 30% of clinically used articles are from the Core Clinical Journals filter and 16% of the journals represented are Core titles. When Bradford-ranked, 55% of the top 20 journals are Core. Articles sources used. Among the 63 Morning Report subjects, 55 have <50% precision and 41 have <50% recall including 37 subjects with 0% precision and 0% recall. Low usage of publications within the Core Clinical Journals filter indicates less relevance for hospital-based care. The divergence from high-impact medicine titles suggests clinically valuable journals differ from academically important titles. With few subjects demonstrating high recall or precision, the MEDLINE Core Clinical Journals filter may require a review and update to better align with current clinical needs. © 2014 John Wiley & Sons, Ltd.

  20. Web-Scale Discovery Services Retrieve Relevant Results in Health Sciences Topics Including MEDLINE Content

    Directory of Open Access Journals (Sweden)

    Elizabeth Margaret Stovold

    2017-06-01

    coverage of MEDLINE, they recorded the first 50 results from each of the 6 PubMed searches in a spreadsheet. During data collection at the WSD sites, they searched for these references to discover if the WSD tool at each site indexed these known items. Authors adopted measures to control for any customisation of the product setup at each data collection site. In particular, they excluded local holdings from the results by limiting the searches to scholarly, peer-reviewed articles. Main results – Authors reported results for 5 of the 6 sites. All of the WSD tools retrieved between 50-60% relevant results. EDS retrieved the highest number of relevant records (195/360 and 216/360, while Primo retrieved the lowest (167/328 and 169/325. There was good observer agreement (k=0.725 for the relevance assessment. The duplicate detection rate was similar in EDS and Summon (between 96-97% unique articles, while the Primo searches returned 82.9-84.9% unique articles. All three tools retrieved relevant results that were not indexed in MEDLINE, and retrieved relevant material indexed in MEDLINE that was not retrieved in the PubMed searches. EDS and Summon retrieved more non-MEDLINE material than Primo. EDS performed best in the known-item searches, with 300/300 and 299/300 items retrieved, while Primo performed worst with 230/300 and 267/300 items retrieved. The Summon platform features an “automated query expansion” search function, where user-entered keywords are matched to related search terms and these are automatically searched along with the original keyword. The authors observed that this function resulted in a wholly relevant first page of results for one of the search questions tested in Summon. Conclusion – While EDS performed slightly better overall, the difference was not great enough in this small sample of test sites to recommend EDS over the other tools being tested. The automated query expansion found in Summon is a useful function that is worthy of further

  1. HPV-QUEST: A highly customized system for automated HPV sequence analysis capable of processing Next Generation sequencing data set.

    Science.gov (United States)

    Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M

    2012-01-01

    Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses.

  2. From the lab - Exercise Key to Keeping Weight Off | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  3. TV Star Jim Parsons Shines Light on NIH Research | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  4. Use of the critical incident technique to evaluate the impact of MEDLINE

    International Nuclear Information System (INIS)

    Wilson, S.R.; Starr-Schneidkraut, N.; Cooper, M.D.

    1989-01-01

    The NLM has an ongoing responsibility to assess the extent to which its information products and services support the requirements of its users. This enables the Library to craft ever more responsive systems that capitalize on the latest advances in information and computer technology and, when necessary, to modify existing systems whose performance may no longer be optimal or consistent with the functions intended. The importance of this requirement was underscored in the recent report of the Outreach Planning Panel to the NLM Board of regents. A fundamental concern is the need to identify the impact of MEDLINE-derived information--i.e., does the use of MEDLINE ''make a difference''? In what ways is it used, and with what effect? In particular, is information retrieved from MEDLINE used successfully by health professionals to support medical decision-making and patient care? Previous efforts to address this question have been limited to the collection of available anecdotal reports. Traditional survey methodology, with pre-defined response categories, while used effectively to determine general areas in which MEDLINE information is used, is not well suited to developing a detailed understanding of user motivation, behavior, and resulting consequences. 8 refs., 6 figs., 2 tab

  5. High school peer tutors teach MedlinePlus: a model for Hispanic outreach*

    Science.gov (United States)

    Warner, Debra G.; Olney, Cynthia A.; Wood, Fred B.; Hansen, Lucille; Bowden, Virginia M.

    2005-01-01

    Objectives: The objective was to introduce the MedlinePlus Website to the predominantly Hispanic residents of the Lower Rio Grande Valley region of Texas by partnering with a health professions magnet high school (known as Med High). Methods: Community assessment was used in the planning stages and included pre-project focus groups with students and teachers. Outreach methods included peer tutor selection, train-the-trainer sessions, school and community outreach, and pre- and posttests of MedlinePlus training sessions. Evaluation methods included Web statistics; end-of-project interviews; focus groups with students, faculty, and librarians; and end-of-project surveys of students and faculty. Results: Four peer tutors reached more than 2,000 people during the project year. Students and faculty found MedlinePlus to be a useful resource. Faculty and librarians developed new or revised teaching methods incorporating MedlinePlus. The project enhanced the role of school librarians as agents of change at Med High. The project continues on a self-sustaining basis. Conclusions: Using peer tutors is an effective way to educate high school students about health information resources and, through the students, to reach families and community members. PMID:15858628

  6. The MedlinePlus public user interface: studies of design challenges and opportunities

    Science.gov (United States)

    Marill, Jennifer L.; Miller, Naomi; Kitendaugh, Paula

    2006-01-01

    Question: What are the challenges involved in designing, modifying, and improving a major health information portal that serves over sixty million page views a month? Setting: MedlinePlus, the National Library of Medicine's (NLM's) consumer health Website, is examined. Method: Challenges are presented as six “studies,” which describe selected design issues and how NLM staff resolved them. Main Result: Improving MedlinePlus is an iterative process. Changes in the public user interface are ongoing, reflecting Web design trends, usability testing recommendations, user survey results, new technical requirements, and the need to grow the site in an orderly way. Conclusion: Testing and analysis should accompany Website design modifications. New technologies may enhance a site but also introduce problems. Further modifications to MedlinePlus will be informed by the experiences described here. PMID:16404467

  7. Analyses of Tissue Culture Adaptation of Human Herpesvirus-6A by Whole Genome Deep Sequencing Redefines the Reference Sequence and Identifies Virus Entry Complex Changes.

    Science.gov (United States)

    Tweedy, Joshua G; Escriva, Eric; Topf, Maya; Gompels, Ursula A

    2017-12-31

    Tissue-culture adaptation of viruses can modulate infection. Laboratory passage and bacterial artificial chromosome (BAC)mid cloning of human cytomegalovirus, HCMV, resulted in genomic deletions and rearrangements altering genes encoding the virus entry complex, which affected cellular tropism, virulence, and vaccine development. Here, we analyse these effects on the reference genome for related betaherpesviruses, Roseolovirus, human herpesvirus 6A (HHV-6A) strain U1102. This virus is also naturally "cloned" by germline subtelomeric chromosomal-integration in approximately 1% of human populations, and accurate references are key to understanding pathological relationships between exogenous and endogenous virus. Using whole genome next-generation deep-sequencing Illumina-based methods, we compared the original isolate to tissue-culture passaged and the BACmid-cloned virus. This re-defined the reference genome showing 32 corrections and 5 polymorphisms. Furthermore, minor variant analyses of passaged and BACmid virus identified emerging populations of a further 32 single nucleotide polymorphisms (SNPs) in 10 loci, half non-synonymous indicating cell-culture selection. Analyses of the BAC-virus genome showed deletion of the BAC cassette via loxP recombination removing green fluorescent protein (GFP)-based selection. As shown for HCMV culture effects, select HHV-6A SNPs mapped to genes encoding mediators of virus cellular entry, including virus envelope glycoprotein genes gB and the gH/gL complex. Comparative models suggest stabilisation of the post-fusion conformation. These SNPs are essential to consider in vaccine-design, antimicrobial-resistance, and pathogenesis.

  8. Kids Create Healthy Comics | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... School Students Using Medline Plus Kids Create Healthy Comics Past Issues / Fall 2015 Table of Contents Fresh, ... use of reliable health information resources." The Four Comic Books Are: The Expert Investigator explores the impact ...

  9. How to Prevent Heart Disease: MedlinePlus Health Topic

    Science.gov (United States)

    ... and your heart (Medical Encyclopedia) Also in Spanish Topic Image MedlinePlus Email Updates Get How to Prevent ... your heart Stress and your heart Related Health Topics Blood Thinners Cholesterol Heart Diseases Heart Health Tests ...

  10. Tetanus, Diphtheria, and Pertussis Vaccines: MedlinePlus Health Topic

    Science.gov (United States)

    ... Know (Centers for Disease Control and Prevention) - PDF Topic Image MedlinePlus Email Updates Get Tetanus, Diphtheria, and ... updates by email What's this? GO Related Health Topics Childhood Immunization Diphtheria Immunization Tetanus Whooping Cough National ...

  11. High-performance information search filters for CKD content in PubMed, Ovid MEDLINE, and EMBASE.

    Science.gov (United States)

    Iansavichus, Arthur V; Hildebrand, Ainslie M; Haynes, R Brian; Wilczynski, Nancy L; Levin, Adeera; Hemmelgarn, Brenda R; Tu, Karen; Nesrallah, Gihad E; Nash, Danielle M; Garg, Amit X

    2015-01-01

    Finding relevant articles in large bibliographic databases such as PubMed, Ovid MEDLINE, and EMBASE to inform care and future research is challenging. Articles relevant to chronic kidney disease (CKD) are particularly difficult to find because they are often published under different terminology and are found across a wide range of journal types. We used computer automation within a diagnostic test assessment framework to develop and validate information search filters to identify CKD articles in large bibliographic databases. 22,992 full-text articles in PubMed, Ovid MEDLINE, or EMBASE. 1,374,148 unique search filters. We established the reference standard of article relevance to CKD by manual review of all full-text articles using prespecified criteria to determine whether each article contained CKD content or not. We then assessed filter performance by calculating sensitivity, specificity, and positive predictive value for the retrieval of CKD articles. Filters with high sensitivity and specificity for the identification of CKD articles in the development phase (two-thirds of the sample) were then retested in the validation phase (remaining one-third of the sample). We developed and validated high-performance CKD search filters for each bibliographic database. Filters optimized for sensitivity reached at least 99% sensitivity, and filters optimized for specificity reached at least 97% specificity. The filters were complex; for example, one PubMed filter included more than 89 terms used in combination, including "chronic kidney disease," "renal insufficiency," and "renal fibrosis." In proof-of-concept searches, physicians found more articles relevant to the topic of CKD with the use of these filters. As knowledge of the pathogenesis of CKD grows and definitions change, these filters will need to be updated to incorporate new terminology used to index relevant articles. PubMed, Ovid MEDLINE, and EMBASE can be filtered reliably for articles relevant to CKD. These

  12. Genetics Home Reference: Emanuel syndrome

    Science.gov (United States)

    ... of Emanuel syndrome include an unusually small head ( microcephaly ), distinctive facial features, and a small lower jaw ( ... MedlinePlus Encyclopedia: Cleft Lip and Palate MedlinePlus Encyclopedia: Microcephaly MedlinePlus Encyclopedia: Preauricular Tag or Pit General Information ...

  13. Genetics Home Reference: Feingold syndrome

    Science.gov (United States)

    ... Feingold syndrome include an unusually small head size ( microcephaly ), a small jaw ( micrognathia ), a narrow opening of ... Duodenal Atresia MedlinePlus Encyclopedia: Esophageal Atresia MedlinePlus Encyclopedia: Microcephaly MedlinePlus Encyclopedia: Webbing of the Fingers or Toes ...

  14. Psoriasis Doesn't Slow Down Texan Brian LaFoy | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... willing to do. I'm one of the lucky ones." Find Out More National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) MedlinePlus-Psoriasis Clinical Trials Search Psoriasis Spring 2017 Issue: Volume 12 Number 1 Page 22 MedlinePlus Subscribe Magazine Information Contact ...

  15. From Bench to Bedside: Researchers of NIH’s Clinical Center | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  16. Molecular Identification of Unusual Pathogenic Yeast Isolates by Large Ribosomal Subunit Gene Sequencing: 2 Years of Experience at the United Kingdom Mycology Reference Laboratory▿

    Science.gov (United States)

    Linton, Christopher J.; Borman, Andrew M.; Cheung, Grace; Holmes, Ann D.; Szekely, Adrien; Palmer, Michael D.; Bridge, Paul D.; Campbell, Colin K.; Johnson, Elizabeth M.

    2007-01-01

    Rapid identification of yeast isolates from clinical samples is particularly important given their innately variable antifungal susceptibility profiles. We present here an analysis of the utility of PCR amplification and sequence analysis of the hypervariable D1/D2 region of the 26S rRNA gene for the identification of yeast species submitted to the United Kingdom Mycology Reference Laboratory over a 2-year period. A total of 3,033 clinical isolates were received from 2004 to 2006 encompassing 50 different yeast species. While more than 90% of the isolates, corresponding to the most common Candida species, could be identified by using the AUXACOLOR2 yeast identification kit, 153 isolates (5%), comprised of 47 species, could not be identified by using this system and were subjected to molecular identification via 26S rRNA gene sequencing. These isolates included some common species that exhibited atypical biochemical and phenotypic profiles and also many rarer yeast species that are infrequently encountered in the clinical setting. All 47 species requiring molecular identification were unambiguously identified on the basis of D1/D2 sequences, and the molecular identities correlated well with the observed biochemical profiles of the various organisms. Together, our data underscore the utility of molecular techniques as a reference adjunct to conventional methods of yeast identification. Further, we show that PCR amplification and sequencing of the D1/D2 region reliably identifies more than 45 species of clinically significant yeasts and can also potentially identify new pathogenic yeast species. PMID:17251397

  17. Married...with Food Allergies | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... this page please turn Javascript on. Feature: Food Allergies Married...with Food Allergies Past Issues / Spring 2011 Table of Contents Photo: ... life together and a common problem—severe food allergies. NIH MedlinePlus magazine’s Naomi Miller caught up with ...

  18. Intelligent system for topic survey in MEDLINE by keyword recommendation and learning text characteristics.

    Science.gov (United States)

    Tanaka, M; Nakazono, S; Matsuno, H; Tsujimoto, H; Kitamura, Y; Miyano, S

    2000-01-01

    We have implemented a system for assisting experts in selecting MEDLINE records for database construction purposes. This system has two specific features: The first is a learning mechanism which extracts characteristics in the abstracts of MEDLINE records of interest as patterns. These patterns reflect selection decisions by experts and are used for screening the records. The second is a keyword recommendation system which assists and supplements experts' knowledge in unexpected cases. Combined with a conventional keyword-based information retrieval system, this system may provide an efficient and comfortable environment for MEDLINE record selection by experts. Some computational experiments are provided to prove that this idea is useful.

  19. Setting reference targets

    International Nuclear Information System (INIS)

    Ruland, R.E.

    1997-04-01

    Reference Targets are used to represent virtual quantities like the magnetic axis of a magnet or the definition of a coordinate system. To explain the function of reference targets in the sequence of the alignment process, this paper will first briefly discuss the geometry of the trajectory design space and of the surveying space, then continue with an overview of a typical alignment process. This is followed by a discussion on magnet fiducialization. While the magnetic measurement methods to determine the magnetic centerline are only listed (they will be discussed in detail in a subsequent talk), emphasis is given to the optical/mechanical methods and to the task of transferring the centerline position to reference targets

  20. PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm

    Directory of Open Access Journals (Sweden)

    Chuong Cheng-Ming

    2006-10-01

    Full Text Available Abstract Background Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. Results PubFocus web server automates analysis of MEDLINE/PubMed search queries by enriching them with two widely used human factor-based bibliometric indicators of publication quality: journal impact factor and volume of forward references. In addition to providing basic volumetric statistics, PubFocus also prioritizes citations and evaluates authors' impact on the field of search. PubFocus also analyses presence and occurrence of biomedical key terms within citations by utilizing controlled vocabularies. Conclusion We have developed citations' prioritisation algorithm based on journal impact factor, forward referencing volume, referencing dynamics, and author's contribution level. It can be applied either to the primary set of PubMed search results or to the subsets of these results identified through key terms from controlled biomedical vocabularies and ontologies. NCI (National Cancer Institute thesaurus and MGD (Mouse Genome Database mammalian gene orthology have been implemented for key terms analytics. PubFocus provides a scalable platform for the integration of multiple available ontology databases. PubFocus analytics can be adapted for input sources of biomedical citations other than PubMed.

  1. Temporal Reference, Attentional Modulation, and Crossmodal Assimilation

    Directory of Open Access Journals (Sweden)

    Yingqi Wan

    2018-06-01

    Full Text Available Crossmodal assimilation effect refers to the prominent phenomenon by which ensemble mean extracted from a sequence of task-irrelevant distractor events, such as auditory intervals, assimilates/biases the perception (such as visual interval of the subsequent task-relevant target events in another sensory modality. In current experiments, using visual Ternus display, we examined the roles of temporal reference, materialized as the time information accumulated before the onset of target event, as well as the attentional modulation in crossmodal temporal interaction. Specifically, we examined how the global time interval, the mean auditory inter-intervals and the last interval in the auditory sequence assimilate and bias the subsequent percept of visual Ternus motion (element motion vs. group motion. We demonstrated that both the ensemble (geometric mean and the last interval in the auditory sequence contribute to bias the percept of visual motion. Longer mean (or last interval elicited more reports of group motion, whereas the shorter mean (or last auditory intervals gave rise to more dominant percept of element motion. Importantly, observers have shown dynamic adaptation to the temporal reference of crossmodal assimilation: when the target visual Ternus stimuli were separated by a long gap interval after the preceding sound sequence, the assimilation effect by ensemble mean was reduced. Our findings suggested that crossmodal assimilation relies on a suitable temporal reference on adaptation level, and revealed a general temporal perceptual grouping principle underlying complex audio-visual interactions in everyday dynamic situations.

  2. Finding Better and More Personalized Ways to Diagnose Cancer at NIH | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  3. From the lab - CancerSEEK: Blood Test Could Detect Cancer Earlier | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  4. From the lab - Brain Scan Technology Extends Treatment Window for Stroke | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  5. The Match of Her Life | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... answer questions for this issue of NIH MedlinePlus magazine about her breast cancer. You discovered you had ... way of healing. As this issue of the magazine went to press, Navratilova was receiving radiation therapy ...

  6. FNLM 2013 Events & Programs Announced | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... study in their field of expertise. Annual Awards Dinner In 2013, the Awards Dinner to celebrate advances ... Mobile MedlinePlus! Trusted medical information on your mobile phone. http://m.medlineplus.gov and in Spanish at ...

  7. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  8. Genetics Home Reference: focal dermal hypoplasia

    Science.gov (United States)

    ... in people with focal dermal hypoplasia is an omphalocele , which is an opening in the wall of ... Dermal Hypoplasia MedlinePlus Encyclopedia: Ectodermal dysplasia MedlinePlus Encyclopedia: Omphalocele General Information from MedlinePlus (5 links) Diagnostic Tests ...

  9. Genetics Home Reference: Beckwith-Wiedemann syndrome

    Science.gov (United States)

    ... opening in the wall of the abdomen (an omphalocele ) that allows the abdominal organs to protrude through ... Beckwith-Wiedemann syndrome MedlinePlus Encyclopedia: Macroglossia MedlinePlus Encyclopedia: Omphalocele General Information from MedlinePlus (5 links) Diagnostic Tests ...

  10. Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.

    Science.gov (United States)

    Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M

    2017-08-16

    High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.

  11. Sequencing of individual chromosomes of plant pathogenic Fusarium oxysporum.

    Science.gov (United States)

    Kashiwa, Takeshi; Kozaki, Toshinori; Ishii, Kazuo; Turgeon, B Gillian; Teraoka, Tohru; Komatsu, Ken; Arie, Tsutomu

    2017-01-01

    A small chromosome in reference isolate 4287 of F. oxysporum f. sp. lycopersici (Fol) has been designated as a 'pathogenicity chromosome' because it carries several pathogenicity related genes such as the Secreted In Xylem (SIX) genes. Sequence assembly of small chromosomes in other isolates, based on a reference genome template, is difficult because of karyotype variation among isolates and a high number of sequences associated with transposable elements. These factors often result in misassembly of sequences, making it unclear whether other isolates possess the same pathogenicity chromosome harboring SIX genes as in the reference isolate. To overcome this difficulty, single chromosome sequencing after Contour-clamped Homogeneous Electric Field (CHEF) separation of chromosomes was performed, followed by de novo assembly of sequences. The assembled sequences of individual chromosomes were consistent with results of probing gels of CHEF separated chromosomes with SIX genes. Individual chromosome sequencing revealed that several SIX genes are located on a single small chromosome in two pathogenic forms of F. oxysporum, beyond the reference isolate 4287, and in the cabbage yellows fungus F. oxysporum f. sp. conglutinans. The particular combination of SIX genes on each small chromosome varied. Moreover, not all SIX genes were found on small chromosomes; depending on the isolate, some were on big chromosomes. This suggests that recombination of chromosomes and/or translocation of SIX genes may occur frequently. Our method improves sequence comparison of small chromosomes among isolates. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Rapid Polymer Sequencer

    Science.gov (United States)

    Stolc, Viktor (Inventor); Brock, Matthew W (Inventor)

    2013-01-01

    Method and system for rapid and accurate determination of each of a sequence of unknown polymer components, such as nucleic acid components. A self-assembling monolayer of a selected substance is optionally provided on an interior surface of a pipette tip, and the interior surface is immersed in a selected liquid. A selected electrical field is impressed in a longitudinal direction, or in a transverse direction, in the tip region, a polymer sequence is passed through the tip region, and a change in an electrical current signal is measured as each polymer component passes through the tip region. Each of the measured changes in electrical current signals is compared with a database of reference electrical change signals, with each reference signal corresponding to an identified polymer component, to identify the unknown polymer component with a reference polymer component. The nanopore preferably has a pore inner diameter of no more than about 40 nm and is prepared by heating and pulling a very small section of a glass tubing.

  13. Genetics Home Reference: hypochondroplasia

    Science.gov (United States)

    ... the elbows, a sway of the lower back ( lordosis ), and bowed legs. These signs are generally less ... Management Resources (2 links) GeneReview: Hypochondroplasia MedlinePlus Encyclopedia: Lordosis General Information from MedlinePlus (5 links) Diagnostic Tests ...

  14. Genetics Home Reference: hypophosphatasia

    Science.gov (United States)

    ... by a softening of the bones known as osteomalacia. In adults, recurrent fractures in the foot and ... Management Resources (2 links) GeneReview: Hypophosphatasia MedlinePlus Encyclopedia: Osteomalacia General Information from MedlinePlus (5 links) Diagnostic Tests ...

  15. The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in Pinaceae

    Science.gov (United States)

    David B. Neale; Patrick E. McGuire; Nicholas C. Wheeler; Kristian A. Stevens; Marc W. Crepeau; Charis Cardeno; Aleksey V. Zimin; Daniela Puiu; Geo M. Pertea; U. Uzay Sezen; Claudio Casola; Tomasz E. Koralewski; Robin Paul; Daniel Gonzalez-Ibeas; Sumaira Zaman; Richard Cronn; Mark Yandell; Carson Holt; Charles H. Langley; James A. Yorke; Steven L. Salzberg; Jill L. Wegrzyn

    2017-01-01

    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50...

  16. MRI of the temporo-mandibular joint: which sequence is best suited to assess the cortical bone of the mandibular condyle? A cadaveric study using micro-CT as the standard of reference

    International Nuclear Information System (INIS)

    Karlo, Christoph A.; Patcas, Raphael; Signorelli, Luca; Mueller, Lukas; Kau, Thomas; Watzal, Helmut; Kellenberger, Christian J.; Ullrich, Oliver; Luder, Hans-Ulrich

    2012-01-01

    To determine the best suited sagittal MRI sequence out of a standard temporo-mandibular joint (TMJ) imaging protocol for the assessment of the cortical bone of the mandibular condyles of cadaveric specimens using micro-CT as the standard of reference. Sixteen TMJs in 8 human cadaveric heads (mean age, 81 years) were examined by MRI. Upon all sagittal sequences, two observers measured the cortical bone thickness (CBT) of the anterior, superior and posterior portions of the mandibular condyles (i.e. objective analysis), and assessed for the presence of cortical bone thinning, erosions or surface irregularities as well as subcortical bone cysts and anterior osteophytes (i.e. subjective analysis). Micro-CT of the condyles was performed to serve as the standard of reference for statistical analysis. Inter-observer agreements for objective (r = 0.83-0.99, P < 0.01) and subjective (κ = 0.67-0.88) analyses were very good. Mean CBT measurements were most accurate, and cortical bone thinning, erosions, surface irregularities and subcortical bone cysts were best depicted on the 3D fast spoiled gradient echo recalled sequence (3D FSPGR). The most reliable MRI sequence to assess the cortical bone of the mandibular condyles on sagittal imaging planes is the 3D FSPGR sequence. (orig.)

  17. MRI of the temporo-mandibular joint: which sequence is best suited to assess the cortical bone of the mandibular condyle? A cadaveric study using micro-CT as the standard of reference

    Energy Technology Data Exchange (ETDEWEB)

    Karlo, Christoph A. [University Hospital Zurich, Department of Diagnostic and Interventional Radiology, Zurich (Switzerland); University Children' s Hospital Zurich, Department of Diagnostic Imaging, Zurich (Switzerland); Patcas, Raphael; Signorelli, Luca; Mueller, Lukas [University of Zurich, Clinic for Orthodontics and Pediatric Dentistry, Center of Dental Medicine, Zurich (Switzerland); Kau, Thomas; Watzal, Helmut; Kellenberger, Christian J. [University Children' s Hospital Zurich, Department of Diagnostic Imaging, Zurich (Switzerland); Ullrich, Oliver [University of Zurich, Institute of Anatomy, Faculty of Medicine, Zurich (Switzerland); Luder, Hans-Ulrich [University of Zurich, Section of Orofacial Structures and Development, Center of Dental Medicine, Zurich (Switzerland)

    2012-07-15

    To determine the best suited sagittal MRI sequence out of a standard temporo-mandibular joint (TMJ) imaging protocol for the assessment of the cortical bone of the mandibular condyles of cadaveric specimens using micro-CT as the standard of reference. Sixteen TMJs in 8 human cadaveric heads (mean age, 81 years) were examined by MRI. Upon all sagittal sequences, two observers measured the cortical bone thickness (CBT) of the anterior, superior and posterior portions of the mandibular condyles (i.e. objective analysis), and assessed for the presence of cortical bone thinning, erosions or surface irregularities as well as subcortical bone cysts and anterior osteophytes (i.e. subjective analysis). Micro-CT of the condyles was performed to serve as the standard of reference for statistical analysis. Inter-observer agreements for objective (r = 0.83-0.99, P < 0.01) and subjective ({kappa} = 0.67-0.88) analyses were very good. Mean CBT measurements were most accurate, and cortical bone thinning, erosions, surface irregularities and subcortical bone cysts were best depicted on the 3D fast spoiled gradient echo recalled sequence (3D FSPGR). The most reliable MRI sequence to assess the cortical bone of the mandibular condyles on sagittal imaging planes is the 3D FSPGR sequence. (orig.)

  18. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  19. Parkinson's Disease Research at NIH | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... of this page please turn JavaScript on. Feature: Parkinson's Disease Parkinson's Disease Research at NIH Past Issues / Winter 2014 ... areas of its research: MedlinePlus . medlineplus.gov . Type "Parkinson's disease" in the Search box. NIHSeniorHealth —Parkinson's Disease ...

  20. Impact of SciELO and MEDLINE indexing on submissions to Jornal de Pediatria.

    Science.gov (United States)

    Blank, Danilo; Buchweitz, Claudia; Procianoy, Renato S

    2005-01-01

    To evaluate the impact of SciELO and MEDLINE indexing on the number of articles submitted to Jornal de Pediatria. Analysis of total article submission, submission of articles from foreign countries and acceptance figures in the following periods: stage I - pre-website (Jan 2000-Mar 2001); stage II - website (Apr 2001-Jul 2002); stage III - SciELO (Aug 2002-Aug 2003); stage IV - MEDLINE (Sep 2003-Dec 2004). There was a significant trend toward linear increase in the number of submissions along the study period (p = 0.009). The number of manuscripts submitted in stages I through IV was 184, 240, 297, and 482, respectively. The number of submissions was similar in stages I and II (p = 0.148), but statistically higher in Stage III (p SciELO indexing was associated with an increase in Brazilian manuscript submissions to Jornal de Pediatria, whereas MEDLINE indexing led to an increase in both Brazilian and foreign submissions.

  1. Reference free phasing and representation of complex variation

    DEFF Research Database (Denmark)

    Jensen, Jacob Malte

    2017-01-01

    High throughput sequencing has revolutionized our ability to interrogate genomes and entire human genomes are sequenced daily across the world. Mapping of short reads to a reference genome has enhanced our ability to detect genetic variation and is currently the most widely used technology....... Therefore, new methods for detecting variation that reduce reference bias are needed including ways of representing genomes that account for the variability within and between populations. The major histocompatibility complex (MHC) region is one of the most diverse and complex regions of the human genome...... to detect and call variation in humans. However, it has become evident that mapping of short reads to a single reference genome is subject to ascertainment bias (reference bias). This bias is especially pronounced in complex regions of the genome and particularly hampers detection of structural variation...

  2. The value of new genome references.

    Science.gov (United States)

    Worley, Kim C; Richards, Stephen; Rogers, Jeffrey

    2017-09-15

    Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.

  3. Genetics Home Reference: bladder cancer

    Science.gov (United States)

    ... Testing Registry: Malignant tumor of urinary bladder Other Diagnosis and Management Resources (1 link) MedlinePlus Encyclopedia: Bladder Cancer General Information from MedlinePlus (5 links) Diagnostic Tests ...

  4. Is a Widely Available Cure for Sickle Cell Disease on the Horizon? | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [1.5 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  5. Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval.

    Science.gov (United States)

    Yoo, Sooyoung; Choi, Jinwook

    2011-06-01

    The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on pseudo-relevance feedback. The OHSUMED test collection, which is a subset of the MEDLINE database, was used as a test corpus. Various ranking algorithms were tested in combination with different term re-weighting algorithms. Our comprehensive evaluation showed that the local context analysis ranking algorithm, when used in combination with one of the reweighting algorithms - Rocchio, the probabilistic model, and our variants - significantly outperformed other algorithm combinations by up to 12% (paired t-test; p algorithm pairs, at least in the context of the OHSUMED corpus. Comparative experiments on term ranking algorithms were performed in the context of a subset of MEDLINE documents. With medical documents, local context analysis, which uses co-occurrence with all query terms, significantly outperformed various term ranking methods based on both frequency and distribution analyses. Furthermore, the results of the experiments demonstrated that the term rank-based re-weighting method contributed to a remarkable improvement in mean average precision.

  6. MRI of the temporo-mandibular joint: which sequence is best suited to assess the cortical bone of the mandibular condyle? A cadaveric study using micro-CT as the standard of reference.

    Science.gov (United States)

    Karlo, Christoph A; Patcas, Raphael; Kau, Thomas; Watzal, Helmut; Signorelli, Luca; Müller, Lukas; Ullrich, Oliver; Luder, Hans-Ulrich; Kellenberger, Christian J

    2012-07-01

    To determine the best suited sagittal MRI sequence out of a standard temporo-mandibular joint (TMJ) imaging protocol for the assessment of the cortical bone of the mandibular condyles of cadaveric specimens using micro-CT as the standard of reference. Sixteen TMJs in 8 human cadaveric heads (mean age, 81 years) were examined by MRI. Upon all sagittal sequences, two observers measured the cortical bone thickness (CBT) of the anterior, superior and posterior portions of the mandibular condyles (i.e. objective analysis), and assessed for the presence of cortical bone thinning, erosions or surface irregularities as well as subcortical bone cysts and anterior osteophytes (i.e. subjective analysis). Micro-CT of the condyles was performed to serve as the standard of reference for statistical analysis. Inter-observer agreements for objective (r = 0.83-0.99, P < 0.01) and subjective (κ = 0.67-0.88) analyses were very good. Mean CBT measurements were most accurate, and cortical bone thinning, erosions, surface irregularities and subcortical bone cysts were best depicted on the 3D fast spoiled gradient echo recalled sequence (3D FSPGR). The most reliable MRI sequence to assess the cortical bone of the mandibular condyles on sagittal imaging planes is the 3D FSPGR sequence. MRI may be used to assess the cortical bone of the TMJ. • Depiction of cortical bone is best on 3D FSPGR sequences. • MRI can assess treatment response in patients with TMJ abnormalities.

  7. On the query reformulation technique for effective MEDLINE document retrieval.

    Science.gov (United States)

    Yoo, Sooyoung; Choi, Jinwook

    2010-10-01

    Improving the retrieval accuracy of MEDLINE documents is still a challenging issue due to low retrieval precision. Focusing on a query expansion technique based on pseudo-relevance feedback (PRF), this paper addresses the problem by systematically examining the effects of expansion term selection and adjustment of the term weights of the expanded query using a set of MEDLINE test documents called OHSUMED. Implementing a baseline information retrieval system based on the Okapi BM25 retrieval model, we compared six well-known term ranking algorithms for useful expansion term selection and then compared traditional term reweighting algorithms with our new variant of the standard Rocchio's feedback formula, which adopts a group-based weighting scheme. Our experimental results on the OHSUMED test collection showed a maximum improvement of 20.2% and 20.4% for mean average precision and recall measures over unexpanded queries when terms were expanded using a co-occurrence analysis-based term ranking algorithm in conjunction with our term reweighting algorithm (p-valueretrieval.

  8. NIH Launches National COPD Action Plan | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... COPD Action Plan Follow us NIH Launches National COPD Action Plan Photo: National Heart, Lung, and Blood ... questions for NIH MedlinePlus magazine. Why was the COPD National Action Plan created? The staggering numbers associated ...

  9. Technical development of PubMed Interact: an improved interface for MEDLINE/PubMed searches

    OpenAIRE

    Muin, Michael; Fontelo, Paul

    2006-01-01

    Abstract Background The project aims to create an alternative search interface for MEDLINE/PubMed that may provide assistance to the novice user and added convenience to the advanced user. An earlier version of the project was the 'Slider Interface for MEDLINE/PubMed searches' (SLIM) which provided JavaScript slider bars to control search parameters. In this new version, recent developments in Web-based technologies were implemented. These changes may prove to be even more valuable in enhanci...

  10. MedlinePlus® Everywhere: Access from Your Phone, Tablet or Desktop

    Science.gov (United States)

    ... responsivefull.html MedlinePlus® Everywhere: Access from Your Phone, Tablet or Desktop To use the sharing features on ... provide a consistent user experience from a desktop, tablet, or phone. All users, regardless of how they ...

  11. Operationalizing Semantic Medline for meeting the information needs at point of care

    Science.gov (United States)

    Rastegar-Mojarad, Majid; Li, Dingcheng; Liu, Hongfang

    2015-01-01

    Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible. PMID:26306259

  12. Operationalizing Semantic Medline for meeting the information needs at point of care.

    Science.gov (United States)

    Rastegar-Mojarad, Majid; Li, Dingcheng; Liu, Hongfang

    2015-01-01

    Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible.

  13. Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes sequenced on a human-specific MitoChip

    Directory of Open Access Journals (Sweden)

    Carr Steven M

    2007-09-01

    Full Text Available Abstract Background Iterative DNA "resequencing" on oligonucleotide microarrays offers a high-throughput method to measure intraspecific biodiversity, one that is especially suited to SNP-dense gene regions such as vertebrate mitochondrial (mtDNA genomes. However, costs of single-species design and microarray fabrication are prohibitive. A cost-effective, multi-species strategy is to hybridize experimental DNAs from diverse species to a common microarray that is tiled with oligonucleotide sets from multiple, homologous reference genomes. Such a strategy requires that cross-hybridization between the experimental DNAs and reference oligos from the different species not interfere with the accurate recovery of species-specific data. To determine the pattern and limits of such interspecific hybridization, we compared the efficiency of sequence recovery and accuracy of SNP identification by a 15,452-base human-specific microarray challenged with human, chimpanzee, gorilla, and codfish mtDNA genomes. Results In the human genome, 99.67% of the sequence was recovered with 100.0% accuracy. Accuracy of SNP identification declines log-linearly with sequence divergence from the reference, from 0.067 to 0.247 errors per SNP in the chimpanzee and gorilla genomes, respectively. Efficiency of sequence recovery declines with the increase of the number of interspecific SNPs in the 25b interval tiled by the reference oligonucleotides. In the gorilla genome, which differs from the human reference by 10%, and in which 46% of these 25b regions contain 3 or more SNP differences from the reference, only 88% of the sequence is recoverable. In the codfish genome, which differs from the reference by > 30%, less than 4% of the sequence is recoverable, in short islands ≥ 12b that are conserved between primates and fish. Conclusion Experimental DNAs bind inefficiently to homologous reference oligonucleotide sets on a re-sequencing microarray when their sequences differ by

  14. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  15. GapMis: a tool for pairwise sequence alignment with a single gap.

    Science.gov (United States)

    Flouri, Tomás; Frousios, Kimon; Iliopoulos, Costas S; Park, Kunsoo; Pissis, Solon P; Tischler, German

    2013-08-01

    Pairwise sequence alignment has received a new motivation due to the advent of recent patents in next-generation sequencing technologies, particularly so for the application of re-sequencing---the assembly of a genome directed by a reference sequence. After the fast alignment between a factor of the reference sequence and a high-quality fragment of a short read by a short-read alignment programme, an important problem is to find the alignment between a relatively short succeeding factor of the reference sequence and the remaining low-quality part of the read allowing a number of mismatches and the insertion of a single gap in the alignment. We present GapMis, a tool for pairwise sequence alignment with a single gap. It is based on a simple algorithm, which computes a different version of the traditional dynamic programming matrix. The presented experimental results demonstrate that GapMis is more suitable and efficient than most popular tools for this task.

  16. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  17. MD-CTS: An integrated terminology reference of clinical and translational medicine

    Directory of Open Access Journals (Sweden)

    Will Ray

    2016-01-01

    Full Text Available New vocabularies are rapidly evolving in the literature relative to the practice of clinical medicine and translational research. To provide integrated access to new terms, we developed a mobile and desktop online reference—Marshfield Dictionary of Clinical and Translational Science (MD-CTS. It is the first public resource that comprehensively integrates Wiktionary (word definition, BioPortal (ontology, Wiki (image reference, and Medline abstract (word usage information. MD-CTS is accessible at http://spellchecker.mfldclin.edu/. The website provides a broadened capacity for the wider clinical and translational science community to keep pace with newly emerging scientific vocabulary. An initial evaluation using 63 randomly selected biomedical words suggests that online references generally provided better coverage (73%-95% than paper-based dictionaries (57–71%.

  18. Journalist Liz Hernandez hopes to make Alzheimer’s a thing of the past | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [4.3 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  19. The contribution of next generation sequencing to epilepsy genetics

    DEFF Research Database (Denmark)

    Møller, Rikke S.; Dahl, Hans A.; Helbig, Ingo

    2015-01-01

    During the last decade, next generation sequencing technologies such as targeted gene panels, whole exome sequencing and whole genome sequencing have led to an explosion of gene identifications in monogenic epilepsies including both familial epilepsies and severe epilepsies, often referred to as ...

  20. Genetics Home Reference: Rett syndrome

    Science.gov (United States)

    ... Genetic Testing Registry: Rett syndrome Other Diagnosis and Management Resources (4 links) Boston Children's Hospital GeneReview: MECP2-Related Disorders MedlinePlus Encyclopedia: Rett Syndrome RettSyndrome.org: Rett Syndrome Clinics General Information from MedlinePlus (5 links) Diagnostic Tests ...

  1. A structural SVM approach for reference parsing.

    Science.gov (United States)

    Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R

    2011-06-09

    Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.

  2. Sequencing and characterization of the guppy (Poecilia reticulata transcriptome

    Directory of Open Access Journals (Sweden)

    Rodd F Helen

    2011-04-01

    Full Text Available Abstract Background Next-generation sequencing is providing researchers with a relatively fast and affordable option for developing genomic resources for organisms that are not among the traditional genetic models. Here we present a de novo assembly of the guppy (Poecilia reticulata transcriptome using 454 sequence reads, and we evaluate potential uses of this transcriptome, including detection of sex-specific transcripts and deployment as a reference for gene expression analysis in guppies and a related species. Guppies have been model organisms in ecology, evolutionary biology, and animal behaviour for over 100 years. An annotated transcriptome and other genomic tools will facilitate understanding the genetic and molecular bases of adaptation and variation in a vertebrate species with a uniquely well known natural history. Results We generated approximately 336 Mbp of mRNA sequence data from male brain, male body, female brain, and female body. The resulting 1,162,670 reads assembled into 54,921 contigs, creating a reference transcriptome for the guppy with an average read depth of 28×. We annotated nearly 40% of this reference transcriptome by searching protein and gene ontology databases. Using this annotated transcriptome database, we identified candidate genes of interest to the guppy research community, putative single nucleotide polymorphisms (SNPs, and male-specific expressed genes. We also showed that our reference transcriptome can be used for RNA-sequencing-based analysis of differential gene expression. We identified transcripts that, in juveniles, are regulated differently in the presence and absence of an important predator, Rivulus hartii, including two genes implicated in stress response. For each sample in the RNA-seq study, >50% of high-quality reads mapped to unique sequences in the reference database with high confidence. In addition, we evaluated the use of the guppy reference transcriptome for gene expression analyses in

  3. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences

    OpenAIRE

    Lescot, Magali; Déhais, Patrice; Thijs, Gert; Marchal, Kathleen; Moreau, Yves; Van de Peer, Yves; Rouzé, Pierre; Rombauts, Stephane

    2002-01-01

    PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors. Regulatory elements are represented by positional matrices, consensus sequences and individual sites on particular promoter sequences. Links to the EMBL, TRANSFAC and MEDLINE databases are provided when available. Data about the transcription sites are extracted mainly from the literature, supplemented with an increasing number of in silico predicted data. Apart from a general description for specific t...

  4. A short guide to peer-reviewed, MEDLINE-indexed complementary and alternative medicine journals.

    Science.gov (United States)

    Morgan, Sherry; Littman, Lynn; Palmer, Christina; Singh, Gurneet; LaRiccia, Patrick J

    2012-01-01

    Complementary and alternative medicine (CAM) comprises a multitude of disciplines, for example, acupuncture, ayurvedic medicine, biofeedback, herbal medicine, and homeopathic medicine. While research on CAM interventions has increased and the CAM literature has proliferated since the mid-1990s, a number of our colleagues have expressed difficulties in deciding where to publish CAM articles. In response, we created a short guide to peer-reviewed MEDLINE-indexed journals that publish CAM articles. We examined numerous English-language sources to identify titles that met our criteria, whether specific to or overlapping CAM. A few of the resources in which we found the journal titles that we included are Alternative Medicine Foundation, American Holistic Nurses Association, CINAHL/Nursing Database, Journal Citation Reports database, MEDLINE, PubMed, and Research Council for Complementary Medicine. We organized the 69 selected titles for easy use by creating 2 user-friendly tables, one listing titles in alphabetical order and one listing them in topical categories. A few examples of the topical categories are Acupuncture, CAM (general), Chinese Medicine, Herbal/Plant/Phytotherapy, Neuroscience/Psychology, Nursing/Clinical Care. Our study is the first to list general CAM journals, specialty CAM journals, and overlapping mainstream journals that are peer reviewed, in English, and indexed in MEDLINE. Our goal was to assist both authors seeking publication and mainstream journal editors who receive an overabundance of publishable articles but must recommend that authors seek publication elsewhere due to space and priority issues. Publishing in journals indexed by and included in MEDLINE (or PubMed) ensures that citations to articles will be found easily. Copyright © 2012 Lippincott Williams & Wilkins.

  5. Taxonomic reference libraries for environmental barcoding: a best practice example from diatom research.

    Directory of Open Access Journals (Sweden)

    Jonas Zimmermann

    Full Text Available DNA barcoding uses a short fragment of a DNA sequence to identify a taxon. After obtaining the target sequence it is compared to reference sequences stored in a database to assign an organism name to it. The quality of data in the reference database is the key to the success of the analysis. In the here presented study, multiple types of data have been combined and critically examined in order to create best practice guidelines for taxonomic reference libraries for environmental barcoding. 70 unialgal diatom strains from Berlin waters have been established and cultured to obtain morphological and molecular data. The strains were sequenced for 18S V4 rDNA (the pre-Barcode for protists as well as rbcL data, and identified by microscopy. LM and for some strains also SEM pictures were taken and physical vouchers deposited at the BGBM. 37 freshwater taxa from 15 naviculoid diatom genera were identified. Four taxa from the genera Amphora, Mayamaea, Planothidium and Stauroneis are described here as new. Names, molecular, morphological and habitat data as well as additional images of living cells are also available electronically in the AlgaTerra Information System. All reference sequences (or reference barcodes presented here are linked to voucher specimens in order to provide a complete chain of evidence back to the formal taxonomic literature.

  6. Building the sequence map of the human pan-genome

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Zheng, Hancheng

    2010-01-01

    analysis of predicted genes indicated that the novel sequences contain potentially functional coding regions. We estimate that a complete human pan-genome would contain approximately 19-40 Mb of novel sequence not present in the extant reference genome. The extensive amount of novel sequence contributing...

  7. Sequencing of the Hepatitis C Virus: A Systematic Review.

    Directory of Open Access Journals (Sweden)

    Brendan Jacka

    Full Text Available Since the identification of hepatitis C virus (HCV, viral sequencing has been important in understanding HCV classification, epidemiology, evolution, transmission clustering, treatment response and natural history. The length and diversity of the HCV genome has resulted in analysis of certain regions of the virus, however there has been little standardisation of protocols. This systematic review was undertaken to map the location and frequency of sequencing on the HCV genome in peer reviewed publications, with the aim to produce a database of sequencing primers and amplicons to inform future research. Medline and Scopus databases were searched for English language publications based on keyword/MeSH terms related to sequence analysis (9 terms or HCV (3 terms, plus "primer" as a general search term. Exclusion criteria included non-HCV research, review articles, duplicate records, and incomplete description of HCV sequencing methods. The PCR primer locations of accepted publications were noted, and purpose of sequencing was determined. A total of 450 studies were accepted from the 2099 identified, with 629 HCV sequencing amplicons identified and mapped on the HCV genome. The most commonly sequenced region was the HVR-1 region, often utilised for studies of natural history, clustering/transmission, evolution and treatment response. Studies related to genotyping/classification or epidemiology of HCV genotype generally targeted the 5'UTR, Core and NS5B regions, while treatment response/resistance was assessed mainly in the NS3-NS5B region with emphasis on the Interferon sensitivity determining region (ISDR region of NS5A. While the sequencing of HCV is generally constricted to certain regions of the HCV genome there is little consistency in the positioning of sequencing primers, with the exception of a few highly referenced manuscripts. This study demonstrates the heterogeneity of HCV sequencing, providing a comprehensive database of previously

  8. Generation and Characterisation of a Reference Transcriptome for Lentil (Lens culinaris Medik.

    Directory of Open Access Journals (Sweden)

    Shimna Sudheesh

    2016-11-01

    Full Text Available RNA-Seq using second-generation sequencing technologies permits generation of a reference unigene set for a given species, in the absence of a well-annotated genome sequence, supporting functional genomics studies, gene characterisation and detailed expression analysis for specific morphophysiological or environmental stress response traits. A reference unigene set for lentil has been developed, consisting of 58,986 contigs and scaffolds with an N50 length of 1719 bp. Comparison to gene complements from related species, reference protein databases, previously published lentil transcriptomes and a draft genome sequence validated the current dataset in terms of degree of completeness and utility. A large proportion (98% of unigenes were expressed in more than one tissue, at varying levels. Candidate genes associated with mechanisms of tolerance to both boron toxicity and time of flowering were identified, which can eventually be used for the development of gene-based markers. This study has provided a comprehensive, assembled and annotated reference gene set for lentil that can be used for multiple applications, permitting identification of genes for pathway-specific expression analysis, genetic modification approaches, development of resources for genotypic analysis, and assistance in the annotation of a future lentil genome sequence.

  9. Developing optimal search strategies for detecting clinically sound prognostic studies in MEDLINE: an analytic survey

    Directory of Open Access Journals (Sweden)

    Haynes R Brian

    2004-06-01

    Full Text Available Abstract Background Clinical end users of MEDLINE have a difficult time retrieving articles that are both scientifically sound and directly relevant to clinical practice. Search filters have been developed to assist end users in increasing the success of their searches. Many filters have been developed for the literature on therapy and reviews but little has been done in the area of prognosis. The objective of this study is to determine how well various methodologic textwords, Medical Subject Headings, and their Boolean combinations retrieve methodologically sound literature on the prognosis of health disorders in MEDLINE. Methods An analytic survey was conducted, comparing hand searches of journals with retrievals from MEDLINE for candidate search terms and combinations. Six research assistants read all issues of 161 journals for the publishing year 2000. All articles were rated using purpose and quality indicators and categorized into clinically relevant original studies, review articles, general papers, or case reports. The original and review articles were then categorized as 'pass' or 'fail' for methodologic rigor in the areas of prognosis and other clinical topics. Candidate search strategies were developed for prognosis and run in MEDLINE – the retrievals being compared with the hand search data. The sensitivity, specificity, precision, and accuracy of the search strategies were calculated. Results 12% of studies classified as prognosis met basic criteria for scientific merit for testing clinical applications. Combinations of terms reached peak sensitivities of 90%. Compared with the best single term, multiple terms increased sensitivity for sound studies by 25.2% (absolute increase, and increased specificity, but by a much smaller amount (1.1% when sensitivity was maximized. Combining terms to optimize both sensitivity and specificity achieved sensitivities and specificities of approximately 83% for each. Conclusion Empirically derived

  10. The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae

    Directory of Open Access Journals (Sweden)

    David B. Neale

    2017-09-01

    Full Text Available A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb. Franco (Coastal Douglas-fir is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp. Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.

  11. The development of search filters for adverse effects of surgical interventions in medline and Embase.

    Science.gov (United States)

    Golder, Su; Wright, Kath; Loke, Yoon Kong

    2018-03-31

    Search filter development for adverse effects has tended to focus on retrieving studies of drug interventions. However, a different approach is required for surgical interventions. To develop and validate search filters for medline and Embase for the adverse effects of surgical interventions. Systematic reviews of surgical interventions where the primary focus was to evaluate adverse effect(s) were sought. The included studies within these reviews were divided randomly into a development set, evaluation set and validation set. Using word frequency analysis we constructed a sensitivity maximising search strategy and this was tested in the evaluation and validation set. Three hundred and fifty eight papers were included from 19 surgical intervention reviews. Three hundred and fifty two papers were available on medline and 348 were available on Embase. Generic adverse effects search strategies in medline and Embase could achieve approximately 90% relative recall. Recall could be further improved with the addition of specific adverse effects terms to the search strategies. We have derived and validated a novel search filter that has reasonable performance for identifying adverse effects of surgical interventions in medline and Embase. However, we appreciate the limitations of our methods, and recommend further research on larger sample sizes and prospective systematic reviews. © 2018 The Authors Health Information and Libraries Journal published by John Wiley & Sons Ltd on behalf of Health Libraries Group.

  12. To Your Health: NLM update transcript - NIH MedlinePlus magazine Winter 2018

    Science.gov (United States)

    ... who is a star of 'The Big Bang Theory' television show, and the producer/narrator of a ... trials, NIH MedlinePlus magazine reports the current life expectancy of a person with sickle cell disease is ...

  13. From the lab - Eyes May be ‘Windows to the Brain’ in Stroke Patients | NIH MedlinePlus the Magazine

    Science.gov (United States)

    Skip to main content NIH MedlinePlus the Magazine NIH MedlinePlus Salud Download the Current Issue PDF [3.1 mb] Trusted Health Information from the National Institutes of Health Home Current Issue ...

  14. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  16. Light-weight reference-based compression of FASTQ data.

    Science.gov (United States)

    Zhang, Yongpeng; Li, Linsen; Yang, Yanli; Yang, Xiao; He, Shan; Zhu, Zexuan

    2015-06-09

    The exponential growth of next generation sequencing (NGS) data has posed big challenges to data storage, management and archive. Data compression is one of the effective solutions, where reference-based compression strategies can typically achieve superior compression ratios compared to the ones not relying on any reference. This paper presents a lossless light-weight reference-based compression algorithm namely LW-FQZip to compress FASTQ data. The three components of any given input, i.e., metadata, short reads and quality score strings, are first parsed into three data streams in which the redundancy information are identified and eliminated independently. Particularly, well-designed incremental and run-length-limited encoding schemes are utilized to compress the metadata and quality score streams, respectively. To handle the short reads, LW-FQZip uses a novel light-weight mapping model to fast map them against external reference sequence(s) and produce concise alignment results for storage. The three processed data streams are then packed together with some general purpose compression algorithms like LZMA. LW-FQZip was evaluated on eight real-world NGS data sets and achieved compression ratios in the range of 0.111-0.201. This is comparable or superior to other state-of-the-art lossless NGS data compression algorithms. LW-FQZip is a program that enables efficient lossless FASTQ data compression. It contributes to the state of art applications for NGS data storage and transmission. LW-FQZip is freely available online at: http://csse.szu.edu.cn/staff/zhuzx/LWFQZip.

  17. Pairwise Sequence Alignment Library

    Energy Technology Data Exchange (ETDEWEB)

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, a novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.

  18. Genetics Home Reference: Donnai-Barrow syndrome

    Science.gov (United States)

    ... opening in the wall of the abdomen (an omphalocele ) that allows the abdominal organs to protrude through ... Hernia MedlinePlus Encyclopedia: Hearing Loss - Infants MedlinePlus Encyclopedia: Omphalocele Nemours Foundation: Hearing Evaluation in Children General Information ...

  19. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  20. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  1. Genetics Home Reference: Manitoba oculotrichoanal syndrome

    Science.gov (United States)

    ... opening in the wall of the abdomen (an omphalocele ) that allows the abdominal organs to protrude through ... 2 links) GeneReview: Manitoba Oculotrichoanal Syndrome MedlinePlus Encyclopedia: Omphalocele Repair General Information from MedlinePlus (5 links) Diagnostic ...

  2. Study of the fast inversion recovery pulse sequence. With reference to fast fluid attenuated inversion recovery and fast short TI inversion recovery pulse sequence

    International Nuclear Information System (INIS)

    Tsuchihashi, Toshio; Maki, Toshio; Suzuki, Takeshi

    1997-01-01

    The fast inversion recovery (fast IR) pulse sequence was evaluated. We compared the fast fluid attenuated inversion recovery (fast FLAIR) pulse sequence in which inversion time (TI) was established as equal to the water null point for the purpose of the water-suppressed T 2 -weighted image, with the fast short TI inversion recovery (fast STIR) pulse sequence in which TI was established as equal to the fat null point for purpose of fat suppression. In the fast FLAIR pulse sequence, the water null point was increased by making TR longer. In the FLAIR pulse sequence, the longitudinal magnetization contrast is determined by TI. If TI is increased, T 2 -weighted contrast improves in the same way as increasing TR for the SE pulse sequence. Therefore, images should be taken with long TR and long TI, which are longer than TR and longer than the water null point. On the other hand, the fat null point is not affected by TR in the fast STIR pulse sequence. However, effective TE was affected by variation of the null point. This increased in proportion to the increase in effective TE. Our evaluation indicated that the fast STIR pulse sequence can control the extensive signals from fat in a short time. (author)

  3. An IR-Based Approach Utilizing Query Expansion for Plagiarism Detection in MEDLINE.

    Science.gov (United States)

    Nawab, Rao Muhammad Adeel; Stevenson, Mark; Clough, Paul

    2017-01-01

    The identification of duplicated and plagiarized passages of text has become an increasingly active area of research. In this paper, we investigate methods for plagiarism detection that aim to identify potential sources of plagiarism from MEDLINE, particularly when the original text has been modified through the replacement of words or phrases. A scalable approach based on Information Retrieval is used to perform candidate document selection-the identification of a subset of potential source documents given a suspicious text-from MEDLINE. Query expansion is performed using the ULMS Metathesaurus to deal with situations in which original documents are obfuscated. Various approaches to Word Sense Disambiguation are investigated to deal with cases where there are multiple Concept Unique Identifiers (CUIs) for a given term. Results using the proposed IR-based approach outperform a state-of-the-art baseline based on Kullback-Leibler Distance.

  4. Genotyping of Indian antigenic, vaccine, and field Brucella spp. using multilocus sequence typing.

    Science.gov (United States)

    Shome, Rajeswari; Krithiga, Natesan; Shankaranarayana, Padmashree B; Jegadesan, Sankarasubramanian; Udayakumar S, Vishnu; Shome, Bibek Ranjan; Saikia, Girin Kumar; Sharma, Narendra Kumar; Chauhan, Harshad; Chandel, Bharat Singh; Jeyaprakash, Rajendhran; Rahman, Habibur

    2016-03-31

    Brucellosis is one of the most important zoonotic diseases that affects multiple livestock species and causes great economic losses. The highly conserved genomes of Brucella, with > 90% homology among species, makes it important to study the genetic diversity circulating in the country. A total of 26 Brucella spp. (4 reference strains and 22 field isolates) and 1 B. melitensis draft genome sequence from India (B. melitensis Bm IND1) were included for sequence typing. The field isolates were identified by biochemical tests and confirmed by both conventional and quantitative polymerase chain reaction (qPCR) targeting bcsp 31Brucella genus-specific marker. Brucella speciation and biotyping was done by Bruce ladder, probe qPCR, and AMOS PCRs, respectively, and genotyping was done by multilocus sequence typing (MLST). The MLST typing of 27 Brucella spp. revealed five distinct sequence types (STs); the B. abortus S99 reference strain and 21 B. abortus field isolates belonged to ST1. On the other hand, the vaccine strain B. abortus S19 was genotyped as ST5. Similarly, B. melitensis 16M reference strain and one B. melitensis field isolate were grouped into ST7. Another B. melitensis field isolate belonged to ST8 (draft genome sequence from India), and only B. suis 1330 reference strain was found to be ST14. The sequences revealed genetic similarity of the Indian strains to the global reference and field strains. The study highlights the usefulness of MLST for typing of field isolates and validation of reference strains used for diagnosis and vaccination against brucellosis.

  5. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao

    2011-08-28

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  6. Multiple reference genomes and transcriptomes for Arabidopsis thaliana

    KAUST Repository

    Gan, Xiangchao; Stegle, Oliver; Behr, Jonas; Steffen, Joshua G.; Drewe, Philipp; Hildebrand, Katie L.; Lyngsoe, Rune; Schultheiss, Sebastian J.; Osborne, Edward J.; Sreedharan, Vipin T.; Kahles, André ; Bohnert, Regina; Jean, Gé raldine; Derwent, Paul; Kersey, Paul; Belfield, Eric J.; Harberd, Nicholas P.; Kemen, Eric; Toomajian, Christopher; Kover, Paula X.; Clark, Richard M.; Rä tsch, Gunnar; Mott, Richard

    2011-01-01

    Genetic differences between Arabidopsis thaliana accessions underlie the plants extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. ©2011 Macmillan Publishers Limited. All rights reserved.

  7. Geoseq: a tool for dissecting deep-sequencing datasets

    Directory of Open Access Journals (Sweden)

    Homann Robert

    2010-10-01

    Full Text Available Abstract Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO, Sequence Read Archive (SRA hosted by the NCBI, or the DNA Data Bank of Japan (ddbj. Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Results Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Conclusions Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a identify differential isoform expression in mRNA-seq datasets, b identify miRNAs (microRNAs in libraries, and identify mature and star sequences in miRNAS and c to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  8. Computer-aided visualization and analysis system for sequence evaluation

    Energy Technology Data Exchange (ETDEWEB)

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  9. Consistency and reproducibility of next-generation sequencing and other multigene mutational assays: A worldwide ring trial study on quantitative cytological molecular reference specimens.

    Science.gov (United States)

    Malapelle, Umberto; Mayo-de-Las-Casas, Clara; Molina-Vila, Miguel A; Rosell, Rafael; Savic, Spasenija; Bihl, Michel; Bubendorf, Lukas; Salto-Tellez, Manuel; de Biase, Dario; Tallini, Giovanni; Hwang, David H; Sholl, Lynette M; Luthra, Rajyalakshmi; Weynand, Birgit; Vander Borght, Sara; Missiaglia, Edoardo; Bongiovanni, Massimo; Stieber, Daniel; Vielh, Philippe; Schmitt, Fernando; Rappa, Alessandra; Barberis, Massimo; Pepe, Francesco; Pisapia, Pasquale; Serra, Nicola; Vigliar, Elena; Bellevicine, Claudio; Fassan, Matteo; Rugge, Massimo; de Andrea, Carlos E; Lozano, Maria D; Basolo, Fulvio; Fontanini, Gabriella; Nikiforov, Yuri E; Kamel-Reid, Suzanne; da Cunha Santos, Gilda; Nikiforova, Marina N; Roy-Chowdhuri, Sinchita; Troncone, Giancarlo

    2017-08-01

    Molecular testing of cytological lung cancer specimens includes, beyond epidermal growth factor receptor (EGFR), emerging predictive/prognostic genomic biomarkers such as Kirsten rat sarcoma viral oncogene homolog (KRAS), neuroblastoma RAS viral [v-ras] oncogene homolog (NRAS), B-Raf proto-oncogene, serine/threonine kinase (BRAF), and phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit α (PIK3CA). Next-generation sequencing (NGS) and other multigene mutational assays are suitable for cytological specimens, including smears. However, the current literature reflects single-institution studies rather than multicenter experiences. Quantitative cytological molecular reference slides were produced with cell lines designed to harbor concurrent mutations in the EGFR, KRAS, NRAS, BRAF, and PIK3CA genes at various allelic ratios, including low allele frequencies (AFs; 1%). This interlaboratory ring trial study included 14 institutions across the world that performed multigene mutational assays, from tissue extraction to data analysis, on these reference slides, with each laboratory using its own mutation analysis platform and methodology. All laboratories using NGS (n = 11) successfully detected the study's set of mutations with minimal variations in the means and standard errors of variant fractions at dilution points of 10% (P = .171) and 5% (P = .063) despite the use of different sequencing platforms (Illumina, Ion Torrent/Proton, and Roche). However, when mutations at a low AF of 1% were analyzed, the concordance of the NGS results was low, and this reflected the use of different thresholds for variant calling among the institutions. In contrast, laboratories using matrix-assisted laser desorption/ionization-time of flight (n = 2) showed lower concordance in terms of mutation detection and mutant AF quantification. Quantitative molecular reference slides are a useful tool for monitoring the performance of different multigene mutational

  10. Describing, Instantiating and Evaluating a Reference Architecture : A Case Study

    NARCIS (Netherlands)

    Avgeriou, Paris

    2003-01-01

    The result of a domain maturing is the emergence of reference architectures that offer numerous advantages to software architects and other stakeholders. However there is no straightforward way to describe a reference architecture and in sequence to design instances for specific systems, while at

  11. LPTAU, Quasi Random Sequence Generator

    International Nuclear Information System (INIS)

    Sobol, Ilya M.

    1993-01-01

    1 - Description of program or function: LPTAU generates quasi random sequences. These are uniformly distributed sets of L=M N points in the N-dimensional unit cube: I N =[0,1]x...x[0,1]. These sequences are used as nodes for multidimensional integration; as searching points in global optimization; as trial points in multi-criteria decision making; as quasi-random points for quasi Monte Carlo algorithms. 2 - Method of solution: Uses LP-TAU sequence generation (see references). 3 - Restrictions on the complexity of the problem: The number of points that can be generated is L 30 . The dimension of the space cannot exceed 51

  12. Single-Shell Tank (SST) Retrieval Sequence Fiscal Year 2000 Update

    International Nuclear Information System (INIS)

    GARFIELD, J.S.

    2000-01-01

    This document describes the baseline single-shell tank (SST) waste retrieval sequence for the River Protection Project (RPP) updated for Fiscal Year 2000. The SST retrieval sequence identifies the proposed retrieval order (sequence), the tank selection and prioritization rationale, and planned retrieval dates for Hanford SSTs. In addition, the tank selection criteria and reference retrieval method for this sequence are discussed

  13. RNA-sequence data normalization through in silico prediction of reference genes: the bacterial response to DNA damage as case study.

    Science.gov (United States)

    Berghoff, Bork A; Karlsson, Torgny; Källman, Thomas; Wagner, E Gerhart H; Grabherr, Manfred G

    2017-01-01

    Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization. Here, we present a novel method, moose 2 , which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli , and show how moose 2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/. The proposed RNA-seq normalization method, moose 2 , is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data

  14. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes

    DEFF Research Database (Denmark)

    Albertsen, Mads; Hugenholtz, Philip; Skarshewski, Adam

    2013-01-01

    Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition–independent approach to recover high-quality microbial genomes from deeply sequenced metageno......Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition–independent approach to recover high-quality microbial genomes from deeply sequenced...

  15. Low-bandwidth and non-compute intensive remote identification of microbes from raw sequencing reads.

    Directory of Open Access Journals (Sweden)

    Laurent Gautier

    Full Text Available Cheap DNA sequencing may soon become routine not only for human genomes but also for practically anything requiring the identification of living organisms from their DNA: tracking of infectious agents, control of food products, bioreactors, or environmental samples. We propose a novel general approach to the analysis of sequencing data where a reference genome does not have to be specified. Using a distributed architecture we are able to query a remote server for hints about what the reference might be, transferring a relatively small amount of data. Our system consists of a server with known reference DNA indexed, and a client with raw sequencing reads. The client sends a sample of unidentified reads, and in return receives a list of matching references. Sequences for the references can be retrieved and used for exhaustive computation on the reads, such as alignment. To demonstrate this approach we have implemented a web server, indexing tens of thousands of publicly available genomes and genomic regions from various organisms and returning lists of matching hits from query sequencing reads. We have also implemented two clients: one running in a web browser, and one as a python script. Both are able to handle a large number of sequencing reads and from portable devices (the browser-based running on a tablet, perform its task within seconds, and consume an amount of bandwidth compatible with mobile broadband networks. Such client-server approaches could develop in the future, allowing a fully automated processing of sequencing data and routine instant quality check of sequencing runs from desktop sequencers. A web access is available at http://tapir.cbs.dtu.dk. The source code for a python command-line client, a server, and supplementary data are available at http://bit.ly/1aURxkc.

  16. Low-Bandwidth and Non-Compute Intensive Remote Identification of Microbes from Raw Sequencing Reads

    Science.gov (United States)

    Gautier, Laurent; Lund, Ole

    2013-01-01

    Cheap DNA sequencing may soon become routine not only for human genomes but also for practically anything requiring the identification of living organisms from their DNA: tracking of infectious agents, control of food products, bioreactors, or environmental samples. We propose a novel general approach to the analysis of sequencing data where a reference genome does not have to be specified. Using a distributed architecture we are able to query a remote server for hints about what the reference might be, transferring a relatively small amount of data. Our system consists of a server with known reference DNA indexed, and a client with raw sequencing reads. The client sends a sample of unidentified reads, and in return receives a list of matching references. Sequences for the references can be retrieved and used for exhaustive computation on the reads, such as alignment. To demonstrate this approach we have implemented a web server, indexing tens of thousands of publicly available genomes and genomic regions from various organisms and returning lists of matching hits from query sequencing reads. We have also implemented two clients: one running in a web browser, and one as a python script. Both are able to handle a large number of sequencing reads and from portable devices (the browser-based running on a tablet), perform its task within seconds, and consume an amount of bandwidth compatible with mobile broadband networks. Such client-server approaches could develop in the future, allowing a fully automated processing of sequencing data and routine instant quality check of sequencing runs from desktop sequencers. A web access is available at http://tapir.cbs.dtu.dk. The source code for a python command-line client, a server, and supplementary data are available at http://bit.ly/1aURxkc. PMID:24391826

  17. Genetics Home Reference: hereditary diffuse gastric cancer

    Science.gov (United States)

    ... Health Conditions Hereditary diffuse gastric cancer Hereditary diffuse gastric cancer Printable PDF Open All Close All Enable Javascript ... Diffuse Gastric Cancer MedlinePlus Encyclopedia: Gastric Cancer National Cancer ... Option Overview General Information from MedlinePlus ( ...

  18. The "most wanted" taxa from the human microbiome for whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Anthony A Fodor

    Full Text Available The goal of the Human Microbiome Project (HMP is to generate a comprehensive catalog of human-associated microorganisms including reference genomes representing the most common species. Toward this goal, the HMP has characterized the microbial communities at 18 body habitats in a cohort of over 200 healthy volunteers using 16S rRNA gene (16S sequencing and has generated nearly 1,000 reference genomes from human-associated microorganisms. To determine how well current reference genome collections capture the diversity observed among the healthy microbiome and to guide isolation and future sequencing of microbiome members, we compared the HMP's 16S data sets to several reference 16S collections to create a 'most wanted' list of taxa for sequencing. Our analysis revealed that the diversity of commonly occurring taxa within the HMP cohort microbiome is relatively modest, few novel taxa are represented by these OTUs and many common taxa among HMP volunteers recur across different populations of healthy humans. Taken together, these results suggest that it should be possible to perform whole-genome sequencing on a large fraction of the human microbiome, including the 'most wanted', and that these sequences should serve to support microbiome studies across multiple cohorts. Also, in stark contrast to other taxa, the 'most wanted' organisms are poorly represented among culture collections suggesting that novel culture- and single-cell-based methods will be required to isolate these organisms for sequencing.

  19. Locating and parsing bibliographic references in HTML medical articles.

    Science.gov (United States)

    Zou, Jie; Le, Daniel; Thoma, George R

    2010-06-01

    The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level.

  20. Systematic review of serum steroid reference intervals developed using mass spectrometry.

    Science.gov (United States)

    Tavita, Nevada; Greaves, Ronda F

    2017-12-01

    The aim of this study was to perform a systematic review of the published literature to determine the available serum/plasma steroid reference intervals generated by mass spectrometry (MS) methods across all age groups in healthy subjects and to suggest recommendations to achieve common MS based reference intervals for serum steroids. MEDLINE, EMBASE and PubMed databases were used to conduct a comprehensive search for English language, MS-based reference interval studies for serum/plasma steroids. Selection of steroids to include was based on those listed in the Royal College of Pathologists of Australasia Quality Assurance Programs, Chemical Pathology, Endocrine Program. This methodology has been registered onto the PROSPERO International prospective register of systematic reviews (ID number: CRD42015029637). After accounting for duplicates, a total of 60 manuscripts were identified through the search strategy. Following critical evaluation, a total of 16 studies were selected. Of the 16 studies, 12 reported reference intervals for testosterone, 11 for 17 hydroxy-progesterone, nine for androstenedione, six for cortisol, three for progesterone, two for dihydrotestosterone and only one for aldosterone and dehydroepiandrosterone sulphate. No studies established MS-based reference intervals for oestradiol. As far as we are aware, this report provides the first comparison of the peer reviewed literature for serum/plasma steroid reference intervals generated by MS-based methods. The reference intervals based on these published studies can be used to inform the process to develop common reference intervals, and agreed reporting units for mass spectrometry based steroid methods. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  1. Experience of targeted Usher exome sequencing as a clinical test

    Science.gov (United States)

    Besnard, Thomas; García-García, Gema; Baux, David; Vaché, Christel; Faugère, Valérie; Larrieu, Lise; Léonard, Susana; Millan, Jose M; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2014-01-01

    We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service. PMID:24498627

  2. [Limiting a Medline/PubMed query to the "best" articles using the JCR relative impact factor].

    Science.gov (United States)

    Avillach, P; Kerdelhué, G; Devos, P; Maisonneuve, H; Darmoni, S J

    2014-12-01

    Medline/PubMed is the most frequently used medical bibliographic research database. The aim of this study was to propose a new generic method to limit any Medline/PubMed query based on the relative impact factor and the A & B categories of the SIGAPS score. The entire PubMed corpus was used for the feasibility study, then ten frequent diseases in terms of PubMed indexing and the citations of four Nobel prize winners. The relative impact factor (RIF) was calculated by medical specialty defined in Journal Citation Reports. The two queries, which included all the journals in category A (or A OR B), were added to any Medline/PubMed query as a central point of the feasibility study. Limitation using the SIGAPS category A was larger than the when using the Core Clinical Journals (CCJ): 15.65% of PubMed corpus vs 8.64% for CCJ. The response time of this limit applied to the entire PubMed corpus was less than two seconds. For five diseases out of ten, limiting the citations with the RIF was more effective than with the CCJ. For the four Nobel prize winners, limiting the citations with the RIF was more effective than the CCJ. The feasibility study to apply a new filter based on the relative impact factor on any Medline/PubMed query was positive. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  3. Sequencing and De Novo Transcriptome Assembly of Brachypodium sylvaticum (Poaceae

    Directory of Open Access Journals (Sweden)

    Samuel E. Fox

    2013-03-01

    Full Text Available Premise of the study: We report the de novo assembly and characterization of the transcriptomes of Brachypodium sylvaticum (slender false-brome accessions from native populations of Spain and Greece, and an invasive population west of Corvallis, Oregon, USA. Methods and Results: More than 350 million sequence reads from the mRNA libraries prepared from three B. sylvaticum genotypes were assembled into 120,091 (Corvallis, 104,950 (Spain, and 177,682 (Greece transcript contigs. In comparison with the B. distachyon Bd21 reference genome and GenBank protein sequences, we estimate >90% exome coverage for B. sylvaticum. The transcripts were assigned Gene Ontology and InterPro annotations. Brachypodium sylvaticum sequence reads aligned against the Bd21 genome revealed 394,654 single-nucleotide polymorphisms (SNPs and >20,000 simple sequence repeat (SSR DNA sites. Conclusions: To our knowledge, this is the first report of transcriptome sequencing of invasive plant species with a closely related sequenced reference genome. The sequences and identified SNP variant and SSR sites will provide tools for developing novel genetic markers for use in genotyping and characterization of invasive behavior of B. sylvaticum.

  4. Effect of endogenous reference genes on digital PCR assessment of genetically engineered canola events

    Directory of Open Access Journals (Sweden)

    Tigst Demeke

    2018-05-01

    Full Text Available Droplet digital PCR (ddPCR has been used for absolute quantification of genetically engineered (GE events. Absolute quantification of GE events by duplex ddPCR requires the use of appropriate primers and probes for target and reference gene sequences in order to accurately determine the amount of GE materials. Single copy reference genes are generally preferred for absolute quantification of GE events by ddPCR. Study has not been conducted on a comparison of reference genes for absolute quantification of GE canola events by ddPCR. The suitability of four endogenous reference sequences (HMG-I/Y, FatA(A, CruA and Ccf for absolute quantification of GE canola events by ddPCR was investigated. The effect of DNA extraction methods and DNA quality on the assessment of reference gene copy numbers was also investigated. ddPCR results were affected by the use of single vs. two copy reference genes. The single copy, FatA(A, reference gene was found to be stable and suitable for absolute quantification of GE canola events by ddPCR. For the copy numbers measured, the HMG-I/Y reference gene was less consistent than FatA(A reference gene. The expected ddPCR values were underestimated when CruA and Ccf (two copy endogenous Cruciferin sequences were used because of high number of copies. It is important to make an adjustment if two copy reference genes are used for ddPCR in order to obtain accurate results. On the other hand, real-time quantitative PCR results were not affected by the use of single vs. two copy reference genes. Keywords: Canola, Digital PCR, DNA extraction, GMO, Reference genes

  5. A Review of Published Articles in the Field of Biomedical Nanotechnology in Medline Database during 2000-2010

    OpenAIRE

    Peyman Sheikhzade

    2015-01-01

    Background and objectives : Nanotechnology is a new technology which is increasingly used over the past decade. Due to its great significance, governments are tending to invest greatly on the research and development on nanotechnology in various sectors and aspects. The purpose of this study was to determine the status of biomedical nanotechnology publications over the past ten years (2010-2000) in Medline/ PubMed. Material and Methods : This was a descriptive study. The Medline database wa...

  6. JVM: Java Visual Mapping tool for next generation sequencing read.

    Science.gov (United States)

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB.

  7. Impacto da indexação no SciELO e MEDLINE sobre as submissões ao Jornal de Pediatria Impact of SciELO and MEDLINE indexing on submissions to Jornal de Pediatria

    Directory of Open Access Journals (Sweden)

    Danilo Blank

    2005-12-01

    Full Text Available OBJETIVO: Avaliar o impacto da indexação no SciELO e MEDLINE sobre o número de artigos submetidos ao Jornal de Pediatria. MÉTODOS: Análise do total de artigos submetidos, artigos estrangeiros submetidos e índices de aceitação, nos seguintes períodos: estágio I - pré-site (janeiro/2000-março/2001; estágio II - site (abril/2001-julho/2002; estágio III - SciELO (agosto/2002-agosto/2003; estágio IV - MEDLINE (setembro/2003-dezembro/2004. RESULTADOS: Houve uma tendência significativa de aumento linear no número de submissões, durante o período do estudo (p = 0,009. O número de originais submetidos nos estágios I a IV foi, respectivamente: 184, 240, 297 e 482. O número de submissões foi similar nos estágios I e II (p = 0,148, mas foi significativamente maior no estágio III (p OBJECTIVE: To evaluate the impact of SciELO and MEDLINE indexing on the number of articles submitted to Jornal de Pediatria. METHODS: Analysis of total article submission, submission of articles from foreign countries and acceptance figures in the following periods: stage I - pre-website (Jan 2000-Mar 2001; stage II - website (Apr 2001-Jul 2002; stage III - SciELO (Aug 2002-Aug 2003; stage IV - MEDLINE (Sep 2003-Dec 2004. RESULTS: There was a significant trend toward linear increase in the number of submissions along the study period (p = 0.009. The number of manuscripts submitted in stages I through IV was 184, 240, 297, and 482, respectively. The number of submissions was similar in stages I and II (p = 0.148, but statistically higher in Stage III (p < 0.001 vs. Stage I and p = 0.006 vs. Stage II and Stage IV (p < 0.001 vs. stages I and II, and p < 0.05 vs. stage III. The rate of article acceptance decreased during the study period. The number of original articles published has been stable since the 2001 March/April issue (n = 10, when the journal reached a printed page limit, leading to stricter judgment criteria and a relative decrease in acceptance

  8. Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies.

    Science.gov (United States)

    Card, Daren C; Schield, Drew R; Reyes-Velasco, Jacobo; Fujita, Matthew K; Andrew, Audra L; Oyler-McCance, Sara J; Fike, Jennifer A; Tomback, Diana F; Ruggiero, Robert P; Castoe, Todd A

    2014-01-01

    As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5-5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.

  9. Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies

    Science.gov (United States)

    Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthre K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

    2014-01-01

    As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (~3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.

  10. Genetics Home Reference: Naegeli-Franceschetti-Jadassohn syndrome/dermatopathia pigmentosa reticularis

    Science.gov (United States)

    ... reticularis ( NFJS/DPR ) represents a rare type of ectodermal dysplasia, a group of about 150 conditions characterized by ... Related Skin Types (FIRST): Palmoplantar Keratodermas MedlinePlus Encyclopedia: Ectodermal Dysplasia MedlinePlus Encyclopedia: Nail Abnormalities General Information from MedlinePlus ( ...

  11. A Snapshot of the Emerging Tomato Genome Sequence

    Directory of Open Access Journals (Sweden)

    Lukas A. Mueller

    2009-03-01

    Full Text Available The genome of tomato ( L. is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States as part of the larger “International Solanaceae Genome Project (SOL: Systems Approach to Diversity and Adaptation” initiative. The tomato genome sequencing project uses an ordered bacterial artificial chromosome (BAC approach to generate a high-quality tomato euchromatic genome sequence for use as a reference genome for the Solanaceae and euasterids. Sequence is deposited at GenBank and at the SOL Genomics Network (SGN. Currently, there are around 1000 BACs finished or in progress, representing more than a third of the projected euchromatic portion of the genome. An annotation effort is also underway by the International Tomato Annotation Group. The expected number of genes in the euchromatin is ∼40,000, based on an estimate from a preliminary annotation of 11% of finished sequence. Here, we present this first snapshot of the emerging tomato genome and its annotation, a short comparison with potato ( L. sequence data, and the tools available for the researchers to exploit this new resource are also presented. In the future, whole-genome shotgun techniques will be combined with the BAC-by-BAC approach to cover the entire tomato genome. The high-quality reference euchromatic tomato sequence is expected to be near completion by 2010.

  12. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    Science.gov (United States)

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  13. Effect of endogenous reference genes on digital PCR assessment of genetically engineered canola events.

    Science.gov (United States)

    Demeke, Tigst; Eng, Monika

    2018-05-01

    Droplet digital PCR (ddPCR) has been used for absolute quantification of genetically engineered (GE) events. Absolute quantification of GE events by duplex ddPCR requires the use of appropriate primers and probes for target and reference gene sequences in order to accurately determine the amount of GE materials. Single copy reference genes are generally preferred for absolute quantification of GE events by ddPCR. Study has not been conducted on a comparison of reference genes for absolute quantification of GE canola events by ddPCR. The suitability of four endogenous reference sequences ( HMG-I/Y , FatA(A), CruA and Ccf) for absolute quantification of GE canola events by ddPCR was investigated. The effect of DNA extraction methods and DNA quality on the assessment of reference gene copy numbers was also investigated. ddPCR results were affected by the use of single vs. two copy reference genes. The single copy, FatA(A), reference gene was found to be stable and suitable for absolute quantification of GE canola events by ddPCR. For the copy numbers measured, the HMG-I/Y reference gene was less consistent than FatA(A) reference gene. The expected ddPCR values were underestimated when CruA and Ccf (two copy endogenous Cruciferin sequences) were used because of high number of copies. It is important to make an adjustment if two copy reference genes are used for ddPCR in order to obtain accurate results. On the other hand, real-time quantitative PCR results were not affected by the use of single vs. two copy reference genes.

  14. A hybrid reference-guided de novo assembly approach for generating Cyclospora mitochondrion genomes.

    Science.gov (United States)

    Gopinath, G R; Cinar, H N; Murphy, H R; Durigan, M; Almeria, M; Tall, B D; DaSilva, A J

    2018-01-01

    Cyclospora cayetanensis is a coccidian parasite associated with large and complex foodborne outbreaks worldwide. Linking samples from cyclosporiasis patients during foodborne outbreaks with suspected contaminated food sources, using conventional epidemiological methods, has been a persistent challenge. To address this issue, development of new methods based on potential genomically-derived markers for strain-level identification has been a priority for the food safety research community. The absence of reference genomes to identify nucleotide and structural variants with a high degree of confidence has limited the application of using sequencing data for source tracking during outbreak investigations. In this work, we determined the quality of a high resolution, curated, public mitochondrial genome assembly to be used as a reference genome by applying bioinformatic analyses. Using this reference genome, three new mitochondrial genome assemblies were built starting with metagenomic reads generated by sequencing DNA extracted from oocysts present in stool samples from cyclosporiasis patients. Nucleotide variants were identified in the new and other publicly available genomes in comparison with the mitochondrial reference genome. A consolidated workflow, presented here, to generate new mitochondrion genomes using our reference-guided de novo assembly approach could be useful in facilitating the generation of other mitochondrion sequences, and in their application for subtyping C. cayetanensis strains during foodborne outbreak investigations.

  15. Logic verification system for power plant sequence diagrams

    International Nuclear Information System (INIS)

    Fukuda, Mitsuko; Yamada, Naoyuki; Teshima, Toshiaki; Kan, Ken-ichi; Utsunomiya, Mitsugu.

    1994-01-01

    A logic verification system for sequence diagrams of power plants has been developed. The system's main function is to verify correctness of the logic realized by sequence diagrams for power plant control systems. The verification is based on a symbolic comparison of the logic of the sequence diagrams with the logic of the corresponding IBDs (interlock Block Diagrams) in combination with reference to design knowledge. The developed system points out the sub-circuit which is responsible for any existing mismatches between the IBD logic and the logic realized by the sequence diagrams. Applications to the verification of actual sequence diagrams of power plants confirmed that the developed system is practical and effective. (author)

  16. Diagnostic reference levels in digital mammography: a systematic review

    International Nuclear Information System (INIS)

    Suleiman, Moayyad E.; Brennan, Patrick C.; McEntee, Mark F.

    2015-01-01

    This study aims to review the literature on existing diagnostic reference levels (DRLs) in digital mammography and methodologies for establishing them. To this end, a systematic search through Medline, Cinahl, Web of Science, Scopus and Google scholar was conducted using search terms extracted from three terms: DRLs, digital mammography and breast screen. The search resulted in 1539 articles of which 22 were included after a screening process. Relevant data from the included studies were summarised and analysed. Differences were found in the methods utilised to establish DRLs including test subjects types, protocols followed, conversion factors employed, breast compressed thicknesses and percentile values adopted. These differences complicate comparison of DRLs among countries; hence, an internationally accepted protocol would be valuable so that international comparisons can be made. (authors)

  17. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.

    Science.gov (United States)

    Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron

    2012-02-01

    Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.

  18. Iterative normalization technique for reference sequence generation for zero-tail discrete fourier transform spread orthogonal frequency division multiplexing

    DEFF Research Database (Denmark)

    2017-01-01

    , and performing an iterative manipulation of the input sequence. The performing of the iterative manipulation of the input sequence may include, for example: computing frequency domain response of the sequence, normalizing elements of the computed frequency domain sequence to unitary power while maintaining phase...

  19. Harnessing cross-species alignment to discover SNPs and generate a draft genome sequence of a bighorn sheep (Ovis canadensis).

    Science.gov (United States)

    Miller, Joshua M; Moore, Stephen S; Stothard, Paul; Liao, Xiaoping; Coltman, David W

    2015-05-20

    Whole genome sequences (WGS) have proliferated as sequencing technology continues to improve and costs decline. While many WGS of model or domestic organisms have been produced, a growing number of non-model species are also being sequenced. In the absence of a reference, construction of a genome sequence necessitates de novo assembly which may be beyond the ability of many labs due to the large volumes of raw sequence data and extensive bioinformatics required. In contrast, the presence of a reference WGS allows for alignment which is more tractable than assembly. Recent work has highlighted that the reference need not come from the same species, potentially enabling a wide array of species WGS to be constructed using cross-species alignment. Here we report on the creation a draft WGS from a single bighorn sheep (Ovis canadensis) using alignment to the closely related domestic sheep (Ovis aries). Two sequencing libraries on SOLiD platforms yielded over 865 million reads, and combined alignment to the domestic sheep reference resulted in a nearly complete sequence (95% coverage of the reference) at an average of 12x read depth (104 SD). From this we discovered over 15 million variants and annotated them relative to the domestic sheep reference. We then conducted an enrichment analysis of those SNPs showing fixed differences between the reference and sequenced individual and found significant differences in a number of gene ontology (GO) terms, including those associated with reproduction, muscle properties, and bone deposition. Our results demonstrate that cross-species alignment enables the creation of novel WGS for non-model organisms. The bighorn sheep WGS will provide a resource for future resequencing studies or comparative genomics.

  20. Genetics Home Reference: Müllerian aplasia and hyperandrogenism

    Science.gov (United States)

    ... do not begin menstruation by age 16 (primary amenorrhea) and will likely never have a menstrual period. ... Encyclopedia: Ovarian Overproduction of Androgens MedlinePlus Encyclopedia: Primary Amenorrhea General Information from MedlinePlus (5 links) Diagnostic Tests ...

  1. A platform-independent method for detecting errors in metagenomic sequencing data: DRISEE.

    Directory of Open Access Journals (Sweden)

    Kevin P Keegan

    Full Text Available We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation, to assess sequencing quality (alternatively referred to as "noise" or "error" within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred. Here, DRISEE is applied to (non amplicon data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs, a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms.

  2. Rfam: annotating families of non-coding RNA sequences.

    Science.gov (United States)

    Daub, Jennifer; Eberhardt, Ruth Y; Tate, John G; Burge, Sarah W

    2015-01-01

    The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

  3. Usefulness of systematic review search strategies in finding child health systematic reviews in MEDLINE

    NARCIS (Netherlands)

    Boluyt, Nicole; Tjosvold, Lisa; Lefebvre, Carol; Klassen, Terry P.; Offringa, Martin

    2008-01-01

    OBJECTIVE: To determine the sensitivity and precision of existing search strategies for retrieving child health systematic reviews in MEDLINE using PubMed. DESIGN: Filter (diagnostic) accuracy study. We identified existing search strategies for systematic reviews, combined them with a filter that

  4. Efficient alignment of pyrosequencing reads for re-sequencing applications

    Directory of Open Access Journals (Sweden)

    Russo Luis MS

    2011-05-01

    Full Text Available Abstract Background Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accurately mapping reads to a reference genome in the context of re-sequencing projects. Results We present an efficient method for the local alignment of pyrosequencing reads produced by the GS FLX (454 system against a reference sequence. Our approach explores the characteristics of the data in these re-sequencing applications and uses state of the art indexing techniques combined with a flexible seed-based approach, leading to a fast and accurate algorithm which needs very little user parameterization. An evaluation performed using real and simulated data shows that our proposed method outperforms a number of mainstream tools on the quantity and quality of successful alignments, as well as on the execution time. Conclusions The proposed methodology was implemented in a software tool called TAPyR--Tool for the Alignment of Pyrosequencing Reads--which is publicly available from http://www.tapyr.net.

  5. Plann: A command-line application for annotating plastome sequences.

    Science.gov (United States)

    Huang, Daisie I; Cronk, Quentin C B

    2015-08-01

    Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann's output can be used in the National Center for Biotechnology Information's tbl2asn to create a Sequin file for GenBank submission. Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.

  6. Bibliometric analysis of scientific production indexed in MEDLINE, about hospital based home care services

    Directory of Open Access Journals (Sweden)

    Javier Sanz-Valero

    2017-01-01

    Full Text Available Objective: A thematic and bibliometric analysis was done for the available scientific production about the home care services based in the hospital. Methods: Bibliometric analysis. Data was obtained from MEDLINE database using MeSH “Home Care Services, Hospital-Based” as Major Topic. Search date: July 2016. The study sample was calculated by estimating population parameters for an infinite population and the selection was a simple random without replacement. Results: A total of 386 references were analysed. The number of original articles was of 204 (52,85%, identifying 243 institutions, with an index of cooperation of 3,75±1,16 authors/article. English was the predominant language in 279 (72,28% articles. The obsolescence was of 13 years according to the Burton-Kebler Index and the Price Index was of 14,40%. Bradford nucleon was constituted by 23 journals. The thematic classification determines a relevance of 70.73%. Conclusions: There was a high obsolescence and an anglophone orientation. Also, there was a weak relation between institutions and corporation index. Over the time there was an improvement of the access to the primary source, in line with the Open Access initiative. The production was collected in a high number of journals (in a very dispersed form. The thematic classification meets the studied issue.

  7. Examining the role of MEDLINE as a patient care information resource: an analysis of data from the Value of Libraries study.

    Science.gov (United States)

    Dunn, Kathel; Marshall, Joanne Gard; Wells, Amber L; Backus, Joyce E B

    2017-10-01

    This study analyzed data from a study on the value of libraries to understand the specific role that the MEDLINE database plays in relation to other information resources that are available to health care providers and its role in positively impacting patient care. A previous study on the use of health information resources for patient care obtained 16,122 responses from health care providers in 56 hospitals about how providers make decisions affecting patient care and the role of information resources in that process. Respondents indicated resources used in answering a specific clinical question from a list of 19 possible resources, including MEDLINE. Study data were examined using descriptive statistics and regression analysis to determine the number of information resources used and how they were used in combination with one another. Health care professionals used 3.5 resources, on average, to aid in patient care. The 2 most frequently used resources were journals (print and online) and the MEDLINE database. Using a higher number of information resources was significantly associated with a higher probability of making changes to patient care and avoiding adverse events. MEDLINE was the most likely to be among consulted resources compared to any other information resource other than journals. MEDLINE is a critical clinical care tool that health care professionals use to avoid adverse events, make changes to patient care, and answer clinical questions.

  8. Accelerating plant DNA barcode reference library construction using herbarium specimens: improved experimental techniques.

    Science.gov (United States)

    Xu, Chao; Dong, Wenpan; Shi, Shuo; Cheng, Tao; Li, Changhao; Liu, Yanlei; Wu, Ping; Wu, Hongkun; Gao, Peng; Zhou, Shiliang

    2015-11-01

    A well-covered reference library is crucial for successful identification of species by DNA barcoding. The biggest difficulty in building such a reference library is the lack of materials of organisms. Herbarium collections are potentially an enormous resource of materials. In this study, we demonstrate that it is likely to build such reference libraries using the reconstructed (self-primed PCR amplified) DNA from the herbarium specimens. We used 179 rosaceous specimens to test the effects of DNA reconstruction, 420 randomly sampled specimens to estimate the usable percentage and another 223 specimens of true cherries (Cerasus, Rosaceae) to test the coverage of usable specimens to the species. The barcode rbcLb (the central four-sevenths of rbcL gene) and matK was each amplified in two halves and sequenced on Roche GS 454 FLX+. DNA from the herbarium specimens was typically shorter than 300 bp. DNA reconstruction enabled amplification fragments of 400-500 bp without bringing or inducing any sequence errors. About one-third of specimens in the national herbarium of China (PE) were proven usable after DNA reconstruction. The specimens in PE cover all Chinese true cherry species and 91.5% of vascular species listed in Flora of China. It is very possible to build well-covered reference libraries for DNA barcoding of vascular species in China. As exemplified in this study, DNA reconstruction and DNA-labelled next-generation sequencing can accelerate the construction of local reference libraries. By putting the local reference libraries together, a global library for DNA barcoding becomes closer to reality. © 2015 John Wiley & Sons Ltd.

  9. MR pulse sequences for selective relaxation time measurements: a phantom study

    DEFF Research Database (Denmark)

    Thomsen, C; Jensen, K E; Jensen, M

    1990-01-01

    a Siemens Magnetom wholebody magnetic resonance scanner operating at 1.5 Tesla was used. For comparison six imaging pulse sequences for relaxation time measurements were tested on the same phantom. The spectroscopic pulse sequences all had an accuracy better than 10% of the reference values....

  10. Nordic reference study on uncertainty and sensitivity analysis

    International Nuclear Information System (INIS)

    Hirschberg, S.; Jacobsson, P.; Pulkkinen, U.; Porn, K.

    1989-01-01

    This paper provides a review of the first phase of Nordic reference study on uncertainty and sensitivity analysis. The main objective of this study is to use experiences form previous Nordic Benchmark Exercises and reference studies concerning critical modeling issues such as common cause failures and human interactions, and to demonstrate the impact of associated uncertainties on the uncertainty of the investigated accident sequence. This has been done independently by three working groups which used different approaches to modeling and to uncertainty analysis. The estimated uncertainty interval for the analyzed accident sequence is large. Also the discrepancies between the groups are substantial but can be explained. Sensitivity analyses which have been carried out concern e.g. use of different CCF-quantification models, alternative handling of CCF-data, time windows for operator actions and time dependences in phase mission operation, impact of state-of-knowledge dependences and ranking of dominating uncertainty contributors. Specific findings with respect to these issues are summarized in the paper

  11. SIS: a program to generate draft genome sequence scaffolds for prokaryotes

    Directory of Open Access Journals (Sweden)

    Dias Zanoni

    2012-05-01

    Full Text Available Abstract Background Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome. Results We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of sis, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that sis has overall better performance. Conclusions sis is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of sis in our tests adds evidence that large

  12. Idiographic duo-trio tests using a constant-reference based on preference of each consumer: Sample presentation sequence in difference test can be customized for individual consumers to reduce error.

    Science.gov (United States)

    Kim, Min-A; Sim, Hye-Min; Lee, Hye-Seong

    2016-11-01

    As reformulations and processing changes are increasingly needed in the food industry to produce healthier, more sustainable, and cost effective products while maintaining superior quality, reliable measurements of consumers' sensory perception and discrimination are becoming more critical. Consumer discrimination methods using a preferred-reference duo-trio test design have been shown to be effective in improving the discrimination performance by customizing sample presentation sequences. However, this design can add complexity to the discrimination task for some consumers, resulting in more errors in sensory discrimination. The objective of the present study was to investigate the effects of different types of test instructions using the preference-reference duo-trio test design where a paired-preference test is followed by 6 repeated preferred-reference duo-trio tests, in comparison to the analytical method using the balanced-reference duo-trio. Analyses of d' estimates (product-related measure) and probabilistic sensory discriminators in momentary numbers of subjects showing statistical significance (subject-related measure) revealed that only preferred-reference duo-trio test using affective reference-framing, either by providing no information about the reference or information on a previously preferred sample, improved the sensory discrimination more than the analytical method. No decrease in discrimination performance was observed with any type of instruction, confirming that consumers could handle the test methods. These results suggest that when repeated tests are feasible, using the affective discrimination method would be operationally more efficient as well as ecologically more reliable for measuring consumers' sensory discrimination ability. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  14. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences

    Directory of Open Access Journals (Sweden)

    Robert C. Edgar

    2018-04-01

    Full Text Available Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%, all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal.

  15. For Distinguished Public Service: Medical Library Association Honors FNLM and NIH MedlinePlus Magazine | NIH ...

    Science.gov (United States)

    ... Medical Library Association Honors FNLM and NIH MedlinePlus Magazine Past Issues / Summer 2011 Table of Contents MLA ... From You We want your feedback on the magazine and ideas for future issues, as well as ...

  16. Welcome from Library Director Donald A.B. Lindberg, M.D. | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... turn Javascript on. Welcome to the NIH MedlinePlus Magazine. Past Issues / Spring 2013 Table of Contents Donald ... about their efforts to cure disease. Lastly, the magazine's lively graphics, fun quizzes and practical tips have ...

  17. An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

    Directory of Open Access Journals (Sweden)

    Herring Christopher D

    2007-08-01

    Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.

  18. dinoref: A curated dinoflagellate (Dinophyceae) reference database for the 18S rRNA gene.

    Science.gov (United States)

    Mordret, Solenn; Piredda, Roberta; Vaulot, Daniel; Montresor, Marina; Kooistra, Wiebe H C F; Sarno, Diana

    2018-03-30

    Dinoflagellates are a heterogeneous group of protists present in all aquatic ecosystems where they occupy various ecological niches. They play a major role as primary producers, but many species are mixotrophic or heterotrophic. Environmental metabarcoding based on high-throughput sequencing is increasingly applied to assess diversity and abundance of planktonic organisms, and reference databases are definitely needed to taxonomically assign the huge number of sequences. We provide an updated 18S rRNA reference database of dinoflagellates: dinoref. Sequences were downloaded from genbank and filtered based on stringent quality criteria. All sequences were taxonomically curated, classified taking into account classical morphotaxonomic studies and molecular phylogenies, and linked to a series of metadata. dinoref includes 1,671 sequences representing 149 genera and 422 species. The taxonomic assignation of 468 sequences was revised. The largest number of sequences belongs to Gonyaulacales and Suessiales that include toxic and symbiotic species. dinoref provides an opportunity to test the level of taxonomic resolution of different 18S barcode markers based on a large number of sequences and species. As an example, when only the V4 region is considered, 374 of the 422 species included in dinoref can still be unambiguously identified. Clustering the V4 sequences at 98% similarity, a threshold that is commonly applied in metabarcoding studies, resulted in a considerable underestimation of species diversity. © 2018 John Wiley & Sons Ltd.

  19. Knowledge production status of Iranian researchers in the gastric cancer area: based on the medline database.

    Science.gov (United States)

    Ghojazadeh, Morteza; Naghavi-Behzad, Mohammad; Nasrolah-Zadeh, Raheleh; Bayat-Khajeh, Parvaneh; Piri, Reza; Mirnia, Keyvan; Azami-Aghdash, Saber

    2014-01-01

    Scientometrics is a useful method for management of financial and human resources and has been applied many times in medical sciences during recent years. The aim of this study was to investigate the status of science production by Iranian scientists in the gastric cancer field based on the Medline database. In this descriptive-cross sectional study Iranian science production concerning gastric cancer during 2000-2011 was investigated based on Medline. After two stages of searching, 121 articles were found, then we reviewed publication date, authors names, journal title, impact factor (IF), and cooperation coefficient between researchers. SPSS.19 was used for statistical analysis. There was a significant increase in published articles about gastric cancer by Iranian researchers in Medline database during 2006-2011. Mean cooperation coefficient between researchers was 6.14±3.29 person per article. Articles of this field were published in 19 countries and 56 journals. Those basex in Thailand, England, and America had the most published Iranian articles. Tehran University of Medical Sciences and Mohammadreza Zali had the most outstanding role in publishing scientific articles. According to results of this study, improving cooperation of researchers in conducting research and scientometric studies about other fields may have an important role in increasing both quality and quantity of published studies.

  20. Easy and accurate reconstruction of whole HIV genomes from short-read sequence data with shiver

    Science.gov (United States)

    Blanquart, François; Golubchik, Tanya; Gall, Astrid; Bakker, Margreet; Bezemer, Daniela; Croucher, Nicholas J; Hall, Matthew; Hillebregt, Mariska; Ratmann, Oliver; Albert, Jan; Bannert, Norbert; Fellay, Jacques; Fransen, Katrien; Gourlay, Annabelle; Grabowski, M Kate; Gunsenheimer-Bartmeyer, Barbara; Günthard, Huldrych F; Kivelä, Pia; Kouyos, Roger; Laeyendecker, Oliver; Liitsola, Kirsi; Meyer, Laurence; Porter, Kholoud; Ristola, Matti; van Sighem, Ard; Cornelissen, Marion; Kellam, Paul; Reiss, Peter

    2018-01-01

    Abstract Studying the evolution of viruses and their molecular epidemiology relies on accurate viral sequence data, so that small differences between similar viruses can be meaningfully interpreted. Despite its higher throughput and more detailed minority variant data, next-generation sequencing has yet to be widely adopted for HIV. The difficulty of accurately reconstructing the consensus sequence of a quasispecies from reads (short fragments of DNA) in the presence of large between- and within-host diversity, including frequent indels, may have presented a barrier. In particular, mapping (aligning) reads to a reference sequence leads to biased loss of information; this bias can distort epidemiological and evolutionary conclusions. De novo assembly avoids this bias by aligning the reads to themselves, producing a set of sequences called contigs. However contigs provide only a partial summary of the reads, misassembly may result in their having an incorrect structure, and no information is available at parts of the genome where contigs could not be assembled. To address these problems we developed the tool shiver to pre-process reads for quality and contamination, then map them to a reference tailored to the sample using corrected contigs supplemented with the user’s choice of existing reference sequences. Run with two commands per sample, it can easily be used for large heterogeneous data sets. We used shiver to reconstruct the consensus sequence and minority variant information from paired-end short-read whole-genome data produced with the Illumina platform, for sixty-five existing publicly available samples and fifty new samples. We show the systematic superiority of mapping to shiver’s constructed reference compared with mapping the same reads to the closest of 3,249 real references: median values of 13 bases called differently and more accurately, 0 bases called differently and less accurately, and 205 bases of missing sequence recovered. We also

  1. A Leader in Clinical Trials, Medical Data, & Electronic Health Information | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... of C. Everett Koop, the former Surgeon General; Charles Drew, the “father of the blood bank”; Rosalind ... MedlinePlus information resources about prescriptions. Soon, such information will be available through one click on the name ...

  2. The first draft reference genome of the American mink ( Neovison vison )

    DEFF Research Database (Denmark)

    Cai, Zexi; Petersen, Bent; Sahana, Goutam

    2017-01-01

    The American mink (Neovison vison) is a semiaquatic species of mustelid native to North America. It’s an important animal for the fur industry. Many efforts have been made to locate genes influencing fur quality and color, but this search has been impeded by the lack of a reference genome. Here we...... present the first draft genome of mink. In our study, two mink individuals were sequenced by Illumina sequencing with 797 Gb sequence generated. Assembly yielded 7,175 scaffolds with an N50 of 6.3 Mb and length of 2.4 Gb including gaps. Repeat sequences constitute around 31% of the genome, which is lower...

  3. Compression of FASTQ and SAM format sequencing data.

    Directory of Open Access Journals (Sweden)

    James K Bonfield

    Full Text Available Storage and transmission of the data produced by modern DNA sequencing instruments has become a major concern, which prompted the Pistoia Alliance to pose the SequenceSqueeze contest for compression of FASTQ files. We present several compression entries from the competition, Fastqz and Samcomp/Fqzcomp, including the winning entry. These are compared against existing algorithms for both reference based compression (CRAM, Goby and non-reference based compression (DSRC, BAM and other recently published competition entries (Quip, SCALCE. The tools are shown to be the new Pareto frontier for FASTQ compression, offering state of the art ratios at affordable CPU costs. All programs are freely available on SourceForge. Fastqz: https://sourceforge.net/projects/fastqz/, fqzcomp: https://sourceforge.net/projects/fqzcomp/, and samcomp: https://sourceforge.net/projects/samcomp/.

  4. Genetics Home Reference: platyspondylic lethal skeletal dysplasia, Torrance type

    Science.gov (United States)

    ... and an exaggerated curvature of the lower back ( lordosis ). Infants with this condition are born with a ... Diagnosis and Management Resources (1 link) MedlinePlus Encyclopedia: Lordosis General Information from MedlinePlus (5 links) Diagnostic Tests ...

  5. Alignment of 1000 Genomes Project reads to reference assembly GRCh38.

    Science.gov (United States)

    Zheng-Bradley, Xiangqun; Streeter, Ian; Fairley, Susan; Richardson, David; Clarke, Laura; Flicek, Paul

    2017-07-01

    The 1000 Genomes Project produced more than 100 trillion basepairs of short read sequence from more than 2600 samples in 26 populations over a period of five years. In its final phase, the project released over 85 million genotyped and phased variants on human reference genome assembly GRCh37. An updated reference assembly, GRCh38, was released in late 2013, but there was insufficient time for the final phase of the project analysis to change to the new assembly. Although it is possible to lift the coordinates of the 1000 Genomes Project variants to the new assembly, this is a potentially error-prone process as coordinate remapping is most appropriate only for non-repetitive regions of the genome and those that did not see significant change between the two assemblies. It will also miss variants in any region that was newly added to GRCh38. Thus, to produce the highest quality variants and genotypes on GRCh38, the best strategy is to realign the reads and recall the variants based on the new alignment. As the first step of variant calling for the 1000 Genomes Project data, we have finished remapping all of the 1000 Genomes sequence reads to GRCh38 with alternative scaffold-aware BWA-MEM. The resulting alignments are available as CRAM, a reference-based sequence compression format. The data have been released on our FTP site and are also available from European Nucleotide Archive to facilitate researchers discovering variants on the primary sequences and alternative contigs of GRCh38. © The Authors 2017. Published by Oxford University Press.

  6. A robust, simple genotyping-by-sequencing (GBS approach for high diversity species.

    Directory of Open Access Journals (Sweden)

    Robert J Elshire

    Full Text Available Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs. This approach is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two to three fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species with high levels of genetic diversity. The GBS procedure is demonstrated with maize (IBM and barley (Oregon Wolfe Barley recombinant inbred populations where roughly 200,000 and 25,000 sequence tags were mapped, respectively. An advantage in species like barley that lack a complete genome sequence is that a reference map need only be developed around the restriction sites, and this can be done in the process of sample genotyping. In such cases, the consensus of the read clusters across the sequence tagged sites becomes the reference. Alternatively, for kinship analyses in the absence of a reference genome, the sequence tags can simply be treated as dominant markers. Future application of GBS to breeding, conservation, and global species and population surveys may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.

  7. Design of Protein Multi-specificity Using an Independent Sequence Search Reduces the Barrier to Low Energy Sequences.

    Directory of Open Access Journals (Sweden)

    Alexander M Sevy

    2015-07-01

    Full Text Available Computational protein design has found great success in engineering proteins for thermodynamic stability, binding specificity, or enzymatic activity in a 'single state' design (SSD paradigm. Multi-specificity design (MSD, on the other hand, involves considering the stability of multiple protein states simultaneously. We have developed a novel MSD algorithm, which we refer to as REstrained CONvergence in multi-specificity design (RECON. The algorithm allows each state to adopt its own sequence throughout the design process rather than enforcing a single sequence on all states. Convergence to a single sequence is encouraged through an incrementally increasing convergence restraint for corresponding positions. Compared to MSD algorithms that enforce (constrain an identical sequence on all states the energy landscape is simplified, which accelerates the search drastically. As a result, RECON can readily be used in simulations with a flexible protein backbone. We have benchmarked RECON on two design tasks. First, we designed antibodies derived from a common germline gene against their diverse targets to assess recovery of the germline, polyspecific sequence. Second, we design "promiscuous", polyspecific proteins against all binding partners and measure recovery of the native sequence. We show that RECON is able to efficiently recover native-like, biologically relevant sequences in this diverse set of protein complexes.

  8. PVWatts Version 1 Technical Reference

    Energy Technology Data Exchange (ETDEWEB)

    Dobos, A. P.

    2013-10-01

    The NREL PVWatts(TM) calculator is a web application developed by the National Renewable Energy Laboratory (NREL) that estimates the electricity production of a grid-connected photovoltaic system based on a few simple inputs. PVWatts combines a number of sub-models to predict overall system performance, and makes several hidden assumptions about performance parameters. This technical reference details the individual sub-models, documents assumptions and hidden parameters, and explains the sequence of calculations that yield the final system performance estimation.

  9. Gas: MedlinePlus Health Topic

    Science.gov (United States)

    ... attenuation via probiotic intervention reduces flatulence in adult human:... Article: The overlap of gastroesophageal reflux disease and functional constipation in... Gas -- see more articles Reference Desk Your Digestive System and How It Works (National Institute of Diabetes ...

  10. SEQUENCING AND DE NOVO DRAFT ASSEMBLIES OF A FATHEAD MINNOW (Pimpehales promelas) reference genome

    Data.gov (United States)

    U.S. Environmental Protection Agency — The dataset provides the URLs for accessing the genome sequence data and two draft assemblies as well as fathead minnow genotyping data associated with estimating...

  11. An optimum analysis sequence for environmental gamma-ray spectrometry

    International Nuclear Information System (INIS)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L.

    2010-10-01

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced χ 2 criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  12. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers.

    Directory of Open Access Journals (Sweden)

    Stephan Pabinger

    Full Text Available Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM. Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage

  13. Evaluation of the reliability of maize reference assays for GMO quantification.

    Science.gov (United States)

    Papazova, Nina; Zhang, David; Gruden, Kristina; Vojvoda, Jana; Yang, Litao; Buh Gasparic, Meti; Blejec, Andrej; Fouilloux, Stephane; De Loose, Marc; Taverniers, Isabel

    2010-03-01

    A reliable PCR reference assay for relative genetically modified organism (GMO) quantification must be specific for the target taxon and amplify uniformly along the commercialised varieties within the considered taxon. Different reference assays for maize (Zea mays L.) are used in official methods for GMO quantification. In this study, we evaluated the reliability of eight existing maize reference assays, four of which are used in combination with an event-specific polymerase chain reaction (PCR) assay validated and published by the Community Reference Laboratory (CRL). We analysed the nucleotide sequence variation in the target genomic regions in a broad range of transgenic and conventional varieties and lines: MON 810 varieties cultivated in Spain and conventional varieties from various geographical origins and breeding history. In addition, the reliability of the assays was evaluated based on their PCR amplification performance. A single base pair substitution, corresponding to a single nucleotide polymorphism (SNP) reported in an earlier study, was observed in the forward primer of one of the studied alcohol dehydrogenase 1 (Adh1) (70) assays in a large number of varieties. The SNP presence is consistent with a poor PCR performance observed for this assay along the tested varieties. The obtained data show that the Adh1 (70) assay used in the official CRL NK603 assay is unreliable. Based on our results from both the nucleotide stability study and the PCR performance test, we can conclude that the Adh1 (136) reference assay (T25 and Bt11 assays) as well as the tested high mobility group protein gene assay, which also form parts of CRL methods for quantification, are highly reliable. Despite the observed uniformity in the nucleotide sequence of the invertase gene assay, the PCR performance test reveals that this target sequence might occur in more than one copy. Finally, although currently not forming a part of official quantification methods, zein and SSIIb

  14. Highly accurate sequence imputation enables precise QTL mapping in Brown Swiss cattle.

    Science.gov (United States)

    Frischknecht, Mirjam; Pausch, Hubert; Bapst, Beat; Signer-Hasler, Heidi; Flury, Christine; Garrick, Dorian; Stricker, Christian; Fries, Ruedi; Gredler-Grandl, Birgit

    2017-12-29

    Within the last few years a large amount of genomic information has become available in cattle. Densities of genomic information vary from a few thousand variants up to whole genome sequence information. In order to combine genomic information from different sources and infer genotypes for a common set of variants, genotype imputation is required. In this study we evaluated the accuracy of imputation from high density chips to whole genome sequence data in Brown Swiss cattle. Using four popular imputation programs (Beagle, FImpute, Impute2, Minimac) and various compositions of reference panels, the accuracy of the imputed sequence variant genotypes was high and differences between the programs and scenarios were small. We imputed sequence variant genotypes for more than 1600 Brown Swiss bulls and performed genome-wide association studies for milk fat percentage at two stages of lactation. We found one and three quantitative trait loci for early and late lactation fat content, respectively. Known causal variants that were imputed from the sequenced reference panel were among the most significantly associated variants of the genome-wide association study. Our study demonstrates that whole-genome sequence information can be imputed at high accuracy in cattle populations. Using imputed sequence variant genotypes in genome-wide association studies may facilitate causal variant detection.

  15. Using incomplete citation data for MEDLINE results ranking.

    Science.gov (United States)

    Herskovic, Jorge R; Bernstam, Elmer V

    2005-01-01

    Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.

  16. FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.

    Science.gov (United States)

    Hoogenboom, Jerry; van der Gaag, Kristiaan J; de Leeuw, Rick H; Sijen, Titia; de Knijff, Peter; Laros, Jeroen F J

    2017-03-01

    Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  17. PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic eukaryotes with curated taxonomy.

    Science.gov (United States)

    Decelle, Johan; Romac, Sarah; Stern, Rowena F; Bendif, El Mahdi; Zingone, Adriana; Audic, Stéphane; Guiry, Michael D; Guillou, Laure; Tessier, Désiré; Le Gall, Florence; Gourvil, Priscillia; Dos Santos, Adriana L; Probert, Ian; Vaulot, Daniel; de Vargas, Colomban; Christen, Richard

    2015-11-01

    Photosynthetic eukaryotes have a critical role as the main producers in most ecosystems of the biosphere. The ongoing environmental metabarcoding revolution opens the perspective for holistic ecosystems biological studies of these organisms, in particular the unicellular microalgae that often lack distinctive morphological characters and have complex life cycles. To interpret environmental sequences, metabarcoding necessarily relies on taxonomically curated databases containing reference sequences of the targeted gene (or barcode) from identified organisms. To date, no such reference framework exists for photosynthetic eukaryotes. In this study, we built the PhytoREF database that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages. We compiled 3333 amplicon sequences available from public databases and 879 sequences extracted from plastidial genomes, and generated 411 novel sequences from cultured marine microalgal strains belonging to different eukaryotic lineages. A total of 1867 environmental Sanger 16S rDNA sequences were also included in the database. Stringent quality filtering and a phylogeny-based taxonomic classification were applied for each 16S rDNA sequence. The database mainly focuses on marine microalgae, but sequences from land plants (representing half of the PhytoREF sequences) and freshwater taxa were also included to broaden the applicability of PhytoREF to different aquatic and terrestrial habitats. PhytoREF, accessible via a web interface (http://phytoref.fr), is a new resource in molecular ecology to foster the discovery, assessment and monitoring of the diversity of photosynthetic eukaryotes using high-throughput sequencing. © 2015 John Wiley & Sons Ltd.

  18. High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies.

    Directory of Open Access Journals (Sweden)

    Anjana Srivatsan

    2008-08-01

    Full Text Available Whole-genome sequencing is a powerful technique for obtaining the reference sequence information of multiple organisms. Its use can be dramatically expanded to rapidly identify genomic variations, which can be linked with phenotypes to obtain biological insights. We explored these potential applications using the emerging next-generation sequencing platform Solexa Genome Analyzer, and the well-characterized model bacterium Bacillus subtilis. Combining sequencing with experimental verification, we first improved the accuracy of the published sequence of the B. subtilis reference strain 168, then obtained sequences of multiple related laboratory strains and different isolates of each strain. This provides a framework for comparing the divergence between different laboratory strains and between their individual isolates. We also demonstrated the power of Solexa sequencing by using its results to predict a defect in the citrate signal transduction pathway of a common laboratory strain, which we verified experimentally. Finally, we examined the molecular nature of spontaneously generated mutations that suppress the growth defect caused by deletion of the stringent response mediator relA. Using whole-genome sequencing, we rapidly mapped these suppressor mutations to two small homologs of relA. Interestingly, stable suppressor strains had mutations in both genes, with each mutation alone partially relieving the relA growth defect. This supports an intriguing three-locus interaction module that is not easily identifiable through traditional suppressor mapping. We conclude that whole-genome sequencing can drastically accelerate the identification of suppressor mutations and complex genetic interactions, and it can be applied as a standard tool to investigate the genetic traits of model organisms.

  19. CIG-DB: the database for human or mouse immunoglobulin and T cell receptor genes available for cancer studies

    Directory of Open Access Journals (Sweden)

    Furue Motoki

    2010-07-01

    Full Text Available Abstract Background Immunoglobulin (IG or antibody and the T-cell receptor (TR are pivotal proteins in the immune system of higher organisms. In cancer immunotherapy, the immune responses mediated by tumor-epitope-binding IG or TR play important roles in anticancer effects. Although there are public databases specific for immunological genes, their contents have not been associated with clinical studies. Therefore, we developed an integrated database of IG/TR data reported in cancer studies (the Cancer-related Immunological Gene Database [CIG-DB]. Description This database is designed as a platform to explore public human and murine IG/TR genes sequenced in cancer studies. A total of 38,308 annotation entries for IG/TR proteins were collected from GenBank/DDBJ/EMBL and the Protein Data Bank, and 2,740 non-redundant corresponding MEDLINE references were appended. Next, we filtered the MEDLINE texts by MeSH terms, titles, and abstracts containing keywords related to cancer. After we performed a manual check, we classified the protein entries into two groups: 611 on cancer therapy (Group I and 1,470 on hematological tumors (Group II. Thus, a total of 2,081 cancer-related IG and TR entries were tabularized. To effectively classify future entries, we developed a computational method based on text mining and canonical discriminant analysis by parsing MeSH/title/abstract words. We performed a leave-one-out cross validation for the method, which showed high accuracy rates: 94.6% for IG references and 94.7% for TR references. We also collected 920 epitope sequences bound with IG/TR. The CIG-DB is equipped with search engines for amino acid sequences and MEDLINE references, sequence analysis tools, and a 3D viewer. This database is accessible without charge or registration at http://www.scchr-cigdb.jp/, and the search results are freely downloadable. Conclusions The CIG-DB serves as a bridge between immunological gene data and cancer studies, presenting

  20. Accuracy of microbial community diversity estimated by closed- and open-reference OTUs

    Directory of Open Access Journals (Sweden)

    Robert C. Edgar

    2017-10-01

    Full Text Available Next-generation sequencing of 16S ribosomal RNA is widely used to survey microbial communities. Sequences are typically assigned to Operational Taxonomic Units (OTUs. Closed- and open-reference OTU assignment matches reads to a reference database at 97% identity (closed, then clusters unmatched reads using a de novo method (open. Implementations of these methods in the QIIME package were tested on several mock community datasets with 20 strains using different sequencing technologies and primers. Richness (number of reported OTUs was often greatly exaggerated, with hundreds or thousands of OTUs generated on Illumina datasets. Between-sample diversity was also found to be highly exaggerated in many cases, with weighted Jaccard distances between identical mock samples often close to one, indicating very low similarity. Non-overlapping hyper-variable regions in 70% of species were assigned to different OTUs. On mock communities with Illumina V4 reads, 56% to 88% of predicted genus names were false positives. Biological inferences obtained using these methods are therefore not reliable.

  1. AGORA : Organellar genome annotation from the amino acid and nucleotide references.

    Science.gov (United States)

    Jung, Jaehee; Kim, Jong Im; Jeong, Young-Sik; Yi, Gangman

    2018-03-29

    Next-generation sequencing (NGS) technologies have led to the accumulation of highthroughput sequence data from various organisms in biology. To apply gene annotation of organellar genomes for various organisms, more optimized tools for functional gene annotation are required. Almost all gene annotation tools are mainly focused on the chloroplast genome of land plants or the mitochondrial genome of animals.We have developed a web application AGORA for the fast, user-friendly, and improved annotations of organellar genomes. AGORA annotates genes based on a BLAST-based homology search and clustering with selected reference sequences from the NCBI database or user-defined uploaded data. AGORA can annotate the functional genes in almost all mitochondrion and plastid genomes of eukaryotes. The gene annotation of a genome with an exon-intron structure within a gene or inverted repeat region is also available. It provides information of start and end positions of each gene, BLAST results compared with the reference sequence, and visualization of gene map by OGDRAW. Users can freely use the software, and the accessible URL is https://bigdata.dongguk.edu/gene_project/AGORA/.The main module of the tool is implemented by the python and php, and the web page is built by the HTML and CSS to support all browsers. gangman@dongguk.edu.

  2. Milestones in Medical Research, The Human Genome and ClinicalTrials.gov | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... the information in this issue of NIH MedlinePlus magazine helps you and your loved ones stay healthier! ... From You! We want your feedback on the magazine, ideas for future issues, as well as questions ...

  3. Plann: A command-line application for annotating plastome sequences1

    Science.gov (United States)

    Huang, Daisie I.; Cronk, Quentin C. B.

    2015-01-01

    Premise of the study: Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome. Methods and Results: Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann’s output can be used in the National Center for Biotechnology Information’s tbl2asn to create a Sequin file for GenBank submission. Conclusions: Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved. PMID:26312193

  4. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  5. Generalized locally Toeplitz sequences theory and applications

    CERN Document Server

    Garoni, Carlo

    2017-01-01

    Based on their research experience, the authors propose a reference textbook in two volumes on the theory of generalized locally Toeplitz sequences and their applications. This first volume focuses on the univariate version of the theory and the related applications in the unidimensional setting, while the second volume, which addresses the multivariate case, is mainly devoted to concrete PDE applications. This book systematically develops the theory of generalized locally Toeplitz (GLT) sequences and presents some of its main applications, with a particular focus on the numerical discretization of differential equations (DEs). It is the first book to address the relatively new field of GLT sequences, which occur in numerous scientific applications and are especially dominant in the context of DE discretizations. Written for applied mathematicians, engineers, physicists, and scientists who (perhaps unknowingly) encounter GLT sequences in their research, it is also of interest to those working in the fields of...

  6. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    Science.gov (United States)

    Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

    2008-01-01

    Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960

  7. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  8. De novo assembly of human genomes with massively parallel short read sequencing

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Zhu, Hongmei; Ruan, Jue

    2010-01-01

    genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities...... for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way....

  9. Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method

    Directory of Open Access Journals (Sweden)

    Bingfu Guo

    2016-07-01

    Full Text Available Molecular characterization of sequences flanking exogenous fragment insertions is essential for safety assessment and labeling of genetically modified organisms (GMO. In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS method. About 21 Gb sequence data (~21× coverage for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundary of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of the genomic insertion site of the G2-EPSPS and GAT transgenes will facilitate the use of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS is a cost-effective and rapid method of identifying sites of T-DNA insertions and flanking sequences in soybean.

  10. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

    Science.gov (United States)

    Schneider, Valerie A; Graves-Lindsay, Tina; Howe, Kerstin; Bouk, Nathan; Chen, Hsiu-Chuan; Kitts, Paul A; Murphy, Terence D; Pruitt, Kim D; Thibaud-Nissen, Françoise; Albracht, Derek; Fulton, Robert S; Kremitzki, Milinn; Magrini, Vincent; Markovic, Chris; McGrath, Sean; Steinberg, Karyn Meltz; Auger, Kate; Chow, William; Collins, Joanna; Harden, Glenn; Hubbard, Timothy; Pelan, Sarah; Simpson, Jared T; Threadgold, Glen; Torrance, James; Wood, Jonathan M; Clarke, Laura; Koren, Sergey; Boitano, Matthew; Peluso, Paul; Li, Heng; Chin, Chen-Shan; Phillippy, Adam M; Durbin, Richard; Wilson, Richard K; Flicek, Paul; Eichler, Evan E; Church, Deanna M

    2017-05-01

    The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health. © 2017 Schneider et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Comparison of double-locus sequence typing (DLST) and multilocus sequence typing (MLST) for the investigation of Pseudomonas aeruginosa populations.

    Science.gov (United States)

    Cholley, Pascal; Stojanov, Milos; Hocquet, Didier; Thouverez, Michelle; Bertrand, Xavier; Blanc, Dominique S

    2015-08-01

    Reliable molecular typing methods are necessary to investigate the epidemiology of bacterial pathogens. Reference methods such as multilocus sequence typing (MLST) and pulsed-field gel electrophoresis (PFGE) are costly and time consuming. Here, we compared our newly developed double-locus sequence typing (DLST) method for Pseudomonas aeruginosa to MLST and PFGE on a collection of 281 isolates. DLST was as discriminatory as MLST and was able to recognize "high-risk" epidemic clones. Both methods were highly congruent. Not surprisingly, a higher discriminatory power was observed with PFGE. In conclusion, being a simple method (single-strand sequencing of only 2 loci), DLST is valuable as a first-line typing tool for epidemiological investigations of P. aeruginosa. Coupled to a more discriminant method like PFGE or whole genome sequencing, it might represent an efficient typing strategy to investigate or prevent outbreaks. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    Science.gov (United States)

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  13. An optimum analysis sequence for environmental gamma-ray spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L., E-mail: fta777@hotmail.co [Universidad Autonoma de Zacatecas, Centro Regional de Estudis Nucleares, Calle Cipres No. 10, Fracc. La Penuela, 98068 Zacatecas (Mexico)

    2010-10-15

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced {chi}{sup 2} criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  14. Sequencing and De novo Draft Assemblies of the Fathead Minnow (Pimphales promelas)Reference Genome

    Science.gov (United States)

    This study was undertaken to develop genome-scale resources for the fathead minnow (Pimphales promelas) an important model organism widely used in both aquatic ecotoxicology research and in regulatory toxicity testing. We report on the first sequencing and two draft assemblies fo...

  15. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    Science.gov (United States)

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.

  16. Harnessing Whole Genome Sequencing in Medical Mycology.

    Science.gov (United States)

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  17. Genetics Home Reference: osteoglophonic dysplasia

    Science.gov (United States)

    ... 1 link) Genetic Testing Registry: Osteoglophonic dysplasia Other Diagnosis and Management Resources (1 link) Seattle Children's Hospital: Dwarfism and Bone Dysplasias General Information from MedlinePlus (5 ...

  18. Dynamics of domain coverage of the protein sequence universe

    Science.gov (United States)

    2012-01-01

    Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data. PMID:23157439

  19. Dynamics of domain coverage of the protein sequence universe

    Directory of Open Access Journals (Sweden)

    Rekapalli Bhanu

    2012-11-01

    Full Text Available Abstract Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data.

  20. Reference genotype and exome data from an Australian Aboriginal population for health-based research.

    Science.gov (United States)

    Tang, Dave; Anderson, Denise; Francis, Richard W; Syn, Genevieve; Jamieson, Sarra E; Lassmann, Timo; Blackwell, Jenefer M

    2016-04-12

    Genetic analyses, including genome-wide association studies and whole exome sequencing (WES), provide powerful tools for the analysis of complex and rare genetic diseases. To date there are no reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 72 Aboriginal individuals to a depth of 20X coverage in ∼80% of the sequenced nucleotides. We determined 320,976 single nucleotide variants (SNVs) and 47,313 insertions/deletions using the Genome Analysis Toolkit. We had previously genotyped a subset of the Aboriginal individuals (70/72) using the Illumina Omni2.5 BeadChip platform and found ~99% concordance at overlapping sites, which suggests high quality genotyping. Finally, we compared our SNVs to six publicly available variant databases, such as dbSNP and the Exome Sequencing Project, and 70,115 of our SNVs did not overlap any of the single nucleotide polymorphic sites in all the databases. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.

  1. Comparison of ompP5 sequence-based typing and pulsed-filed gel ...

    African Journals Online (AJOL)

    In this study, comparison of the outer membrane protein P5 gene (ompP5) sequence-based typing with pulsed-field gel electrophoresis (PFGE) for the genotyping of Haemophilus parasuis, the 15 serovar reference strains and 43 isolates were investigated. When comparing the two methods, 31 ompP5 sequence types ...

  2. 454 sequencing of pooled BAC clones on chromosome 3H of barley

    Directory of Open Access Journals (Sweden)

    Yamaji Nami

    2011-05-01

    Full Text Available Abstract Background Genome sequencing of barley has been delayed due to its large genome size (ca. 5,000Mbp. Among the fast sequencing systems, 454 liquid phase pyrosequencing provides the longest reads and is the most promising method for BAC clones. Here we report the results of pooled sequencing of BAC clones selected with ESTs genetically mapped to chromosome 3H. Results We sequenced pooled barley BAC clones using a 454 parallel genome sequencer. A PCR screening system based on primer sets derived from genetically mapped ESTs on chromosome 3H was used for clone selection in a BAC library developed from cultivar "Haruna Nijo". The DNA samples of 10 or 20 BAC clones were pooled and used for shotgun library development. The homology between contig sequences generated in each pooled library and mapped EST sequences was studied. The number of contigs assigned on chromosome 3H was 372. Their lengths ranged from 1,230 bp to 58,322 bp with an average 14,891 bp. Of these contigs, 240 showed homology and colinearity with the genome sequence of rice chromosome 1. A contig annotation browser supplemented with query search by unique sequence or genetic map position was developed. The identified contigs can be annotated with barley cDNAs and reference sequences on the browser. Homology analysis of these contigs with rice genes indicated that 1,239 rice genes can be assigned to barley contigs by the simple comparison of sequence lengths in both species. Of these genes, 492 are assigned to rice chromosome 1. Conclusions We demonstrate the efficiency of sequencing gene rich regions from barley chromosome 3H, with special reference to syntenic relationships with rice chromosome 1.

  3. Using RNA-Seq Data to Evaluate Reference Genes Suitable for Gene Expression Studies in Soybean.

    Directory of Open Access Journals (Sweden)

    Aldrin Kay-Yuen Yim

    Full Text Available Differential gene expression profiles often provide important clues for gene functions. While reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR is an important tool, the validity of the results depends heavily on the choice of proper reference genes. In this study, we employed new and published RNA-sequencing (RNA-Seq datasets (26 sequencing libraries in total to evaluate reference genes reported in previous soybean studies. In silico PCR showed that 13 out of 37 previously reported primer sets have multiple targets, and 4 of them have amplicons with different sizes. Using a probabilistic approach, we identified new and improved candidate reference genes. We further performed 2 validation tests (with 26 RNA samples on 8 commonly used reference genes and 7 newly identified candidates, using RT-qPCR. In general, the new candidate reference genes exhibited more stable expression levels under the tested experimental conditions. The three newly identified candidate reference genes Bic-C2, F-box protein2, and VPS-like gave the best overall performance, together with the commonly used ELF1b. It is expected that the proposed probabilistic model could serve as an important tool to identify stable reference genes when more soybean RNA-Seq data from different growth stages and treatments are used.

  4. MRI Sequences in Head & Neck Radiology - State of the Art.

    Science.gov (United States)

    Widmann, Gerlig; Henninger, Benjamin; Kremser, Christian; Jaschke, Werner

    2017-05-01

    Background  Magnetic resonance imaging (MRI) has become an essential imaging modality for the evaluation of head & neck pathologies. However, the diagnostic power of MRI is strongly related to the appropriate selection and interpretation of imaging protocols and sequences. The aim of this article is to review state-of-the-art sequences for the clinical routine in head & neck MRI and to describe the evidence for which medical question these sequences and techniques are useful. Method  Literature review of state-of-the-art sequences in head & neck MRI. Results and Conclusion  Basic sequences (T1w, T2w, T1wC+) and fat suppression techniques (TIRM/STIR, Dixon, Spectral Fat sat) are important tools in the diagnostic workup of inflammation, congenital lesions and tumors including staging. Additional sequences (SSFP (CISS, FIESTA), SPACE, VISTA, 3D-FLAIR) are used for pathologies of the cranial nerves, labyrinth and evaluation of endolymphatic hydrops in Menière's disease. Vessel and perfusion sequences (3D-TOF, TWIST/TRICKS angiography, DCE) are used in vascular contact syndromes, vascular malformations and analysis of microvascular parameters of tissue perfusion. Diffusion-weighted imaging (EPI-DWI, non-EPI-DWI, RESOLVE) is helpful in cholesteatoma imaging, estimation of malignancy, and evaluation of treatment response and posttreatment recurrence in head & neck cancer. Understanding of MRI sequences and close collaboration with referring physicians improves the diagnostic confidence of MRI in the daily routine and drives further research in this fascinating image modality. Key Points:   · Understanding of MRI sequences is essential for the correct and reliable interpretation of MRI findings.. · MRI protocols have to be carefully selected based on relevant clinical information.. · Close collaboration with referring physicians improves the output obtained from the diagnostic possibilities of MRI.. Citation Format · Widmann G, Henninger B, Kremser C et

  5. Genetics Home Reference: citrullinemia

    Science.gov (United States)

    ... belongs to a class of genetic diseases called urea cycle disorders. Learn more about the genes associated with citrullinemia ... GeneReview: Citrin Deficiency GeneReview: Citrullinemia Type I GeneReview: Urea Cycle Disorders Overview MedlinePlus Encyclopedia: Hereditary Urea Cycle Abnormality National ...

  6. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  7. A programmable method for massively parallel targeted sequencing

    Science.gov (United States)

    Hopmans, Erik S.; Natsoulis, Georges; Bell, John M.; Grimes, Susan M.; Sieh, Weiva; Ji, Hanlee P.

    2014-01-01

    We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy. PMID:24782526

  8. A Parvovirus B19 synthetic genome: sequence features and functional competence.

    Science.gov (United States)

    Manaresi, Elisabetta; Conti, Ilaria; Bua, Gloria; Bonvicini, Francesca; Gallinella, Giorgio

    2017-08-01

    Central to genetic studies for Parvovirus B19 (B19V) is the availability of genomic clones that may possess functional competence and ability to generate infectious virus. In our study, we established a new model genetic system for Parvovirus B19. A synthetic approach was followed, by design of a reference genome sequence, by generation of a corresponding artificial construct and its molecular cloning in a complete and functional form, and by setup of an efficient strategy to generate infectious virus, via transfection in UT7/EpoS1 cells and amplification in erythroid progenitor cells. The synthetic genome was able to generate virus with biological properties paralleling those of native virus, its infectious activity being dependent on the preservation of self-complementarity and sequence heterogeneity within the terminal regions. A virus of defined genome sequence, obtained from controlled cell culture conditions, can constitute a reference tool for investigation of the structural and functional characteristics of the virus. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.

    Science.gov (United States)

    Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M

    2004-06-12

    The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se

  10. Quality Control of the Traditional Patent Medicine Yimu Wan Based on SMRT Sequencing and DNA Barcoding

    Science.gov (United States)

    Jia, Jing; Xu, Zhichao; Xin, Tianyi; Shi, Linchun; Song, Jingyuan

    2017-01-01

    Substandard traditional patent medicines may lead to global safety-related issues. Protecting consumers from the health risks associated with the integrity and authenticity of herbal preparations is of great concern. Of particular concern is quality control for traditional patent medicines. Here, we establish an effective approach for verifying the biological composition of traditional patent medicines based on single-molecule real-time (SMRT) sequencing and DNA barcoding. Yimu Wan (YMW), a classical herbal prescription recorded in the Chinese Pharmacopoeia, was chosen to test the method. Two reference YMW samples were used to establish a standard method for analysis, which was then applied to three different batches of commercial YMW samples. A total of 3703 and 4810 circular-consensus sequencing (CCS) reads from two reference and three commercial YMW samples were mapped to the ITS2 and psbA-trnH regions, respectively. Moreover, comparison of intraspecific genetic distances based on SMRT sequencing data with reference data from Sanger sequencing revealed an ITS2 and psbA-trnH intergenic spacer that exhibited high intraspecific divergence, with the sites of variation showing significant differences within species. Using the CCS strategy for SMRT sequencing analysis was adequate to guarantee the accuracy of identification. This study demonstrates the application of SMRT sequencing to detect the biological ingredients of herbal preparations. SMRT sequencing provides an affordable way to monitor the legality and safety of traditional patent medicines. PMID:28620408

  11. Kismeth: Analyzer of plant methylation states through bisulfite sequencing

    Directory of Open Access Journals (Sweden)

    Martienssen Robert A

    2008-09-01

    Full Text Available Abstract Background There is great interest in probing the temporal and spatial patterns of cytosine methylation states in genomes of a variety of organisms. It is hoped that this will shed light on the biological roles of DNA methylation in the epigenetic control of gene expression. Bisulfite sequencing refers to the treatment of isolated DNA with sodium bisulfite to convert unmethylated cytosine to uracil, with PCR converting the uracil to thymidine followed by sequencing of the resultant DNA to detect DNA methylation. For the study of DNA methylation, plants provide an excellent model system, since they can tolerate major changes in their DNA methylation patterns and have long been studied for the effects of DNA methylation on transposons and epimutations. However, in contrast to the situation in animals, there aren't many tools that analyze bisulfite data in plants, which can exhibit methylation of cytosines in a variety of sequence contexts (CG, CHG, and CHH. Results Kismeth http://katahdin.mssm.edu/kismeth is a web-based tool for bisulfite sequencing analysis. Kismeth was designed to be used with plants, since it considers potential cytosine methylation in any sequence context (CG, CHG, and CHH. It provides a tool for the design of bisulfite primers as well as several tools for the analysis of the bisulfite sequencing results. Kismeth is not limited to data from plants, as it can be used with data from any species. Conclusion Kismeth simplifies bisulfite sequencing analysis. It is the only publicly available tool for the design of bisulfite primers for plants, and one of the few tools for the analysis of methylation patterns in plants. It facilitates analysis at both global and local scales, demonstrated in the examples cited in the text, allowing dissection of the genetic pathways involved in DNA methylation. Kismeth can also be used to study methylation states in different tissues and disease cells compared to a reference sequence.

  12. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  13. An evaluation of selected herbal reference texts and comparison to published reports of adverse herbal events.

    Science.gov (United States)

    Haller, Christine A; Anderson, Ilene B; Kim, Susan Y; Blanc, Paul D

    2002-01-01

    There has been a recent proliferation of medical reference texts intended to guide practitioners whose patients use herbal therapies. We systematically assessed six herbal reference texts to evaluate the information they contain on herbal toxicity. We selected six major herbal references published from 1996 to 2000 to evaluate the adequacy of their toxicological information in light of published adverse events. To identify herbs most relevant to toxicology, we reviewed herbal-related calls to our regional California Poison Control System, San Francisco division (CPCS-SF) in 1998 and identified the 12 herbs (defined as botanical dietary supplements) most frequently involved in these CPCS-SF referrals. We searched Medline (1966 to 2000) to identify published reports of adverse effects potentially related to these same 12 herbs. We scored each herbal reference text on the basis of information inclusiveness for the target 12 herbs, with a maximal overall score of 3. The herbs, identified on the basis of CPCS-SF call frequency were: St John's wort, ma huang, echinacea, guarana, ginkgo, ginseng, valerian, tea tree oil, goldenseal, arnica, yohimbe and kava kava. The overall herbal reference scores ranged from 2.2 to 0.4 (median 1.1). The Natural Medicines Comprehensive Database received the highest overall score and was the most complete and useful reference source. All of the references, however, lacked sufficient information on management of herbal medicine overdose, and several had incorrect overdose management guidelines that could negatively impact patient care. Current herbal reference texts do not contain sufficient information for the assessment and management of adverse health effects of botanical therapies.

  14. Applications of Next-Generation Sequencing Technologies to Diagnostic Virology

    Directory of Open Access Journals (Sweden)

    Giorgio Palù

    2011-11-01

    Full Text Available Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS, provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics.

  15. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  16. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

    NARCIS (Netherlands)

    J. Huang (Jie); B. Howie (Bryan); S. McCarthy (Shane); Y. Memari (Yasin); K. Walter (Klaudia); J.L. Min (Josine L.); P. Danecek (Petr); G. Malerba (Giovanni); E. Trabetti (Elisabetta); H.-F. Zheng (Hou-Feng); G. Gambaro (Giovanni); J.B. Richards (Brent); R. Durbin (Richard); N.J. Timpson (Nicholas); J. Marchini (Jonathan); N. Soranzo (Nicole); S.H. Al Turki (Saeed); A. Amuzu (Antoinette); C. Anderson (Carl); R. Anney (Richard); D. Antony (Dinu); M.S. Artigas; M. Ayub (Muhammad); S. Bala (Senduran); J.C. Barrett (Jeffrey); I.E. Barroso (Inês); P.L. Beales (Philip); M. Benn (Marianne); J. Bentham (Jamie); S. Bhattacharya (Shoumo); E. Birney (Ewan); D.H.R. Blackwood (Douglas); M. Bobrow (Martin); E. Bochukova (Elena); P.F. Bolton (Patrick F.); R. Bounds (Rebecca); C. Boustred (Chris); G. Breen (Gerome); M. Calissano (Mattia); K. Carss (Keren); J.P. Casas (Juan Pablo); J.C. Chambers (John C.); R. Charlton (Ruth); K. Chatterjee (Krishna); L. Chen (Lu); A. Ciampi (Antonio); S. Cirak (Sebahattin); P. Clapham (Peter); G. Clement (Gail); G. Coates (Guy); M. Cocca (Massimiliano); D.A. Collier (David); C. Cosgrove (Catherine); T. Cox (Tony); N.J. Craddock (Nick); L. Crooks (Lucy); S. Curran (Sarah); D. Curtis (David); A. Daly (Allan); I.N.M. Day (Ian N.M.); A.G. Day-Williams (Aaron); G.V. Dedoussis (George); T. Down (Thomas); Y. Du (Yuanping); C.M. van Duijn (Cornelia); I. Dunham (Ian); T. Edkins (Ted); R. Ekong (Rosemary); P. Ellis (Peter); D.M. Evans (David); I.S. Farooqi (I. Sadaf); D.R. Fitzpatrick (David R.); P. Flicek (Paul); J. Floyd (James); A.R. Foley (A. Reghan); C.S. Franklin (Christopher S.); M. Futema (Marta); L. Gallagher (Louise); P. Gasparini (Paolo); T.R. Gaunt (Tom); M. Geihs (Matthias); D. Geschwind (Daniel); C.M.T. Greenwood (Celia); H. Griffin (Heather); D. Grozeva (Detelina); X. Guo (Xiaosen); X. Guo (Xueqin); H. Gurling (Hugh); D. Hart (Deborah); A.E. Hendricks (Audrey E.); P.A. Holmans (Peter A.); L. Huang (Liren); T. Hubbard (Tim); S.E. Humphries (Steve E.); M.E. Hurles (Matthew); P.G. Hysi (Pirro); V. Iotchkova (Valentina); A. Isaacs (Aaron); D.K. Jackson (David K.); Y. Jamshidi (Yalda); J. Johnson (Jon); C. Joyce (Chris); K.J. Karczewski (Konrad); J. Kaye (Jane); T. Keane (Thomas); J.P. Kemp (John); K. Kennedy (Karen); A. Kent (Alastair); J. Keogh (Julia); F. Khawaja (Farrah); M.E. Kleber (Marcus); M. Van Kogelenberg (Margriet); A. Kolb-Kokocinski (Anja); J.S. Kooner (Jaspal S.); G. Lachance (Genevieve); C. Langenberg (Claudia); C. Langford (Cordelia); D. Lawson (Daniel); I. Lee (Irene); E.M. van Leeuwen (Elisa); M. Lek (Monkol); R. Li (Rui); Y. Li (Yingrui); J. Liang (Jieqin); H. Lin (Hong); R. Liu (Ryan); J. Lönnqvist (Jouko); L.R. Lopes (Luis R.); M.C. Lopes (Margarida); J. Luan; D.G. MacArthur (Daniel G.); M. Mangino (Massimo); G. Marenne (Gaëlle); W. März (Winfried); J. Maslen (John); A. Matchan (Angela); I. Mathieson (Iain); P. McGuffin (Peter); A.M. McIntosh (Andrew); A.G. McKechanie (Andrew G.); A. McQuillin (Andrew); S. Metrustry (Sarah); N. Migone (Nicola); H.M. Mitchison (Hannah M.); A. Moayyeri (Alireza); J. Morris (James); R. Morris (Richard); D. Muddyman (Dawn); F. Muntoni; B.G. Nordestgaard (Børge G.); K. Northstone (Kate); M.C. O'donovan (Michael); S. O'Rahilly (Stephen); A. Onoufriadis (Alexandros); K. Oualkacha (Karim); M.J. Owen (Michael J.); A. Palotie (Aarno); K. Panoutsopoulou (Kalliope); V. Parker (Victoria); J.R. Parr (Jeremy R.); L. Paternoster (Lavinia); T. Paunio (Tiina); F. Payne (Felicity); S.J. Payne (Stewart J.); J.R.B. Perry (John); O.P.H. Pietiläinen (Olli); V. Plagnol (Vincent); R.C. Pollitt (Rebecca C.); S. Povey (Sue); M.A. Quail (Michael A.); L. Quaye (Lydia); L. Raymond (Lucy); K. Rehnström (Karola); C.K. Ridout (Cheryl K.); S.M. Ring (Susan); G.R.S. Ritchie (Graham R.S.); N. Roberts (Nicola); R.L. Robinson (Rachel L.); D.B. Savage (David); P.J. Scambler (Peter); S. Schiffels (Stephan); M. Schmidts (Miriam); N. Schoenmakers (Nadia); R.H. Scott (Richard H.); R.A. Scott (Robert); R.K. Semple (Robert K.); E. Serra (Eva); S.I. Sharp (Sally I.); A.C. Shaw (Adam C.); H.A. Shihab (Hashem A.); S.-Y. Shin (So-Youn); D. Skuse (David); K.S. Small (Kerrin); C. Smee (Carol); G.D. Smith; L. Southam (Lorraine); O. Spasic-Boskovic (Olivera); T.D. Spector (Timothy); D. St. Clair (David); B. St Pourcain (Beate); J. Stalker (Jim); E. Stevens (Elizabeth); J. Sun (Jianping); G. Surdulescu (Gabriela); J. Suvisaari (Jaana); P. Syrris (Petros); I. Tachmazidou (Ioanna); R. Taylor (Rohan); J. Tian (Jing); M.D. Tobin (Martin); D. Toniolo (Daniela); M. Traglia (Michela); A. Tybjaerg-Hansen; A.M. Valdes; A.M. Vandersteen (Anthony M.); A. Varbo (Anette); P. Vijayarangakannan (Parthiban); P.M. Visscher (Peter); L.V. Wain (Louise); J.T. Walters (James); G. Wang (Guangbiao); J. Wang (Jun); Y. Wang (Yu); K. Ward (Kirsten); E. Wheeler (Eleanor); P.H. Whincup (Peter); T. Whyte (Tamieka); H.J. Williams (Hywel J.); K.A. Williamson (Kathleen); C. Wilson (Crispian); S.G. Wilson (Scott); K. Wong (Kim); C. Xu (Changjiang); J. Yang (Jian); G. Zaza (Gianluigi); E. Zeggini (Eleftheria); F. Zhang (Feng); P. Zhang (Pingbo); W. Zhang (Weihua)

    2015-01-01

    textabstractImputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced

  17. The Use of Non-Variant Sites to Improve the Clinical Assessment of Whole-Genome Sequence Data.

    Directory of Open Access Journals (Sweden)

    Alberto Ferrarini

    Full Text Available Genetic testing, which is now a routine part of clinical practice and disease management protocols, is often based on the assessment of small panels of variants or genes. On the other hand, continuous improvements in the speed and per-base costs of sequencing have now made whole exome sequencing (WES and whole genome sequencing (WGS viable strategies for targeted or complete genetic analysis, respectively. Standard WGS/WES data analytical workflows generally rely on calling of sequence variants respect to the reference genome sequence. However, the reference genome sequence contains a large number of sites represented by rare alleles, by known pathogenic alleles and by alleles strongly associated to disease by GWAS. It's thus critical, for clinical applications of WGS and WES, to interpret whether non-variant sites are homozygous for the reference allele or if the corresponding genotype cannot be reliably called. Here we show that an alternative analytical approach based on the analysis of both variant and non-variant sites from WGS data allows to genotype more than 92% of sites corresponding to known SNPs compared to 6% genotyped by standard variant analysis. These include homozygous reference sites of clinical interest, thus leading to a broad and comprehensive characterization of variation necessary to an accurate evaluation of disease risk. Altogether, our findings indicate that characterization of both variant and non-variant clinically informative sites in the genome is necessary to allow an accurate clinical assessment of a personal genome. Finally, we propose a highly efficient extended VCF (eVCF file format which allows to store genotype calls for sites of clinical interest while remaining compatible with current variant interpretation software.

  18. Toddler Nutrition: MedlinePlus Health Topic

    Science.gov (United States)

    ... Cardiometabolic Risk. Article: Bringing babies and breasts into workplaces: Support for breastfeeding mothers... Toddler Nutrition -- see more articles Reference Desk Toddler Nutrition and Health Resource List (Department of Agriculture) - PDF Find an ...

  19. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  20. [Guidelines of reference recording in scientific papers of Chinese Journal of Applied Ecology].

    Science.gov (United States)

    Xiao, Hong

    2008-01-01

    To improve the compilation quality of references, work in well with articles search and periodicals evaluation, and promote international academic exchange, the Chinese Journal of Applied Ecology shall adjust its principles of reference recording in scientific papers based on the GB/T7714 -2005. From 2008, the references in scientific papers to be submitted are requested to record by the Citation-Sequence. In this paper, some examples were presented, and the issues needed to be paid more attention to by the authors were put forward.

  1. Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks.

    Science.gov (United States)

    Zhou, Xuezhong; Liu, Baoyan; Wu, Zhaohui; Feng, Yi

    2007-10-01

    The amount of biomedical data in different disciplines is growing at an exponential rate. Integrating these significant knowledge sources to generate novel hypotheses for systems biology research is difficult. Traditional Chinese medicine (TCM) is a completely different discipline, and is a complementary knowledge system to modern biomedical science. This paper uses a significant TCM bibliographic literature database in China, together with MEDLINE, to help discover novel gene functional knowledge. We present an integrative mining approach to uncover the functional gene relationships from MEDLINE and TCM bibliographic literature. This paper introduces TCM literature (about 50,000 records) as one knowledge source for constructing literature-based gene networks. We use the TCM diagnosis, TCM syndrome, to automatically congregate the related genes. The syndrome-gene relationships are discovered based on the syndrome-disease relationships extracted from TCM literature and the disease-gene relationships in MEDLINE. Based on the bubble-bootstrapping and relation weight computing methods, we have developed a prototype system called MeDisco/3S, which has name entity and relation extraction, and online analytical processing (OLAP) capabilities, to perform the integrative mining process. We have got about 200,000 syndrome-gene relations, which could help generate syndrome-based gene networks, and help analyze the functional knowledge of genes from syndrome perspective. We take the gene network of Kidney-Yang Deficiency syndrome (KYD syndrome) and the functional analysis of some genes, such as CRH (corticotropin releasing hormone), PTH (parathyroid hormone), PRL (prolactin), BRCA1 (breast cancer 1, early onset) and BRCA2 (breast cancer 2, early onset), to demonstrate the preliminary results. The underlying hypothesis is that the related genes of the same syndrome will have some biological functional relationships, and will constitute a functional network. This paper presents

  2. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    Science.gov (United States)

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  3. Application of genotyping by sequencing technology to a variety of crop breeding programs.

    Science.gov (United States)

    Kim, Changsoo; Guo, Hui; Kong, Wenqian; Chandnani, Rahul; Shuang, Lan-Shuan; Paterson, Andrew H

    2016-01-01

    Since the Arabidopsis genome was completed, draft sequences or pseudomolecules have been published for more than 100 plant genomes including green algae, in large part due to advances in sequencing technologies. Advanced DNA sequencing technologies have also conferred new opportunities for high-throughput low-cost crop genotyping, based on single-nucleotide polymorphisms (SNPs). However, a recurring complication in crop genotyping that differs from other taxa is a higher level of DNA sequence duplication, noting that all angiosperms are thought to have polyploidy in their evolutionary history. In the current article, we briefly review current genotyping methods using next-generation sequencing (NGS) technologies. We also explore case studies of genotyping-by-sequencing (GBS) applications to several crops differing in genome size, organization and breeding system (paleopolyploids, neo-allopolyploids, neo-autopolyploids). GBS typically shows good results when it is applied to an inbred diploid species with a well-established reference genome. However, we have also made some progress toward GBS of outcrossing species lacking reference genomes and of polyploid populations, which still need much improvement. Regardless of some limitations, low-cost and multiplexed genotyping offered by GBS will be beneficial to breed superior cultivars in many crop species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  4. Zero-Sequence Voltage Modulation Strategy for Multiparallel Converters Circulating Current Suppression

    DEFF Research Database (Denmark)

    Zhu, Rongwu; Liserre, Marco; Chen, Zhe

    2017-01-01

    A zero-sequence circulating current (ZSCC) is typically generated among the multiparallel converters that share the common dc link and ac side without isolated transformers under the space vector modulation (SVM), due to the injected third-order zero-sequence voltage (ZSV). This paper analyzes SVM...... references and filter inductances. The simulation and experimental results based on the parallel converters clearly verify the effectiveness of the proposed control....

  5. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  6. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

    DEFF Research Database (Denmark)

    Huang, Jie; Howie, Bryan; Mccarthy, Shane

    2015-01-01

    Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low de...

  7. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  8. SRComp: short read sequence compression using burstsort and Elias omega coding.

    Directory of Open Access Journals (Sweden)

    Jeremy John Selva

    Full Text Available Next-generation sequencing (NGS technologies permit the rapid production of vast amounts of data at low cost. Economical data storage and transmission hence becomes an increasingly important challenge for NGS experiments. In this paper, we introduce a new non-reference based read sequence compression tool called SRComp. It works by first employing a fast string-sorting algorithm called burstsort to sort read sequences in lexicographical order and then Elias omega-based integer coding to encode the sorted read sequences. SRComp has been benchmarked on four large NGS datasets, where experimental results show that it can run 5-35 times faster than current state-of-the-art read sequence compression tools such as BEETL and SCALCE, while retaining comparable compression efficiency for large collections of short read sequences. SRComp is a read sequence compression tool that is particularly valuable in certain applications where compression time is of major concern.

  9. Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants.

    Science.gov (United States)

    Tanabe, Akifumi S; Toju, Hirokazu

    2013-01-01

    Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate

  10. Exome-wide DNA capture and next generation sequencing in domestic and wild species

    Directory of Open Access Journals (Sweden)

    Ng Sarah B

    2011-07-01

    Full Text Available Abstract Background Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses. We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus to capture (enrich for, and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison. Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. Results We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. Conclusions This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.

  11. Exome-wide DNA capture and next generation sequencing in domestic and wild species.

    Science.gov (United States)

    Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon

    2011-07-05

    Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.

  12. Probabilistic Methods for Processing High-Throughput Sequencing Signals

    DEFF Research Database (Denmark)

    Sørensen, Lasse Maretty

    High-throughput sequencing has the potential to answer many of the big questions in biology and medicine. It can be used to determine the ancestry of species, to chart complex ecosystems and to understand and diagnose disease. However, going from raw sequencing data to biological or medical insig....... By estimating the genotypes on a set of candidate variants obtained from both a standard mapping-based approach as well as de novo assemblies, we are able to find considerably more structural variation than previous studies...... for reconstructing transcript sequences from RNA sequencing data. The method is based on a novel sparse prior distribution over transcript abundances and is markedly more accurate than existing approaches. The second chapter describes a new method for calling genotypes from a fixed set of candidate variants....... The method queries the reads using a graph representation of the variants and hereby mitigates the reference-bias that characterise standard genotyping methods. In the last chapter, we apply this method to call the genotypes of 50 deeply sequencing parent-offspring trios from the GenomeDenmark project...

  13. Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

    Science.gov (United States)

    Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

    2012-08-01

    Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.

  14. Genetics Home Reference: spondylothoracic dysostosis

    Science.gov (United States)

    ... normal-length arms and legs, called short-trunk dwarfism. The spine and rib abnormalities, which are present ... Additional Information & Resources MedlinePlus (2 links) Health Topic: Dwarfism Health Topic: Spine Injuries and Disorders Genetic and ...

  15. Quantitative comparison between a multiecho sequence and a single-echo sequence for susceptibility-weighted phase imaging.

    Science.gov (United States)

    Gilbert, Guillaume; Savard, Geneviève; Bard, Céline; Beaudoin, Gilles

    2012-06-01

    The aim of this study was to investigate the benefits arising from the use of a multiecho sequence for susceptibility-weighted phase imaging using a quantitative comparison with a standard single-echo acquisition. Four healthy adult volunteers were imaged on a clinical 3-T system using a protocol comprising two different three-dimensional susceptibility-weighted gradient-echo sequences: a standard single-echo sequence and a multiecho sequence. Both sequences were repeated twice in order to evaluate the local noise contribution by a subtraction of the two acquisitions. For the multiecho sequence, the phase information from each echo was independently unwrapped, and the background field contribution was removed using either homodyne filtering or the projection onto dipole fields method. The phase information from all echoes was then combined using a weighted linear regression. R2 maps were also calculated from the multiecho acquisitions. The noise standard deviation in the reconstructed phase images was evaluated for six manually segmented regions of interest (frontal white matter, posterior white matter, globus pallidus, putamen, caudate nucleus and lateral ventricle). The use of the multiecho sequence for susceptibility-weighted phase imaging led to a reduction of the noise standard deviation for all subjects and all regions of interest investigated in comparison to the reference single-echo acquisition. On average, the noise reduction ranged from 18.4% for the globus pallidus to 47.9% for the lateral ventricle. In addition, the amount of noise reduction was found to be strongly inversely correlated to the estimated R2 value (R=-0.92). In conclusion, the use of a multiecho sequence is an effective way to decrease the noise contribution in susceptibility-weighted phase images, while preserving both contrast and acquisition time. The proposed approach additionally permits the calculation of R2 maps. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Bunches of random cross-correlated sequences

    International Nuclear Information System (INIS)

    Maystrenko, A A; Melnik, S S; Pritula, G M; Usatenko, O V

    2013-01-01

    The statistical properties of random cross-correlated sequences constructed by the convolution method (likewise referred to as the Rice or the inverse Fourier transformation) are examined. We clarify the meaning of the filtering function—the kernel of the convolution operator—and show that it is the value of the cross-correlation function which describes correlations between the initial white noise and constructed correlated sequences. The matrix generalization of this method for constructing a bunch of N cross-correlated sequences is presented. Algorithms for their generation are reduced to solving the problem of decomposition of the Fourier transform of the correlation matrix into a product of two mutually conjugate matrices. Different decompositions are considered. The limits of weak and strong correlations for the one-point probability and pair correlation functions of sequences generated by the method under consideration are studied. Special cases of heavy-tailed distributions of the generated sequences are analyzed. We show that, if the filtering function is rather smooth, the distribution function of generated variables has the Gaussian or Lévy form depending on the analytical properties of the distribution (or characteristic) functions of the initial white noise. Anisotropic properties of statistically homogeneous random sequences related to the asymmetry of a filtering function are revealed and studied. These asymmetry properties are expressed in terms of the third- or fourth-order correlation functions. Several examples of the construction of correlated chains with a predefined correlation matrix are given. (paper)

  17. BOKP: A DNA Barcode Reference Library for Monitoring Herbal Drugs in the Korean Pharmacopeia

    Directory of Open Access Journals (Sweden)

    Jinxin Liu

    2017-12-01

    Full Text Available Herbal drug authentication is an important task in traditional medicine; however, it is challenged by the limitations of traditional authentication methods and the lack of trained experts. DNA barcoding is conspicuous in almost all areas of the biological sciences and has already been added to the British pharmacopeia and Chinese pharmacopeia for routine herbal drug authentication. However, DNA barcoding for the Korean pharmacopeia still requires significant improvements. Here, we present a DNA barcode reference library for herbal drugs in the Korean pharmacopeia and developed a species identification engine named KP-IDE to facilitate the adoption of this DNA reference library for the herbal drug authentication. Using taxonomy records, specimen records, sequence records, and reference records, KP-IDE can identify an unknown specimen. Currently, there are 6,777 taxonomy records, 1,054 specimen records, 30,744 sequence records (ITS2 and psbA-trnH and 285 reference records. Moreover, 27 herbal drug materials were collected from the Seoul Yangnyeongsi herbal medicine market to give an example for real herbal drugs authentications. Our study demonstrates the prospects of the DNA barcode reference library for the Korean pharmacopeia and provides future directions for the use of DNA barcoding for authenticating herbal drugs listed in other modern pharmacopeias.

  18. Mitochondrial D-loop sequence variation among Italian horse breeds

    Directory of Open Access Journals (Sweden)

    Zanotti Marta

    2004-11-01

    Full Text Available Abstract The genetic variability of the mitochondrial D-loop DNA sequence in seven horse breeds bred in Italy (Giara, Haflinger, Italian trotter, Lipizzan, Maremmano, Thoroughbred and Sarcidano was analysed. Five unrelated horses were chosen in each breed and twenty-two haplotypes were identified. The sequences obtained were aligned and compared with a reference sequence and with 27 mtDNA D-loop sequences selected in the GenBank database, representing Spanish, Portuguese, North African, wild horses and an Equus asinus sequence as the outgroup. Kimura two-parameter distances were calculated and a cluster analysis using the Neighbour-joining method was performed to obtain phylogenetic trees among breeds bred in Italy and among Italian and foreign breeds. The cluster analysis indicates that all the breeds but Giara are divided in the two trees, and no clear relationships were revealed between Italian populations and the other breeds. These results could be interpreted as showing the mixed origin of breeds bred in Italy and probably indicate the presence of many ancient maternal lineages with high diversity in mtDNA sequences.

  19. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    DEFF Research Database (Denmark)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang

    2015-01-01

    . Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication...

  20. Is there an added value of T1-weighted contrast-enhanced fat-suppressed spin-echo MR sequences compared to STIR sequences in MRI of the foot and ankle?

    Energy Technology Data Exchange (ETDEWEB)

    Zubler, Veronika; Zanetti, Marco; Dietrich, Tobias J.; Pfirrmann, Christian W.; Mamisch-Saupe, Nadja [University of Zurich, Faculty of Medicine, Zurich (Switzerland); Orthopedic University Hospital Balgrist, Department of Radiology, Zurich (Switzerland); Espinosa, Norman [University of Zurich, Faculty of Medicine, Zurich (Switzerland); Orthopedic University Hospital Balgrist, Orthopedic Surgery, Zurich (Switzerland)

    2017-08-15

    To prospectively compare T1-weighted fat-suppressed spin-echo magnetic resonance (MR) sequences after gadolinium application (T1wGdFS) to STIR sequences in patients with acute and chronic foot pain. In 51 patients referred for MRI of the foot and ankle, additional transverse and sagittal T1wGdFS sequences were obtained. Two sets of MR images (standard protocol with STIR or T1wGdFS) were analysed. Diagnosis, diagnostic confidence, and localization of the abnormality were noted. Standard of reference was established by an expert panel of two experienced MSK radiologists and one experienced foot surgeon based on MR images, clinical charts and surgical reports. Patients reported prospectively localization of pain. Descriptive statistics, McNemar test and Kappa test were used. Diagnostic accuracy with STIR protocol was 80% for reader 1, 67% for reader 2, with contrast-protocol 84%, both readers. Significance was found for reader 2. Diagnostic confidence for reader 1 was 1.7 with STIR, 1.3 with contrast-protocol; reader 2: 2.1/1.7. Significance was found for reader 1. Pain location correlated with STIR sequences in 64% and 52%, with gadolinium sequences in 70% and 71%. T1-weighted contrast material-enhanced fat-suppressed spin-echo magnetic resonance sequences improve diagnostic accuracy, diagnostic confidence and correlation of MR abnormalities with pain location in MRI of the foot and ankle. However, the additional value is small. (orig.)

  1. Is there an added value of T1-weighted contrast-enhanced fat-suppressed spin-echo MR sequences compared to STIR sequences in MRI of the foot and ankle?

    International Nuclear Information System (INIS)

    Zubler, Veronika; Zanetti, Marco; Dietrich, Tobias J.; Pfirrmann, Christian W.; Mamisch-Saupe, Nadja; Espinosa, Norman

    2017-01-01

    To prospectively compare T1-weighted fat-suppressed spin-echo magnetic resonance (MR) sequences after gadolinium application (T1wGdFS) to STIR sequences in patients with acute and chronic foot pain. In 51 patients referred for MRI of the foot and ankle, additional transverse and sagittal T1wGdFS sequences were obtained. Two sets of MR images (standard protocol with STIR or T1wGdFS) were analysed. Diagnosis, diagnostic confidence, and localization of the abnormality were noted. Standard of reference was established by an expert panel of two experienced MSK radiologists and one experienced foot surgeon based on MR images, clinical charts and surgical reports. Patients reported prospectively localization of pain. Descriptive statistics, McNemar test and Kappa test were used. Diagnostic accuracy with STIR protocol was 80% for reader 1, 67% for reader 2, with contrast-protocol 84%, both readers. Significance was found for reader 2. Diagnostic confidence for reader 1 was 1.7 with STIR, 1.3 with contrast-protocol; reader 2: 2.1/1.7. Significance was found for reader 1. Pain location correlated with STIR sequences in 64% and 52%, with gadolinium sequences in 70% and 71%. T1-weighted contrast material-enhanced fat-suppressed spin-echo magnetic resonance sequences improve diagnostic accuracy, diagnostic confidence and correlation of MR abnormalities with pain location in MRI of the foot and ankle. However, the additional value is small. (orig.)

  2. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  3. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

    Science.gov (United States)

    Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

    2015-11-18

    RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as

  4. Genome sequences of three strains of Aspergillus flavus for the biological control of Aflatoxin

    Science.gov (United States)

    The genomes of three strains of Aspergillus flavus with demonstrated utility for the biological control of aflatoxin were sequenced. These sequences were assembled with MIRA and annotated with Augustus using A. flavus strain 3357 (NCBI EQ963472) as a reference. Each strain had a genome of 36.3 to ...

  5. Pleiades rapid rotators - evidence for an evolutionary sequence

    International Nuclear Information System (INIS)

    Butler, R.P.; Marcy, G.W.; Cohen, R.D.; Duncan, D.K.; California Univ., La Jolla; Space Telescope Science Institute, Baltimore, MD)

    1987-01-01

    Four rapidly rotating early-K dwarfs in the Pleiades are shown to contain an order of magnitude more Li than four slow rotators of the same spectral type, as would be expected if they were systematically younger. This supports the idea that late-type stars first arrive on the main sequence with V(rot) greater than about 100 km/s, that they spin down to V(rot) less than about 10 km/s in 10 to the 7th to 10 to the 8th yr, and that the Pleiades lower main sequence shows such an age spread. 14 references

  6. Genetics Home Reference: atopic dermatitis

    Science.gov (United States)

    ... adults, the rashes typically occur on the wrists, ankles, and eyelids in addition to the bend of ... Information from MedlinePlus (5 links) Diagnostic Tests Drug Therapy Genetic ... Manual Consumer Version The University of Chicago Medicine World ...

  7. Dispersed repetitive sequences in eukaryotic genomes and their possible biological significance

    International Nuclear Information System (INIS)

    Georgiev, G.P.; Kramerov, D.A.; Ryskov, A.P.; Skryabin, K.G.; Lukanidin, E.M.

    1983-01-01

    In this paper is described the properties of a novel mouse mdg-like element, the A2 sequence, which is the most abundant repetitive sequence. We also characterized an ubiquitous B2 sequence that represents, after B1, the dominant family among the short interspersed repeats of the mouse genome. The existence of some putative transposition intermediates was shown for repeats of both A and B types of the mouse genome. These are closed circular DNA of the A type and small polyadenylated B + RNAs. The fundamental question that arises is whether these sequences are simply selfish DNA capable of transpositions or do they fulfill some useful biological functions within the genome. 66 references, 11 figures, 1 table

  8. Comparison of tiered formularies and reference pricing policies: a systematic review.

    Science.gov (United States)

    Morgan, Steve; Hanley, Gillian; Greyson, Devon

    2009-01-01

    To synthesize methodologically comparable evidence from the published literature regarding the outcomes of tiered formularies and therapeutic reference pricing of prescription drugs. We searched the following electronic databases: ABI/Inform, CINAHL, Clinical Evidence, Digital Dissertations & Theses, Evidence-Based Medicine Reviews (which incorporates ACP Journal Club, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, Cochrane Methodology Register, Database of Abstracts of Reviews of Effectiveness, Health Technology Assessments and NHS Economic Evaluation Database), EconLit, EMBASE, International Pharmaceutical Abstracts, MEDLINE, PAIS International and PAIS Archive, and the Web of Science. We also searched the reference lists of relevant articles and several grey literature sources. We sought English-language studies published from 1986 to 2007 that examined the effects of either therapeutic reference pricing or tiered formularies, reported on outcomes relevant to patient care and cost-effectiveness, and employed quantitative study designs that included concurrent or historical comparison groups. We abstracted and assessed potentially appropriate articles using a modified version of the data abstraction form developed by the Cochrane Effective Practice and Organisation of Care Group. From an initial list of 2964 citations, 12 citations (representing 11 studies) were deemed eligible for inclusion in our review: 3 studies (reported in 4 articles) of reference pricing and 8 studies of tiered formularies. The introduction of reference pricing was associated with reduced plan spending, switching to preferred medicines, reduced overall drug utilization and short-term increases in the use of physician services. Reference pricing was not associated with adverse health impacts. The introduction of tiered formularies was associated with reduced plan expenditures, greater patient costs and increased rates of non-compliance with

  9. Recent advances in nanopore-based nucleic acid analysis and sequencing

    International Nuclear Information System (INIS)

    Shi, Jidong; Fang, Ying; Hou, Junfeng

    2016-01-01

    Nanopore-based sequencing platforms are transforming the field of genomic science. This review (containing 116 references) highlights some recent progress on nanopore-based nucleic acid analysis and sequencing. These studies are classified into three categories, biological, solid-state, and hybrid nanopores, according to their nanoporous materials. We begin with a brief description of the translocation-based detection mechanism of nanopores. Next, specific examples are given in nanopore-based nucleic acid analysis and sequencing, with an emphasis on identifying strategies that can improve the resolution of nanopores. This review concludes with a discussion of future research directions that will advance the practical applications of nanopore technology. (author)

  10. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

    Science.gov (United States)

    Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

    2016-05-26

    Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.

  11. Analysis of the AD sequence in Zion plant using the March 1.1 code

    International Nuclear Information System (INIS)

    Oriolo, F.; Paci, S.

    1985-01-01

    The analyses of the AD sequences for the Zion power plant, made at the Pisa University, in the framework of the participation in the Source Tern Working Group. After a short description of the plant and the sequence under analysis, the model used for the reference computation and the results obtained using the March 1.1 code are shown. Together with the reference computation a series of parametric tests have been also made, concerning some input code variables, in order to ascertain their influence on the transient trend. The results of these analyses are shown in Appendix

  12. Draft Genome Sequence of Campylobacter jejuni 11168H

    Science.gov (United States)

    Macdonald, Sarah E.; Gundogdu, Ozan; Dorrell, Nick; Wren, Brendan W.; Blake, Damer

    2017-01-01

    ABSTRACT Campylobacter jejuni is the most prevalent cause of food-borne gastroenteritis in the developed world. The reference and original sequenced strain C. jejuni NCTC11168 has low levels of motility compared to clinical isolates. Here, we describe the draft genome of the laboratory derived hypermotile variant named 11168H. PMID:28153902

  13. Norgal: Extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data

    DEFF Research Database (Denmark)

    Al-Nakeeb, Kosai Ali Ahmed; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-01-01

    and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences...

  14. Valued Components of a Consultant Letter from Referring Physicians' Perspective: a Systematic Literature Synthesis.

    Science.gov (United States)

    Rash, Arjun H; Sheldon, Robert; Donald, Maoliosa; Eronmwon, Cindy; Kuriachan, Vikas P

    2018-03-05

    Effective communication between the consultants and physicians form an integral foundation of effective and expert patient care. A broad review of the literature has not been undertaken to determine the components of a consultant's letter of most value to the referring physician. We aimed to identify the components of a consultant's letter preferred by referring physicians. We searched Embase and MEDLINE (OVID) Medicine (EBM) Reviews and Cochrane Database of Systematic Reviews for English articles with no restriction on initial date to January 6, 2017. Articles containing letters from specialists to referring physicians regarding outpatient assessments with either an observational or experimental design were included. Studies were excluded if they pertained to communications from referring physicians to consultant specialists, or pertained to allied health professionals, inpatient documents, or opinion articles. We enumerated the frequencies with which three common themes were addressed, and the positive or negative nature of the comments. The three themes were the structure of consultant letters, their contents, and whether referring physicians and consultants shared a common opinion about the items. Eighteen articles were included in our synthesis. In 11 reports, 91% of respondents preferred structured formats. Other preferred structural features were problem lists and brevity (four reports each). The most preferred contents were oriented to insight: diagnosis, prognosis, and management plan (16/21 mentions in the top tertile). Data items such as history, physical examination, and medication lists were less important (1/23 mentions in the top tertile). Reports varied as to whether referring physicians and consultants shared common opinions about letter features. Referring physicians prefer brief, structured letters from consultants that feature diagnostic and prognostic opinions and management plans over unstructured letters that emphasize data elements such as

  15. National Library of Medicine Celebrates 30 Years of Progress and Charts the Future | NIH MedlinePlus the Magazine

    Science.gov (United States)

    ... described the genesis of the National Center for Biotechnology Information (NCBI); the Visible Human Project (a digital ... expand the publication and distribution of NIH MedlinePlus magazine, thousands and thousands more people will gain valuable, ...

  16. High-Throughput Next-Generation Sequencing of Polioviruses

    Science.gov (United States)

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  17. Clinical Whole-Exome Sequencing for the Diagnosis of Mendelian Disorders

    Science.gov (United States)

    Yang, Yaping; Muzny, Donna M.; Reid, Jeffrey G.; Bainbridge, Matthew N.; Willis, Alecia; Ward, Patricia A.; Braxton, Alicia; Beuten, Joke; Xia, Fan; Niu, Zhiyv; Hardison, Matthew; Person, Richard; Bekheirnia, Mir Reza; Leduc, Magalie S.; Kirby, Amelia; Pham, Peter; Scull, Jennifer; Wang, Min; Ding, Yan; Plon, Sharon E.; Lupski, James R.; Beaudet, Arthur L.; Gibbs, Richard A.; Eng, Christine M.

    2014-01-01

    BACKGROUND Whole-exome sequencing is a diagnostic approach for the identification of molecular defects in patients with suspected genetic disorders. METHODS We developed technical, bioinformatic, interpretive, and validation pipelines for whole-exome sequencing in a certified clinical laboratory to identify sequence variants underlying disease phenotypes in patients. RESULTS We present data on the first 250 probands for whom referring physicians ordered whole-exome sequencing. Patients presented with a range of phenotypes suggesting potential genetic causes. Approximately 80% were children with neurologic pheno-types. Insurance coverage was similar to that for established genetic tests. We identified 86 mutated alleles that were highly likely to be causative in 62 of the 250 patients, achieving a 25% molecular diagnostic rate (95% confidence interval, 20 to 31). Among the 62 patients, 33 had autosomal dominant disease, 16 had auto-somal recessive disease, and 9 had X-linked disease. A total of 4 probands received two nonoverlapping molecular diagnoses, which potentially challenged the clinical diagnosis that had been made on the basis of history and physical examination. A total of 83% of the autosomal dominant mutant alleles and 40% of the X-linked mutant alleles occurred de novo. Recurrent clinical phenotypes occurred in patients with mutations that were highly likely to be causative in the same genes and in different genes responsible for genetically heterogeneous disorders. CONCLUSIONS Whole-exome sequencing identified the underlying genetic defect in 25% of consecutive patients referred for evaluation of a possible genetic condition. (Funded by the National Human Genome Research Institute.) PMID:24088041

  18. Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.

    Science.gov (United States)

    Alkhateeb, Abedalrhman; Rueda, Luis

    2017-08-01

    Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.

  19. Draft genome sequence of the Coccolithovirus Emiliania huxleyi virus 203.

    Science.gov (United States)

    Nissimov, Jozef I; Worthy, Charlotte A; Rooks, Paul; Napier, Johnathan A; Kimmance, Susan A; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J

    2011-12-01

    The Coccolithoviridae are a recently discovered group of viruses that infect the marine coccolithophorid Emiliania huxleyi. Emiliania huxleyi virus 203 (EhV-203) has a 160- to 180-nm-diameter icosahedral structure and a genome of approximately 400 kbp, consisting of 464 coding sequences (CDSs). Here we describe the genomic features of EhV-203 together with a draft genome sequence and its annotation, highlighting the homology and heterogeneity of this genome in comparison with the EhV-86 reference genome.

  20. G-MAPSEQ – a new method for mapping reads to a reference genome

    Directory of Open Access Journals (Sweden)

    Wojciechowski Pawel

    2016-06-01

    Full Text Available The problem of reads mapping to a reference genome is one of the most essential problems in modern computational biology. The most popular algorithms used to solve this problem are based on the Burrows-Wheeler transform and the FM-index. However, this causes some issues with highly mutated sequences due to a limited number of mutations allowed. G-MAPSEQ is a novel, hybrid algorithm combining two interesting methods: alignment-free sequence comparison and an ultra fast sequence alignment. The former is a fast heuristic algorithm which uses k-mer characteristics of nucleotide sequences to find potential mapping places. The latter is a very fast GPU implementation of sequence alignment used to verify the correctness of these mapping positions. The source code of G-MAPSEQ along with other bioinformatic software is available at: http://gpualign.cs.put.poznan.pl.

  1. Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data.

    Science.gov (United States)

    Al-Nakeeb, Kosai; Petersen, Thomas Nordahl; Sicheritz-Pontén, Thomas

    2017-11-21

    Whole-genome sequencing (WGS) projects provide short read nucleotide sequences from nuclear and possibly organelle DNA depending on the source of origin. Mitochondrial DNA is present in animals and fungi, while plants contain DNA from both mitochondria and chloroplasts. Current techniques for separating organelle reads from nuclear reads in WGS data require full reference or partial seed sequences for assembling. Norgal (de Novo ORGAneLle extractor) avoids this requirement by identifying a high frequency subset of k-mers that are predominantly of mitochondrial origin and performing a de novo assembly on a subset of reads that contains these k-mers. The method was applied to WGS data from a panda, brown algae seaweed, butterfly and filamentous fungus. We were able to extract full circular mitochondrial genomes and obtained sequence identities to the reference sequences in the range from 98.5 to 99.5%. We also assembled the chloroplasts of grape vines and cucumbers using Norgal together with seed-based de novo assemblers. Norgal is a pipeline that can extract and assemble full or partial mitochondrial and chloroplast genomes from WGS short reads without prior knowledge. The program is available at: https://bitbucket.org/kosaidtu/norgal .

  2. Pharmaceutical policies: effects of reference pricing, other pricing, and purchasing policies.

    Science.gov (United States)

    Acosta, Angela; Ciapponi, Agustín; Aaserud, Morten; Vietto, Valeria; Austvoll-Dahlgren, Astrid; Kösters, Jan Peter; Vacca, Claudia; Machado, Manuel; Diaz Ayala, Diana Hazbeydy; Oxman, Andrew D

    2014-10-16

    Pharmaceuticals are important interventions that could improve people's health. Pharmaceutical pricing and purchasing policies are used as cost-containment measures to determine or affect the prices that are paid for drugs. Internal reference pricing establishes a benchmark or reference price within a country which is the maximum level of reimbursement for a group of drugs. Other policies include price controls, maximum prices, index pricing, price negotiations and volume-based pricing. To determine the effects of pharmaceutical pricing and purchasing policies on health outcomes, healthcare utilisation, drug expenditures and drug use. We searched the Cochrane Central Register of Controlled Trials (CENTRAL), part of The Cochrane Library (including the Effective Practice and Organisation of Care Group Register) (searched 22/10/2012); MEDLINE In-Process & Other Non-Indexed Citations and MEDLINE, Ovid (searched 22/10/2012); EconLit, ProQuest (searched 22/10/2012); PAIS International, ProQuest (searched 22/10/2012); World Wide Political Science Abstracts, ProQuest (searched 22/10/2012); INRUD Bibliography (searched 22/10/2012); Embase, Ovid (searched 14/12/2010); NHSEED, part of The Cochrane Library (searched 08/12/2010); LILACS, VHL (searched 14/12/2010); International Political Science Abstracts (IPSA), Ebsco (searched (17/12/2010); OpenSIGLE (searched 21/12/10); WHOLIS, WHO (searched 17/12/2010); World Bank (Documents and Reports) (searched 21/12/2010); Jolis (searched 09/10/2011); Global Jolis (searched 09/10/2011) ; OECD (searched 30/08/2005); OECD iLibrary (searched 30/08/2005); World Bank eLibrary (searched 21/12/2010); WHO - The Essential Drugs and Medicines web site (browsed 21/12/2010). Policies in this review were defined as laws; rules; financial and administrative orders made by governments, non-government organisations or private insurers. To be included a study had to include an objective measure of at least one of the following outcomes: drug use

  3. Genetics Home Reference: Gorlin syndrome

    Science.gov (United States)

    ... for This Condition basal cell nevus syndrome BCNS Gorlin-Goltz syndrome NBCCS nevoid basal cell carcinoma syndrome Related Information ... named? Additional Information & Resources MedlinePlus (2 links) Encyclopedia: Basal Cell Nevus Syndrome Health Topic: Skin Cancer Genetic and Rare Diseases ...

  4. Learning sequences on the subject of energy

    International Nuclear Information System (INIS)

    1986-01-01

    The ten learning sequences follow on one another. Each picks on a particular aspect from the energy field. The subject notebooks are self-contained and can therefore be used independently. Apart from actual data and energy-related information, the information for the teacher contains: - proposals for teaching - suggestions for further activities - sample solutions for the pupil's sheets - references to the literature and media. The worksheets for the pupils are different; it should be possible to use the learning sequences in all classes of secondary school stage 1. The multicoloured foils for projectors should motivate, on the one hand, and on the other hand should help to check the results of learning. (orig./HP) [de

  5. Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

    Science.gov (United States)

    Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

    2012-01-01

    Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365

  6. A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio.

    Science.gov (United States)

    Leung, Henry C M; Yiu, S M; Yang, Bin; Peng, Yu; Wang, Yi; Liu, Zhihua; Chen, Jingchi; Qin, Junjie; Li, Ruiqiang; Chin, Francis Y L

    2011-06-01

    With the rapid development of next-generation sequencing techniques, metagenomics, also known as environmental genomics, has emerged as an exciting research area that enables us to analyze the microbial environment in which we live. An important step for metagenomic data analysis is the identification and taxonomic characterization of DNA fragments (reads or contigs) resulting from sequencing a sample of mixed species. This step is referred to as 'binning'. Binning algorithms that are based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms or phylogenetic markers. Due to the limited availability of reference genomes and the bias and low availability of markers, these algorithms may not be applicable in all cases. Unsupervised binning algorithms which can handle fragments from unknown species provide an alternative approach. However, existing unsupervised binning algorithms only work on datasets either with balanced species abundance ratios or rather different abundance ratios, but not both. In this article, we present MetaCluster 3.0, an integrated binning method based on the unsupervised top--down separation and bottom--up merging strategy, which can bin metagenomic fragments of species with very balanced abundance ratios (say 1:1) to very different abundance ratios (e.g. 1:24) with consistently higher accuracy than existing methods. MetaCluster 3.0 can be downloaded at http://i.cs.hku.hk/~alse/MetaCluster/.

  7. Development of Reference Source Terms for EU-APR1400

    Energy Technology Data Exchange (ETDEWEB)

    Kim, ByungIl; Lee, Chonghui; Lee, Dongsu; Ko, Heejin; Kang, Sangho [KEPCO Engineering and Construction Co. Inc., Yongin (Korea, Republic of)

    2014-05-15

    These source terms are developed for the typical U. S. NPP and do not reflect the design characteristics of EU-APR1400 (1,400 MWe PWR) which will be applied for the EUR certification in European countries. The process of developing the RST for EU-APR1400 is to undergo a similar process that NUREG-1465 had gone through when it came out with its proposed source terms. The purpose of this study is to develop the EU-APR1400 design-specific RST complied with the EUR. The Large LOCA is the reference equence used in the NUREG-1465 evaluation, whereas the EUAPR1400 risk-significant sequences are dominated by small LOCA and non-LOCA sequences. Moreover, when considering the EU-APR1400 has many design features to mitigate the consequences of severe accident phenomena, it is not surprising that the aspects of both release fractions and durations are distinctly different from NUREG-1465. This RST will be continuously updated to reflect to the design features of EU-APR1400, and then, be used as the reference for design purposes such as criteria satisfaction of radioactivity releases, equipment survivability, control room habitability for severe accident, and so on.

  8. Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads.

    Science.gov (United States)

    Huson, Daniel H; Tappu, Rewati; Bazinet, Adam L; Xie, Chao; Cummings, Michael P; Nieselt, Kay; Williams, Rohan

    2017-01-25

    Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However, for some questions, only specific genes of interest need to be assembled. This is then a gene-centric assembly where the goal is to assemble reads into contigs for a family of orthologous genes. We present a new method for performing gene-centric assembly, called protein-alignment-guided assembly, and provide an implementation in our metagenome analysis tool MEGAN. Genes are assembled on the fly, based on the alignment of all reads against a protein reference database such as NCBI-nr. Specifically, the user selects a gene family based on a classification such as KEGG and all reads binned to that gene family are assembled. Using published synthetic community metagenome sequencing reads and a set of 41 gene families, we show that the performance of this approach compares favorably with that of full-featured assemblers and that of a recently published HMM-based gene-centric assembler, both in terms of the number of reference genes detected and of the percentage of reference sequence covered. Protein-alignment-guided assembly of orthologous gene families complements whole-metagenome assembly in a new and very useful way.

  9. Comparative performance of double-digest RAD sequencing across divergent arachnid lineages.

    Science.gov (United States)

    Burns, Mercedes; Starrett, James; Derkarabetian, Shahan; Richart, Casey H; Cabrero, Allan; Hedin, Marshal

    2017-05-01

    Next-generation sequencing technologies now allow researchers of non-model systems to perform genome-based studies without the requirement of a (often unavailable) closely related genomic reference. We evaluated the role of restriction endonuclease (RE) selection in double-digest restriction-site-associated DNA sequencing (ddRADseq) by generating reduced representation genome-wide data using four different RE combinations. Our expectation was that RE selections targeting longer, more complex restriction sites would recover fewer loci than RE with shorter, less complex sites. We sequenced a diverse sample of non-model arachnids, including five congeneric pairs of harvestmen (Opiliones) and four pairs of spiders (Araneae). Sample pairs consisted of either conspecifics or closely related congeneric taxa, and in total 26 sample pair analyses were tested. Sequence demultiplexing, read clustering and variant calling were performed in the pyRAD program. The 6-base pair cutter EcoRI combined with methylated site-specific 4-base pair cutter MspI produced, on average, the greatest numbers of intra-individual loci and shared loci per sample pair. As expected, the number of shared loci recovered for a sample pair covaried with the degree of genetic divergence, estimated with cytochrome oxidase I sequences, although this relationship was non-linear. Our comparative results will prove useful in guiding protocol selection for ddRADseq experiments on many arachnid taxa where reference genomes, even from closely related species, are unavailable. © 2016 John Wiley & Sons Ltd.

  10. Superior Cross-Species Reference Genes: A Blueberry Case Study

    Science.gov (United States)

    Die, Jose V.; Rowland, Lisa J.

    2013-01-01

    The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well. PMID:24058469

  11. Superior cross-species reference genes: a blueberry case study.

    Directory of Open Access Journals (Sweden)

    Jose V Die

    Full Text Available The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well.

  12. Plans and progress for building a Great Lakes fauna DNA barcode reference library

    Science.gov (United States)

    DNA reference libraries provide researchers with an important tool for assessing regional biodiversity by allowing unknown genetic sequences to be assigned identities, while also providing a means for taxonomists to validate identifications. Expanding the representation of Great...

  13. Doubly excited states of the LiI isoelectronic sequence

    International Nuclear Information System (INIS)

    To, K.X.; Knystautas, E.; Drouin, R.; Berry, H.G.

    1978-01-01

    The term level diagrams of the doubly excited quartet systems of the LiI isoelectronic sequence up to Ne VIII are presented. The identifications are based on recent theoretical and experimental work which suggest a revision particularly of the 2s3p/sup 4po/ terms. 11 references

  14. Construction of a virtual Mycobacterium tuberculosis consensus genome and its application to data from a next generation sequencer.

    Science.gov (United States)

    Okumura, Kayo; Kato, Masako; Kirikae, Teruo; Kayano, Mitsunori; Miyoshi-Akiyama, Tohru

    2015-03-20

    Although Mycobacterium tuberculosis isolates are consisted of several different lineages and the epidemiology analyses are usually assessed relative to a particular reference genome, M. tuberculosis H37Rv, which might introduce some biased results. Those analyses are essentially based genome sequence information of M. tuberculosis and could be performed in sillico in theory, with whole genome sequence (WGS) data available in the databases and obtained by next generation sequencers (NGSs). As an approach to establish higher resolution methods for such analyses, whole genome sequences of the M. tuberculosis complexes (MTBCs) strains available on databases were aligned to construct virtual reference genome sequences called the consensus sequence (CS), and evaluated its feasibility in in sillico epidemiological analyses. The consensus sequence (CS) was successfully constructed and utilized to perform phylogenetic analysis, evaluation of read mapping efficacy, which is crucial for detecting single nucleotide polymorphisms (SNPs), and various MTBC typing methods virtually including spoligotyping, VNTR, Long sequence polymorphism and Beijing typing. SNPs detected based on CS, in comparison with H37Rv, were utilized in concatemer-based phylogenetic analysis to determine their reliability relative to a phylogenetic tree based on whole genome alignment as the gold standard. Statistical comparison of phylogenic trees based on CS with that of H37Rv indicated the former showed always better results that that of later. SNP detection and concatenation with CS was advantageous because the frequency of crucial SNPs distinguishing among strain lineages was higher than those of H37Rv. The number of SNPs detected was lower with the consensus than with the H37Rv sequence, resulting in a significant reduction in computational time. Performance of each virtual typing was satisfactory and accorded with those published when those are available. These results indicated that virtual CS

  15. Sequence of tenses: playing with possibilities

    Directory of Open Access Journals (Sweden)

    Bohdan Ulašin

    2012-12-01

    Full Text Available The aim of this article is to analyse the sequence of tenses in Spanish, a phenomenon typical of the Romance languages. Our purpose is to systematize and formulate the rules of use by means of examples with graphical representations. We focus on such cases of use of the sequence of tenses where it is possible to apply a “double-access interpretation”, i. e. which allows the possibility of choosing between reference points: with the actual time (the moment of speech or with the main clause. This double interpretation is to be found in all three types of time relationships: simultaneousness, priority and posteriority. Some pairs differ semantically; others are used according to the preferences of the speaker with no change of meaning. Most examples are subordinate noun clauses, nonetheless we included examples of other types of clauses as well (e.g. relative, reason clauses. We also analyze the problem of the sequence of tenses where the main verb is in the conditional tense.

  16. A computational genomics pipeline for prokaryotic sequencing projects.

    Science.gov (United States)

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  17. Genetics Home Reference: Tourette syndrome

    Science.gov (United States)

    ... and Vocal Tic Disorder Gilles de la Tourette Syndrome Gilles de la Tourette's syndrome GTS TD Tourette Disorder Tourette's Disease TS Related ... Additional Information & Resources MedlinePlus (2 links) Encyclopedia: Gilles de la Tourette syndrome Health Topic: Tourette Syndrome Genetic and Rare Diseases ...

  18. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  19. Performance Improvement of a Prefiltered Synchronous-Reference-Frame PLL By Using a PID-Type Loop Filter

    DEFF Research Database (Denmark)

    Golestan, Saeed; Monfared, Mohammad; Frejeido, Francisco

    2014-01-01

    Control Parameters design of a three-phase synchronous reference frame phase locked loop (SRF-PLL) with a pre-filtering stage (acting as the sequence separator) is not a trivial task. The conventional way to deal with this problem is to neglect the interaction between the SRF-PLL and pre-filterin......Control Parameters design of a three-phase synchronous reference frame phase locked loop (SRF-PLL) with a pre-filtering stage (acting as the sequence separator) is not a trivial task. The conventional way to deal with this problem is to neglect the interaction between the SRF-PLL and pre......-integral-derivative controller as the loop filter (instead of the commonly adopted proportionalintegral controller) and arranging a pole-zero cancellation. The suggested method is simple and efficient, and is applicable to the joint operation of different sequence separation techniques and the SRF-PLL. The effectiveness...

  20. Genotyping-by-sequencing (GBS), an ultimate marker-assisted selection (MAS) tool to accelerate plant breeding

    OpenAIRE

    He, Jiangfeng; Zhao, Xiaoqing; Laroche, André; Lu, Zhen-Xiang; Liu, HongKui; Li, Ziqin

    2014-01-01

    Marker-assisted selection (MAS) refers to the use of molecular markers to assist phenotypic selections in crop improvement. Several types of molecular markers, such as single nucleotide polymorphism (SNP), have been identified and effectively used in plant breeding. The application of next-generation sequencing (NGS) technologies has led to remarkable advances in whole genome sequencing, which provides ultra-throughput sequences to revolutionize plant genotyping and breeding. To further broad...

  1. Draft genome sequence of the coccolithovirus Emiliania huxleyi virus 202.

    Science.gov (United States)

    Nissimov, Jozef I; Worthy, Charlotte A; Rooks, Paul; Napier, Johnathan A; Kimmance, Susan A; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J

    2012-02-01

    Emiliania huxleyi virus 202 (EhV-202) is a member of the Coccolithoviridae, a group of viruses that infect the marine coccolithophorid Emiliania huxleyi. EhV-202 has a 160- to 180-nm-diameter icosahedral structure and a genome of approximately 407 kbp, consisting of 485 coding sequences (CDSs). Here we describe the genomic features of EhV-202, together with a draft genome sequence and its annotation, highlighting the homology and heterogeneity of this genome in comparison with the EhV-86 reference genome.

  2. Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome

    Science.gov (United States)

    PacBio long-read sequencing technology is increasingly popular in genome sequence assembly and transcriptome cataloguing. Recently, a new-generation pig reference genome was assembled based on long reads from this technology. To finely annotate this genome assembly, transcriptomes of nine tissues fr...

  3. BLEACHING EUCALYPTUS PULPS WITH SHORT SEQUENCES

    Directory of Open Access Journals (Sweden)

    Flaviana Reis Milagres

    2011-03-01

    Full Text Available Eucalyptus spp kraft pulp, due to its high content of hexenuronic acids, is quite easy to bleach. Therefore, investigations have been made attempting to decrease the number of stages in the bleaching process in order to minimize capital costs. This study focused on the evaluation of short ECF (Elemental Chlorine Free and TCF (Totally Chlorine Free sequences for bleaching oxygen delignified Eucalyptus spp kraft pulp to 90% ISO brightness: PMoDP (Molybdenum catalyzed acid peroxide, chlorine dioxide and hydrogen peroxide, PMoD/P (Molybdenum catalyzed acid peroxide, chlorine dioxide and hydrogen peroxide, without washing PMoD(PO (Molybdenum catalyzed acid peroxide, chlorine dioxide and pressurized peroxide, D(EPODP (chlorine dioxide, extraction oxidative with oxygen and peroxide, chlorine dioxide and hydrogen peroxide, PMoQ(PO (Molybdenum catalyzed acid peroxide, DTPA and pressurized peroxide, and XPMoQ(PO (Enzyme, molybdenum catalyzed acid peroxide, DTPA and pressurized peroxide. Uncommon pulp treatments, such as molybdenum catalyzed acid peroxide (PMo and xylanase (X bleaching stages, were used. Among the ECF alternatives, the two-stage PMoD/P sequence proved highly cost-effective without affecting pulp quality in relation to the traditional D(EPODP sequence and produced better quality effluent in relation to the reference. However, a four stage sequence, XPMoQ(PO, was required to achieve full brightness using the TCF technology. This sequence was highly cost-effective although it only produced pulp of acceptable quality.

  4. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs

    DEFF Research Database (Denmark)

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards...

  5. Sequencing and analysis of the Mediterranean amphioxus (Branchiostoma lanceolatum transcriptome.

    Directory of Open Access Journals (Sweden)

    Silvan Oulion

    Full Text Available BACKGROUND: The basally divergent phylogenetic position of amphioxus (Cephalochordata, as well as its conserved morphology, development and genetics, make it the best proxy for the chordate ancestor. Particularly, studies using the amphioxus model help our understanding of vertebrate evolution and development. Thus, interest for the amphioxus model led to the characterization of both the transcriptome and complete genome sequence of the American species, Branchiostoma floridae. However, recent technical improvements allowing induction of spawning in the laboratory during the breeding season on a daily basis with the Mediterranean species Branchiostoma lanceolatum have encouraged European Evo-Devo researchers to adopt this species as a model even though no genomic or transcriptomic data have been available. To fill this need we used the pyrosequencing method to characterize the B. lanceolatum transcriptome and then compared our results with the published transcriptome of B. floridae. RESULTS: Starting with total RNA from nine different developmental stages of B. lanceolatum, a normalized cDNA library was constructed and sequenced on Roche GS FLX (Titanium mode. Around 1.4 million of reads were produced and assembled into 70,530 contigs (average length of 490 bp. Overall 37% of the assembled sequences were annotated by BlastX and their Gene Ontology terms were determined. These results were then compared to genomic and transcriptomic data of B. floridae to assess similarities and specificities of each species. CONCLUSION: We obtained a high-quality amphioxus (B. lanceolatum reference transcriptome using a high throughput sequencing approach. We found that 83% of the predicted genes in the B. floridae complete genome sequence are also found in the B. lanceolatum transcriptome, while only 41% were found in the B. floridae transcriptome obtained with traditional Sanger based sequencing. Therefore, given the high degree of sequence conservation

  6. QTL analysis by sequencing of Water Use Efficiency (WUE) in potato

    DEFF Research Database (Denmark)

    Kaminski, Kacper Piotr; Sønderkær, Mads; Sørensen, Kirsten Kørup

    2013-01-01

    The traditional approach to potato breeding, the classical “mate and phenotype” approach is relatively costly and because phenotyping and growth capacity is limited, this are being slowly replaced by Marker Assisted Selection (MAS) breeding schemes. MAS is based on the presence of DNA polymorphic.......sparsipilum), phenotyped for water use efficiency. This population has also previously been phenotyped for the total glycoalkaloid (TGA) content....... and time consuming process. Here, a novel method for Quantitative Trait Locus (QTL) analysis has been developed, that allows for development of specific markers by use of genomic sequence reads and the recently published reference genome sequence for potato. Prior to sequencing the mapping population...

  7. Complete sequence analysis reveals two distinct poleroviruses infecting cucurbits in China.

    Science.gov (United States)

    Xiang, Hai-ying; Shang, Qiao-xia; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

    2008-01-01

    The complete RNA genomes of a Chinese isolate of cucurbit aphid-borne yellows virus (CABYV-CHN) and a new polerovirus tentatively referred to as melon aphid-borne yellows virus (MABYV) were determined. The entire genome of CABYV-CHN shared 89.0% nucleotide sequence identity with the French CABYV isolate. In contrast, nucleotide sequence identities between MABYV and CABYV and other poleroviruses were in the range of 50.7-74.2%, with amino acid sequence identities ranging from 24.8 to 82.9% for individual gene products. We propose that CABYV-CHN is a strain of CABYV and that MABYV is a member of a tentative distinct species within the genus Polerovirus.

  8. Swallow Event Sequencing: Comparing Healthy Older and Younger Adults.

    Science.gov (United States)

    Herzberg, Erica G; Lazarus, Cathy L; Steele, Catriona M; Molfenter, Sonja M

    2018-04-23

    Previous research has established that a great deal of variation exists in the temporal sequence of swallowing events for healthy adults. Yet, the impact of aging on swallow event sequence is not well understood. Kendall et al. (Dysphagia 18(2):85-91, 2003) suggested there are 4 obligatory paired-event sequences in swallowing. We directly compared adherence to these sequences, as well as event latencies, and quantified the percentage of unique sequences in two samples of healthy adults: young ( 65). The 8 swallowing events that contribute to the sequences were reliably identified from videofluoroscopy in a sample of 23 healthy seniors (10 male, mean age 74.7) and 20 healthy young adults (10 male, mean age 31.5) with no evidence of penetration-aspiration or post-swallow residue. Chi-square analyses compared the proportions of obligatory pairs and unique sequences by age group. Compared to the older subjects, younger subjects had significantly lower adherence to two obligatory sequences: Upper Esophageal Sphincter (UES) opening occurs before (or simultaneous with) the bolus arriving at the UES and UES maximum distention occurs before maximum pharyngeal constriction. The associated latencies were significantly different between age groups as well. Further, significantly fewer unique swallow sequences were observed in the older group (61%) compared with the young (82%) (χ 2  = 31.8; p < 0.001). Our findings suggest that paired swallow event sequences may not be robust across the age continuum and that variation in swallow sequences appears to decrease with aging. These findings provide normative references for comparisons to older individuals with dysphagia.

  9. Post-main-sequence planetary system evolution

    Science.gov (United States)

    Veras, Dimitri

    2016-01-01

    The fates of planetary systems provide unassailable insights into their formation and represent rich cross-disciplinary dynamical laboratories. Mounting observations of post-main-sequence planetary systems necessitate a complementary level of theoretical scrutiny. Here, I review the diverse dynamical processes which affect planets, asteroids, comets and pebbles as their parent stars evolve into giant branch, white dwarf and neutron stars. This reference provides a foundation for the interpretation and modelling of currently known systems and upcoming discoveries. PMID:26998326

  10. Leishmania naiffi and Leishmania guyanensis reference genomes highlight genome structure and gene evolution in the Viannia subgenus.

    Science.gov (United States)

    Coughlan, Simone; Taylor, Ali Shirley; Feane, Eoghan; Sanders, Mandy; Schonian, Gabriele; Cotton, James A; Downing, Tim

    2018-04-01

    The unicellular protozoan parasite Leishmania causes the neglected tropical disease leishmaniasis, affecting 12 million people in 98 countries. In South America, where the Viannia subgenus predominates, so far only L. ( Viannia ) braziliensis and L. ( V. ) panamensis have been sequenced, assembled and annotated as reference genomes. Addressing this deficit in molecular information can inform species typing, epidemiological monitoring and clinical treatment. Here, L. ( V. ) naiffi and L. ( V. ) guyanensis genomic DNA was sequenced to assemble these two genomes as draft references from short sequence reads. The methods used were tested using short sequence reads for L. braziliensis M2904 against its published reference as a comparison. This assembly and annotation pipeline identified 70 additional genes not annotated on the original M2904 reference. Phylogenetic and evolutionary comparisons of L. guyanensis and L. naiffi with 10 other Viannia genomes revealed four traits common to all Viannia : aneuploidy, 22 orthologous groups of genes absent in other Leishmania subgenera, elevated TATE transposon copies and a high NADH-dependent fumarate reductase gene copy number. Within the Viannia , there were limited structural changes in genome architecture specific to individual species: a 45 Kb amplification on chromosome 34 was present in all bar L. lainsoni , L. naiffi had a higher copy number of the virulence factor leishmanolysin, and laboratory isolate L. shawi M8408 had a possible minichromosome derived from the 3' end of chromosome 34 . This combination of genome assembly, phylogenetics and comparative analysis across an extended panel of diverse Viannia has uncovered new insights into the origin and evolution of this subgenus and can help improve diagnostics for leishmaniasis surveillance.

  11. Automation tools for accelerator control a network based sequencer

    International Nuclear Information System (INIS)

    Clout, P.; Geib, M.; Westervelt, R.

    1991-01-01

    In conjunction with a major client, Vista Control Systems has developed a sequencer for control systems which works in conjunction with its realtime, distributed Vsystem database. Vsystem is a network-based data acquisition, monitoring and control system which has been applied successfully to both accelerator projects and projects outside this realm of research. The network-based sequencer allows a user to simply define a thread of execution in any supported computer on the network. The script defining a sequence has a simple syntax designed for non-programmers, with facilities for selectively abbreviating the channel names for easy reference. The semantics of the script contains most of the familiar capabilities of conventional programming languages, including standard stream I/O and the ability to start other processes with parameters passed. The script is compiled to threaded code for execution efficiency. The implementation is described in some detail and examples are given of applications for which the sequencer has been used

  12. Processing sequence annotation data using the Lua programming language.

    Science.gov (United States)

    Ueno, Yutaka; Arita, Masanori; Kumagai, Toshitaka; Asai, Kiyoshi

    2003-01-01

    The data processing language in a graphical software tool that manages sequence annotation data from genome databases should provide flexible functions for the tasks in molecular biology research. Among currently available languages we adopted the Lua programming language. It fulfills our requirements to perform computational tasks for sequence map layouts, i.e. the handling of data containers, symbolic reference to data, and a simple programming syntax. Upon importing a foreign file, the original data are first decomposed in the Lua language while maintaining the original data schema. The converted data are parsed by the Lua interpreter and the contents are stored in our data warehouse. Then, portions of annotations are selected and arranged into our catalog format to be depicted on the sequence map. Our sequence visualization program was successfully implemented, embedding the Lua language for processing of annotation data and layout script. The program is available at http://staff.aist.go.jp/yutaka.ueno/guppy/.

  13. Microbial Dark Matter: Unusual intervening sequences in 16S rRNA genes of candidate phyla from the deep subsurface

    Energy Technology Data Exchange (ETDEWEB)

    Jarett, Jessica; Stepanauskas, Ramunas; Kieft, Thomas; Onstott, Tullis; Woyke, Tanja

    2014-03-17

    The Microbial Dark Matter project has sequenced genomes from over 200 single cells from candidate phyla, greatly expanding our knowledge of the ecology, inferred metabolism, and evolution of these widely distributed, yet poorly understood lineages. The second phase of this project aims to sequence an additional 800 single cells from known as well as potentially novel candidate phyla derived from a variety of environments. In order to identify whole genome amplified single cells, screening based on phylogenetic placement of 16S rRNA gene sequences is being conducted. Briefly, derived 16S rRNA gene sequences are aligned to a custom version of the Greengenes reference database and added to a reference tree in ARB using parsimony. In multiple samples from deep subsurface habitats but not from other habitats, a large number of sequences proved difficult to align and therefore to place in the tree. Based on comparisons to reference sequences and structural alignments using SSU-ALIGN, many of these ?difficult? sequences appear to originate from candidate phyla, and contain intervening sequences (IVSs) within the 16S rRNA genes. These IVSs are short (39 - 79 nt) and do not appear to be self-splicing or to contain open reading frames. IVSs were found in the loop regions of stem-loop structures in several different taxonomic groups. Phylogenetic placement of sequences is strongly affected by IVSs; two out of three groups investigated were classified as different phyla after their removal. Based on data from samples screened in this project, IVSs appear to be more common in microbes occurring in deep subsurface habitats, although the reasons for this remain elusive.

  14. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  15. Reference Materials for Calibration of Analytical Biases in Quantification of DNA Methylation.

    Science.gov (United States)

    Yu, Hannah; Hahn, Yoonsoo; Yang, Inchul

    2015-01-01

    Most contemporary methods for the quantification of DNA methylation employ bisulfite conversion and PCR amplification. However, many reports have indicated that bisulfite-mediated PCR methodologies can result in inaccurate measurements of DNA methylation owing to amplification biases. To calibrate analytical biases in quantification of gene methylation, especially those that arise during PCR, we utilized reference materials that represent exact bisulfite-converted sequences with 0% and 100% methylation status of specific genes. After determining relative quantities using qPCR, pairs of plasmids were gravimetrically mixed to generate working standards with predefined DNA methylation levels at 10% intervals in terms of mole fractions. The working standards were used as controls to optimize the experimental conditions and also as calibration standards in melting-based and sequencing-based analyses of DNA methylation. Use of the reference materials enabled precise characterization and proper calibration of various biases during PCR and subsequent methylation measurement processes, resulting in accurate measurements.

  16. Genetics Home Reference: chylomicron retention disease

    Science.gov (United States)

    ... reflexes (hyporeflexia) and a decreased ability to feel vibrations. Related Information What does it mean if a ... Encyclopedia: Malabsorption General Information from MedlinePlus (5 links) Diagnostic Tests Drug Therapy Genetic Counseling Palliative Care Surgery ...

  17. Technical Report: Algorithm and Implementation for Quasispecies Abundance Inference with Confidence Intervals from Metagenomic Sequence Data

    Energy Technology Data Exchange (ETDEWEB)

    McLoughlin, Kevin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-11

    This report describes the design and implementation of an algorithm for estimating relative microbial abundances, together with confidence limits, using data from metagenomic DNA sequencing. For the background behind this project and a detailed discussion of our modeling approach for metagenomic data, we refer the reader to our earlier technical report, dated March 4, 2014. Briefly, we described a fully Bayesian generative model for paired-end sequence read data, incorporating the effects of the relative abundances, the distribution of sequence fragment lengths, fragment position bias, sequencing errors and variations between the sampled genomes and the nearest reference genomes. A distinctive feature of our modeling approach is the use of a Chinese restaurant process (CRP) to describe the selection of genomes to be sampled, and thus the relative abundances. The CRP component is desirable for fitting abundances to reads that may map ambiguously to multiple targets, because it naturally leads to sparse solutions that select the best representative from each set of nearly equivalent genomes.

  18. Lithium evolution in metal-poor stars: from Pre-Main Sequence to the Spite plateau

    OpenAIRE

    Fu, Xiaoting; Bressan, Alessandro; Molaro, Paolo; Marigo, Paola

    2015-01-01

    Lithium abundance derived in metal-poor main sequence stars is about three times lower than the value of primordial Li predicted by the standard Big Bang nucleosynthesis when the baryon density is taken from the CMB or the deuterium measurements. This disagreement is generally referred as the lithium problem. We here reconsider the stellar Li evolution from the pre-main sequence to the end of the main sequence phase by introducing the effects of convective overshooting and residual mass accre...

  19. Colleagues Pay Tribute to Dr. Donald A.B. Lindberg, Retiring After Three Decades of NLM Leadership | NIH MedlinePlus ...

    Science.gov (United States)

    ... informatics and helped establish the National Center for Biotechnology Information (NCBI), a course-changing development that could ... expand the publication and distribution of NIH MedlinePlus magazine, thousands and thousands more people will gain valuable, ...

  20. A Probabilistic Approach for Improved Sequence Mapping in Metatranscriptomic Studies

    Science.gov (United States)

    Mapping millions of short DNA sequences a reference genome is a necessary step in many experiments designed to investigate the expression of genes involved in disease resistance. This is a difficult task in which several challenges often arise resulting in a suboptimal mapping. This mapping process ...