Full Text Available Abstract Background Coregulator proteins are "master regulators", directing transcriptional and posttranscriptional regulation of many target genes, and are critical in many normal physiological processes, but also in hormone driven diseases, such as breast cancer. Little is known on how genetic changes in these genes impact disease development and progression. Thus, we set out to identify novel single nucleotide polymorphisms (SNPs within SRC-1 (NCoA1, SRC-3 (NCoA3, AIB1, NCoR (NCoR1, and SMRT (NCoR2, and test the most promising SNPs for associations with breast cancer risk. Methods The identification of novel SNPs was accomplished by sequencing the coding regions of these genes in 96 apparently normal individuals (48 Caucasian Americans, 48 African Americans. To assess their association with breast cancer risk, five SNPs were genotyped in 1218 familial BRCA1/2-mutation negative breast cancer cases and 1509 controls (rs1804645, rs6094752, rs2230782, rs2076546, rs2229840. Results Through our resequencing effort, we identified 74 novel SNPs (30 in NCoR, 32 in SMRT, 10 in SRC-3, and 2 in SRC-1. Of these, 8 were found with minor allele frequency (MAF >5% illustrating the large amount of genetic diversity yet to be discovered. The previously shown protective effect of rs2230782 in SRC-3 was strengthened (OR = 0.45 [0.21-0.98], p = 0.04. No significant associations were found with the other SNPs genotyped. Conclusions This data illustrates the importance of coregulators, especially SRC-3, in breast cancer development and suggests that more focused studies, including functional analyses, should be conducted.
Cregan Perry B
indicate that a trained ML classifier can significantly reduce human intervention and in this case achieved a 5–10 fold enhanced productivity. The optimized feature set and ML framework can also be applied to all polymorphism discovery software. ML support software is written in Perl and can be easily integrated into an existing SNP discovery pipeline.
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA variation in vertebrates and may be used as genetic markers for a range of applications. This has led to an increased interest in identification of SNP markers in non-model species and farmed animals. The in silico SNP mining method used for discovery of most known SNPs in Atlantic salmon (Salmo salar has applied a global (genome-wide approach. In this study we present a targeted 3'UTR-primed SNP discovery strategy that utilizes sequence data from Salmo salar full length sequenced cDNAs (FLIcs. We compare the efficiency of this new strategy to the in silico SNP mining method when using both methods for targeted SNP discovery. Results The SNP discovery efficiency of the two methods was tested in a set of FLIc target genes. The 3'UTR-primed SNP discovery method detected novel SNPs in 35% of the target genes while the in silico SNP mining method detected novel SNPs in 15% of the target genes. Furthermore, the 3'UTR-primed SNP discovery strategy was the less labor intensive one and revealed a higher success rate than the in silico SNP mining method in the initial amplification step. When testing the methods we discovered 112 novel bi-allelic polymorphisms (type I markers in 88 salmon genes [dbSNP: ss179319972-179320081, ss250608647-250608648], and three of the SNPs discovered were missense substitutions. Conclusions Full length insert cDNAs (FLIcs are important genomic resources that have been developed in many farmed animals. The 3'UTR-primed SNP discovery strategy successfully utilized FLIc data to detect novel SNPs in the partially tetraploid Atlantic salmon. This strategy may therefore be useful for targeted SNP discovery in several species, and particularly useful in species that, like salmonids, have duplicated genomes.
Nathan A Baird
Full Text Available Single nucleotide polymorphism (SNP discovery and genotyping are essential to genetic mapping. There remains a need for a simple, inexpensive platform that allows high-density SNP discovery and genotyping in large populations. Here we describe the sequencing of restriction-site associated DNA (RAD tags, which identified more than 13,000 SNPs, and mapped three traits in two model organisms, using less than half the capacity of one Illumina sequencing run. We demonstrated that different marker densities can be attained by choice of restriction enzyme. Furthermore, we developed a barcoding system for sample multiplexing and fine mapped the genetic basis of lateral plate armor loss in threespine stickleback by identifying recombinant breakpoints in F(2 individuals. Barcoding also facilitated mapping of a second trait, a reduction of pelvic structure, by in silico re-sorting of individuals. To further demonstrate the ease of the RAD sequencing approach we identified polymorphic markers and mapped an induced mutation in Neurospora crassa. Sequencing of RAD markers is an integrated platform for SNP discovery and genotyping. This approach should be widely applicable to genetic mapping in a variety of organisms.
Qin, Sisi; Ingle, James N; Liu, Mohan; Yu, Jia; Wickerham, D Lawrence; Kubo, Michiaki; Weinshilboum, Richard M; Wang, Liewei
We previously performed a case-control genome-wide association study in women treated with selective estrogen receptor modulators (SERMs) for breast cancer prevention and identified single nucleotide polymorphisms (SNPs) in ZNF423 as potential biomarkers for response to SERM therapy. The ZNF423rs9940645 SNP, which is approximately 200 bp away from the estrogen response elements, resulted in the SNP, estrogen, and SERM-dependent regulation of ZNF423 expression and, "downstream", that of BRCA1. Electrophoretic mobility shift assay-mass spectrometry was performed to identify proteins binding to the ZNF423 SNP and coordinating with estrogen receptor alpha (ERα). Clustered, regularly interspaced short palindromic repeats (CRISPR)/Cas9 genome editing was applied to generate ZR75-1 breast cancer cells with different ZNF423 SNP genotypes. Both cultured cells and mouse xenograft models with different ZNF423 SNP genotypes were used to study the cellular responses to SERMs and poly(ADP-ribose) polymerase (PARP) inhibitors. We identified calmodulin-like protein 3 (CALML3) as a key sensor of this SNP and a coregulator of ERα, which contributes to differential gene transcription regulation in an estrogen and SERM-dependent fashion. Furthermore, using CRISPR/Cas9-engineered ZR75-1 breast cancer cells with different ZNF423 SNP genotypes, striking differences in cellular responses to SERMs and PARP inhibitors, alone or in combination, were observed not only in cells but also in a mouse xenograft model. Our results have demonstrated the mechanism by which the ZNF423 rs9940645 SNP might regulate gene expression and drug response as well as its potential role in achieving more highly individualized breast cancer therapy.
You Frank M
Full Text Available Abstract Background A genome-wide set of single nucleotide polymorphisms (SNPs is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a pipeline (AGSNP for genome-wide SNP discovery in coding sequences and other single-copy DNA without a complete genome sequence in self-pollinating (autogamous plants. Here we updated this pipeline for SNP discovery in outcrossing (allogamous species and demonstrated its efficacy in SNP discovery in walnut (Juglans regia L.. Results The first step in the original implementation of the AGSNP pipeline was the construction of a reference sequence and the identification of single-copy sequences in it. To identify single-copy sequences, multiple genome equivalents of short SOLiD reads of another individual were mapped to shallow genome coverage of long Sanger or Roche 454 reads making up the reference sequence. The relative depth of SOLiD reads was used to filter out repeated sequences from single-copy sequences in the reference sequence. The second step was a search for SNPs between SOLiD reads and the reference sequence. Polymorphism within the mapped SOLiD reads would have precluded SNP discovery; hence both individuals had to be homozygous. The AGSNP pipeline was updated here for using SOLiD or other type of short reads of a heterozygous individual for these two principal steps. A total of 32.6X walnut genome equivalents of SOLiD reads of vegetatively propagated walnut scion cultivar ‘Chandler’ were mapped to 48,661 ‘Chandler’ bacterial artificial chromosome (BAC end sequences (BESs produced by Sanger sequencing during the construction of a walnut physical map. A total of 22,799 putative SNPs were initially identified. A total of 6,000 Infinium II type SNPs evenly distributed along the walnut physical map were selected for the
Background Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L). Results The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes. Conclusion This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and
Chen, Nancy; Van Hout, Cristopher V; Gottipati, Srikanth; Clark, Andrew G
Restriction site-associated DNA sequencing or genotyping-by-sequencing (GBS) approaches allow for rapid and cost-effective discovery and genotyping of thousands of single-nucleotide polymorphisms (SNPs) in multiple individuals. However, rigorous quality control practices are needed to avoid high levels of error and bias with these reduced representation methods. We developed a formal statistical framework for filtering spurious loci, using Mendelian inheritance patterns in nuclear families, that accommodates variable-quality genotype calls and missing data--both rampant issues with GBS data--and for identifying sex-linked SNPs. Simulations predict excellent performance of both the Mendelian filter and the sex-linkage assignment under a variety of conditions. We further evaluate our method by applying it to real GBS data and validating a subset of high-quality SNPs. These results demonstrate that our metric of Mendelian inheritance is a powerful quality filter for GBS loci that is complementary to standard coverage and Hardy-Weinberg filters. The described method, implemented in the software MendelChecker, will improve quality control during SNP discovery in nonmodel as well as model organisms. Copyright © 2014 by the Genetics Society of America.
Pootakham, Wirulda; Shearman, Jeremy R; Ruang-Areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke
Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10.
Sindhu, Anoop; Ramsay, Larissa; Sanderson, Lacey-Anne; Stonehouse, Robert; Li, Rong; Condie, Janet; Shunmugam, Arun S K; Liu, Yong; Jha, Ambuj B; Diapari, Marwan; Burstin, Judith; Aubert, Gregoire; Tar'an, Bunyamin; Bett, Kirstin E; Warkentin, Thomas D; Sharpe, Andrew G
Gene-based SNPs were identified and mapped in pea using five recombinant inbred line populations segregating for traits of agronomic importance. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and has been a model system in plant biology and genetics since the work of Gregor Mendel. Pea is the second most widely grown pulse crop in the world following common bean. The importance of pea as a food crop is growing due to its combination of moderate protein concentration, slowly digestible starch, high dietary fiber concentration, and its richness in micronutrients; however, pea has lagged behind other major crops in harnessing recent advances in molecular biology, genomics and bioinformatics, partly due to its large genome size with a large proportion of repetitive sequence, and to the relatively limited investment in research in this crop globally. The objective of this research was the development of a genome-wide transcriptome-based pea single-nucleotide polymorphism (SNP) marker platform using next-generation sequencing technology. A total of 1,536 polymorphic SNP loci selected from over 20,000 non-redundant SNPs identified using deep transcriptome sequencing of eight diverse Pisum accessions were used for genotyping in five RIL populations using an Illumina GoldenGate assay. The first high-density pea SNP map defining all seven linkage groups was generated by integrating with previously published anchor markers. Syntenic relationships of this map with the model legume Medicago truncatula and lentil (Lens culinaris Medik.) maps were established. The genic SNP map establishes a foundation for future molecular breeding efforts by enabling both the identification and tracking of introgression of genomic regions harbouring QTLs related to agronomic and seed quality traits.
A. Janiak; M.Y. Kim; S.H. Lee
@@ There are several strategies that can be applied in SNP discovery, as for example the locus-specific amplification of target genome regions (Primmer et al., 2002; Van et al., 2004) or simultaneous assembly of anonymous sequences which are the product of whole genome shotgun sequencing (Webber and Myers, 1997) or reduced representation shotgun sequencing (Altshuler et al., 2000).
Helyar, Sarah J; Limborg, Morten; Bekkevold, Dorte;
details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed...
and exotic agrestis melons from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes. Conclusions This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties.
Asp, Torben; Studer, Bruno; Lübberstedt, Thomas
Gene-associated single nucleotide polymorphisms (SNPs) are of major interest for genome analysis and breeding applications in the key grassland species perennial ryegrass. High-throughput 454 Titanium transcriptome sequencing was performed on two genotypes, which previously have been used to esta...... in the VrnA mapping population. Here we report on large-scale SNP discovery, and the construction of a genetic map enabling QTL fine mapping, map-based cloning, and comparative genomics in perennial ryegrass....... to establish the VrnA F2 mapping population. The sequences were assembled and used for in-silico SNP discovery. SNPs supported by a minimum number of eight reads, within candidate genes for important agronomic traits, were selected for Illumina GoldenGate genotyping and used to map 768 expressed genes...
González-Pérez Antonio; Gayán Javier; Ruiz Agustín
Abstract A response to Toplak et al: Does replication groups scoring reduce false positive rate in SNP interaction discovery? BMC Genomics 2010, 11:58. Background The genomewide evaluation of genetic epistasis is a computationally demanding task, and a current challenge in Genetics. HFCC (Hypothesis-Free Clinical Cloning) is one of the methods that have been suggested for genomewide epistasis analysis. In order to perform an exhaustive search of epistasis, HFCC has implemented several tools ...
Full Text Available Abstract Background The domestic cat has offered enormous genomic potential in the veterinary description of over 250 hereditary disease models as well as the occurrence of several deadly feline viruses (feline leukemia virus -- FeLV, feline coronavirus -- FECV, feline immunodeficiency virus - FIV that are homologues to human scourges (cancer, SARS, and AIDS respectively. However, to realize this bio-medical potential, a high density single nucleotide polymorphism (SNP map is required in order to accomplish disease and phenotype association discovery. Description To remedy this, we generated 3,178,297 paired fosmid-end Sanger sequence reads from seven cats, and combined these data with the publicly available 2X cat whole genome sequence. All sequence reads were assembled together to form a 3X whole genome assembly allowing the discovery of over three million SNPs. To reduce potential false positive SNPs due to the low coverage assembly, a low upper-limit was placed on sequence coverage and a high lower-limit on the quality of the discrepant bases at a potential variant site. In all domestic cats of different breeds: female Abyssinian, female American shorthair, male Cornish Rex, female European Burmese, female Persian, female Siamese, a male Ragdoll and a female African wildcat were sequenced lightly. We report a total of 964 k common SNPs suitable for a domestic cat SNP genotyping array and an additional 900 k SNPs detected between African wildcat and domestic cats breeds. An empirical sampling of 94 discovered SNPs were tested in the sequenced cats resulting in a SNP validation rate of 99%. Conclusions These data provide a large collection of mapped feline SNPs across the cat genome that will allow for the development of SNP genotyping platforms for mapping feline diseases.
Trebbi, Daniele; Maccaferri, Marco; de Heer, Peter; Sørensen, Anker; Giuliani, Silvia; Salvi, Silvio; Sanguineti, Maria Corinna; Massi, Andrea; van der Vossen, Edwin Andries Gerard; Tuberosa, Roberto
We describe the application of complexity reduction of polymorphic sequences (CRoPS(®)) technology for the discovery of SNP markers in tetraploid durum wheat (Triticum durum Desf.). A next-generation sequencing experiment was carried out on reduced representation libraries obtained from four durum cultivars. SNP validation and minor allele frequency (MAF) estimate were carried out on a panel of 12 cultivars, and the feasibility of genotyping these SNPs in segregating populations was tested using the Illumina Golden Gate (GG) technology. A total of 2,659 SNPs were identified on 1,206 consensus sequences. Among the 768 SNPs that were chosen irrespective of their genomic repetitiveness level and assayed on the Illumina BeadExpress genotyping system, 275 (35.8%) SNPs matched the expected genotypes observed in the SNP discovery phase. MAF data indicated that the overall SNP informativeness was high: a total of 196 (71.3%) SNPs had MAF >0.2, of which 76 (27.6%) showed MAF >0.4. Of these SNPs, 157 were mapped in one of two mapping populations (Meridiano × Claudio and Colosseo × Lloyd) and integrated into a common genetic map. Despite the relatively low genotyping efficiency of the GG assay, the validated CRoPS-derived SNPs showed valuable features for genomics and breeding applications such as a uniform distribution across the wheat genome, a prevailing single-locus codominant nature and a high polymorphism. Here, we report a new set of 275 highly robust genome-wide Triticum SNPs that are readily available for breeding purposes.
Full Text Available Recent advances in next-generation DNA sequencing technologies have made possible the development of high-throughput SNP genotyping platforms that allow for the simultaneous interrogation of thousands of single-nucleotide polymorphisms (SNPs. Such resources have the potential to facilitate the rapid development of high-density genetic maps, and to enable genome-wide association studies as well as molecular breeding approaches in a variety of taxa. Herein, we describe the development of a SNP genotyping resource for use in sunflower (Helianthus annuus L.. This work involved the development of a reference transcriptome assembly for sunflower, the discovery of thousands of high quality SNPs based on the generation and analysis of ca. 6 Gb of transcriptome re-sequencing data derived from multiple genotypes, the selection of 10,640 SNPs for inclusion in the genotyping array, and the use of the resulting array to screen a diverse panel of sunflower accessions as well as related wild species. The results of this work revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, greater than 95% of successful SNP assays revealed polymorphism, and more than 90% of these assays could be successfully transferred to related wild species. Analysis of the polymorphism data revealed patterns of genetic differentiation that were largely congruent with the evolutionary history of sunflower, though the large number of markers allowed for finer resolution than has previously been possible.
Bachlava, Eleni; Taylor, Christopher A; Tang, Shunxue; Bowers, John E; Mandel, Jennifer R; Burke, John M; Knapp, Steven J
Recent advances in next-generation DNA sequencing technologies have made possible the development of high-throughput SNP genotyping platforms that allow for the simultaneous interrogation of thousands of single-nucleotide polymorphisms (SNPs). Such resources have the potential to facilitate the rapid development of high-density genetic maps, and to enable genome-wide association studies as well as molecular breeding approaches in a variety of taxa. Herein, we describe the development of a SNP genotyping resource for use in sunflower (Helianthus annuus L.). This work involved the development of a reference transcriptome assembly for sunflower, the discovery of thousands of high quality SNPs based on the generation and analysis of ca. 6 Gb of transcriptome re-sequencing data derived from multiple genotypes, the selection of 10,640 SNPs for inclusion in the genotyping array, and the use of the resulting array to screen a diverse panel of sunflower accessions as well as related wild species. The results of this work revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, greater than 95% of successful SNP assays revealed polymorphism, and more than 90% of these assays could be successfully transferred to related wild species. Analysis of the polymorphism data revealed patterns of genetic differentiation that were largely congruent with the evolutionary history of sunflower, though the large number of markers allowed for finer resolution than has previously been possible.
Full Text Available Abstract Background The primary goal of genetic linkage analysis is to identify genes affecting a phenotypic trait. After localisation of the linkage region, efficient genetic dissection of the disease linked loci requires that functional variants are identified across the loci. These functional variations are difficult to detect due to extent of genetic diversity and, to date, incomplete cataloguing of the large number of variants present both within and between populations. Massively parallel sequencing platforms offer unprecedented capacity for variant discovery, however the number of samples analysed are still limited by cost per sample. Some progress has been made in reducing the cost of resequencing using either multiplexing methodologies or through the utilisation of targeted enrichment technologies which provide the ability to resequence genomic areas of interest rather that full genome sequencing. Results We developed a method that combines current multiplexing methodologies with a solution-based target enrichment method to further reduce the cost of resequencing where region-specific sequencing is required. Our multiplex/enrichment strategy produced high quality data with nominal reduction of sequencing depth. We undertook a genotyping study and were successful in the discovery of novel SNP alleles in all samples at uniplex, duplex and pentaplex levels. Conclusion Our work describes the successful combination of a targeted enrichment method and index barcode multiplexing to reduce costs, time and labour associated with processing large sample sets. Furthermore, we have shown that the sequencing depth obtained is adequate for credible SNP genotyping analysis at uniplex, duplex and pentaplex levels.
Full Text Available Abstract Background Flax (Linum usitatissimum L. is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs. Results Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents. Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%. Conclusions Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from
LIU Chengzhang; WANG Xia; XIANG Jianhai; LI Fuhua
Pacific white shrimp has become a major aquaculture and fishery species worldwide.Although a large scale EST resource has been publicly available since 2008,the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure.In this study,a set of 155 411 expressed sequence tags(ESTs)from the NCBI database were computationally analyzed and 17 225single nucleotide polymorphisms(SNPs)were predicted,including 9 546 transitions,5 124 transversions and 2 481 indels.Among the 7 298 SNP substitutions located in functionally annotated contigs,58.4％(4 262)are non-synonymous SNPs capable of introducing amino acid mutations.Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding.Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous,suggesting negative selection.Distribution of non-synonymous to synonymous substitutions(Ka/Ks)ratio ranges from 0 to 4.01,(average 0.42,median 0.26),suggesting that the majority of the affected genes are under purifying selection.Enrichment analysis identified multiple gene ontology categories under positive or negative selection.Categories involved in innate immune response and male gamete generation are rich in positively selected genes,which is similar to reports in Drosophila and primates.This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species.The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua
Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
Hedrich Hans J
Full Text Available Abstract Background The laboratory rat (Rattus norvegicus is an important model for studying many aspects of human health and disease. Detailed knowledge on genetic variation between strains is important from a biomedical, particularly pharmacogenetic point of view and useful for marker selection for genetic cloning and association studies. Results We show that Single Nucleotide Polymorphisms (SNPs in commonly used rat strains are surprisingly well represented in wild rat isolates. Shotgun sequencing of 814 Kbp in one wild rat resulted in the identification of 485 SNPs as compared with the Brown Norway genome sequence. Genotyping 36 commonly used inbred rat strains showed that 84% of these alleles are also polymorphic in a representative set of laboratory rat strains. Conclusion We postulate that shotgun sequencing in a wild rat sample and subsequent genotyping in multiple laboratory or domesticated strains rather than direct shotgun sequencing of multiple strains, could be the most efficient SNP discovery approach. For the rat, laboratory strains still harbor a large portion of the haplotypes present in wild isolates, suggesting a relatively recent common origin and supporting the idea that rat inbred strains, in contrast to mouse inbred strains, originate from a single species, R. norvegicus.
Hurgobin, Bhavna; Edwards, David
Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species; it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms (SNPs) are also an important form of genetic variation. The advent of next-generation sequencing (NGS) coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci (QTL) analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data.
Full Text Available Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species; it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms (SNPs are also an important form of genetic variation. The advent of next-generation sequencing (NGS coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci (QTL analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data.
Crooijmans Richard PMA
Full Text Available Abstract Background Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of true SNPs on a large scale. Results DNA pooled from five animals from a commercial boar line was digested with DraI; 150–250-bp fragments were isolated and end-sequenced using the Illumina 1 G Genome Analyzer, yielding 70,348,064 sequences 36-bp long. Rules were developed to select sequences, which were then aligned to unique positions in a reference genome. Sequences were selected based on quality, and three thresholds of sequence quality (SQ were compared. The highest threshold of SQ allowed identification of a larger number of SNPs (17,489, distributed widely across the pig genome. In total, 3,142 SNPs were validated with a success rate of 96%. The correlation between estimated minor allele frequency (MAF and genotyped MAF was moderate, and SNPs were highly polymorphic in other pig breeds. Lowering the SQ threshold and maintaining the same criteria for SNP identification resulted in the discovery of fewer SNPs (16,768, of which 259 were not identified using higher SQ levels. Validation of SNPs found exclusively in the lower SQ threshold had a success rate of 94% and a low correlation between estimated MAF and genotyped MAF. Base change analysis suggested that the rate of transitions in the pig genome is likely to be similar to that observed in humans. Chromosome X showed reduced nucleotide diversity relative to autosomes, as observed for other species. Conclusion Large numbers of SNPs can be identified reliably by creating strict rules for sequence selection, which simultaneously decreases sequence ambiguity. Selection of sequences using a higher SQ
Chao, Shiaoman; Jellen, Eric N.; Carson, Martin L.; Rines, Howard W.; Obert, Donald E.; Lutz, Joseph D.; Shackelford, Irene; Korol, Abraham B.; Wight, Charlene P.; Gardner, Kyle M.; Hattori, Jiro; Beattie, Aaron D.; Bjørnstad, Åsmund; Bonman, J. Michael; Jannink, Jean-Luc; Sorrells, Mark E.; Brown-Guedira, Gina L.; Mitchell Fetch, Jennifer W.; Harrison, Stephen A.; Howarth, Catherine J.; Ibrahim, Amir; Kolb, Frederic L.; McMullen, Michael S.; Murphy, J. Paul; Ohm, Herbert W.; Rossnagel, Brian G.; Yan, Weikai; Miclaus, Kelci J.; Hiller, Jordan; Maughan, Peter J.; Redman Hulse, Rachel R.; Anderson, Joseph M.; Islamovic, Emir
A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42) has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources. PMID:23533580
Rebekah E Oliver
Full Text Available A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42 has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources.
Zhao Patrick X
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most common type of sequence variation among plants and are often functionally important. We describe the use of 454 technology and high resolution melting analysis (HRM for high throughput SNP discovery in tetraploid alfalfa (Medicago sativa L., a species with high economic value but limited genomic resources. Results The alfalfa genotypes selected from M. sativa subsp. sativa var. 'Chilean' and M. sativa subsp. falcata var. 'Wisfal', which differ in water stress sensitivity, were used to prepare cDNA from tissue of clonally-propagated plants grown under either well-watered or water-stressed conditions, and then pooled for 454 sequencing. Based on 125.2 Mb of raw sequence, a total of 54,216 unique sequences were obtained including 24,144 tentative consensus (TCs sequences and 30,072 singletons, ranging from 100 bp to 6,662 bp in length, with an average length of 541 bp. We identified 40,661 candidate SNPs distributed throughout the genome. A sample of candidate SNPs were evaluated and validated using high resolution melting (HRM analysis. A total of 3,491 TCs harboring 20,270 candidate SNPs were located on the M. truncatula (MT 3.5.1 chromosomes. Gene Ontology assignments indicate that sequences obtained cover a broad range of GO categories. Conclusions We describe an efficient method to identify thousands of SNPs distributed throughout the alfalfa genome covering a broad range of GO categories. Validated SNPs represent valuable molecular marker resources that can be used to enhance marker density in linkage maps, identify potential factors involved in heterosis and genetic variation, and as tools for association mapping and genomic selection in alfalfa.
Full Text Available Abstract A response to Toplak et al: Does replication groups scoring reduce false positive rate in SNP interaction discovery? BMC Genomics 2010, 11:58. Background The genomewide evaluation of genetic epistasis is a computationally demanding task, and a current challenge in Genetics. HFCC (Hypothesis-Free Clinical Cloning is one of the methods that have been suggested for genomewide epistasis analysis. In order to perform an exhaustive search of epistasis, HFCC has implemented several tools and data filters, such as the use of multiple replication groups, and direction of effect and control filters. A recent article has claimed that the use of multiple replication groups (as implemented in HFCC does not reduce the false positive rate, and we hereby try to clarify these issues. Results/Discussion HFCC uses, as an analysis strategy, the possibility of replicating findings in multiple replication groups, in order to select a liberal subset of preliminary results that are above a statistical criterion and consistent in direction of effect. We show that the use of replication groups and the direction filter reduces the false positive rate of a study, although at the expense of lowering the overall power of the study. A post-hoc analysis of these selected signals in the combined sample could then be performed to select the most promising results. Conclusion Replication of results in independent samples is generally used in scientific studies to establish credibility in a finding. Nonetheless, the combined analysis of several datasets is known to be a preferable and more powerful strategy for the selection of top signals. HFCC is a flexible and complete analysis tool, and one of its analysis options combines these two strategies: A preliminary multiple replication group analysis to eliminate inconsistent false positive results, and a post-hoc combined-group analysis to select the top signals.
A response to Toplak et al: Does replication groups scoring reduce false positive rate in SNP interaction discovery? BMC Genomics 2010, 11:58. Background The genomewide evaluation of genetic epistasis is a computationally demanding task, and a current challenge in Genetics. HFCC (Hypothesis-Free Clinical Cloning) is one of the methods that have been suggested for genomewide epistasis analysis. In order to perform an exhaustive search of epistasis, HFCC has implemented several tools and data filters, such as the use of multiple replication groups, and direction of effect and control filters. A recent article has claimed that the use of multiple replication groups (as implemented in HFCC) does not reduce the false positive rate, and we hereby try to clarify these issues. Results/Discussion HFCC uses, as an analysis strategy, the possibility of replicating findings in multiple replication groups, in order to select a liberal subset of preliminary results that are above a statistical criterion and consistent in direction of effect. We show that the use of replication groups and the direction filter reduces the false positive rate of a study, although at the expense of lowering the overall power of the study. A post-hoc analysis of these selected signals in the combined sample could then be performed to select the most promising results. Conclusion Replication of results in independent samples is generally used in scientific studies to establish credibility in a finding. Nonetheless, the combined analysis of several datasets is known to be a preferable and more powerful strategy for the selection of top signals. HFCC is a flexible and complete analysis tool, and one of its analysis options combines these two strategies: A preliminary multiple replication group analysis to eliminate inconsistent false positive results, and a post-hoc combined-group analysis to select the top signals. PMID:20576100
Pappas Georgios J
Full Text Available Abstract Background Benefits from high-throughput sequencing using 454 pyrosequencing technology may be most apparent for species with high societal or economic value but few genomic resources. Rapid means of gene sequence and SNP discovery using this novel sequencing technology provide a set of baseline tools for genome-level research. However, it is questionable how effective the sequencing of large numbers of short reads for species with essentially no prior gene sequence information will support contig assemblies and sequence annotation. Results With the purpose of generating the first broad survey of gene sequences in Eucalyptus grandis, the most widely planted hardwood tree species, we used 454 technology to sequence and assemble 148 Mbp of expressed sequences (EST. EST sequences were generated from a normalized cDNA pool comprised of multiple tissues and genotypes, promoting discovery of homologues to almost half of Arabidopsis genes, and a comprehensive survey of allelic variation in the transcriptome. By aligning the sequencing reads from multiple genotypes we detected 23,742 SNPs, 83% of which were validated in a sample. Genome-wide nucleotide diversity was estimated for 2,392 contigs using a modified theta (θ parameter, adapted for measuring genetic diversity from polymorphisms detected by randomly sequencing a multi-genotype cDNA pool. Diversity estimates in non-synonymous nucleotides were on average 4x smaller than in synonymous, suggesting purifying selection. Non-synonymous to synonymous substitutions (Ka/Ks among 2,001 contigs averaged 0.30 and was skewed to the right, further supporting that most genes are under purifying selection. Comparison of these estimates among contigs identified major functional classes of genes under purifying and diversifying selection in agreement with previous researches. Conclusion In providing an abundance of foundational transcript sequences where limited prior genomic information existed, this
Aslam Muhammad L
whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.
Doran Anthony G
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data. Results Here we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat. Conclusions SNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience.
Full Text Available Abstract Background Computational methods that infer single nucleotide polymorphism (SNP interactions from phenotype data may uncover new biological mechanisms in non-Mendelian diseases. However, practical aspects of such analysis face many problems. Present experimental studies typically use SNP arrays with hundreds of thousands of SNPs but record only hundreds of samples. Candidate SNP pairs inferred by interaction analysis may include a high proportion of false positives. Recently, Gayan et al. (2008 proposed to reduce the number of false positives by combining results of interaction analysis performed on subsets of data (replication groups, rather than analyzing the entire data set directly. If performing as hypothesized, replication groups scoring could improve interaction analysis and also any type of feature ranking and selection procedure in systems biology. Because Gayan et al. do not compare their approach to the standard interaction analysis techniques, we here investigate if replication groups indeed reduce the number of reported false positive interactions. Results A set of simulated and false interaction-imputed experimental SNP data sets were used to compare the inference of SNP-SNP interactions by means of replication groups to the standard approach where the entire data set was directly used to score all candidate SNP pairs. In all our experiments, the inference of interactions from the entire data set (e.g. without using the replication groups reported fewer false positives. Conclusions With respect to the direct scoring approach the utility of replication groups does not reduce false positive rates, and may, depending on the data set, often perform worse.
Helyar, Sarah J; Limborg, Morten; Bekkevold, Dorte
to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic...
Gardner, Shea N; Hall, Barry G
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.
Kottapalli, Pratibha; Ulloa, Mauricio; Kottapalli, Kameswara Rao; Payton, Paxton; Burke, John
The objective of this study was to explore the known narrow genetic diversity and discover single-nucleotide polymorphic (SNP) markers for marker-assisted breeding within Pima cotton (Gossypium barbadense L.) leaf transcriptomes. cDNA from 25-day plants of three diverse cotton genotypes [Pima S6 (PS6), Pima S7 (PS7), and Pima 3-79 (P3-79)] was sequenced on Illumina sequencing platform. A total of 28.9 million reads (average read length of 138 bp) were generated by sequencing cDNA libraries of these three genotypes. The de novo assembly of reads generated transcriptome sets of 26,369 contigs for PS6, 25,870 contigs for PS7, and 24,796 contigs for P3-79. A Pima leaf reference transcriptome was generated consisting of 42,695 contigs. More than 10,000 single-nucleotide polymorphisms (SNPs) were identified between the genotypes, with 100% SNP frequency and a minimum of eight sequencing reads. The most prevalent SNP substitutions were C—T and A—G in these cotton genotypes. The putative SNPs identified can be utilized for characterizing genetic diversity, genotyping, and eventually in Pima cotton breeding through marker-assisted selection.
Birolo, Giovanni; Prazzoli, Maria Lucia; Lorenzi, Silvia; Valle, Giorgio; Grando, Maria Stella
Whole-genome comparisons of Vitis vinifera subsp. sativa and V. vinifera subsp. sylvestris are expected to provide a better estimate of the valuable genetic diversity still present in grapevine, and help to reconstruct the evolutionary history of a major crop worldwide. To this aim, the increase of molecular marker density across the grapevine genome is fundamental. Here we describe the SNP discovery in a grapevine germplasm collection of 51 cultivars and 44 wild accessions through a novel protocol of restriction-site associated DNA (RAD) sequencing. By resequencing 1.1% of the grapevine genome at a high coverage, we recovered 34K BamHI unique restriction sites, of which 6.8% were absent in the ‘PN40024’ reference genome. Moreover, we identified 37,748 single nucleotide polymorphisms (SNPs), 93% of which belonged to the 19 assembled chromosomes with an average of 1.8K SNPs per chromosome. Nearly half of the SNPs fell in genic regions mostly assigned to the functional categories of metabolism and regulation, whereas some nonsynonymous variants were identified in genes related with the detection and response to environmental stimuli. SNP validation was carried-out, showing the ability of RAD-seq to accurately determine genotypes in a highly heterozygous species. To test the usefulness of our SNP panel, the main diversity statistics were evaluated, highlighting how the wild grapevine retained less genetic variability than the cultivated form. Furthermore, the analysis of Linkage Disequilibrium (LD) in the two subspecies separately revealed how the LD decays faster within the domesticated grapevine compared to its wild relative. Being the first application of RAD-seq in a diverse grapevine germplasm collection, our approach holds great promise for exploiting the genetic resources available in one of the most economically important fruit crops. PMID:28125640
Van, Kyujung; Kang, Yang Jae; Han, Kwang-Soo; Lee, Yeong-Ho; Gwag, Jae-Gyun; Moon, Jung-Kyung; Lee, Suk-Ha
Mungbean [Vigna radiata (L.) Wilczek], a self-pollinated diploid plant with 2n = 22 chromosomes, is an important legume crop with a high-quality amino acid profile. Sequence variation at the whole-genome level was examined by comparing two mungbean cultivars, Sunhwanokdu and Gyeonggijaerae 5, using Illumina HiSeq sequencing data. More than 40 billion bp from both mungbean cultivars were sequenced to a depth of 72×. After de novo assembly of Sunhwanokdu contigs by ABySS 1.3.2 (N50 = 9,958 bp), those longer than 10 kb were aligned with Gyeonggijaerae 5 reads using the Burrows-Wheeler Aligner. SAMTools was used for retrieving single nucleotide polymorphisms (SNPs) between Sunhwanokdu and Gyeonggijaerae 5, defining the lowest and highest depths as 5 and 100, respectively, and the sequence quality as 100. Of the 305,504 single-base changes identified, 40,503 SNPs were considered heterozygous in Gyeonggijaerae 5. Among the remaining 265,001 SNPs, 65.9 % (174,579 cases) were transitions and 34.1 % (90,422 cases) were transversions. For SNP validation, a total of 42 SNPs were chosen among Sunhwanokdu contigs longer than 10 kb and sharing at least 80 % sequence identity with common bean expressed sequence tags as determined with est2genome. Using seven mungbean cultivars from various origins in addition to Sunhwanokdu and Gyeonggijaerae 5, most of the SNPs identified by bioinformatics tools were confirmed by Sanger sequencing. These genome-wide SNP markers could enrich the current molecular resources and might be of value for the construction of a mungbean genetic map and the investigation of genetic diversity.
Ruiz-Rojas, J J; Sargent, D J; Shulaev, V; Dickerman, A W; Pattison, J; Holt, S H; Ciordia, A; Veilleux, Richard E
As part of a program to develop forward and reverse genetics platforms in the diploid strawberry [Fragaria vesca L.; (2n = 2x = 14)] we have generated insertional mutant lines by T-DNA mutagenesis using pCAMBIA vectors. To characterize the T-DNA insertion sites of a population of 108 unique single copy mutants, we utilized thermal asymmetric interlaced PCR (hiTAIL-PCR) to amplify the flanking region surrounding either the left or right border of the T-DNA. Bioinformatics analysis of flanking sequences revealed little preference for insertion site with regard to G/C content; left borders tended to retain more of the plasmid backbone than right borders. Primers were developed from F. vesca flanking sequences to attempt to amplify products from both parents of the reference F. vesca 815 x F. bucharica 601 mapping population. Polymorphism occurred as: presence/absence of an amplification product for 16 primer pairs and different size products for 12 primer pairs, For 46 mutants, where polymorphism was not found by PCR, the amplification products were sequenced to reveal SNP polymorphism. A cleaved amplified polymorphic sequence/derived cleaved amplified polymorphism sequence (CAPS/dCAPS) strategy was then applied to find restriction endonuclease recognition sites in one of the parental lines to map the SNP position of 74 of the T-DNA insertion lines. BLAST search of flanking regions against GenBank revealed that 46 of 108 flanking sequences were close to presumed strawberry genes related to annotated genes from other plants.
Studer, Bruno; Kölliker, Roland
for this is the availability of high-throughput platforms for multiplexed SNP genotyping. Advancements in these technologies have enabled increased flexibility and throughput, allowing for the generation of adequate SNP marker data at very competitive cost per data point.......In the recent years, single nucleotide polymorphism (SNP) markers have emerged as the marker technology of choice for plant genetics and breeding applications. Besides the efficient technologies available for SNP discovery even in complex genomes, one of the main reasons...
Studer, Bruno; Kölliker, Roland
In the recent years, single nucleotide polymorphism (SNP) markers have emerged as the marker technology of choice for plant genetics and breeding applications. Besides the efficient technologies available for SNP discovery even in complex genomes, one of the main reasons...... for this is the availability of high-throughput platforms for multiplexed SNP genotyping. Advancements in these technologies have enabled increased flexibility and throughput, allowing for the generation of adequate SNP marker data at very competitive cost per data point....
Full Text Available Abstract Background One of the goals of livestock genomics research is to identify the genetic differences responsible for variation in phenotypic traits, particularly those of economic importance. Characterizing the genetic variation in livestock species is an important step towards linking genes or genomic regions with phenotypes. The completion of the bovine genome sequence and recent advances in DNA sequencing technology allow for in-depth characterization of the genetic variations present in cattle. Here we describe the whole-genome resequencing of two Bos taurus bulls from distinct breeds for the purpose of identifying and annotating novel forms of genetic variation in cattle. Results The genomes of a Black Angus bull and a Holstein bull were sequenced to 22-fold and 19-fold coverage, respectively, using the ABI SOLiD system. Comparisons of the sequences with the Btau4.0 reference assembly yielded 7 million single nucleotide polymorphisms (SNPs, 24% of which were identified in both animals. Of the total SNPs found in Holstein, Black Angus, and in both animals, 81%, 81%, and 75% respectively are novel. In-depth annotations of the data identified more than 16 thousand distinct non-synonymous SNPs (85% novel between the two datasets. Alignments between the SNP-altered proteins and orthologues from numerous species indicate that many of the SNPs alter well-conserved amino acids. Several SNPs predicted to create or remove stop codons were also found. A comparison between the sequencing SNPs and genotyping results from the BovineHD high-density genotyping chip indicates a detection rate of 91% for homozygous SNPs and 81% for heterozygous SNPs. The false positive rate is estimated to be about 2% for both the Black Angus and Holstein SNP sets, based on follow-up genotyping of 422 and 427 SNPs, respectively. Comparisons of read depth between the two bulls along the reference assembly identified 790 putative copy-number variations (CNVs. Ten
Koutsos Anastasios; Morlais Isabelle; Simard Frédéric; Krishnakumar Sujatha; Cohuet Anna; Fontenille Didier; Mindrinos Michael; Kafatos Fotis C
Abstract Background Anopheles innate immunity affects Plasmodium development and is a potential target of innovative malaria control strategies. The extent and distribution of nucleotide diversity in immunity genes might provide insights into the evolutionary forces that condition pathogen-vector interactions. The discovery of polymorphisms is an essential step towards association studies of susceptibility to infection. Results We sequenced coding fragments of 72 immune related genes in natur...
Norman, Anita J; Street, Nathaniel R; Spong, Göran
Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos) within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia.
Anita J Norman
Full Text Available Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia.
Lepoittevin, C; Bodénès, C; Chancerel, E; Villate, L; Lang, T; Lesur, I; Boury, C; Ehrenmann, F; Zelenica, D; Boland, A; Besse, C; Garnier-Géré, P; Plomion, C; Kremer, A
An Illumina Infinium SNP genotyping array was constructed for European white oaks. Six individuals of Quercus petraea and Q. robur were considered for SNP discovery using both previously obtained Sanger sequences across 676 gene regions (1371 in vitro SNPs) and Roche 454 technology sequences from 5112 contigs (6542 putative in silico SNPs). The 7913 SNPs were genotyped across the six parental individuals, full-sib progenies (one within each species and two interspecific crosses between Q. petraea and Q. robur) and three natural populations from south-western France that included two additional interfertile white oak species (Q. pubescens and Q. pyrenaica). The genotyping success rate in mapping populations was 80.4% overall and 72.4% for polymorphic SNPs. In natural populations, these figures were lower (54.8% and 51.9%, respectively). Illumina genotype clusters with compression (shift of clusters on the normalized x-axis) were detected in ~25% of the successfully genotyped SNPs and may be due to the presence of paralogues. Compressed clusters were significantly more frequent for SNPs showing a priori incorrect Illumina genotypes, suggesting that they should be considered with caution or discarded. Altogether, these results show a high experimental error rate for the Infinium array (between 15% and 20% of SNPs potentially unreliable and 10% when excluding all compressed clusters), and recommendations are proposed when applying this type of high-throughput technique. Finally, results on diversity levels and shared polymorphisms across targeted white oaks and more distant species of the Quercus genus are discussed, and perspectives for future comparative studies are proposed.
Sim, Sung-Chur; Robbins, Matthew D; Chilcott, Charles; Zhu, Tong; Francis, David M
Cultivated tomato (Solanum lycopersicum L.) has narrow genetic diversity that makes it difficult to identify polymorphisms between elite germplasm. We explored array-based single feature polymorphism (SFP) discovery as a high-throughput approach for marker development in cultivated tomato. Three varieties, FL7600 (fresh-market), OH9242 (processing), and PI114490 (cherry) were used as a source of genomic DNA for hybridization to oligonucleotide arrays. Identification of SFPs was based on outlier detection using regression analysis of normalized hybridization data within a probe set for each gene. A subset of 189 putative SFPs was sequenced for validation. The rate of validation depended on the desired level of significance (alpha) used to define the confidence interval (CI), and ranged from 76% for polymorphisms identified at alpha or= 2 SNPs per locus. We used a subset of validated SNPs for genetic diversity analysis of 92 tomato varieties and accessions. Pairwise estimation of theta (Fst) suggested significant differentiation between collections of fresh-market, processing, vintage, Latin American (landrace), and S. pimpinellifolium accessions. The fresh-market and processing groups displayed high genetic diversity relative to vintage and landrace groups. Furthermore, the patterns of SNP variation indicated that domestication and early breeding practices have led to progressive genetic bottlenecks while modern breeding practices have reintroduced genetic variation into the crop from wild species. Finally, we examined the ratio of non-synonymous (Ka) to synonymous substitutions (Ks) for 20 loci with multiple SNPs (>or= 4 per locus). Six of 20 loci showed ratios of Ka/Ks >or= 0.9. Array-based SFP discovery was an efficient method to identify a large number of molecular markers for genetics and breeding in elite tomato germplasm. Patterns of sequence variation across five major tomato groups provided insight into to the effect of human selection on genetic variation.
Full Text Available Targeted sequencing is a cost-efficient way to obtain answers to biological questions in many projects, but the choice of the enrichment method to use can be difficult. In this study we compared two hybridization methods for target enrichment for massively parallel sequencing and single nucleotide polymorphism (SNP discovery, namely Nimblegen sequence capture arrays and the SureSelect liquid-based hybrid capture system. We prepared sequencing libraries from three HapMap samples using both methods, sequenced the libraries on the Illumina Genome Analyzer, mapped the sequencing reads back to the genome, and called variants in the sequences. 74-75% of the sequence reads originated from the targeted region in the SureSelect libraries and 41-67% in the Nimblegen libraries. We could sequence up to 99.9% and 99.5% of the regions targeted by capture probes from the SureSelect libraries and from the Nimblegen libraries, respectively. The Nimblegen probes covered 0.6 Mb more of the original 3.1 Mb target region than the SureSelect probes. In each sample, we called more SNPs and detected more novel SNPs from the libraries that were prepared using the Nimblegen method. Thus the Nimblegen method gave better results when judged by the number of SNPs called, but this came at the cost of more over-sampling.
Vandepitte, K; Honnay, O; Mergeay, J; Breyne, P; Roldán-Ruiz, I; De Meyer, T
Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost-effective approaches to uncover genome-wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD-PE (Restriction site Associated DNA Paired-End sequencing) approach. RAD tags were generated from the PstI-digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired-end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N(50) = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD-PE as an inexpensive genome-wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.
Gaur, Rashmi; Azam, Sarwar; Jeena, Ganga; Khan, Aamir Waseem; Choudhary, Shalu; Jain, Mukesh; Yadav, Gitanjali; Tyagi, Akhilesh K; Chattopadhyay, Debasis; Bhatia, Sabhyata
The present study reports the large-scale discovery of genome-wide single-nucleotide polymorphisms (SNPs) in chickpea, identified mainly through the next generation sequencing of two genotypes, i.e. Cicer arietinum ICC4958 and its wild progenitor C. reticulatum PI489777, parents of an inter-specific reference mapping population of chickpea. Development and validation of a high-throughput SNP genotyping assay based on Illumina's GoldenGate Genotyping Technology and its application in building a high-resolution genetic linkage map of chickpea is described for the first time. In this study, 1022 SNPs were identified, of which 768 high-confidence SNPs were selected for designing the custom Oligo Pool All (CpOPA-I) for genotyping. Of these, 697 SNPs could be successfully used for genotyping, demonstrating a high success rate of 90.75%. Genotyping data of the 697 SNPs were compiled along with those of 368 co-dominant markers mapped in an earlier study, and a saturated genetic linkage map of chickpea was constructed. One thousand and sixty-three markers were mapped onto eight linkage groups spanning 1808.7 cM (centiMorgans) with an average inter-marker distance of 1.70 cM, thereby representing one of the most advanced maps of chickpea. The map was used for the synteny analysis of chickpea, which revealed a higher degree of synteny with the phylogenetically close Medicago than with soybean. The first set of validated SNPs and map resources developed in this study will not only facilitate QTL mapping, genome-wide association analysis and comparative mapping in legumes but also help anchor scaffolds arising out of the whole-genome sequencing of chickpea.
Full Text Available Abstract Background Cultivated tomato (Solanum lycopersicum L. has narrow genetic diversity that makes it difficult to identify polymorphisms between elite germplasm. We explored array-based single feature polymorphism (SFP discovery as a high-throughput approach for marker development in cultivated tomato. Results Three varieties, FL7600 (fresh-market, OH9242 (processing, and PI114490 (cherry were used as a source of genomic DNA for hybridization to oligonucleotide arrays. Identification of SFPs was based on outlier detection using regression analysis of normalized hybridization data within a probe set for each gene. A subset of 189 putative SFPs was sequenced for validation. The rate of validation depended on the desired level of significance (α used to define the confidence interval (CI, and ranged from 76% for polymorphisms identified at α ≤ 10-6 to 60% for those identified at α ≤ 10-2. Validation percentage reached a plateau between α ≤ 10-4 and α ≤ 10-7, but failure to identify known SFPs (Type II error increased dramatically at α ≤ 10-6. Trough sequence validation, we identified 279 SNPs and 27 InDels in 111 loci. Sixty loci contained ≥ 2 SNPs per locus. We used a subset of validated SNPs for genetic diversity analysis of 92 tomato varieties and accessions. Pairwise estimation of θ (Fst suggested significant differentiation between collections of fresh-market, processing, vintage, Latin American (landrace, and S. pimpinellifolium accessions. The fresh-market and processing groups displayed high genetic diversity relative to vintage and landrace groups. Furthermore, the patterns of SNP variation indicated that domestication and early breeding practices have led to progressive genetic bottlenecks while modern breeding practices have reintroduced genetic variation into the crop from wild species. Finally, we examined the ratio of non-synonymous (Ka to synonymous substitutions (Ks for 20 loci with multiple SNPs (≥ 4 per
Full Text Available Abstract Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA
You, Frank M; Huo, Naxin; Gu, Yong Q; Lazo, Gerard R; Dvorak, Jan; Anderson, Olin D
In some genomic applications it is necessary to design large numbers of PCR primers in exons flanking one or several introns on the basis of orthologous gene sequences in related species. The primer pairs designed by this target gene approach are called "intron-flanking primers" or because they are located in exonic sequences which are usually conserved between related species, "conserved primers". They are useful for large-scale single nucleotide polymorphism (SNP) discovery and marker development, especially in species, such as wheat, for which a large number of ESTs are available but for which genome sequences and intron/exon boundaries are not available. To date, no suitable high-throughput tool is available for this purpose. We have developed, the ConservedPrimers 2.0 pipeline, for designing intron-flanking primers for large-scale SNP discovery and marker development, and demonstrated its utility in wheat. This tool uses non-redundant wheat EST sequences, such as wheat contigs and singleton ESTs, and related genomic sequences, such as those of rice, as inputs. It aligns the ESTs to the genomic sequences to identify unique colinear exon blocks and predicts intron lengths. Intron-flanking primers are then designed based on the intron/exon information using the Primer3 core program or BatchPrimer3. Finally, a tab-delimited file containing intron-flanking primer pair sequences and their primer properties is generated for primer ordering and their PCR applications. Using this tool, 1,922 bin-mapped wheat ESTs (31.8% of the 6,045 in total) were found to have unique colinear exon blocks suitable for primer design and 1,821 primer pairs were designed from these single- or low-copy genes for PCR amplification and SNP discovery. With these primers and subsequently designed genome-specific primers, a total of 1,527 loci were found to contain one or more genome-specific SNPs. The ConservedPrimers 2.0 pipeline for designing intron-flanking primers was developed and its
Derzelle, Sylviane; Girault, Guillaume; Roest, Hendrik Ido Jan; Koene, Miriam
Bacillus anthracis, the causative agent of anthrax, has been widely described as a clonal species. Here we report the use of both canonical SNP analysis and whole-genome sequencing to characterize the phylogenetic lineages of B. anthracis from the Netherlands. Eleven strains isolated over a 25-years period (1968-1993) were paired-end sequenced using parallel sequencing technology. Five canSNP groups or lineages, i.e. A.Br.001/002 (n=6), A.Br.Aust94 (n=2), A.Br.008/011 (n=1), A.Br.011/009 (n=1) and A.Br.Vollum (n=1) were identified. Comparative analyses, with a focus on SNPs discovery, were carried out using a total of 52 B. anthracis genomes. A phylogeographic "Dutch" cluster within the dominant A.Br.001/002 group was discovered, involving isolates from a single outbreak. Diagnostic SNPs specific to the newly identified sub-groups were developed into high-resolution melting SNP discriminative assays for the purpose of rapid molecular epidemiology. Phylogenetic relationships with strains from other parts of the world are discussed.
Full Text Available Abstract Background Next generation sequencing (NGS technologies are providing new ways to accelerate fine-mapping and gene isolation in many species. To date, the majority of these efforts have focused on diploid organisms with readily available whole genome sequence information. In this study, as a proof of concept, we tested the use of NGS for SNP discovery in tetraploid wheat lines differing for the previously cloned grain protein content (GPC gene GPC-B1. Bulked segregant analysis (BSA was used to define a subset of putative SNPs within the candidate gene region, which were then used to fine-map GPC-B1. Results We used Illumina paired end technology to sequence mRNA (RNAseq from near isogenic lines differing across a ~30-cM interval including the GPC-B1 locus. After discriminating for SNPs between the two homoeologous wheat genomes and additional quality filtering, we identified inter-varietal SNPs in wheat unigenes between the parental lines. The relative frequency of these SNPs was examined by RNAseq in two bulked samples made up of homozygous recombinant lines differing for their GPC phenotype. SNPs that were enriched at least 3-fold in the corresponding pool (6.5% of all SNPs were further evaluated. Marker assays were designed for a subset of the enriched SNPs and mapped using DNA from individuals of each bulk. Thirty nine new SNP markers, corresponding to 67% of the validated SNPs, mapped across a 12.2-cM interval including GPC-B1. This translated to 1 SNP marker per 0.31 cM defining the GPC-B1 gene to within 13-18 genes in syntenic cereal genomes and to a 0.4 cM interval in wheat. Conclusions This study exemplifies the use of RNAseq for SNP discovery in polyploid species and supports the use of BSA as an effective way to target SNPs to specific genetic intervals to fine-map genes in unsequenced genomes.
Talukder, Zahirul I; Seiler, Gerald J; Song, Qijian; Ma, Guojia; Qi, Lili
Basal stalk rot (BSR), caused by the ascomycete fungus (Lib.) de Bary, is a serious disease of sunflower ( L.) in the cool and humid production areas of the world. Quantitative trait loci (QTL) for BSR resistance were identified in a sunflower recombinant inbred line (RIL) population derived from the cross HA 441 × RHA 439. A genotyping-by-sequencing (GBS) approach was adapted to discover single nucleotide polymorphism (SNP) markers. A genetic linkage map was developed comprised of 1053 SNP markers on 17 linkage groups (LGs) spanning 1401.36 cM. The RILs were tested in five environments (locations and years) for resistance to BSR. Quantitative trait loci were identified in each environment separately and also with integrated data across environments. A total of six QTL were identified in all five environments: one of each on LGs 4, 9, 10, 11, 16, and 17. The most significant QTL, and , were identified at multiple environments on LGs 10 and 17, explaining 31.6 and 20.2% of the observed phenotypic variance, respectively. The remaining four QTL, , , , and , were detected in only one environment on LGs 4, 9, 11, and 16, respectively. Each of these QTL explains between 6.4 and 10.5% of the observed phenotypic variation in the RIL population. Alleles conferring increased resistance were contributed by both parents. The potential of the and in marker-assisted selection (MAS) breeding are discussed. Copyright © 2016 Crop Science Society of America.
Zahirul I. Talukder
Full Text Available Basal stalk rot (BSR, caused by the ascomycete fungus (Lib. de Bary, is a serious disease of sunflower ( L. in the cool and humid production areas of the world. Quantitative trait loci (QTL for BSR resistance were identified in a sunflower recombinant inbred line (RIL population derived from the cross HA 441 × RHA 439. A genotyping-by-sequencing (GBS approach was adapted to discover single nucleotide polymorphism (SNP markers. A genetic linkage map was developed comprised of 1053 SNP markers on 17 linkage groups (LGs spanning 1401.36 cM. The RILs were tested in five environments (locations and years for resistance to BSR. Quantitative trait loci were identified in each environment separately and also with integrated data across environments. A total of six QTL were identified in all five environments: one of each on LGs 4, 9, 10, 11, 16, and 17. The most significant QTL, and , were identified at multiple environments on LGs 10 and 17, explaining 31.6 and 20.2% of the observed phenotypic variance, respectively. The remaining four QTL, , , , and , were detected in only one environment on LGs 4, 9, 11, and 16, respectively. Each of these QTL explains between 6.4 and 10.5% of the observed phenotypic variation in the RIL population. Alleles conferring increased resistance were contributed by both parents. The potential of the and in marker-assisted selection (MAS breeding are discussed.
Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M
Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the
Brant K Peterson
Full Text Available The ability to efficiently and accurately determine genotypes is a keystone technology in modern genetics, crucial to studies ranging from clinical diagnostics, to genotype-phenotype association, to reconstruction of ancestry and the detection of selection. To date, high capacity, low cost genotyping has been largely achieved via "SNP chip" microarray-based platforms which require substantial prior knowledge of both genome sequence and variability, and once designed are suitable only for those targeted variable nucleotide sites. This method introduces substantial ascertainment bias and inherently precludes detection of rare or population-specific variants, a major source of information for both population history and genotype-phenotype association. Recent developments in reduced-representation genome sequencing experiments on massively parallel sequencers (commonly referred to as RAD-tag or RADseq have brought direct sequencing to the problem of population genotyping, but increased cost and procedural and analytical complexity have limited their widespread adoption. Here, we describe a complete laboratory protocol, including a custom combinatorial indexing method, and accompanying software tools to facilitate genotyping across large numbers (hundreds or more of individuals for a range of markers (hundreds to hundreds of thousands. Our method requires no prior genomic knowledge and achieves per-site and per-individual costs below that of current SNP chip technology, while requiring similar hands-on time investment, comparable amounts of input DNA, and downstream analysis times on the order of hours. Finally, we provide empirical results from the application of this method to both genotyping in a laboratory cross and in wild populations. Because of its flexibility, this modified RADseq approach promises to be applicable to a diversity of biological questions in a wide range of organisms.
Full Text Available The growing accessibility to genomic resources using next-generation sequencing (NGS technologies has revolutionized the application of molecular genetic tools to ecology and evolutionary studies in non-model organisms. Here we present the case study of the European hake (Merluccius merluccius, one of the most important demersal resources of European fisheries. Two sequencing platforms, the Roche 454 FLX (454 and the Illumina Genome Analyzer (GAII, were used for Single Nucleotide Polymorphisms (SNPs discovery in the hake muscle transcriptome. De novo transcriptome assembly into unique contigs, annotation, and in silico SNP detection were carried out in parallel for 454 and GAII sequence data. High-throughput genotyping using the Illumina GoldenGate assay was performed for validating 1,536 putative SNPs. Validation results were analysed to compare the performances of 454 and GAII methods and to evaluate the role of several variables (e.g. sequencing depth, intron-exon structure, sequence quality and annotation. Despite well-known differences in sequence length and throughput, the two approaches showed similar assay conversion rates (approximately 43% and percentages of polymorphic loci (67.5% and 63.3% for GAII and 454, respectively. Both NGS platforms therefore demonstrated to be suitable for large scale identification of SNPs in transcribed regions of non-model species, although the lack of a reference genome profoundly affects the genotyping success rate. The overall efficiency, however, can be improved using strict quality and filtering criteria for SNP selection (sequence quality, intron-exon structure, target region score.
Full Text Available BACKGROUND: Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. METHODOLOGY: We designed datasets (real and synthetic covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. RESULTS AND CONCLUSIONS: Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE.
Valdisser, Paula Arielle M R; Pappas, Georgios J; de Menezes, Ivandilson P P; Müller, Bárbara S F; Pereira, Wendell J; Narciso, Marcelo G; Brondani, Claudio; Souza, Thiago L P O; Borba, Tereza C O; Vianello, Rosana P
Researchers have made great advances into the development and application of genomic approaches for common beans, creating opportunities to driving more real and applicable strategies for sustainable management of the genetic resource towards plant breeding. This work provides useful polymorphic single-nucleotide polymorphisms (SNPs) for high-throughput common bean genotyping developed by RAD (restriction site-associated DNA) sequencing. The RAD tags were generated from DNA pooled from 12 common bean genotypes, including breeding lines of different gene pools and market classes. The aligned sequences identified 23,748 putative RAD-SNPs, of which 3357 were adequate for genotyping; 1032 RAD-SNPs with the highest ADT (assay design tool) score are presented in this article. The RAD-SNPs were structurally annotated in different coding (47.00 %) and non-coding (53.00 %) sequence components of genes. A subset of 384 RAD-SNPs with broad genome distribution was used to genotype a diverse panel of 95 common bean germplasms and revealed a successful amplification rate of 96.6 %, showing 73 % of polymorphic SNPs within the Andean group and 83 % in the Mesoamerican group. A slightly increased He (0.161, n = 21) value was estimated for the Andean gene pool, compared to the Mesoamerican group (0.156, n = 74). For the linkage disequilibrium (LD) analysis, from a group of 580 SNPs (289 RAD-SNPs and 291 BARC-SNPs) genotyped for the same set of genotypes, 70.2 % were in LD, decreasing to 0.10 %in the Andean group and 0.77 % in the Mesoamerican group. Haplotype patterns spanning 310 Mb of the genome (60 %) were characterized in samples from different origins. However, the haplotype frameworks were under-represented for the Andean (7.85 %) and Mesoamerican (5.55 %) gene pools separately. In conclusion, RAD sequencing allowed the discovery of hundreds of useful SNPs for broad genetic analysis of common bean germplasm. From now, this approach provides an excellent panel
Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.
Derzelle, S.; Girault, G.; Roest, H.I.J.; Koene, M.G.J.
Bacillus anthracis, the causative agent of anthrax, has been widely described as a clonal species. Here we report the use of both canonical SNP analysis and whole-genome sequencing to characterize the phylogenetic lineages of B. anthracis from the Netherlands. Eleven strains isolated over a 25-years
González, Jorge; Fuentes, Glenda; Alarcón, Diego; Ruiz, Eduardo
Within a woody plant species, environmental heterogeneity has the potential to influence the distribution of genetic variation among populations through several evolutionary processes. In some species, a relationship between environmental characteristics and the distribution of genotypes can be detected, showing the importance of natural selection as the main source of differentiation. Nothofagus dombeyi (Mirb.) Oerst. (Nothofagaceae) is an endemic tree species occurring both in Chile and in Argentina temperate forests. Postglacial history has been studied with chloroplast DNA and evolutionary forces shaping genetic variation patterns have been analysed with isozymes but fine-scale genetic diversity studies are needed. The study of demographic and selection histories in Nothofagus dombeyi requires more informative markers such as single nucleotide polymorphisms (SNP). Genotyping-by-Sequencing tools now allow studying thousands of SNP markers at reasonable prices in nonmodel species. We investigated more than 10 K SNP loci for signatures of local adaptation and showed that interrogation of genomic resources can identify shifts in genetic diversity and putative adaptive signals in this nonmodel woody species. PMID:27446942
Zhao, Pengshan; Zhang, Jiwei; Qian, Chaoju; Zhou, Qin; Zhao, Xin; Chen, Guoxiong; Ma, Xiao-Fei
The extreme stress tolerance and high nutritional value of sand rice (Agriophyllum squarrosum) make it attractive for use as an alternative crop in response to concerns about ongoing climate change and future food security. However, a lack of genetic information hinders understanding of the mechanisms underpinning the morphological and physiological adaptations of sand rice. In the present study, we sequenced and analyzed the transcriptomes of two individuals representing semi-arid [Naiman (NM)] and arid [Shapotou (SPT)] sand rice genotypes. A total of 105,868 pairwise single nucleotide polymorphisms (SNPs) distributed in 24,712 Unigenes were identified among SPT and NM samples; the average SNP frequency was 0.3% (one SNP per 333 base pair). Characterization of gene annotation demonstrated that variations in genes involved in DNA recombination were associated with the survival of the NM population in the semi-arid environment. A set of genes predicted to be relevant to heat stress response and agronomic traits was functionally annotated using the accumulated knowledge from Arabidopsis and several crop plants, including rice, barley, maize, and sorghum. Four candidate genes related to heat tolerance (heat-shock transcription factor, HsfA1d), seed size (DA1-Related, DAR1), and flowering (early flowering 3, ELF3 and late elongated hypocotyl, LHY) were subjected to analysis of the genetic diversity in 10 natural populations, representing the core germplasm resource across the area of sand rice distribution in China. Only one SNP was detected in each of HsfA1d and DAR1, among 60 genotypes, with two in ELF3 and four in LHY. Nucleotide diversity ranged from 0.00032 to 0.00118. Haplotype analysis indicated that the NM population carried a specific allele for all four genes, suggesting that divergence has occurred between NM and other populations. These four genes could be further analyzed to determine whether they are associated with phenotype variation and identify
Yu, Long-Xi; Zheng, Ping; Bhamidimarri, Suresh; Liu, Xiang-Ping; Main, Dorie
Verticillium wilt (VW) of alfalfa is a soilborne disease causing severe yield loss in alfalfa. To identify molecular markers associated with VW resistance, we used an integrated framework of genome-wide association study (GWAS) with high-throughput genotyping by sequencing (GBS) to identify loci associated with VW resistance in an F1 full-sib alfalfa population. Phenotyping was performed using manual inoculation of the pathogen to cloned plants of each individual and disease severity was scored using a standard scale. Genotyping was done by GBS, followed by genotype calling using three bioinformatics pipelines including the TASSEL-GBS pipeline (TASSEL), the Universal Network Enabled Analysis Kit (UNEAK), and the haplotype-based FreeBayes pipeline (FreeBayes). The resulting numbers of SNPs, marker density, minor allele frequency (MAF) and heterozygosity were compared among the pipelines. The TASSEL pipeline generated more markers with the highest density and MAF, whereas the highest heterozygosity was obtained by the UNEAK pipeline. The FreeBayes pipeline generated tetraploid genotypes, with the least number of markers. SNP markers generated from each pipeline were used independently for marker-trait association. Markers significantly associated with VW resistance identified by each pipeline were compared. Similar marker loci were found on chromosomes 5, 6, and 7, whereas different loci on chromosome 1, 2, 3, and 4 were identified by different pipelines. Most significant markers were located on chromosome 6 and they were identified by all three pipelines. Of those identified, several loci were linked to known genes whose functions are involved in the plants’ resistance to pathogens. Further investigation on these loci and their linked genes would provide insight into understanding molecular mechanisms of VW resistance in alfalfa. Functional markers closely linked to the resistance loci would be useful for MAS to improve alfalfa cultivars with enhanced resistance
Gorlov, Ivan P.; Moore, Jason H.; Peng, Bo; Jin, Jennifer L.; Gorlova, Olga Y.; Amos, Christopher I.
Successful independent replication is the most direct approach for distinguishing real genotype-disease associations from false discoveries in Genome Wide Association Studies (GWAS). Selecting SNPs for replication has been primarily based on p-values from the discovery stage, although additional characteristics of SNPs may be used to improve replication success. We used disease-associated SNPs from more than 2,000 published GWASs to identify predictors of SNP reproducibility. SNP reproducibility was defined as a proportion of successful replications among all replication attempts. The study reporting association for the first time was considered to be discovery and all consequent studies targeting the same phenotype replications. We found that −Log(P), where P is a p-value from the discovery study, is the strongest predictor of the SNP reproducibility. Other significant predictors include type of the SNP (e.g. missense vs intronic SNPs) and minor allele frequency. Features of the genes linked to the disease-associated SNP also predict SNP reproducibility. Based on empirically defined rules, we developed a reproducibility score (RS) to predict SNP reproducibility independently of −Log(P). We used data from two lung cancer GWAS studies as well as recently reported disease-associated SNPs to validate RS. Minus Log(P) outperforms RS when the very top SNPs are selected, while RS works better with relaxed selection criteria. In conclusion, we propose an empirical model to predict SNP reproducibility, which can be used to select SNPs for validation and prioritization. PMID:25273843
Lin, Hui-Yi; Chen, Dung-Tsa; Huang, Po-Yu
MOTIVATION: Testing SNP-SNP interactions is considered as a key for overcoming bottlenecks of genetic association studies. However, related statistical methods for testing SNP-SNP interactions are underdeveloped. RESULTS: We propose the SNP Interaction Pattern Identifier (SIPI), which tests 45...
Full Text Available Toll-like receptors (TLRs play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.
Full Text Available The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs with minor allele frequency (MAF ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10−12. Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized.
U.S. Department of Health & Human Services — dbSNP is a database of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and...
Gumus, Ergun; Gormez, Zeliha; Kursun, Olcay
Biomarker discovery is a challenging task of bioinformatics especially when targeting high dimensional problems such as SNP (single nucleotide polymorphism) datasets. Various types of feature selection methods can be applied to accomplish this task. Typically, using features versus class labels of samples in the training dataset, these methods aim at selecting feature subsets with maximal classification accuracies. Although finding such class-discriminative features is crucial, selection of relevant SNPs for maximizing other properties that exist in the nature of population genetics such as the correlation between genetic diversity and geographical distance of ethnic groups can also be equally important. In this work, a methodology using a multi objective optimization technique called Pareto Optimal is utilized for selecting SNP subsets offering both high classification accuracy and correlation between genomic and geographical distances. In this method, discriminatory power of an SNP is determined using mutual information and its contribution to the genomic-geographical correlation is estimated using its loadings on principal components. Combining these objectives, the proposed method identifies SNP subsets that can better discriminate ethnic groups than those obtained with sole mutual information and yield higher correlation than those obtained with sole principal components on the Human Genome Diversity Project (HGDP) SNP dataset.
Full Text Available Abstract Background Breast cancer predisposition genes identified to date (e.g., BRCA1 and BRCA2 are responsible for less than 5% of all breast cancer cases. Many studies have shown that the cancer risks associated with individual commonly occurring single nucleotide polymorphisms (SNPs are incremental. However, polygenic models suggest that multiple commonly occurring low to modestly penetrant SNPs of cancer related genes might have a greater effect on a disease when considered in combination. Methods In an attempt to identify the breast cancer risk conferred by SNP interactions, we have studied 19 SNPs from genes involved in major cancer related pathways. All SNPs were genotyped by TaqMan 5'nuclease assay. The association between the case-control status and each individual SNP, measured by the odds ratio and its corresponding 95% confidence interval, was estimated using unconditional logistic regression models. At the second stage, two-way interactions were investigated using multivariate logistic models. The robustness of the interactions, which were observed among SNPs with stronger functional evidence, was assessed using a bootstrap approach, and correction for multiple testing based on the false discovery rate (FDR principle. Results None of these SNPs contributed to breast cancer risk individually. However, we have demonstrated evidence for gene-gene (SNP-SNP interaction among these SNPs, which were associated with increased breast cancer risk. Our study suggests cross talk between the SNPs of the DNA repair and immune system (XPD-[Lys751Gln] and IL10-[G(-1082A], cell cycle and estrogen metabolism (CCND1-[Pro241Pro] and COMT-[Met108/158Val], cell cycle and DNA repair (BARD1-[Pro24Ser] and XPD-[Lys751Gln], and within carcinogen metabolism (GSTP1-[Ile105Val] and COMT-[Met108/158Val] pathways. Conclusion The importance of these pathways and their communication in breast cancer predisposition has been emphasized previously, but their
Doan, Tram B; Eriksson, Natalie A; Graham, Dinny; Funder, John W; Simpson, Evan R; Kuczek, Elizabeth S; Clyne, Colin; Leedman, Peter J; Tilley, Wayne D; Fuller, Peter J; Muscat, George E O; Clarke, Christine L
Although molecular signatures based on transcript expression in breast cancer samples have provided new insights into breast cancer classification and prognosis, there are acknowledged limitations in current signatures. To provide rational, pathway-based signatures of disrupted physiology in cancer tissues that may be relevant to prognosis, this study has directly quantitated changed gene expression, between normal breast and cancer tissue, as a basis for signature development. The nuclear receptor (NR) family of transcription factors, and their coregulators, are fundamental regulators of every aspect of metazoan life, and were rigorously quantified in normal breast tissues and ERα positive and ERα negative breast cancers. Coregulator expression was highly correlated with that of selected NR in normal breast, particularly from postmenopausal women. These associations were markedly decreased in breast cancer, and the expression of the majority of coregulators was down-regulated in cancer tissues compared with normal. While in cancer the loss of NR-coregulator associations observed in normal breast was common, a small number of NR (Rev-ERBβ, GR, NOR1, LRH-1 and PGR) acquired new associations with coregulators in cancer tissues. Elevated expression of these NR in cancers was associated with poorer outcome in large clinical cohorts, as well as suggesting the activation of ERα -related, but ERα-independent, pathways in ERα negative cancers. In addition, the combined expression of small numbers of NR and coregulators in breast cancer was identified as a signature predicting outcome in ERα negative breast cancer patients, not linked to proliferation and with predictive power superior to existing signatures containing many more genes. These findings highlight the power of predictive signatures derived from the quantitative determination of altered gene expression between normal breast and breast cancers. Taken together, the findings of this study identify networks
Jing Fan; Jennifer G.Dy; Chung-Che Chang; Xiaobo Zhou
Myelodysplastic syndromes have increased in frequency and incidence in the American population,but patient prognosis has not significantly improved over the last decade.Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identified.In this study,we propose a method that associates two state-of-the-art array technologies-single nucleotide polymorphism (SNP) array and gene expression array-with gene motifs considered transcription factor-binding sites (TFBS).We are particularly interested in SNP-containing motifs introduced by genetic variation and mutation as TFBS.The potential regulation of SNP-containing motifs affects only when certain mutations occur.These motifs can be identified from a group of co-expressed genes with copy number variation.Then,we used a sliding window to identify motif candidates near SNPs on gene sequences.The candidates were filtered by coarse thresholding and fine statistical testing.Using the regression-based LARS-EN algorithm and a level-wise sequence combination procedure,we identified 28 SNP-containing motifs as candidate TFBS.We confirmed 21 of the 28 motifs with ChIP-chip fragments in the TRANSFAC database.Another six motifs were validated by TRANSFAC via searching binding fragments on coregulated genes.The identified motifs and their location genes can be considered potential biomarkers for myelodysplastic syndromes.Thus,our proposed method,a novel strategy for associating two data categories,is capable of integrating information from different sources to identify reliable candidate regulatory SNP-containing motifs introduced by genetic variation and mutation.
Schoenfelder, Stefan; Sexton, Tom; Chakalova, Lyubomira; Cope, Nathan F; Horton, Alice; Andrews, Simon; Kurukuti, Sreenivasulu; Mitchell, Jennifer A; Umlauf, David; Dimitrova, Daniela S; Eskiw, Christopher H; Luo, Yanquan; Wei, Chia-Lin; Ruan, Yijun; Bieker, James J; Fraser, Peter
The discovery of interchromosomal interactions in higher eukaryotes points to a functional interplay between genome architecture and gene expression, challenging the view of transcription as a one-dimensional process. However, the extent of interchromosomal interactions and the underlying mechanisms are unknown. Here we present the first genome-wide analysis of transcriptional interactions using the mouse globin genes in erythroid tissues. Our results show that the active globin genes associate with hundreds of other transcribed genes, revealing extensive and preferential intra- and interchromosomal transcription interactomes. We show that the transcription factor Klf1 mediates preferential co-associations of Klf1-regulated genes at a limited number of specialized transcription factories. Our results establish a new gene expression paradigm, implying that active co-regulated genes and their regulatory factors cooperate to create specialized nuclear hot spots optimized for efficient and coordinated transcriptional control.
Albrechtsen, Anders; Nielsen, Finn Cilius; Nielsen, Rasmus
Chip-based high-throughput genotyping has facilitated genome-wide studies of genetic diversity. Many studies have utilized these large data sets to make inferences about the demographic history of human populations using measures of genetic differentiation such as F(ST) or principal component...... analyses. However, the single nucleotide polymorphism (SNP) chip data suffer from ascertainment biases caused by the SNP discovery process in which a small number of individuals from selected populations are used as discovery panels. In this study, we investigate the effect of the ascertainment bias...... on inferences regarding genetic differentiation among populations in one of the common genome-wide genotyping platforms. We generate SNP genotyping data for individuals that previously have been subject to partial genome-wide Sanger sequencing and compare inferences based on genotyping data to inferences based...
Xie, Zhengzhi; Ma, Xiaoqiang; Gang, David R
Turmeric is an excellent example of a plant that produces large numbers of metabolites from diverse metabolic pathways or networks. It is hypothesized that these metabolic pathways or networks contain biosynthetic modules, which lead to the formation of metabolite modules-groups of metabolites whose production is co-regulated and biosynthetically linked. To test whether such co-regulated metabolite modules do exist in this plant, metabolic profiling analysis was performed on turmeric rhizome samples that were collected from 16 different growth and development treatments, which had significant impacts on the levels of 249 volatile and non-volatile metabolites that were detected. Importantly, one of the many co-regulated metabolite modules that were indeed readily detected in this analysis contained the three major curcuminoids, whereas many other structurally related diarylheptanoids belonged to separate metabolite modules, as did groups of terpenoids. The existence of these co-regulated metabolite modules supported the hypothesis that the 3-methoxyl groups on the aromatic rings of the curcuminoids are formed before the formation of the heptanoid backbone during the biosynthesis of curcumin and also suggested the involvement of multiple polyketide synthases with different substrate selectivities in the formation of the array of diarylheptanoids detected in turmeric. Similar conclusions about terpenoid biosynthesis could also be made. Thus, discovery and analysis of metabolite modules can be a powerful predictive tool in efforts to understand metabolism in plants.
Helm, Jonathan L; Sbarra, David A; Ferrer, Emilio
Questions surrounding physiological interdependence in romantic relationships are gaining increased attention in the research literature. One specific form of interdependence, coregulation, can be defined as the bidirectional linkage of oscillating signals within optimal bounds. Conceptual and theoretical work suggests that physiological coregulation should be instantiated in romantic couples. Although these ideas are appealing, the central tenets of most coregulatory models await empirical evaluation. In the current study, we evaluate the covariation of respiratory sinus arrhythmia (RSA) in 32 romantic couples during a series of laboratory tasks using a cross-lagged panel model. During the tasks, men's and women's RSA were associated with their partners' previous RSA responses, and this pattern was stronger for those couples with higher relationship satisfaction. The findings are discussed in terms of their implications for attachment theory, as well as the association between relationships and health.
Full Text Available Abstract Background Genome-wide single-nucleotide polymorphism (SNP arrays containing hundreds of thousands of SNPs from the human genome have proven useful for studying important human genome questions. Data quality of SNP arrays plays a key role in the accuracy and precision of downstream data analyses. However, good indices for assessing data quality of SNP arrays have not yet been developed. Results We developed new quality indices to measure the quality of SNP arrays and/or DNA samples and investigated their statistical properties. The indices quantify a departure of estimated individual-level allele frequencies (AFs from expected frequencies via standardized distances. The proposed quality indices followed lognormal distributions in several large genomic studies that we empirically evaluated. AF reference data and quality index reference data for different SNP array platforms were established based on samples from various reference populations. Furthermore, a confidence interval method based on the underlying empirical distributions of quality indices was developed to identify poor-quality SNP arrays and/or DNA samples. Analyses of authentic biological data and simulated data show that this new method is sensitive and specific for the detection of poor-quality SNP arrays and/or DNA samples. Conclusions This study introduces new quality indices, establishes references for AFs and quality indices, and develops a detection method for poor-quality SNP arrays and/or DNA samples. We have developed a new computer program that utilizes these methods called SNP Array Quality Control (SAQC. SAQC software is written in R and R-GUI and was developed as a user-friendly tool for the visualization and evaluation of data quality of genome-wide SNP arrays. The program is available online (http://www.stat.sinica.edu.tw/hsinchou/genetics/quality/SAQC.htm.
performance of snape to that of other packages. Conclusions We present a software which helps in calling SNPs in pooled samples: it has good power while retaining a low false discovery rate (FDR. The method also provides the posterior probability that a SNP is segregating and the full posterior distribution of f for every SNP. In order to test the behaviour of our software, we generated (through simulated coalescence artificial genomes and computed the effect of a pooled sequencing protocol, followed by SNP calling. In this setting, snape has better power and False Discovery Rate (FDR than the comparable packages samtools, PoPoolation, Varscan : for N = 50 chromosomes, snape has power ≈ 35%and FDR ≈ 2.5%. snape is available at http://code.google.com/p/snape-pooled/ (source code and precompiled binaries.
Nicholas A. Tinker
Full Text Available Recognizing a need in cultivated hexaploid oat ( L. for a reliable set of reference single nucleotide polymorphisms (SNPs, we have developed a 6000 (6K BeadChip design containing 257 Infinium I and 5486 Infinium II designs corresponding to 5743 SNPs. Of those, 4975 SNPs yielded successful assays after array manufacturing. These SNPs were discovered based on a variety of bioinformatics pipelines in complementary DNA (cDNA and genomic DNA originating from 20 or more diverse oat cultivars. The array was validated in 1100 samples from six recombinant inbred line (RIL mapping populations and sets of diverse oat cultivars and breeding lines, and provided approximately 3500 discernible Mendelian polymorphisms. Here, we present an annotation of these SNPs, including methods of discovery, gene identification and orthology, population-genetic characteristics, and tentative positions on an oat consensus map. We also evaluate a new cluster-based method of calling SNPs. The SNP design sequences are made publicly available, and the full SNP genotyping platform is available for commercial purchase from an independent third party.
Graham, J D; Bain, D L; Richer, J K; Jackson, T A; Tung, L; Horwitz, K B
The development of tamoxifen resistance and consequent disease progression are common occurrences in breast cancers, often despite the continuing expression of estrogen receptors (ER). Tamoxifen is a mixed antagonist, having both agonist and antagonist properties. We have suggested that the development of tamoxifen resistance is associated with an increase in its agonist-like properties, resulting in loss of antagonist effects or even inappropriate tumor stimulation. Nuclear receptor function is influenced by a family of transcriptional coregulators, that either enhance or suppress transcriptional activity. Using a mixed antagonist-biased two-hybrid screening strategy, we identified two such proteins: the human homolog of the nuclear receptor corepressor, N-CoR, and a novel coactivator, L7/SPA (Switch Protein for Antagonists). In transcriptional studies, N-CoR suppressed the agonist properties of tamoxifen and RU486, and L7/SPA increased agonist effects. We speculated that the relative levels of these coactivators and corepressors may determine the balance of agonist and antagonist properties of mixed antagonists, such as tamoxifen. Using quantitative RT-PCR, we, therefore, measured the levels of transcripts encoding these coregulators, as well as the corepressor SMRT, and the coactivator SRC-1, in a small cohort of tamoxifen-resistant and sensitive breast tumors. The results suggest that tumor sensitivity to mixed antagonists may be governed by a complex set of transcription factors, which we are only now beginning to understand.
Scaglione Davide; Acquadro Alberto; Portis Ezio; Tirone Matteo; Knapp Steven J; Lanteri Sergio
Abstract Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic ...
Full Text Available The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50-60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed: Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes, sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.
Butler, John M.; Budowle, B.; Gill, P.;
Six scientists presented their views and experience with single nucleotide polymorphism (SNP) markers, multiplexes, and methods regarding their potential application in forensic identity and relationship testing. Benefits and limitations of SNPs were reviewed, as were different SNP marker categor...
Full Text Available Identification of single nucleotide polymorphisms (SNPs and mutations is important for the discovery of genetic predisposition to complex diseases. PCR resequencing is the method of choice for de novo SNP discovery. However, manual curation of putative SNPs has been a major bottleneck in the application of this method to high-throughput screening. Therefore it is critical to develop a more sensitive and accurate computational method for automated SNP detection. We developed a software tool, SNPdetector, for automated identification of SNPs and mutations in fluorescence-based resequencing reads. SNPdetector was designed to model the process of human visual inspection and has a very low false positive and false negative rate. We demonstrate the superior performance of SNPdetector in SNP and mutation analysis by comparing its results with those derived by human inspection, PolyPhred (a popular SNP detection tool, and independent genotype assays in three large-scale investigations. The first study identified and validated inter- and intra-subspecies variations in 4,650 traces of 25 inbred mouse strains that belong to either the Mus musculus species or the M. spretus species. Unexpected heterozygosity in CAST/Ei strain was observed in two out of 1,167 mouse SNPs. The second study identified 11,241 candidate SNPs in five ENCODE regions of the human genome covering 2.5 Mb of genomic sequence. Approximately 50% of the candidate SNPs were selected for experimental genotyping; the validation rate exceeded 95%. The third study detected ENU-induced mutations (at 0.04% allele frequency in 64,896 traces of 1,236 zebra fish. Our analysis of three large and diverse test datasets demonstrated that SNPdetector is an effective tool for genome-scale research and for large-sample clinical studies. SNPdetector runs on Unix/Linux platform and is available publicly (http://lpg.nci.nih.gov.
Full Text Available Identification of single nucleotide polymorphisms (SNPs and mutations is important for the discovery of genetic predisposition to complex diseases. PCR resequencing is the method of choice for de novo SNP discovery. However, manual curation of putative SNPs has been a major bottleneck in the application of this method to high-throughput screening. Therefore it is critical to develop a more sensitive and accurate computational method for automated SNP detection. We developed a software tool, SNPdetector, for automated identification of SNPs and mutations in fluorescence-based resequencing reads. SNPdetector was designed to model the process of human visual inspection and has a very low false positive and false negative rate. We demonstrate the superior performance of SNPdetector in SNP and mutation analysis by comparing its results with those derived by human inspection, PolyPhred (a popular SNP detection tool, and independent genotype assays in three large-scale investigations. The first study identified and validated inter- and intra-subspecies variations in 4,650 traces of 25 inbred mouse strains that belong to either the Mus musculus species or the M. spretus species. Unexpected heterozgyosity in CAST/Ei strain was observed in two out of 1,167 mouse SNPs. The second study identified 11,241 candidate SNPs in five ENCODE regions of the human genome covering 2.5 Mb of genomic sequence. Approximately 50% of the candidate SNPs were selected for experimental genotyping; the validation rate exceeded 95%. The third study detected ENU-induced mutations (at 0.04% allele frequency in 64,896 traces of 1,236 zebra fish. Our analysis of three large and diverse test datasets demonstrated that SNPdetector is an effective tool for genome-scale research and for large-sample clinical studies. SNPdetector runs on Unix/Linux platform and is available publicly (http://lpg.nci.nih.gov.
Panitz, Frank; Nielsen, Rasmus Ory; van Houdt, Jeroen K J
(Solea solea). For each species, a total of 7-8 individuals from 4-5 populations, that in most cases spanned major parts of the species’ distributions, were individually tagged and subsequently pooled for cDNA library construction. After 454 sequencing reads were de-multiplexed, cleaned, repeat...... using GigaBayes (http://bioinformatics.bc.edu/marthlab/GigaBayes). From the predicted polymorphic sites potential SNPs were filtered for intron-exon boundaries and visually inspected. We selected 1,536 candidate SNPs which were then used to design probes for validation by large-scale genotyping assays...
Mansueto, Locedie; Fuentes, Roven Rommel; Borja, Frances Nikki; Detras, Jeffery; Abriol-Santos, Juan Miguel; Chebotarov, Dmytro; Sanciangco, Millicent; Palis, Kevin; Copetti, Dario; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Wing, Rod A.; Hamilton, Ruaraidh Sackville; Mauleon, Ramil; McNally, Kenneth L.; Alexandrov, Nickolai
We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org. PMID:27899667
Full Text Available The 17-beta estradiol (E2, a steroid hormone, which play critical role in various cellular processes such as cell proliferation, differentiation, migration and apoptosis, is essential for reproduction and mammary gland development. E2 actions are mediated by two classical nuclear hormone receptors, estrogen receptor alpha and beta (ERs. The activity of ERs depends on the coordinate activity of ligand binding, posttranslational modification, and importantly their interaction with their partner proteins called ‘coregulators’. Because majority of breast cancers are ERalpha positive and coregulators are proved to be crucial for ER transcriptional activity, an increased interest in the field has led to the identification of a large number of coregulators. In the last decade, gene knockout studies using mouse models provided impetus to our further understanding of the role of these coregulators in mammary gland development. Several coregulators appear to be critical for terminal end bud formation, ductal branching and alveologenesis during mammary gland development. The emerging studies support that, in addition to these coregulators, the other ER partner proteins ‘pioneering factors’ also seems to contribute significantly to E2 signaling and mammary cell fate. This review discusses emerging themes in coregulator- and pioneering factor-mediated action on ER functions, particularly their role in mammary gland cell fate and development.
Mohammadnejad, Afsaneh; Brasch-Andersen, Charlotte; Haagerup, Annette
Background: Allergic Rhinitis (AR) is a complex disorder that affects many people around the world. There is a high genetic contribution to the development of the AR, as twins and family studies have estimated heritability of more than 33%. Due to the complex nature of the disease, single SNP...... analysis has limited power in identifying the genetic variations for AR. We combined genome-wide association analysis (GWAS) with polygenic risk score (PRS) in exploring the genetic basis underlying the disease. Methods: We collected clinical data on 631 Danish subjects with AR cases consisting of 434...... sibling pairs and unrelated individuals and control subjects of 197 unrelated individuals. SNP genotyping was done by Affymetrix Genome-Wide Human SNP Array 5.0. SNP imputation was performed using "IMPUTE2". Using additive effect model, GWAS was conducted in discovery sample, the genotypes...
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ~4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification pr...
Tosser-klopp, G.; Bardou, P.; Bouchez, O.; Cabau, C.; Crooijmans, R.P.M.A.; Dong, Y.; Donnadieu-Tonon, C.; Eggen, A.; Heuven, H.C.M.; Jamli, S.; Jiken, A.J.; Klopp, C.; Lawley, C.T.; McEwen, J.; Martin, P.; Moreno, C.R.; Mulsant, P.; Nabihoudine, I.; Pailhoux, E.; Palhiere, I.; Rupp, R.; Sarry, J.; Sayre, B.L.; Tircazes, A.; Wang, J.; Wang, W.; Zhang, W.G.
The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50–60
Full Text Available Abstract Background Transcriptional profiling of prostate cancer (PC has unveiled new markers of neoplasia and allowed insights into mechanisms underlying this disease. Genomewide analyses have also identified new chromosomal abnormalities associated with PC. The combination of both classes of data for the same sample cohort might provide better criteria for identifying relevant factors involved in neoplasia. Here we describe transcriptional signatures identifying distinct normal and tumoral prostate tissue compartments, and the inference and demonstration of a new, highly recurrent copy number gain on chromosome 17q25.3. Methods We have applied transcriptional profiling to tumoral and non-tumoral prostate samples with relatively homogeneous epithelial representations as well as pure stromal tissue from peripheral prostate and cultured cell lines, followed by quantitative RT-PCR validations and immunohistochemical analysis. In addition, we have performed in silico colocalization analysis of co-regulated genes and validation by fluorescent in situ hybridization (FISH. Results The transcriptomic analysis has allowed us to identify signatures corresponding to non-tumoral luminal and tumoral epithelium, basal epithelial cells, and prostate stromal tissue. In addition, in silico analysis of co-regulated expression of physically linked genes has allowed us to predict the occurrence of a copy number gain at chromosomal region 17q25.3. This computational inference was validated by fluorescent in situ hybridization, which showed gains in this region in over 65% of primary and metastatic tumoral samples. Conclusion Our approach permits to directly link gene copy number variations with transcript co-regulation in association with neoplastic states. Therefore, transcriptomic studies of carefully selected samples can unveil new diagnostic markers and transcriptional signatures highly specific of PC, and lead to the discovery of novel genomic abnormalities
Mercedes Muñoz-Saldaña, Ph.D.
Full Text Available On 17 November 2009 the first co-regulation code for the audiovisual media sector was established in Spain: “2010 Co-regulation Code for the Quality of Audiovisual Contents in Navarra”. This Code is pioneering in the field and, taking into account the content of the recently approved General Law on Audiovisual Communication, is an example of the kind of work that shall be carried out in the future by Spain’s National Media Council (Consejo Estatal de Medios Audiovisuales, aka, CEMA or the corresponding regulatory body. This initiative shows the need to apply co-regulatory codes to the national systems of regulation in the audiovisual sector, as the European institutions urged in their latest Directive in 2010. This article addresses three issues that demonstrate the need for and advantages of applying co-regulation practices to guarantee the protection of minors, pluralism, and the promotion of media literacy: the failure of traditional regulatory instruments and the inefficiency of self-regulation; the conceptual definition of co-regulation as an instrument separated from self-regulation and regulation; and the added value of co-regulation in its application to concrete areas.
Full Text Available Genome-wide association studies (GWASs have identified low-penetrance common variants (i.e., single nucleotide polymorphisms, SNPs associated with breast cancer susceptibility. Although GWASs are primarily focused on single-locus effects, gene-gene interactions (i.e., epistasis are also assumed to contribute to the genetic risks for complex diseases including breast cancer. While it has been hypothesized that moderately ranked (P value based weak single-locus effects in GWASs could potentially harbor valuable information for evaluating epistasis, we lack systematic efforts to investigate SNPs showing consistent associations with weak statistical significance across independent discovery and replication stages. The objectives of this study were i to select SNPs showing single-locus effects with weak statistical significance for breast cancer in a GWAS and/or candidate-gene studies; ii to replicate these SNPs in an independent set of breast cancer cases and controls; and iii to explore their potential SNP-SNP interactions contributing to breast cancer susceptibility. A total of 17 SNPs related to DNA repair, modification and metabolism pathway genes were selected since these pathways offer a priori knowledge for potential epistatic interactions and an overall role in breast carcinogenesis. The study design included predominantly Caucasian women (2,795 cases and 4,505 controls from Alberta, Canada. We observed two two-way SNP-SNP interactions (APEX1-rs1130409 and RPAP1-rs2297381; MLH1-rs1799977 and MDM2-rs769412 in logistic regression that conferred elevated risks for breast cancer (P(interaction<7.3 × 10(-3. Logic regression identified an interaction involving four SNPs (MBD2-rs4041245, MLH1-rs1799977, MDM2-rs769412, BRCA2-rs1799943 (P(permutation = 2.4 × 10(-3. SNPs involved in SNP-SNP interactions also showed single-locus effects with weak statistical significance, while BRCA2-rs1799943 showed stronger statistical significance (P
Lee, Nayoung; Park, Jeongmoo; Kim, Keunhwa; Choi, Giltsu
PHYTOCHROME-INTERACTING FACTOR1 (PIF1) is a basic helix-loop-helix transcription factor that inhibits light-dependent seed germination in Arabidopsis thaliana. However, it remains unclear whether PIF1 requires other factors to regulate its direct targets. Here, we demonstrate that LEUNIG_HOMOLOG (LUH), a Groucho family transcriptional corepressor, binds to PIF1 and coregulates its targets. Not only are the transcriptional profiles of the luh and pif1 mutants remarkably similar, more than 80% of the seeds of both genotypes germinate in the dark. We show by chromatin immunoprecipitation that LUH binds a subset of PIF1 targets in a partially PIF1-dependent manner. Unexpectedly, we found LUH binds and coregulates not only PIF1-activated targets but also PIF1-repressed targets. Together, our results indicate LUH functions with PIF1 as a transcriptional coregulator to inhibit seed germination.
Full Text Available Abstract Background PCR-restriction fragment length polymorphism (RFLP assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. Results The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels, gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. Conclusions The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2.
Damgaard, Christian Kroun; Lykke-Andersen, Jens
The response of cells to changes in their environment often requires coregulation of gene networks, but little is known about how this can occur at the post-transcriptional level. An important example of post-transcriptional coregulation is the selective translational regulation in response......-associated TIA-1 and TIAR proteins as key factors in human 5'TOP mRNA regulation, which upon amino acid starvation assemble onto the 5' end of 5'TOP mRNAs and arrest translation at the initiation step, as evidenced by TIA-1/TIAR-dependent 5'TOP mRNA translation repression, polysome release, and accumulation...
Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming
In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Marco Di Stefano
Full Text Available The connection between chromatin nuclear organization and gene activity is vividly illustrated by the observation that transcriptional coregulation of certain genes appears to be directly influenced by their spatial proximity. This fact poses the more general question of whether it is at all feasible that the numerous genes that are coregulated on a given chromosome, especially those at large genomic distances, might become proximate inside the nucleus. This problem is studied here using steered molecular dynamics simulations in order to enforce the colocalization of thousands of knowledge-based gene sequences on a model for the gene-rich human chromosome 19. Remarkably, it is found that most (≈ 88% gene pairs can be brought simultaneously into contact. This is made possible by the low degree of intra-chromosome entanglement and the large number of cliques in the gene coregulatory network. A clique is a set of genes coregulated all together as a group. The constrained conformations for the model chromosome 19 are further shown to be organized in spatial macrodomains that are similar to those inferred from recent HiC measurements. The findings indicate that gene coregulation and colocalization are largely compatible and that this relationship can be exploited to draft the overall spatial organization of the chromosome in vivo. The more general validity and implications of these findings could be investigated by applying to other eukaryotic chromosomes the general and transferable computational strategy introduced here.
Full Text Available Abstract Background The globe artichoke (Cynara cardunculus L. var. scolymus genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp, of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria.
Full Text Available Open source single nucleotide polymorphism (SNP discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2, SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a
Yoshinaga Yoshimura, Tomoko Ohtake, Hajime Okada, Takehiro Ami, Tadashi Tsukaguchi and Kenzo Fujimoto
Full Text Available We describe a simple and inexpensive single-nucleotide polymorphism (SNP typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.
Yoshimura, Yoshinaga; Ohtake, Tomoko; Okada, Hajime; Fujimoto, Kenzo [School of Materials Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292 (Japan); Ami, Takehiro [Innovation Plaza Ishikawa, Japan Science and Technology Agency, 2-13 Asahidai, Nomi, Ishikawa 923-1211 (Japan); Tsukaguchi, Tadashi, E-mail: email@example.com [Faculty of Bioresources and Environmental Sciences, Ishikawa Prefectural University, 1-308 Suematsu, Nonoichi, Ishikawa 921-8836 (Japan)
We describe a simple and inexpensive single-nucleotide polymorphism (SNP) typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.
Full Text Available non-tagSNP -: SNP not included in LD bin calculation (MAF Best tagSNP The flag that indicates whether the SNP is the best... tagSNP or not. 1: Best tagSNP 0: non-best tagSNP -: SNP
Aldred Micheala A
Full Text Available Abstract Background The recent discovery of widespread copy number variation in humans has forced a shift away from the assumption of two copies per locus per cell throughout the autosomal genome. In particular, a SNP site can no longer always be accurately assigned one of three genotypes in an individual. In the presence of copy number variability, the individual may theoretically harbor any number of copies of each of the two SNP alleles. Results To address this issue, we have developed a method to infer a "generalized genotype" from raw SNP microarray data. Here we apply our approach to data from 48 individuals and uncover thousands of aberrant SNPs, most in regions that were previously unreported as copy number variants. We show that our allele-specific copy numbers follow Mendelian inheritance patterns that would be obscured in the absence of SNP allele information. The interplay between duplication and point mutation in our data shed light on the relative frequencies of these events in human history, showing that at least some of the duplication events were recurrent. Conclusion This new multi-allelic view of SNPs has a complicated role in disease association studies, and further work will be necessary in order to accurately assess its importance. Software to perform generalized genotyping from SNP array data is freely available online 1.
Presented are four mathematical discoveries made by students on an arithmetical function using the Fibonacci sequence. Discussed is the nature of the role of the teacher in directing the students' discovery activities. (KR)
Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis
The price discovery literature investigates how homogenous securities traded on different markets incorporate information into prices. We take this literature one step further and investigate how these markets contribute to stochastic volatility (volatility discovery). We formally show...... that the realized measures from homogenous securities share a fractional stochastic trend, which is a combination of the price and volatility discovery measures. Furthermore, we show that volatility discovery is associated with the way that market participants process information arrival (market sensitivity...
Evers, Nynke M; van den Berg, Johannes H J; Wang, Si; Melchers, Diana; Houtman, René; de Haan, Laura H J; Ederveen, Antwan G H; Groten, John P; Rietjens, Ivonne M C M
The aim of the present study was to investigate modulation of the interaction of the ERα and ERβ with coregulators in the ligand responses induced by estrogenic compounds. To this end, selective ERα and ERβ agonists were characterized for intrinsic relative potency reflected by EC50 and maximal efficacy towards ERα and ERβ mediated response in ER selective reporter gene assays, and subsequently tested for induction of cell proliferation in T47D-ERβ cells with variable ERα/ERβ ratio, and finally for ligand dependent modulation of the interaction of ERα and ERβ with coregulators using the MARCoNI assay, with 154 unique nuclear receptor coregulator peptides derived from 66 different coregulators. Results obtained reveal an important influence of the ERα/ERβ ratio and receptor selectivity of the compounds tested on induction of cell proliferation. ERα agonists activate cell proliferation whereas ERβ suppresses ERα mediated cell proliferation. The responses in the MARCoNI assay reveal that upon ERα or ERβ activation by a specific agonist, the modulation of the interaction of the ERs with coregulators is very similar indicating only a limited number of differences upon ERα or ERβ activation by a specific ligand. Differences in the modulation of the interaction of the ERs with coregulators between the different agonists were more pronounced. Based on ligand dependent differences in the modulation of the interaction of the ERs with coregulators, the MARCoNI assay was shown to be able to classify the ER agonists discriminating between different agonists for the same receptor, a characteristic not defined by the ER selective reporter gene or proliferation assays. It is concluded that the ultimate effect of the model compounds on proliferation of estrogen responsive cells depends on the intrinsic relative potency of the agonist towards ERα and ERβ and the cellular ERα/ERβ ratio whereas differences in the modulation of the interaction of the ERα and
Full Text Available Abstract Background Cucurbita pepo is a member of the Cucurbitaceae family, the second- most important horticultural family in terms of economic importance after Solanaceae. The "summer squash" types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP, was recently generated using massive sequencing. A set of 384 SNP was selected to generate an Illumina GoldenGate assay in order to construct the first SNP-based genetic map of Cucurbita and map quantitative trait loci (QTL. Results We herein present the construction of the first SNP-based genetic map of Cucurbita pepo using a population derived from the cross of two varieties with contrasting phenotypes, representing the main cultivar groups of the species' two subspecies: Zucchini (subsp. pepo × Scallop (subsp. ovifera. The mapping population was genotyped with 384 SNP, a set of selected EST-SNP identified in silico after massive sequencing of the transcriptomes of both parents, using the Illumina GoldenGate platform. The global success rate of the assay was higher than 85%. In total, 304 SNP were mapped, along with 11 SSR from a previous map, giving a map density of 5.56 cM/marker. This map was used to infer syntenic relationships between C. pepo and cucumber and to successfully map QTL that control plant, flowering and fruit traits that are of benefit to squash breeding. The QTL effects were validated in backcross populations. Conclusion Our results show that massive sequencing in different genotypes is an excellent tool for SNP discovery, and that the Illumina GoldenGate platform can be successfully applied to constructing genetic maps and performing QTL analysis in Cucurbita. This is the first SNP-based genetic map in the Cucurbita genus and is an invaluable new tool for biological research
Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross
High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle.
Androutsellis-Theotokis, A.; Chrousos, G. P.; McKay, R. D.; DeCherney, A. H.; Kino, T.
Neural stem cells (NSCs) are pluripotent precursors with the ability to proliferate and differentiate into 3 neural cell lineages, neurons, astrocytes and oligodendrocytes. Elucidation of the mechanisms underlying these biologic processes is essential for understanding both physiologic and pathologic neural development and regeneration after injury. Nuclear hormone receptors (NRs) and their transcriptional coregulators also play crucial roles in neural development, functions and fate. To identify key NRs and their transcriptional regulators in NSC differentiation, we examined mRNA expression of 49 NRs and many of their coregulators during differentiation (0–5 days) of mouse embryonic NSCs induced by withdrawal of fibroblast growth factor-2 (FGF2). 37 out of 49 NRs were expressed in NSCs before induction of differentiation, while receptors known to play major roles in neural development, such as THRα, RXRs, RORs, TRs, and COUPTFs, were highly expressed. CAR, which plays important roles in xenobiotic metabolism, was also highly expressed. FGF2 withdrawal induced mRNA expression of RORγ, RXRγ, and MR by over 20-fold. Most of the transcriptional coregulators examined were expressed basally and throughout differentiation without major changes, while FGF2 withdrawal strongly induced mRNA expression of several histone deacetylases (HDACs), including HDAC11. Dexamethasone and aldosterone, respectively a synthetic glucocorticoid and natural mineralocorticoid, increased NSC numbers and induced differentiation into neurons and astrocytes. These results indicate that the NRs and their coregulators are present and/or change their expression during NSC differentiation, suggesting that they may influence development of the central nervous system in the absence or presence of their ligands. PMID:22990992
Richard A Notebaart
Full Text Available To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naïve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools.
Yu-Hai Zhao; Guo-Ren Wang; Ying Yin; Guang-Yu Xu
As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which includeboth positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clusteringapproaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering modelcalled g-Cluster is proposed for gene expression data. The proposed model has the following advantages: 1) find both positiveand negative co-regulated genes in a shot, 2) get away from the restriction of magnitude transformation relationship amongco-regulated genes, and 3) guarantee quality of clusters and significance of regulations using a novel similarity measurementgCode and a user-specified regulation threshold 5, respectively. No previous work measures up to the task which has been set.Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is alsodesigned, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters.Extensive experiments are conducted on real and synthetic datasets. The experimental results show that 1) the algorithmis able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biologicalsignificance, and 2) the algorithms are effective and efficient, and outperform the existing approaches.
Herbers, Janette E; Cutuli, J J; Supkoff, Laura M; Narayan, Angela J; Masten, Ann S
The role of effective parenting in promoting child executive functioning and school success was examined among 138 children (age 4 to 6 years) staying in family emergency shelters the summer before kindergarten or 1st grade. Parent-child coregulation, which refers to relationship processes wherein parents guide and respond to the behavior of their children, was observed during structured interaction tasks and quantified as a dyadic construct using state space grid methodology. Positive coregulation was related to children's executive functioning and IQ, which in turn were related to teacher-reported outcomes once school began. Separate models considering parenting behavior demonstrated that executive function carried indirect effects of parents' directive control to school outcomes. Meanwhile, responsive parenting behaviors directly predicted children's peer acceptance at school beyond effects of executive function and IQ. Findings support theory and past research in developmental science, indicating the importance of effective parenting in shaping positive adaptive skills among children who overcome adversity, in part through processes of coregulation.
Full Text Available Angiogenesis has been shown to be associated with prostate cancer development. The majority of prostate cancer studies focused on individual single nucleotide polymorphisms (SNPs while SNP-SNP interactions are suggested having a great impact on unveiling the underlying mechanism of complex disease. Using 1,151 prostate cancer patients in the Cancer Genetic Markers of Susceptibility (CGEMS dataset, 2,651 SNPs in the angiogenesis genes associated with prostate cancer aggressiveness were evaluated. SNP-SNP interactions were primarily assessed using the two-stage Random Forests plus Multivariate Adaptive Regression Splines (TRM approach in the CGEMS group, and were then re-evaluated in the Moffitt group with 1,040 patients. For the identified gene pairs, cross-evaluation was applied to evaluate SNP interactions in both study groups. Five SNP-SNP interactions in three gene pairs (MMP16+ ROBO1, MMP16+ CSF1, and MMP16+ EGFR were identified to be associated with aggressive prostate cancer in both groups. Three pairs of SNPs (rs1477908+ rs1387665, rs1467251+ rs7625555, and rs1824717+ rs7625555 were in MMP16 and ROBO1, one pair (rs2176771+ rs333970 in MMP16 and CSF1, and one pair (rs1401862+ rs6964705 in MMP16 and EGFR. The results suggest that MMP16 may play an important role in prostate cancer aggressiveness. By integrating our novel findings and available biomedical literature, a hypothetical gene interaction network was proposed. This network demonstrates that our identified SNP-SNP interactions are biologically relevant and shows that EGFR may be the hub for the interactions. The findings provide valuable information to identify genotype combinations at risk of developing aggressive prostate cancer and improve understanding on the genetic etiology of angiogenesis associated with prostate cancer aggressiveness.
Thiel, Thomas; Kota, Raja; Grosse, Ivo; Stein, Nils; Graner, Andreas
With the influx of various SNP genotyping assays in recent years, there has been a need for an assay that is robust, yet cost effective, and could be performed using standard gel-based procedures. In this context, CAPS markers have been shown to meet these criteria. However, converting SNPs to CAPS markers can be a difficult process if done manually. In order to address this problem, we describe a computer program, SNP2CAPS, that facilitates the computational conversion of SNP markers into CA...
Full Text Available Cytogenetic analysis is essential for the diagnosis and prognosis of hematopoietic neoplasms in current clinical practice. Many hematopoietic malignancies are characterized by structural chromosomal abnormalities such as specific translocations, inversions, deletions and/or numerical abnormalities that can be identified by karyotype analysis or fluorescence in situ hybridization (FISH studies. Single nucleotide polymorphism (SNP arrays offer high-resolution identification of copy number variants (CNVs and acquired copy-neutral loss of heterozygosity (LOH/uniparental disomy (UPD that are usually not identifiable by conventional cytogenetic analysis and FISH studies. As a result, SNP arrays have been increasingly applied to hematopoietic neoplasms to search for clinically-significant genetic abnormalities. A large numbers of CNVs and UPDs have been identified in a variety of hematopoietic neoplasms. CNVs detected by SNP array in some hematopoietic neoplasms are of prognostic significance. A few specific genes in the affected regions have been implicated in the pathogenesis and may be the targets for specific therapeutic agents in the future. In this review, we summarize the current findings of application of SNP arrays in a variety of hematopoietic malignancies with an emphasis on the clinically significant genetic variants.
Full Text Available Abstract Background Until recently, only a small number of low- and mid-throughput methods have been used for single nucleotide polymorphism (SNP discovery and genotyping in grapevine (Vitis vinifera L.. However, following completion of the sequence of the highly heterozygous genome of Pinot Noir, it has been possible to identify millions of electronic SNPs (eSNPs thus providing a valuable source for high-throughput genotyping methods. Results Herein we report the first application of the SNPlex™ genotyping system in grapevine aiming at the anchoring of an eukaryotic genome. This approach combines robust SNP detection with automated assay readout and data analysis. 813 candidate eSNPs were developed from non-repetitive contigs of the assembled genome of Pinot Noir and tested in 90 progeny of Syrah × Pinot Noir cross. 563 new SNP-based markers were obtained and mapped. The efficiency rate of 69% was enhanced to 80% when multiple displacement amplification (MDA methods were used for preparation of genomic DNA for the SNPlex assay. Conclusion Unlike other SNP genotyping methods used to investigate thousands of SNPs in a few genotypes, or a few SNPs in around a thousand genotypes, the SNPlex genotyping system represents a good compromise to investigate several hundred SNPs in a hundred or more samples simultaneously. Therefore, the use of the SNPlex assay, coupled with whole genome amplification (WGA, is a good solution for future applications in well-equipped laboratories.
Full Text Available Abstract Background Copy number variation (CNV is essential to understand the pathology of many complex diseases at the DNA level. Affymetrix SNP arrays, which are widely used for CNV studies, significantly depend on accurate copy number (CN estimation. Nevertheless, CN estimation may be biased by several factors, including cross-hybridization and training sample batch, as well as genomic waves of intensities induced by sequence-dependent hybridization rate and amplification efficiency. Since many available algorithms only address one or two of the three factors, a high false discovery rate (FDR often results when identifying CNV. Therefore, we have developed a new CNV detection pipeline which is based on hybridization and amplification rate correction (CNVhac. Methods CNVhac first estimates the allelic concentrations (ACs of target sequences by using the sample independent parameters trained through physicochemical hybridization law. Then the raw CN is estimated by taking the ratio of AC to the corresponding average AC from a reference sample set for one specific site. Finally, a hidden Markov model (HMM segmentation process is implemented to detect CNV regions. Results Based on public HapMap data, the results show that CNVhac effectively smoothes the genomic waves and facilitates more accurate raw CN estimates compared to other methods. Moreover, CNVhac alleviates, to a certain extent, the sample dependence of inference and makes CNV calling with appreciable low FDRs. Conclusion CNVhac is an effective approach to address the common difficulties in SNP array analysis, and the working principles of CNVhac can be easily extended to other platforms.
Liu, Yu-Xuan; Hu, Qing-Qing; Ma, Hong-Du; Huang, Dai-Xin
Single nucleotide polymorphism (SNP) refers to the single base sequence variation in specific location of the human genome. Phenotype informative SNP has gradually become one of the research hot spots in forensic science. In this paper, the forensic research situation and application prospect of phenotype informative SNP in the characteristics of hair, eye and skin color, height, and facial feature are reviewed.
Alonso, Santos; Boyano, M. Dolores; Peña-Chilet, Maria; Pita, Guillermo; Aviles, Jose A.; Mayor, Matias; Gomez-Fernandez, Cristina; Casado, Beatriz; Martin-Gonzalez, Manuel; Izagirre, Neskuts; De la Rua, Concepcion; Asumendi, Aintzane; Perez-Yarza, Gorka; Arroyo-Berdugo, Yoana; Boldo, Enrique; Lozoya, Rafael; Torrijos-Aguilar, Arantxa; Pitarch, Ana; Pitarch, Gerard; Sanchez-Motilla, Jose M.; Valcuende-Cavero, Francisca; Tomas-Cabedo, Gloria; Perez-Pastor, Gemma; Diaz-Perez, Jose L.; Gardeazabal, Jesus; de Lizarduy, Iñigo Martinez; Sanchez-Diez, Ana; Valdes, Carlos; Pizarro, Angel; Casado, Mariano; Carretero, Gregorio; Botella-Estrada, Rafael; Nagore, Eduardo; Lazaro, Pablo; Lluch, Ana; Benitez, Javier; Martinez-Cadenas, Conrado; Ribas, Gloria
As the incidence of Malignant Melanoma (MM) reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2) and rs2069398 (SILV/CKD2), were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls). A novel SNP located on the SLC45A2 gene (rs35414) was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001). None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively) had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls). Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date. PMID:21559390
Full Text Available As the incidence of Malignant Melanoma (MM reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2 and rs2069398 (SILV/CKD2, were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls. A novel SNP located on the SLC45A2 gene (rs35414 was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001. None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls. Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date.
Fondevila, M; Børsting, C; Phillips, C
This review explores the key factors that influence the optimization, routine use, and profile interpretation of the SNaPshot single-base extension (SBE) system applied to forensic single-nucleotide polymorphism (SNP) genotyping. Despite being a mainly complimentary DNA genotyping technique...... to routine STR profiling, use of SNaPshot is an important part of the development of SNP sets for a wide range of forensic applications with these markers, from genotyping highly degraded DNA with very short amplicons to the introduction of SNPs to ascertain the ancestry and physical characteristics...... of an unidentified contact trace donor. However, this technology, as resourceful as it is, displays several features that depart from the usual STR genotyping far enough to demand a certain degree of expertise from the forensic analyst before tackling the complex casework on which SNaPshot application provides...
Full Text Available The analysis of next-generation sequence (NGS data is often a fragmented step-wise process. For example, multiple pieces of software are typically needed to map NGS reads, extract variant sites, and construct a DNA sequence matrix containing only single nucleotide polymorphisms (i.e., a SNP matrix for a set of individuals. The management and chaining of these software pieces and their outputs can often be a cumbersome and difficult task. Here, we present CFSAN SNP Pipeline, which combines into a single package the mapping of NGS reads to a reference genome with Bowtie2, processing of those mapping (BAM files using SAMtools, identification of variant sites using VarScan, and production of a SNP matrix using custom Python scripts. We also introduce a Python package (CFSAN SNP Mutator that when given a reference genome will generate variants of known position against which we validate our pipeline. We created 1,000 simulated Salmonella enterica sp. enterica Serovar Agona genomes at 100× and 20× coverage, each containing 500 SNPs, 20 single-base insertions and 20 single-base deletions. For the 100× dataset, the CFSAN SNP Pipeline recovered 98.9% of the introduced SNPs and had a false positive rate of 1.04 × 10−6; for the 20× dataset 98.8% of SNPs were recovered and the false positive rate was 8.34 × 10−7. Based on these results, CFSAN SNP Pipeline is a robust and accurate tool that it is among the first to combine into a single executable the myriad steps required to produce a SNP matrix from NGS data. Such a tool is useful to those working in an applied setting (e.g., food safety traceback investigations as well as for those interested in evolutionary questions.
Qin, Zhaohui S; McCue, Lee Ann; Thompson, William; Mayerhofer, Linda; Lawrence, Charles E; Liu, Jun S
The identification of co-regulated genes and their transcription-factor binding sites (TFBS) are key steps toward understanding transcription regulation. In addition to effective laboratory assays, various computational approaches for the detection of TFBS in promoter regions of coexpressed genes have been developed. The availability of complete genome sequences combined with the likelihood that transcription factors and their cognate sites are often conserved during evolution has led to the development of phylogenetic footprinting. The modus operandi of this technique is to search for conserved motifs upstream of orthologous genes from closely related species. The method can identify hundreds of TFBS without prior knowledge of co-regulation or coexpression. Because many of these predicted sites are likely to be bound by the same transcription factor, motifs with similar patterns can be put into clusters so as to infer the sets of co-regulated genes, that is, the regulons. This strategy utilizes only genome sequence information and is complementary to and confirmative of gene expression data generated by microarray experiments. However, the limited data available to characterize individual binding patterns, the variation in motif alignment, motif width, and base conservation, and the lack of knowledge of the number and sizes of regulons make this inference problem difficult. We have developed a Gibbs sampling-based Bayesian motif clustering (BMC) algorithm to address these challenges. Tests on simulated data sets show that BMC produces many fewer errors than hierarchical and K-means clustering methods. The application of BMC to hundreds of predicted gamma-proteobacterial motifs correctly identified many experimentally reported regulons, inferred the existence of previously unreported members of these regulons, and suggested novel regulons.
Van Loo, Peter; Nilsen, Gro; Nordgard, Silje H; Vollan, Hans Kristian Moen; Børresen-Dale, Anne-Lise; Kristensen, Vessela N; Lingjærde, Ole Christian
Single nucleotide polymorphism (SNP) arrays are powerful tools to delineate genomic aberrations in cancer genomes. However, the analysis of these SNP array data of cancer samples is complicated by three phenomena: (a) aneuploidy: due to massive aberrations, the total DNA content of a cancer cell can differ significantly from its normal two copies; (b) nonaberrant cell admixture: samples from solid tumors do not exclusively contain aberrant tumor cells, but always contain some portion of nonaberrant cells; (c) intratumor heterogeneity: different cells in the tumor sample may have different aberrations. We describe here how these phenomena impact the SNP array profile, and how these can be accounted for in the analysis. In an extended practical example, we apply our recently developed and further improved ASCAT (allele-specific copy number analysis of tumors) suite of tools to analyze SNP array data using data from a series of breast carcinomas as an example. We first describe the structure of the data, how it can be plotted and interpreted, and how it can be segmented. The core ASCAT algorithm next determines the fraction of nonaberrant cells and the tumor ploidy (the average number of DNA copies), and calculates an ASCAT profile. We describe how these ASCAT profiles visualize both copy number aberrations as well as copy-number-neutral events. Finally, we touch upon regions showing intratumor heterogeneity, and how they can be detected in ASCAT profiles. All source code and data described here can be found at our ASCAT Web site ( http://www.ifi.uio.no/forskning/grupper/bioinf/Projects/ASCAT/).
Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.; Payne, Deborah A.
Current proteomics techniques, such as mass spectrometry, focus on protein identification, usually ignoring most types of modifications beyond post-translational modifications, with the assumption that only a small number of peptides have to be matched to a protein for a positive identification. However, not all proteins are being identified with current techniques and improved methods to locate points of mutation are becoming a necessity. In the case when single-nucleotide polymorphisms (SNPs) are observed, brute force is the most common method to locate them, quickly becoming computationally unattractive as the size of the database associated with the model organism grows. We have developed a Bayesian model for SNPs, BSNP, incorporating evolutionary information at both the nucleotide and amino acid levels. Formulating SNPs as a Bayesian inference problem allows probabilities of interest to be easily obtained, for example the probability of a specific SNP or specific type of mutation over a gene or entire genome. Three SNP databases were observed in the evaluation of the BSNP model; the first SNP database is a disease specific gene in human, hemoglobin, the second is also a disease specific gene in human, p53, and the third is a more general SNP database for multiple genes in mouse. We validate that the BSNP model assigns higher posterior probabilities to the SNPs defined in all three separate databases than can be attributed to chance under specific evolutionary information, for example the amino acid model described by Majewski and Ott in conjunction with either the four-parameter nucleotide model by Bulmer or seven-parameter nucleotide model by Majewski and Ott.
Full Text Available The goal of this paper is to highlights some thoughts about regulation and self-regulation for e-commerce, to present cases and the importance of co- and self-regulation in e-commerce. There are a couple of question that rise in this area and we will try to answer with this paper: What are the benefits and disadvantages of self- and co-regulation, how can online industry be a cause and effect for self and co-regulation, should our web portal and search engines be a subject to any kind of regulation, what is the impact of the Internet and self and co-regulation on the e-commerce.
Full Text Available Abstract Background Compared to classical genotyping, targeted next-generation sequencing (tNGS can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions. Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99 across multiplexed samples, with Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results.
Hand Melanie L
GoldenGate™ assay is capable of high-throughput co-dominant SNP allele detection, and minimises the problems associated with SNP genotyping in a polyploid by effectively reducing the complexity to a diploid system. This SNP collection may now be refined and used in applications such as cultivar identification, genetic linkage map construction, genome-wide association studies and genomic selection in tall fescue. The bioinformatic pipeline described here represents an effective general method for SNP discovery within outbreeding allopolyploid species.
Rollins, David A; Coppo, Maddalena; Rogatsky, Inez
Nuclear receptor coactivators (NCOAs) are multifunctional transcriptional coregulators for a growing number of signal-activated transcription factors. The members of the p160 family (NCOA1/2/3) are increasingly recognized as essential and nonredundant players in a number of physiological processes. In particular, accumulating evidence points to the pivotal roles that these coregulators play in inflammatory and metabolic pathways, both under homeostasis and in disease. Given that chronic inflammation of metabolic tissues ("metainflammation") is a driving force for the widespread epidemic of obesity, insulin resistance, cardiovascular disease, and associated comorbidities, deciphering the role of NCOAs in "normal" vs "pathological" inflammation and in metabolic processes is indeed a subject of extreme biomedical importance. Here, we review the evolving and, at times, contradictory, literature on the pleiotropic functions of NCOA1/2/3 in inflammation and metabolism as related to nuclear receptor actions and beyond. We then briefly discuss the potential utility of NCOAs as predictive markers for disease and/or possible therapeutic targets once a better understanding of their molecular and physiological actions is achieved.
Cortazar, Ana Rosa; Liu, Xiaojing; Urosevic, Jelena; Castillo-Martin, Mireia; Fernández-Ruiz, Sonia; Morciano, Giampaolo; Caro-Maldonado, Alfredo; Guiu, Marc; Zúñiga-García, Patricia; Graupera, Mariona; Bellmunt, Anna; Pandya, Pahini; Lorente, Mar; Martín-Martín, Natalia; Sutherland, James David; Sanchez-Mosquera, Pilar; Bozal-Basterra, Laura; Zabala-Letona, Amaia; Arruabarrena-Aristorena, Amaia; Berenguer, Antonio; Embade, Nieves; Ugalde-Olano, Aitziber; Lacasa-Viscasillas, Isabel; Loizaga-Iriarte, Ana; Unda-Urzaiz, Miguel; Schultz, Nikolaus; Aransay, Ana Maria; Sanz-Moreno, Victoria; Barrio, Rosa; Velasco, Guillermo; Pinton, Paolo; Cordon-Cardo, Carlos; Carracedo, Arkaitz
Cellular transformation and cancer progression is accompanied by changes in the metabolic landscape. Master co-regulators of metabolism orchestrate the modulation of multiple metabolic pathways through transcriptional programs, and hence constitute a probabilistically parsimonious mechanism for general metabolic rewiring. Here we show that the transcriptional co-activator PGC1α suppresses prostate cancer progression and metastasis. A metabolic co-regulator data mining analysis unveiled that PGC1α is down-regulated in prostate cancer and associated to disease progression. Using genetically engineered mouse models and xenografts, we demonstrated that PGC1α opposes prostate cancer progression and metastasis. Mechanistically, the use of integrative metabolomics and transcriptomics revealed that PGC1α activates an Oestrogen-related receptor alpha (ERRα)-dependent transcriptional program to elicit a catabolic state and metastasis suppression. Importantly, a signature based on the PGC1α-ERRα pathway exhibited prognostic potential in prostate cancer, thus uncovering the relevance of monitoring and manipulating this pathway for prostate cancer stratification and treatment. PMID:27214280
Graham, J D; Bain, D L; Richer, J K; Jackson, T A; Tung, L; Horwitz, K B
The antiestrogen tamoxifen is an effective treatment for estrogen receptor positive breast cancers, slowing tumor growth and preventing disease recurrence, with relatively few side effects. However, many patients who initially respond to treatment, later become resistant to treatment. Tamoxifen has both agonist and antagonist activities, which are manifested in a tissue-specific pattern. Development of tamoxifen resistance can be characterized by an increase in the partial agonist properties of the antiestrogen in the breast, resulting in loss of growth inhibition and even inappropriate tumor stimulation. Nuclear receptor function is modulated by transcriptional coregulators, which either enhance or repress receptor activity. Using a mixed antagonist-biased two-hybrid screening strategy, we identified two such proteins: the human homolog of the nuclear receptor corepressor, N-CoR, and a novel coactivator, L7/SPA (Switch Protein for Antagonists). In transcriptional studies N-CoR suppressed the agonist properties of tamoxifen and RU486, while L7/SPA increased agonist effects. We speculated that the relative level of these coactivators and corepressors might determine the balance of agonist and antagonist properties of mixed antagonists such as tamoxifen. Using quantitative RT-PCR we therefore measured the levels of transcripts encoding these coregulators, as well as the corepressor SMRT, and the coactivator SRC-1, in a small cohort of tamoxifen resistant and sensitive breast tumors. The results suggest that tumor sensitivity to mixed antagonists may be governed by a complex set of transcription factors, which we are only now beginning to understand.
Torrano, Veronica; Valcarcel-Jimenez, Lorea; Cortazar, Ana Rosa; Liu, Xiaojing; Urosevic, Jelena; Castillo-Martin, Mireia; Fernández-Ruiz, Sonia; Morciano, Giampaolo; Caro-Maldonado, Alfredo; Guiu, Marc; Zúñiga-García, Patricia; Graupera, Mariona; Bellmunt, Anna; Pandya, Pahini; Lorente, Mar; Martín-Martín, Natalia; Sutherland, James David; Sanchez-Mosquera, Pilar; Bozal-Basterra, Laura; Zabala-Letona, Amaia; Arruabarrena-Aristorena, Amaia; Berenguer, Antonio; Embade, Nieves; Ugalde-Olano, Aitziber; Lacasa-Viscasillas, Isabel; Loizaga-Iriarte, Ana; Unda-Urzaiz, Miguel; Schultz, Nikolaus; Aransay, Ana Maria; Sanz-Moreno, Victoria; Barrio, Rosa; Velasco, Guillermo; Pinton, Paolo; Cordon-Cardo, Carlos; Locasale, Jason W; Gomis, Roger R; Carracedo, Arkaitz
Cellular transformation and cancer progression is accompanied by changes in the metabolic landscape. Master co-regulators of metabolism orchestrate the modulation of multiple metabolic pathways through transcriptional programs, and hence constitute a probabilistically parsimonious mechanism for general metabolic rewiring. Here we show that the transcriptional co-activator peroxisome proliferator-activated receptor gamma co-activator 1α (PGC1α) suppresses prostate cancer progression and metastasis. A metabolic co-regulator data mining analysis unveiled that PGC1α is downregulated in prostate cancer and associated with disease progression. Using genetically engineered mouse models and xenografts, we demonstrated that PGC1α opposes prostate cancer progression and metastasis. Mechanistically, the use of integrative metabolomics and transcriptomics revealed that PGC1α activates an oestrogen-related receptor alpha (ERRα)-dependent transcriptional program to elicit a catabolic state and metastasis suppression. Importantly, a signature based on the PGC1α-ERRα pathway exhibited prognostic potential in prostate cancer, thus uncovering the relevance of monitoring and manipulating this pathway for prostate cancer stratification and treatment.
Tang, J.; Vosman, B.; Voorrips, R.E.; Linden, van der C.G.; Leunissen, J.A.M.
Background - Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are
Li, S.; Ma, L.; Li, H.
research. Using a user-friendly web interface, genes can be searched by name, description, position, SNP ID or clone name. Several public databases are integrated, including gene information from Ensembl, protein features from Uniprot/SWISS-PROT, Pfam and DAS-CBS. Gene relationships are fetched from BIND......, MINT, KEGG and are integrated with ortholog data from TreeFam to extend the current interaction networks. Integrated tools for primer-design and mis-splicing analysis have been developed to facilitate experimental analysis of individual genes with focus on their variation. Snap is available at http...
Sherry, S T; Ward, M H; Kholodov, M; Baker, J; Phan, L; Smigielski, E M; Sirotkin, K
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K. Sirotkin (1999) Genome Res., 9, 677-679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/.
Wang, Jingbo; Ronaghi, Mostafa; Chong, Samuel S; Lee, Caroline G L
Currently, >14,000,000 single nucleotide polymorphisms (SNPs) are reported. Identifying phenotype-affecting SNPs among these many SNPs pose significant challenges. Although several Web resources are available that can inform about the functionality of SNPs, these resources are mainly annotation databases and are not very comprehensive. In this article, we present a comprehensive, well-annotated, integrated pfSNP (potentially functional SNPs) Web resource (http://pfs.nus.edu.sg/), which is aimed to facilitate better hypothesis generation through knowledge syntheses mediated by better data integration and a user-friendly Web interface. pfSNP integrates >40 different algorithms/resources to interrogate >14,000,000 SNPs from the dbSNP database for SNPs of potential functional significance based on previous published reports, inferred potential functionality from genetic approaches as well as predicted potential functionality from sequence motifs. Its query interface has the user-friendly "auto-complete, prompt-as-you-type" feature and is highly customizable, facilitating different combination of queries using Boolean-logic. Additionally, to facilitate better understanding of the results and aid in hypotheses generation, gene/pathway-level information with text clouds highlighting enriched tissues/pathways as well as detailed-related information are also provided on the results page. Hence, the pfSNP resource will be of great interest to scientists focusing on association studies as well as those interested to experimentally address the functionality of SNPs.
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980
Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun
Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene
This research investigated, via quasi-experiments, the effects of web-based co-regulated learning (CRL) on developing students' computing skills. Two classes of 68 undergraduates in a one-semester course titled "Applied Information Technology: Data Processing" were chosen for this research. The first class (CRL group, n = 38) received…
Evers, N.M.; Berg, van den J.H.J.; Wang, S.; Melchers, D.; Houtman, J.; Haan, de L.H.J.; Ederveen, A.G.H.; Groten, J.P.; Rietjens, I.
The aim of the present study was to investigate modulation of the interaction of the ERa and ERß with coregulators in the ligand responses induced by estrogenic compounds. To this end, selective ERa and ERß agonists were characterized for intrinsic relative potency reflected by EC50 and maximal effi
Wang, S.; Houtman, R.; Melchers, D.; Aarts, J.; Peijnenburg, A.A.C.M.; Beuningen, van R.; Rietjens, I.M.C.M.; Bovee, T.F.H.
To further develop an integrated in vitro testing strategy for replacement of in vivo tests for (anti-)estrogenicity testing, the ligand-modulated interaction of coregulators with estrogen receptor a was assessed using a PamChip® plate. The relative estrogenic potencies determined, based on ERa bind
Korsgaard, Steffen; Sassmannshausen, Sean Patrick
as their central concepts and conceptualization of the entrepreneurial function. On this basis we discuss three central themes that cut across the four alternatives: process, uncertainty, and agency. These themes provide new foci for entrepreneurship research and can help to generate new research questions......In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well...
Full Text Available Abstract Background Mitochondrial single nucleotide polymorphisms (mtSNPs constitute important data when trying to shed some light on human diseases and cancers. Unfortunately, providing relevant mtSNP genotyping information in mtDNA databases in a neatly organized and transparent visual manner still remains a challenge. Amongst the many methods reported for SNP genotyping, determining the restriction fragment length polymorphisms (RFLPs is still one of the most convenient and cost-saving methods. In this study, we prepared the visualization of the mtDNA genome in a way, which integrates the RFLP genotyping information with mitochondria related cancers and diseases in a user-friendly, intuitive and interactive manner. The inherent problem associated with mtDNA sequences in BLAST of the NCBI database was also solved. Description V-MitoSNP provides complete mtSNP information for four different kinds of inputs: (1 color-coded visual input by selecting genes of interest on the genome graph, (2 keyword search by locus, disease and mtSNP rs# ID, (3 visualized input of nucleotide range by clicking the selected region of the mtDNA sequence, and (4 sequences mtBLAST. The V-MitoSNP output provides 500 bp (base pairs flanking sequences for each SNP coupled with the RFLP enzyme and the corresponding natural or mismatched primer sets. The output format enables users to see the SNP genotype pattern of the RFLP by virtual electrophoresis of each mtSNP. The rate of successful design of enzymes and primers for RFLPs in all mtSNPs was 99.1%. The RFLP information was validated by actual agarose electrophoresis and showed successful results for all mtSNPs tested. The mtBLAST function in V-MitoSNP provides the gene information within the input sequence rather than providing the complete mitochondrial chromosome as in the NCBI BLAST database. All mtSNPs with rs number entries in NCBI are integrated in the corresponding SNP in V-MitoSNP. Conclusion V-MitoSNP is a web
McKenna, Neil J; Evans, Ronald M; O'Malley, Bert W
The field of nuclear receptor and coregulator signaling has grown into one of the most active and interdisciplinary in eukaryotic biology. Papers in this field are spread widely across a vast number of journals, which complicates the task of investigators in keeping current with the literature in the field. In 2003, we launched Nuclear Receptor Signaling as an Open Access reviews, perspectives and methods journal for the nuclear receptor signaling field. Building on its success and impact on the community, we have added primary research and dataset articles to this list of article categories, and we now announce the re-launch of the journal this month. Here we will summarize the rationale that informed the creation and expansion of the journal, and discuss the possibilities for its future development.
Kwan, Kenneth Y; Sestan, Nenad; Anton, E S
The cerebral neocortex is segregated into six horizontal layers, each containing unique populations of molecularly and functionally distinct excitatory projection (pyramidal) neurons and inhibitory interneurons. Development of the neocortex requires the orchestrated execution of a series of crucial processes, including the migration of young neurons into appropriate positions within the nascent neocortex, and the acquisition of layer-specific neuronal identities and axonal projections. Here, we discuss emerging evidence supporting the notion that the migration and final laminar positioning of cortical neurons are also co-regulated by cell type- and layer-specific transcription factors that play concomitant roles in determining the molecular identity and axonal connectivity of these neurons. These transcriptional programs thus provide direct links between the mechanisms controlling the laminar position and identity of cortical neurons.
Full Text Available Prostate cancer cells adhere to a tumor basement membrane, while secretoryepithelial cells reside in a suprabasal cell compartment. Since tumor cells are derived fromsuprabasal epithelial cells, they experience de-novo substratum adhesion in the context ofoncogenesis. We therefore analyzed whether cell-matrix adhesion could affect the proteinexpression and activity of the AR. In this study, AR protein expression declined uponsuspension of BPH-1-AR cells, but not in PC-3-AR cells shown by Western blot. In a timecourse study, BPH-1 cell lost AR expression within 6 hours, and the synthetic androgen,R1881 reduced the loss of AR expression. We further explored the mechanism of AR loss insuspended BPH-1 cells. BPH-1-AR cells underwent apoptosis (anoikis when suspended for2 - 5 hours. Suspension did not induce significant apoptosis or decreasing of AR expressionin PC-3 cells. Inhibition of apoptosis in suspended BPH-1-AR cells, either by expression ofBcl-2 or Bcl-xl or by treatment with Z-VAD, a caspase inhibitor, prevented loss of ARprotein. In contrast, the calpain protease inhibitor , ALLN, accelerated the loss of AR proteinexpression. Additionally, cell-matrix adhesion changed the expression of coregulators of ARin the mRNA level of prostate cancer cells. Our results demonstrate that AR proteinexpression was reduced through activation of cell death pathways, and thus indirectly through cell suspension in BPH-AR cells. The activity of AR can also be regulated by adhesion in PC-3-AR and LNCaP cells through affecting the coregulators level.
Full Text Available BACKGROUND: Heparanase, a mammalian endo-beta-D-glucuronidase, specifically degrades heparan sulfate proteoglycans ubiquitously associated with the cell surface and extracellular matrix. This single gene encoded enzyme is over-expressed in most human cancers, promoting tumor metastasis and angiogenesis. PRINCIPAL FINDINGS: We report that targeted disruption of the murine heparanase gene eliminated heparanase enzymatic activity, resulting in accumulation of long heparan sulfate chains. Unexpectedly, the heparanase knockout (Hpse-KO mice were fertile, exhibited a normal life span and did not show prominent pathological alterations. The lack of major abnormalities is attributed to a marked elevation in the expression of matrix metalloproteinases, for example, MMP2 and MMP14 in the Hpse-KO liver and kidney. Co-regulation of heparanase and MMPs was also noted by a marked decrease in MMP (primarily MMP-2,-9 and 14 expression following transfection and over-expression of the heparanase gene in cultured human mammary carcinoma (MDA-MB-231 cells. Immunostaining (kidney tissue and chromatin immunoprecipitation (ChIP analysis (Hpse-KO mouse embryonic fibroblasts suggest that the newly discovered co-regulation of heparanase and MMPs is mediated by stabilization and transcriptional activity of beta-catenin. CONCLUSIONS/SIGNIFICANCE: The lack of heparanase expression and activity was accompanied by alterations in the expression level of MMP family members, primarily MMP-2 and MMP-14. It is conceivable that MMP-2 and MMP-14, which exert some of the effects elicited by heparanase (i.e., over branching of mammary glands, enhanced angiogenic response can compensate for its absence, in spite of their different enzymatic substrate. Generation of viable Hpse-KO mice lacking significant abnormalities may provide a promising indication for the use of heparanase as a target for drug development.
Chamovitz Daniel A
Full Text Available Abstract Background Analyses of gene expression data from microarray experiments has become a central tool for identifying co-regulated, functional gene modules. A crucial aspect of such analysis is the integration of data from different experiments and different laboratories. How to weigh the contribution of different experiments is an important point influencing the final outcomes. We have developed a novel method for this integration, and applied it to genome-wide data from multiple Arabidopsis microarray experiments performed under a variety of experimental conditions. The goal of this study is to identify functional globally co-regulated gene modules in the Arabidopsis genome. Results Following the analysis of 21,000 Arabidopsis genes in 43 datasets and about 2 × 108 gene pairs, we identified a globally co-expressed gene network. We found clusters of globally co-expressed Arabidopsis genes that are enriched for known Gene Ontology annotations. Two types of modules were identified in the regulatory network that differed in their sensitivity to the node-scoring parameter; we further showed these two pertain to general and specialized modules. Some of these modules were further investigated using the Genevestigator compendium of microarray experiments. Analyses of smaller subsets of data lead to the identification of condition-specific modules. Conclusion Our method for identification of gene clusters allows the integration of diverse microarray experiments from many sources. The analysis reveals that part of the Arabidopsis transcriptome is globally co-expressed, and can be further divided into known as well as novel functional gene modules. Our methodology is general enough to apply to any set of microarray experiments, using any scoring function.
Etherington, Graham J; Monaghan, Jacqueline; Zipfel, Cyril; MacLean, Dan
Analysis of mutants isolated from forward-genetic screens has revealed key components of several plant signalling pathways. Mapping mutations by position, either using classical methods or whole genome high-throughput sequencing (HTS), largely relies on the analysis of genome-wide polymorphisms in F2 recombinant populations. Combining bulk segregant analysis with HTS has accelerated the identification of causative mutations and has been widely adopted in many research programmes. A major advantage of HTS is the ability to perform bulk segregant analysis after back-crossing to the parental line rather than out-crossing to a polymorphic ecotype, which reduces genetic complexity and avoids issues with phenotype penetrance in different ecotypes. Plotting the positions of homozygous polymorphisms in a mutant genome identifies areas of low recombination and is an effective way to detect molecular linkage to a phenotype of interest. We describe the use of single nucleotide polymorphism (SNP) density plots as a mapping strategy to identify and refine chromosomal positions of causative mutations from screened plant populations. We developed a web application called CandiSNP that generates density plots from user-provided SNP data obtained from HTS. Candidate causative mutations, defined as SNPs causing non-synonymous changes in annotated coding regions are highlighted on the plots and listed in a table. We use data generated from a recent mutant screen in the model plant Arabidopsis thaliana as proof-of-concept for the validity of our tool. CandiSNP is a user-friendly application that will aid in novel discoveries from forward-genetic mutant screens. It is particularly useful for analysing HTS data from bulked back-crossed mutants, which contain fewer polymorphisms than data generated from out-crosses. The web-application is freely available online at http://candisnp.tsl.ac.uk.
Full Text Available Abstract With ever-increasing numbers of microbial genomes being sequenced, efficient tools are needed to perform strain-level identification of any newly sequenced genome. Here, we present the SNP identification for strain typing (SNIT pipeline, a fast and accurate software system that compares a newly sequenced bacterial genome with other genomes of the same species to identify single nucleotide polymorphisms (SNPs and small insertions/deletions (indels. Based on this information, the pipeline analyzes the polymorphic loci present in all input genomes to identify the genome that has the fewest differences with the newly sequenced genome. Similarly, for each of the other genomes, SNIT identifies the input genome with the fewest differences. Results from five bacterial species show that the SNIT pipeline identifies the correct closest neighbor with 75% to 100% accuracy. The SNIT pipeline is available for download at http://www.bhsai.org/snit.html
Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra
One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git. PMID:27600083
Børsting, Claus; Fordyce, Sarah L; Olofsson, Jill Katharina
in our ISO 17025 accredited laboratory. Concordance between the Ion Torrent™ HID SNP assay and the SNPforID assay was tested by typing 44 Iraqis twice with the Ion Torrent™ HID SNP assay. The same samples were previously typed with the SNPforID assay and the Y-chromosome haplogroups of the individuals...
Ezequiel Luis Nicolazzi
Full Text Available One of the main advantages of single nucleotide polymorphism (SNP array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.
Jamshidi, Maral; Nevanlinna, Heli; Van Dyck, Laurien
In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-...
Edward; M.; Smith; Jack; Littrell; Michael; Olivier
High-throughput SNP genotyping platforms use automated genotype calling algo- rithms to assign genotypes. While these algorithms work efficiently for individual platforms, they are not compatible with other platforms, and have individual biases that result in missed genotype calls. Here we present data on the use of a second complementary SNP genotype clustering algorithm. The algorithm was originally designed for individual fluorescent SNP genotyping assays, and has been opti- mized to permit the clustering of large datasets generated from custom-designed Affymetrix SNP panels. In an analysis of data from a 3K array genotyped on 1,560 samples, the additional analysis increased the overall number of genotypes by over 45,000, significantly improving the completeness of the experimental data. This analysis suggests that the use of multiple genotype calling algorithms may be ad- visable in high-throughput SNP genotyping experiments. The software is written in Perl and is available from the corresponding author.
Kevin Chen; WANG Xin-xin; SONG Hai-ying
Food safety has received a great deal of attention in both developed and developing countries in recent years. In China, the numerous food scandals and scares that have struck over the past decade have spurred signiifcant food safety regulatory reform, which has been increasingly oriented towards the public-private partnership model adopted by the Europe Union’s (EU) food safety regulatory system. This paper analyzes the development of both the EU’s and China’s food safety regu-latory systems, identiifes the current chalenges for China and additionaly considers the role of public-private partnership. The success of co-regulation in the food regulatory system would bring signiifcant beneifts and opportunities for China. Finaly, this paper recommends additional measures like training and grants to improve the private’s sector effectiveness in co-regulating China’s food safety issues.
Netherlands ) (see appendices). Small Molecule Inhibitors of the Androgen Receptor Transcriptional Activity for Prostate Cancer Drug Discovery...peritoneal injection, tail injection, oral gavage, retro-orbital blood sampling, isoflurane anesthesia, CO2 euthanasia , cardiac stick, organ harvesting...Discovery Poster Award, Androgens 2008 Meeting, Rotterdam (The Netherlands ), October 2008 Novel Small Molecules Antagonists of the Interaction of
Many people don't realise quite how much is going on at CERN. Would you like to gain first-hand knowledge of CERN's scientific and technological activities and their many applications? Try out some experiments for yourself, or pick the brains of the people in charge? If so, then the «Lundis Découverte» or Discovery Mondays, will be right up your street. Starting on May 5th, on every first Monday of the month you will be introduced to a different facet of the Laboratory. CERN staff, non-scientists, and members of the general public, everyone is welcome. So tell your friends and neighbours and make sure you don't miss this opportunity to satisfy your curiosity and enjoy yourself at the same time. You won't have to listen to a lecture, as the idea is to have open exchange with the expert in question and for each subject to be illustrated with experiments and demonstrations. There's no need to book, as Microcosm, CERN's interactive museum, will be open non-stop from 7.30 p.m. to 9 p.m. On the first Discovery M...
Studer, Bruno; Nielsen, Rasmus Ory; Panitz, Frank;
a clear cluster separation. An additional 83 (12%) were monomorphic. A total of 513 gene-associated SNPs were available for linkage mapping, out of which 495 (64% of the total 768 SNPs on the array) were successfully mapped in the VrnA population. The current VrnA map contains a total of 837 DNA markers......-assisted breeding strategies, a surprisingly low number of validated SNPs are currently available in perennial ryegrass. The advent of next generation sequencing opened up the opportunity for efficient and high throughput in silico SNP discovery in absence of a reference genome sequence. However, the percentages...... of 768 SNP markers were selected for GoldenGate genotyping on 181 individuals of the perennial ryegrass mapping population VrnA, which has been previously evaluated for important agronomic traits. A total of 692 (90%) of the 768 SNPs tested were successfully called. Of these, 96 (14%) did not reveal...
Jamshidi, Maral; Fagerholm, Rainer; Khan, Sofia; Aittomäki, Kristiina; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Andrulis, Irene L; Chang-Claude, Jenny; Devilee, Peter; Fasching, Peter A; Michailidou, Kyriaki; Bolla, Manjeet K; Dennis, Joe; Wang, Qin; Guo, Qi; Rhenius, Valerie; Cornelissen, Sten; Rudolph, Anja; Knight, Julia A; Loehberg, Christian R; Burwinkel, Barbara; Marme, Frederik; Hopper, John L; Southey, Melissa C; Bojesen, Stig E; Flyger, Henrik; Brenner, Hermann; Holleczek, Bernd; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Van Dyck, Laurien; Nevelsteen, Ines; Couch, Fergus J; Olson, Janet E; Giles, Graham G; McLean, Catriona; Haiman, Christopher A; Henderson, Brian E; Winqvist, Robert; Pylkäs, Katri; Tollenaar, Rob A E M; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J; Martens, John W M; Cox, Angela; Cross, Simon S; Simard, Jacques; Dunning, Alison M; Easton, Douglas F; Pharoah, Paul D P; Hall, Per; Blomqvist, Carl; Schmidt, Marjanka K; Nevanlinna, Heli
In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-SNP interactions on survival using the likelihood-ratio test comparing multivariate Cox' regression models of SNP pairs without and with an interaction term. We found two interacting pairs associating with prognosis: patients simultaneously homozygous for the rare alleles of rs5996080 and rs7973914 had worse survival (HRinteraction 6.98, 95% CI=3.3-14.4, P=1.42E-07), and patients carrying at least one rare allele for rs17243893 and rs57890595 had better survival (HRinteraction 0.51, 95% CI=0.3-0.6, P = 2.19E-05). Based on in silico functional analyses and literature, we speculate that the rs5996080 and rs7973914 loci may affect the BAFFR and TNFR1/TNFR3 receptors and breast cancer survival, possibly by disturbing both the canonical and non-canonical NF-κB pathways or their dynamics, whereas, rs17243893-rs57890595 interaction on survival may be mediated through TRAF2-TRAIL-R4 interplay. These results warrant further validation and functional analyses.
Full Text Available Abstract Background With the increasing availability of EST databases and whole genome sequences, SNPs have become the most abundant and powerful polymorphic markers. However, SNP chip data generally suffers from ascertainment biases caused by the SNP discovery and selection process in which a small number of individuals are used as discovery panels. The ongoing International Citrus Genome Consortium sequencing project of the highly heterozygous Clementine and sweet orange genomes will soon result in the release of several hundred thousand SNPs. The primary goals of this study were: (i to estimate the transferability within the genus Citrus of SNPs discovered from Clementine BACend sequencing (BES, (ii to estimate bias associated with the very narrow discovery panel, and (iii to evaluate the usefulness of the Clementine-derived SNP markers for diversity analysis and comparative mapping studies between the different cultivated Citrus species. Results Fifty-four accessions covering the main Citrus species and 52 interspecific hybrids between pummelo and Clementine were genotyped on a GoldenGate array platform using 1,457 SNPs mined from Clementine BES and 37 SNPs identified between and within C. maxima, C. medica, C. reticulata and C. micrantha. Consistent results were obtained from 622 SNP loci. Of these markers, 116 displayed incomplete transferability primarily in C. medica, C. maxima and wild Citrus species. The two primary biases associated with the SNP mining in Clementine were an overestimation of the C. reticulata diversity and an underestimation of the interspecific differentiation. However, the genetic stratification of the gene pool was high, with very frequent significant linkage disequilibrium. Furthermore, the shared intraspecific polymorphism and accession heterozygosity were generally enough to perform interspecific comparative genetic mapping. Conclusions A set of 622 SNP markers providing consistent results was selected. Of the
Ollitrault, Patrick; Terol, Javier; Garcia-Lor, Andres; Bérard, Aurélie; Chauveau, Aurélie; Froelicher, Yann; Belzile, Caroline; Morillon, Raphaël; Navarro, Luis; Brunel, Dominique; Talon, Manuel
With the increasing availability of EST databases and whole genome sequences, SNPs have become the most abundant and powerful polymorphic markers. However, SNP chip data generally suffers from ascertainment biases caused by the SNP discovery and selection process in which a small number of individuals are used as discovery panels. The ongoing International Citrus Genome Consortium sequencing project of the highly heterozygous Clementine and sweet orange genomes will soon result in the release of several hundred thousand SNPs. The primary goals of this study were: (i) to estimate the transferability within the genus Citrus of SNPs discovered from Clementine BACend sequencing (BES), (ii) to estimate bias associated with the very narrow discovery panel, and (iii) to evaluate the usefulness of the Clementine-derived SNP markers for diversity analysis and comparative mapping studies between the different cultivated Citrus species. Fifty-four accessions covering the main Citrus species and 52 interspecific hybrids between pummelo and Clementine were genotyped on a GoldenGate array platform using 1,457 SNPs mined from Clementine BES and 37 SNPs identified between and within C. maxima, C. medica, C. reticulata and C. micrantha. Consistent results were obtained from 622 SNP loci. Of these markers, 116 displayed incomplete transferability primarily in C. medica, C. maxima and wild Citrus species. The two primary biases associated with the SNP mining in Clementine were an overestimation of the C. reticulata diversity and an underestimation of the interspecific differentiation. However, the genetic stratification of the gene pool was high, with very frequent significant linkage disequilibrium. Furthermore, the shared intraspecific polymorphism and accession heterozygosity were generally enough to perform interspecific comparative genetic mapping. A set of 622 SNP markers providing consistent results was selected. Of the markers mined from Clementine, 80.5% were successfully
杜艳; 贾文祥; 刘莉
@@ The cholera epidemics is an important public health problem in many developing countries. Highly effective and preventive vaccines against cholera are under investigation as alternatives to the one available presently. Much of the vaccine research focuses on colonization factors. Colonization of a human by the Vibrio cholerae (V. cholerae Ol strain is mediated by toxin-coregulated pilus (TCP), 1 which was shown to play a role in the infant mouse cholera model and subsequently in human volunteers. 2 TCP-loaded vaccines could potentially provide cross-protection among experimental strains. Data have indicated that poly (D,L-lactide)-polyethylene glycol copolymer (PELA)microparticles loaded antigens were strongly immunogenic, 3 and that these microparticles served as an effective delivery system for a single dose of vaccine. 4Microparticle formulation could represent the next generation of vaccines, as they are highly effective at delivery of vaccines, thus requiring fewer doses. 5 We prepared PELA microparticles loaded with TCP for testing as a vaccine; their targeting distributions were identified and related immune responses were analyzed.
Full Text Available Abstract Background There are several isolated tools for partial analysis of microarray expression data. To provide an integrative, easy-to-use and automated toolkit for the analysis of Affymetrix microarray expression data we have developed Array2BIO, an application that couples several analytical methods into a single web based utility. Results Array2BIO converts raw intensities into probe expression values, automatically maps those to genes, and subsequently identifies groups of co-expressed genes using two complementary approaches: (1 comparative analysis of signal versus control and (2 clustering analysis of gene expression across different conditions. The identified genes are assigned to functional categories based on Gene Ontology classification and KEGG protein interaction pathways. Array2BIO reliably handles low-expressor genes and provides a set of statistical methods for quantifying expression levels, including Benjamini-Hochberg and Bonferroni multiple testing corrections. An automated interface with the ECR Browser provides evolutionary conservation analysis for the identified gene loci while the interconnection with Crème allows prediction of gene regulatory elements that underlie observed expression patterns. Conclusion We have developed Array2BIO – a web based tool for rapid comprehensive analysis of Affymetrix microarray expression data, which also allows users to link expression data to Dcode.org comparative genomics tools and integrates a system for translating co-expression data into mechanisms of gene co-regulation. Array2BIO is publicly available at http://array2bio.dcode.org.
Li, Siwei; Wang, Qian; Wang, Yi; Chen, Xinmei; Wang, Zhixiang
It is well established that epidermal growth factor (EGF) induces the cytoskeleton reorganization and cell migration through two major signaling cascades: phospholipase C-gamma1 (PLC-gamma1) and Rho GTPases. However, little is known about the cross talk between PLC-gamma1 and Rho GTPases. Here we showed that PLC-gamma1 forms a complex with Rac1 in response to EGF. This interaction is direct and mediated by PLC-gamma1 Src homology 3 (SH3) domain and Rac1 (106)PNTP(109) motif. This interaction is critical for EGF-induced Rac1 activation in vivo, and PLC-gamma1 SH3 domain is actually a potent and specific Rac1 guanine nucleotide exchange factor in vitro. We have also demonstrated that the interaction between PLC-gamma1 SH3 domain and Rac1 play a significant role in EGF-induced F-actin formation and cell migration. We conclude that PLC-gamma1 and Rac1 coregulate EGF-induced cell cytoskeleton remodeling and cell migration by a direct functional interaction.
Wilson, Gabrielle R; Sim, Marcus L-J; Brody, Kate M; Taylor, Juliet M; McLachlan, Robert I; O'Bryan, Moira K; Delatycki, Martin B; Lockhart, Paul J
To investigate the potential role of PArkin co-regulated gene (PACRG) in human male infertility. Case-control study. Academic reproductive biology department. Blood samples were obtained from 610 patients and 156 normal control subjects. Genomic DNA was used as template for polymerase chain reaction amplification of the PACRG promoter and coding exons. The amplified fragments were tested for DNA sequence variations by direct sequencing and restriction enzyme analysis. Gene structure and sequence alterations of PACRG in infertile male patients. The structure of PACRG was determined to comprise 5 coding exons, generating a single transcript in the testis which encoded a predicted protein of 257 amino acids. No pathogenic mutations were identified; however, a variant in the promoter of PACRG was shown to be significantly associated with azoospermia, but not oligospermia, in the case-control cohort. Mutation of PACRG was not identified as a cause of male infertility, but variation in the promoter was demonstrated to be a risk factor associated with azoospermia. Copyright 2010 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
Full Text Available Chromosome 2 of Vibrio cholerae carries a chromosomal superintegron, composed of an integrase, a cassette integration site (attI and an array of mostly promoterless gene cassettes. We determined the precise location of the promoter, Pc, which drives the transcription of the first cassettes of the V. cholerae superintegron. We found that cassette mRNA starts 65 bp upstream of the attI site, so that the inversely oriented promoters Pc and Pint (integrase promoter partly overlap, allowing for their potential co-regulation. Pint was previously shown to be induced during the SOS response and is further controlled by the catabolite repression cAMP-CRP complex. We found that cassette expression from Pc was also controlled by the cAMP-CRP complex, but is not part of the SOS regulon. Pint and Pc promoters were both found to be induced in rich medium, at high temperature, high salinity and at the end of exponential growth phase, although at very different levels and independently of sigma factor RpoS. All these results show that expression from the integrase and cassette promoters can take place at the same time, thus leading to coordinated excisions and integrations within the superintegron and potentially coupling cassette shuffling to immediate selective advantage.
Kononenko, Olga; Galatenko, Vladimir; Andersson, Malin; Bazov, Igor; Watanabe, Hiroyuki; Zhou, Xing Wu; Iatsyshyna, Anna; Mityakina, Irina; Yakovleva, Tatiana; Sarkisyan, Daniil; Ponomarev, Igor; Krishtal, Oleg; Marklund, Niklas; Tonevitsky, Alex; Adkins, DeAnna L; Bakalkin, Georgy
Regulation of the formation and rewiring of neural circuits by neuropeptides may require coordinated production of these signaling molecules and their receptors that may be established at the transcriptional level. Here, we address this hypothesis by comparing absolute expression levels of opioid peptides with their receptors, the largest neuropeptide family, and by characterizing coexpression (transcriptionally coordinated) patterns of these genes. We demonstrated that expression patterns of opioid genes highly correlate within and across functionally and anatomically different areas. Opioid peptide genes, compared with their receptor genes, are transcribed at much greater absolute levels, which suggests formation of a neuropeptide cloud that covers the receptor-expressed circuits. Surprisingly, we found that both expression levels and the proportion of opioid receptors are strongly lateralized in the spinal cord, interregional coexpression patterns are side-specific, and intraregional coexpression profiles are affected differently by left- and right-side unilateral body injury. We propose that opioid genes are regulated as interconnected components of the same molecular system distributed between distinct anatomic regions. The striking feature of this system is its asymmetric coexpression patterns, which suggest side-specific regulation of selective neural circuits by opioid neurohormones.-Kononenko, O., Galatenko, V., Andersson, M., Bazov, I., Watanabe, H., Zhou, X. W., Iatsyshyna, A., Mityakina, I., Yakovleva, T., Sarkisyan, D., Ponomarev, I., Krishtal, O., Marklund, N., Tonevitsky, A., Adkins, D. L., Bakalkin, G. Intra- and interregional coregulation of opioid genes: broken symmetry in spinal circuits.
Liang Chen; Hong-Yu Zhao
Absolute or relative transcript amounts measured through high-throughput technologies (e.g., microarrays)are now commonly used in bioinformatics analysis, such as gene clustering and DNA binding motif finding. However,transcription rates that represent mRNA synthesis may be more relevant in these analyses. Because transcription rates are not equivalent to transcript amounts unless the mRNA degradation rates as well as other factors that affect transcript amount are identical across different genes, the use of transcription rates in bioinformatics analysis may lead to a better description of the relationships among genes and better identification of genomic signals. In this article, we propose to use experimentally measured mRNA decay rates and mRNA transcript amounts to jointly infer transcription rates, and then use the inferred transcription rates in downstream analyses. For gene expression similarity analysis, we find that there tends to be higher correlations among co-regulated genes when transcription-rate-based correlations are used compared to those based on transcript amounts. In the context of identifying DNA binding motifs, using inferred transcription rates leads to more significant findings than those based on transcript amounts. These analyses suggest that the incorporation of mRNA decay rates and the use of the inferred transcription rates can facilitate the study of gene regulations and the reconstruction of transcriptional regulatory networks.
Marit W. Vermunt
Full Text Available Understanding the complexity of the human brain and its functional diversity remain a major challenge. Distinct anatomical regions are involved in an array of processes, including organismal homeostasis, cognitive functions, and susceptibility to neurological pathologies, many of which define our species. Distal enhancers have emerged as key regulatory elements that acquire histone modifications in a cell- and species-specific manner, thus enforcing specific gene expression programs. Here, we survey the epigenomic landscape of promoters and cis-regulatory elements in 136 regions of the adult human brain. We identify a total of 83,553 promoter-distal H3K27ac-enriched regions showing global characteristics of brain enhancers. We use coregulation of enhancer elements across many distinct regions of the brain to uncover functionally distinct networks at high resolution and link these networks to specific neuroglial functions. Furthermore, we use these data to understand the relevance of noncoding genomic variations previously linked to Parkinson’s disease incidence.
Voorrips Roeland E
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only. Results We have developed a new algorithm to detect reliable SNPs and insertions/deletions (indels in EST data, both with and without quality files. Implemented in a pipeline called QualitySNP, it uses three filters for the identification of reliable SNPs. Filter 1 screens for all potential SNPs and identifies variation between or within genotypes. Filter 2 is the core filter that uses a haplotype-based strategy to detect reliable SNPs. Clusters with potential paralogs as well as false SNPs caused by sequencing errors are identified. Filter 3 screens SNPs by calculating a confidence score, based upon sequence redundancy and quality. Non-synonymous SNPs are subsequently identified by detecting open reading frames of consensus sequences (contigs with SNPs. The pipeline includes a data storage and retrieval system for haplotypes, SNPs and alignments. QualitySNP's versatility is demonstrated by the identification of SNPs in EST datasets from potato, chicken and humans. Conclusion QualitySNP is an efficient tool for SNP detection, storage and retrieval in diploid as well as polyploid species. It is available for running on Linux or UNIX systems. The program, test data, and user manual are available at
Full Text Available Abstract Background High-throughput genotyping of single nucleotide polymorphisms (SNPs generates large amounts of data. In many SNP genotyping assays, the genotype assignment is based on scatter plots of signals corresponding to the two SNP alleles. In a robust assay the three clusters that define the genotypes are well separated and the distances between the data points within a cluster are short. "Silhouettes" is a graphical aid for interpretation and validation of data clusters that provides a measure of how well a data point was classified when it was assigned to a cluster. Thus "Silhouettes" can potentially be used as a quality measure for SNP genotyping results and for objective comparison of the performance of SNP assays at different circumstances. Results We created a program (ClusterA for calculating "Silhouette scores", and applied it to assess the quality of SNP genotype clusters obtained by single nucleotide primer extension ("minisequencing" in the Tag-microarray format. A Silhouette score condenses the quality of the genotype assignment for each SNP assay into a single numeric value, which ranges from 1.0, when the genotype assignment is unequivocal, down to -1.0, when the genotype assignment has been arbitrary. In the present study we applied Silhouette scores to compare the performance of four DNA polymerases in our minisequencing system by analyzing 26 SNPs in both DNA polarities in 16 DNA samples. We found Silhouettes to provide a relevant measure for the quality of SNP assays at different reaction conditions, illustrated by the four DNA polymerases here. According to our result, the genotypes can be unequivocally assigned without manual inspection when the Silhouette score for a SNP assay is > 0.65. All four DNA polymerases performed satisfactorily in our Tag-array minisequencing system. Conclusion "Silhouette scores" for assessing the quality of SNP genotyping clusters is convenient for evaluating the quality of SNP genotype
Krag, Kristian; Janss, Luc; Mahdi Shariati, Mohammad;
Heritability is a central element in quantitative genetics. New molecular markers to assess genetic variance and heritability are continually under development. The availability of molecular single nucleotide polymorphism (SNP) markers can be applied for estimation of variance components and heri......Heritability is a central element in quantitative genetics. New molecular markers to assess genetic variance and heritability are continually under development. The availability of molecular single nucleotide polymorphism (SNP) markers can be applied for estimation of variance components...
Antonio M Ramos
Full Text Available BACKGROUND: The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs. This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. METHODOLOGY/PRINCIPAL FINDINGS: A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI were sequenced using Illumina's Genome Analyzer (GA. The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%. Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF for all scorable SNPs was 0.274. CONCLUSIONS/SIGNIFICANCE: Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs.
Claudia D. Gherman
Full Text Available Objectives. We hypothesized that adiponectin gene SNP+45 (rs2241766 and SNP+276 (rs1501299 would be associated with atherosclerotic peripheral arterial disease (PAD. Furthermore, the association between circulating adiponectin levels, fetuin-A, and tumoral necrosis factor-alpha (TNF-α in patients with atherosclerotic peripheral arterial disease was investigated. Method. Several blood parameters (such as adiponectin, fetuin-A, and TNF-α were measured in 346 patients, 226 with atherosclerotic peripheral arterial disease (PAD and 120 without symptomatic PAD (non-PAD. Two common SNPs of the ADIPOQ gene represented by +45T/G 2 and +276G/T were also investigated. Results. Adiponectin concentrations showed lower circulating levels in the PAD patients compared to non-PAD patients (P0.05. Conclusion. The results of our study demonstrated that neither adiponectin SNP+45 nor SNP+276 is associated with the risk of PAD.
Lamparter, David; Marbach, Daniel; Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven
Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.
Full Text Available We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH. We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS. These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method.
Yamamura, T; Hikita, J; Bleakley, M; Hirosawa, T; Sato-Otsubo, A; Torikai, H; Hamajima, T; Nannya, Y; Demachi-Okamura, A; Maruya, E; Saji, H; Yamamoto, Y; Takahashi, T; Emi, N; Morishima, Y; Kodera, Y; Kuzushima, K; Riddell, S R; Ogawa, S; Akatsuka, Y
Minor histocompatibility (H) antigens are targets of graft-vs-host disease and graft-vs-tumor responses after human leukocyte antigen matched allogeneic hematopoietic stem cell transplantation. Recently, we reported a strategy for genetic mapping of linkage disequilibrium blocks that encoded novel minor H antigens using the large dataset from the International HapMap Project combined with conventional immunologic assays to assess recognition of HapMap B-lymphoid cell line by minor H antigen-specific T cells. In this study, we have constructed and provide an online interactive program and demonstrate its utility for searching for single-nucleotide polymorphisms (SNPs) responsible for minor H antigen generation. The website is available as 'HapMap SNP Scanner', and can incorporate T-cell recognition and other data with genotyping datasets from CEU, JPT, CHB, and YRI to provide a list of candidate SNPs that correlate with observed phenotypes. This method should substantially facilitate discovery of novel SNPs responsible for minor H antigens and be applicable for assaying of other specific cell phenotypes (e.g. drug sensitivity) to identify individuals who may benefit from SNP-based customized therapies.
Mooser, V; Waterworth, D M; Isenhour, T; Middleton, L
In the past pharmacological agents have contributed to a significant reduction in age-adjusted incidence of cardiovascular events. However, not all patients treated with these agents respond favorably, and some individuals may develop side-effects. With aging of the population and the growing prevalence of cardiovascular risk factors worldwide, it is expected that the demand for cardiovascular drugs will increase in the future. Accordingly, there is a growing need to identify the 'good' responders as well as the persons at risk for developing adverse events. Evidence is accumulating to indicate that responses to drugs are at least partly under genetic control. As such, pharmacogenetics - the study of variability in drug responses attributed to hereditary factors in different populations - may significantly assist in providing answers toward meeting this challenge. Pharmacogenetics mostly relies on associations between a specific genetic marker like single nucleotide polymorphisms (SNPs), either alone or arranged in a specific linear order on a certain chromosomal region (haplotypes), and a particular response to drugs. Numerous associations have been reported between selected genotypes and specific responses to cardiovascular drugs. Recently, for instance, associations have been reported between specific alleles of the apoE gene and the lipid-lowering response to statins, or the lipid-elevating effect of isotretinoin. Thus far, these types of studies have been mostly limited to a priori selected candidate genes due to restricted genotyping and analytical capacities. Thanks to the large number of SNPs now available in the public domain through the SNP Consortium and the newly developed technologies (high throughput genotyping, bioinformatics software), it is now possible to interrogate more than 200,000 SNPs distributed over the entire human genome. One pharmacogenetic study using this approach has been launched by GlaxoSmithKline to identify the approximately 4% of
Boom, Dirk Van Den; Beaulieu, Martin; Oeth, Paul; Roth, Rich; Honisch, Christiane; Nelson, Matthew R.; Jurinke, Christian; Cantor, Charles
Matrix-assisted laser desorption/ionization (MALDI) time-of-flight (TOF) mass spectrometry (MS) has been applied as a high-throughput platform technology for qualitative and quantitative nucleic acid analysis in the genetic discovery of target genes and their biological validation. Mass spectrometric methods for the elucidation of genetic variability and for subsequent large-scale genotyping of genetic markers are exemplified. The use of quantitative MALDI-TOF MS is described for large-scale validation of SNP markers and their analysis in DNA sample pools. Initial results of genome-wide association studies employing this technology are provided exemplifying a genetics-driven approach to drug discovery.
Full Text Available High-throughput sequencing (CHIP-Seq data exhibit binding events with possible binding locations and their strengths, followed by interpretation of the locations of peaks. Recent methods tend to summarize all CHIP-Seq peaks detected within a limited up and down region of each gene into one real-valued score in order to quantify the probability of regulation in a region. Applying subspace clustering techniques on these scores can help discover important knowledge such as the potential co-regulation or co-factor mechanisms. The ideal biclusters generated would contain subsets of genes and transcription factors (TF such that the cell-values in biclusters are distributed around a mean value with very low variance. Such biclusters would indicate TF sets regulating gene sets with very similar probability values. However, most existing biclustering algorithms neither enforce low variance as the desired property of a bicluster, nor use variance as a guiding metric while searching for the desirable biclusters. In this paper we present an algorithm that searches a space of all overlapping biclusters organized in a lattice, and uses an upper bound on variance values of biclusters as the guiding metric. We show the algorithm to be an efficient and effective method for discovering the possibly overlapping biclusters under pre-defined variance bounds. We present in this paper our algorithm, its results with synthetic, CHIP-Seq and motif datasets, and compare them with the results obtained by other algorithms to demonstrate the power and effectiveness of our algorithm.
Gachon, Claire M M; Langlois-Meurinne, Mathilde; Henry, Yves; Saindrenan, Patrick
The combined knowledge of the Arabidopsis genome and transcriptome now allows to get an integrated view of the dynamics and evolution of metabolic pathways in plants. We used publicly available sets of microarray data obtained in a wide range of different stress and developmental conditions to investigate the co-expression of genes encoding enzymes of secondary metabolism pathways, in particular indoles, phenylpropanoids, and flavonoids. We performed hierarchical clustering of gene expression profiles and found that major enzymes of each pathway display a clear and robust co-expression throughout all the conditions studied. Moreover, detailed analysis evidenced that some genes display co-regulation in particular physiological conditions only, certainly reflecting their modular recruitment into stress- or developmentally regulated biosynthetic pathways. The combination of these microarray data with sequence analysis allows to draw very precise hypotheses on the function of otherwise uncharacterized genes. To illustrate this approach, we focused our analysis on secondary metabolism glycosyltransferases (UGTs), a multigenic family involved in the conjugation of small molecules to sugars like glucose. We propose that UGT74B1 and UGT74C1 may be involved in aromatic and aliphatic glucosinolates synthesis, respectively. We also suggest that UGT75C1 may function as an anthocyanin-5-O-glucosyltransferase in planta. Therefore, this data-mining approach appears very powerful for the functional prediction of unknown genes, and could be transposed to virtually any other gene family. Finally, we suggest that analysis of expression pattern divergence of duplicated genes also provides some insight into the mechanisms of metabolic pathway evolution.
Full Text Available BACKGROUND: MicroRNAs play important roles in various biological processes involving fairly complex mechanism. Analysis of genome-wide miRNA microarray demonstrate that a single miRNA can regulate hundreds of genes, but the regulative extent on most individual genes is surprisingly mild so that it is difficult to understand how a miRNA provokes detectable functional changes with such mild regulation. RESULTS: To explore the internal mechanism of miRNA-mediated regulation, we re-analyzed the data collected from genome-wide miRNA microarray with bioinformatics assay, and found that the transfection of miR-181b and miR-34a in Hela and HCT-116 tumor cells regulated large numbers of genes, among which, the genes related to cell growth and cell death demonstrated high Enrichment scores, suggesting that these miRNAs may be important in cell growth and cell death. MiR-181b induced changes in protein expression of most genes that were seemingly related to enhancing cell growth and decreasing cell death, while miR-34a mediated contrary changes of gene expression. Cell growth assays further confirmed this finding. In further study on miR-20b-mediated osteogenesis in hMSCs, miR-20b was found to enhance osteogenesis by activating BMPs/Runx2 signaling pathway in several stages by co-repressing of PPARγ, Bambi and Crim1. CONCLUSIONS: With its multi-target characteristics, miR-181b, miR-34a and miR-20b provoked detectable functional changes by co-regulating functionally-related gene groups or several genes in the same signaling pathway, and thus mild regulation from individual miRNA targeting genes could have contributed to an additive effect. This might also be one of the modes of miRNA-mediated gene regulation.
Mason, H S; Dewald, D B; Creelman, R A; Mullet, J E
The soybean vegetative storage protein genes vspA and vspB are highly expressed in developing leaves, stems, flowers, and pods as compared with roots, seeds, and mature leaves and stems. In this paper, we report that physiological levels of methyl jasmonate (MeJA) and soluble sugars synergistically stimulate accumulation of vsp mRNAs. Treatment of excised mature soybean (Glycine max Merr. cv Williams) leaves with 0.2 molar sucrose and 10 micromolar MeJA caused a large accumulation of vsp mRNAs, whereas little accumulation occurred when these compounds were supplied separately. In soybean cell suspension cultures, the synergistic effect of sucrose and MeJA on the accumulation of vspB mRNA was maximal at 58 millimolar sucrose and was observed with fructose or glucose substituted for sucrose. In dark-grown soybean seedlings, the highest levels of vsp mRNAs occurred in the hypocotyl hook, which also contained high levels of MeJA and soluble sugars. Lower levels of vsp mRNAs, MeJA, and soluble sugars were found in the cotyledons, roots, and nongrowing regions of the stem. Wounding of mature soybean leaves induced a large accumulation of vsp mRNAs when wounded plants were incubated in the light. Wounded plants kept in the dark or illuminated plants sprayed with dichlorophenyldimethylurea, an inhibitor of photosynthetic electron transport, showed a greatly reduced accumulation of vsp mRNAs. The time courses for the accumulation of vsp mRNAs induced by wounding or sucrose/MeJA treatment were similar. These results strongly suggest that vsp expression is coregulated by endogenous levels of MeJA (or jasmonic acid) and soluble carbohydrate during normal vegetative development and in wounded leaves.
Aslam, M.L.; Bastiaansen, J.W.M.; Elferink, M.G.; Megens, H.J.W.C.; Crooijmans, R.P.M.A.; Blomberg, L.; Fleischer, G.; Groenen, M.
Background The turkey (Meleagris gallopavo) is an important agricultural species and the second largest contributor to the world’s poultry meat production. Genetic improvement is attributed largely to selective breeding programs that rely on highly heritable phenotypic traits, such as body size and
Esraa M. Hashem
Full Text Available cancer represents one of the greatest medical causes of mortality. The majority of Hepatocellular carcinoma arises from the accumulation of genetic abnormalities, and possibly induced by exterior etiological factors especially HCV and HBV infections. There is a need for new tools to analysis the large sum of data to present relevant genetic changes that may be critical for both understanding how cancers develop and determining how they could ultimately be treated. Gene expression profiling may lead to new biomarkers that may help develop diagnostic accuracy for detecting Hepatocellular carcinoma. In this work, statistical technique (discrete stationary wavelet transform for detection of copy number alternations to analysis high-density single-nucleotide polymorphism array of 30 cell lines on specific chromosomes, which are frequently detected in Hepatocellular carcinoma have been proposed. The results demonstrate the feasibility of whole-genome fine mapping of copy number alternations via high-density single-nucleotide polymorphism genotyping, Results revealed that a novel altered chromosomal region is discovered; region amplification (4q22.1 have been detected in 22 out of 30-Hepatocellular carcinoma cell lines (73%. This region strike, AFF1 and DSPP, tumor suppressor genes. This finding has not previously reported to be involved in liver carcinogenesis; it can be used to discover a new HCC biomarker, which helps in a better understanding of hepatocellular carcinoma.
Background: Melon (Cucumis melo L.) is a highly diverse species that is cultivated worldwide. Recent advances in massively parallel sequencing have begun to allow the study of nucleotide diversity in this species. The Sanger method combined with medium-throughput 454 technology were used in a previous study to analyze the genetic diversity of germplasm representing 3 botanical varieties, yielding a collection of about 40,000 SNPs distributed in 14,000 unigenes. However, the usefulness of this...
Smits, B.M.; Guryev, V.; Zeegers, D.; Wedekind, D.; Hedrich, H.J.; Cuppen, E.
BACKGROUND: The laboratory rat (Rattus norvegicus) is an important model for studying many aspects of human health and disease. Detailed knowledge on genetic variation between strains is important from a biomedical, particularly pharmacogenetic point of view and useful for marker selection for genet
Hedrich Hans J; Wedekind Dirk; Zeegers Dimphy; Guryev Victor; Smits Bart MG; Cuppen Edwin
Abstract Background The laboratory rat (Rattus norvegicus) is an important model for studying many aspects of human health and disease. Detailed knowledge on genetic variation between strains is important from a biomedical, particularly pharmacogenetic point of view and useful for marker selection for genetic cloning and association studies. Results We show that Single Nucleotide Polymorphisms (SNPs) in commonly used rat strains are surprisingly well represented in wild rat isolates. Shotgun ...
Chen, Jin-Bor; Chuang, Li-Yeh; Lin, Yu-Da; Liou, Chia-Wei; Lin, Tsu-Kung; Lee, Wen-Chin; Cheng, Ben-Chung; Chang, Hsueh-Wei; Yang, Cheng-Hong
Single nucleotide polymorphism (SNP) interaction analysis can simultaneously evaluate the complex SNP interactions present in complex diseases. However, it is less commonly applied to evaluate the predisposition of chronic dialysis and its computational analysis remains challenging. In this study, we aimed to improve the analysis of SNP-SNP interactions within the mitochondrial D-loop in chronic dialysis. The SNP-SNP interactions between 77 reported SNPs within the mitochondrial D-loop in chronic dialysis study were evaluated in terms of SNP barcodes (different SNP combinations with their corresponding genotypes). We propose a genetic algorithm (GA) to generate SNP barcodes. The χ(2) values were then calculated by the occurrences of the specific SNP barcodes and their non-specific combinations between cases and controls. Each SNP barcode (2- to 7-SNP) with the highest value in the χ(2) test was regarded as the best SNP barcode (11.304 to 23.310; p algorithm to address the SNP-SNP interactions and demonstrated that many non-significant SNPs within the mitochondrial D-loop may play a role in jointed effects to chronic dialysis susceptibility.
Fadista, João; Bendixen, Christian
The field of genetics has come to rely heavily on commercial genotyping arrays and accompanying annotations for insights into genotype-phenotype associations. However, in order to avoid errors and false leads, it is imperative that the annotation of SNP chromosomal positions is accurate and unamb......The field of genetics has come to rely heavily on commercial genotyping arrays and accompanying annotations for insights into genotype-phenotype associations. However, in order to avoid errors and false leads, it is imperative that the annotation of SNP chromosomal positions is accurate...... and unambiguous. We report on genomic positional discrepancies of various SNP chips for human, cattle and mouse species, and discuss their causes and consequences....
Cornelis, Senne; Gansemans, Yannick; Deleye, Lieselot; Deforce, Dieter; Van Nieuwerburgh, Filip
One of the latest developments in next generation sequencing is the Oxford Nanopore Technologies’ (ONT) MinION nanopore sequencer. We studied the applicability of this system to perform forensic genotyping of the forensic female DNA standard 9947 A using the 52 SNP-plex assay developed by the SNPforID consortium. All but one of the loci were correctly genotyped. Several SNP loci were identified as problematic for correct and robust genotyping using nanopore sequencing. All these loci contained homopolymers in the sequence flanking the forensic SNP and most of them were already reported as problematic in studies using other sequencing technologies. When these problematic loci are avoided, correct forensic genotyping using nanopore sequencing is technically feasible. PMID:28155888
Tuan Anh Pham
Full Text Available Hybrid nanoparticle (NP structures containing organic building units such as polymers, peptides, DNA and proteins have great potential in biosensor and electronic applications. The nearly free modification of the polymer chain, the variation of the protein and DNA sequence and the implementation of functional moieties provide a great platform to create inorganic structures of different morphology, resulting in different optical and magnetic properties. Nevertheless, the design and modification of a protein structure with functional groups or sequences for the assembly of biohybrid materials is not trivial. This is mainly due to the sensitivity of its secondary, tertiary and quaternary structure to the changes in the interaction (e.g., hydrophobic, hydrophilic, electrostatic, chemical groups between the protein subunits and the inorganic material. Here, we use hemolysin coregulated protein 1 (Hcp1 from Pseudomonas aeruginosa as a building and gluing unit for the formation of biohybrid structures by implementing cysteine anchoring points at defined positions on the protein rim (Hcp1_cys3. We successfully apply the Hcp1_cys3 gluing unit for the assembly of often linear, hybrid structures of plasmonic gold (Au NP, magnetite (Fe3O4 NP, and cobalt ferrite nanoparticles (CoFe2O4 NP. Furthermore, the assembly of Au NPs into linear structures using Hcp1_cys3 is investigated by UV–vis spectroscopy, TEM and cryo-TEM. One key parameter for the formation of Au NP assembly is the specific ionic strength in the mixture. The resulting network-like structure of Au NPs is characterized by Raman spectroscopy, showing surface-enhanced Raman scattering (SERS by a factor of 8·104 and a stable secondary structure of the Hcp1_cys3 unit. In order to prove the catalytic performance of the gold hybrid structures, they are used as a catalyst in the reduction reaction of 4-nitrophenol showing similar catalytic activity as the pure Au NPs. To further extend the
Han, Wen; Jones, Frank E., E-mail: firstname.lastname@example.org
unaffected by loss of HER4 expression. In summary, we demonstrate for the first time that a cell surface receptor functions as an obligate ER coactivator with functional specificity associated with breast tumor cell proliferation and cell cycle progression. Nearly 90% of ER positive tumors coexpress HER4, therefore we predict that the majority of breast cancer patients would benefit from a strategy to therapeutic disengage ER/4ICD coregulated tumor cell proliferation.
Wang, Kainan; Degerny, Cindy; Xu, Minghong; Yang, Xiang-Jiao
How extracellular cues are transduced to the nucleus is a fundamental issue in biology. The paralogous WW-domain proteins YAP (Yes-associated protein) and TAZ (transcriptional coactivator with PDZ-binding motif; also known as WWTR1, for WW-domain containing transcription regulator 1) constitute a pair of transducers linking cytoplasmic signaling events to transcriptional regulation in the nucleus. A cascade composed of mammalian Ste20-like (MST) and large tumor suppressor (LATS) kinases directs multisite phosphorylation, promotes 14-3-3 binding, and hinders nuclear import of YAP and TAZ, thereby inhibiting their transcriptional coactivator and growth-promoting activities. A similar cascade regulates the trafficking and function of Yorkie, the fly orthologue of YAP. Mammalian YAP and TAZ are expressed in various tissues and serve as coregulators for transcriptional enhancer factors (TEFs; also referred to as TEADs, for TEA-domain proteins), runt-domain transcription factors (Runxs), peroxisome proliferator-activated receptor gamma (PPARgamma), T-box transcription factor 5 (Tbx5), and several others. YAP and TAZ play distinct roles during mouse development. Both, and their upstream regulators, are intimately linked to tumorigenesis and other pathogenic processes. Here, we review studies on this family of signal-responsive transcriptional coregulators and emphasize how relative sequence conservation predicates their function and regulation, to provide a conceptual framework for organizing available information and seeking new knowledge about these signal transducers.
Deborah L. Butler
Full Text Available This paper reports findings from a longitudinal project in which secondary teachers were working collaboratively to support adolescents' self-regulated learning through reading (LTR in subject-area classrooms. We build from prior research to “connect the dots” between teachers' engagement in self- and co-regulated inquiry, associated shifts in classroom practice, and student self-regulation. More specifically, we investigated whether and how teachers working within a community of inquiry were mobilizing research to shape classroom practice and advance student learning. Drawing on evidence from 18 teachers and their respective classrooms, we describe findings related to the following research questions: (1 While engaged in self- and co-regulated inquiry, what types of practices did teachers enact to support LTR in their subject-area classrooms? (2 How did teachers draw on research-based resources to inform practice development? (3 What kinds of practices could be associated with gains in students' self-regulated LTR? In our discussion, we highlight contributions to understanding how teachers can be supported to situate research in authentic classroom environments and about qualities of practices supportive of students' self-regulated LTR. We also identify limitations of this work and important future directions.
Toropainen, Sari; Malinen, Marjo; Kaikkonen, Sanna; Rytinki, Miia; Jääskeläinen, Tiina; Sahu, Biswajyoti; Jänne, Olli A; Palvimo, Jorma J
Androgen receptor (AR) is a ligand-activated transcription factor that plays a central role in the development and growth of prostate carcinoma. PIAS1 is an AR- and SUMO-interacting protein and a putative transcriptional coregulator overexpressed in prostate cancer. To study the importance of PIAS1 for the androgen-regulated transcriptome of VCaP prostate cancer cells, we silenced its expression by RNAi. Transcriptome analyses revealed that a subset of the AR-regulated genes is significantly influenced, either activated or repressed, by PIAS1 depletion. Interestingly, PIAS1 depletion also exposed a new set of genes to androgen regulation, suggesting that PIAS1 can mask distinct genomic loci from AR access. In keeping with gene expression data, silencing of PIAS1 attenuated VCaP cell proliferation. ChIP-seq analyses showed that PIAS1 interacts with AR at chromatin sites harboring also SUMO2/3 and surrounded by H3K4me2; androgen exposure increased the number of PIAS1-occupying sites, resulting in nearly complete overlap with AR chromatin binding events. PIAS1 interacted also with the pioneer factor FOXA1. Of note, PIAS1 depletion affected AR chromatin occupancy at binding sites enriched for HOXD13 and GATA motifs. Taken together, PIAS1 is a genuine chromatin-bound AR coregulator that functions in a target gene selective fashion to regulate prostate cancer cell growth. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Leekitcharoenphon, Pimlapas; Kaas, Rolf S; Thomsen, Martin Christen Frølund; Friis, Carsten; Rasmussen, Simon; Aarestrup, Frank M
The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script.The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evaluation results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.
Nielsen, Rasmus; Williamson, Scott; Kim, Yuseob
of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence...
Børsting, Claus; Sanchez Sanchez, Juan Jose; Morling, Niels
We describe a single nucleotide polymorphism (SNP) typing protocol developed for the NanoChip electronic microarray. The NanoChip array consists of 100 electrodes covered by a thin hydrogel layer containing streptavidin. An electric currency can be applied to one, several, or all electrodes...
Andersen, Jeppe Dyrberg; Tvedebrink, Torben; Mogensen, Helle Smidt
-out of true alleles is possible. As part of the validation of the IrisPlex assay in our ISO17025 accredited, forensic genetic laboratory, we estimated the probability of drop-out of specific SNP alleles using 29 and 30 PCR cycles and 25, 50 and 100 Single Base Extension (SBE) cycles. We observed no drop...
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most common genetic variations in the human genome and are useful as genomic markers. Oligonucleotide SNP microarrays have been developed for high-throughput genotyping of up to 900,000 human SNPs and have been used widely in linkage and cancer genomics studies. We have previously used Hidden Markov Models (HMM to analyze SNP array data for inferring copy numbers and loss-of-heterozygosity (LOH from paired normal and tumor samples and unpaired tumor samples. Results We proposed and implemented major copy proportion (MCP analysis of oligonucleotide SNP array data. A HMM was constructed to infer unobserved MCP states from observed allele-specific signals through emission and transition distributions. We used 10 K, 100 K and 250 K SNP array datasets to compare MCP analysis with LOH and copy number analysis, and showed that MCP performs better than LOH analysis for allelic-imbalanced chromosome regions and normal contaminated samples. The major and minor copy alleles can also be inferred from allelic-imbalanced regions by MCP analysis. Conclusion MCP extends tumor LOH analysis to allelic imbalance analysis and supplies complementary information to total copy numbers. MCP analysis of mixing normal and tumor samples suggests the utility of MCP analysis of normal-contaminated tumor samples. The described analysis and visualization methods are readily available in the user-friendly dChip software.
Prokhorenko, Igor A.; Astakhova, Irina V.; Momynaliev, Kuvat T.
Excimer formation is a unique feature of some fluorescent dyes (e.g., pyrene) which can be used for probing the proximity of biomolecules. Pyrene excimer fluorescence has previously been used for homogeneous detection of single nucleotide polymorphism (SNP) on DNA. 1-Phenylethynylpyrene (1-1-PEPy...
Full Text Available Abstract Background The recent development of new high-throughput technologies for SNP genotyping has opened the possibility of taking a genome-wide linkage approach to the search for new candidate genes involved in heredity diseases. The two major breast cancer susceptibility genes BRCA1 and BRCA2 are involved in 30% of hereditary breast cancer cases, but the discovery of additional breast cancer predisposition genes for the non-BRCA1/2 breast cancer families has so far been unsuccessful. Results In order to evaluate the power improvement provided by using SNP markers in a real situation, we have performed a whole genome screen of 19 non-BRCA1/2 breast cancer families using 4720 genomewide SNPs with Illumina technology (Illumina's Linkage III Panel, with an average distance of 615 Kb/SNP. We identified six regions on chromosomes 2, 3, 4, 7, 11 and 14 as candidates to contain genes involved in breast cancer susceptibility, and additional fine mapping genotyping using microsatellite markers around linkage peaks confirmed five of them, excluding the region on chromosome 3. These results were consistent in analyses that excluded SNPs in high linkage disequilibrium. The results were compared with those obtained previously using a 10 cM microsatellite scan (STR-GWS and we found lower or not significant linkage signals with STR-GWS data compared to SNP data in all cases. Conclusion Our results show the power increase that SNPs can supply in linkage studies.
Arnaiz-Villena, Antonio; Fernández-Honrado, Mercedes; Rey, Diego; Enríquez-de-Salamanca, Mercedes; Abd-El-Fatah-Khalil, Sedeka; Arribas, Ignacio; Coca, Carmen; Algora, Manuel; Areces, Cristina
Adiponectin gene polymorphisms SNP45 and SNP276 have been related to metabolic syndrome (MS) and related pathologies, including obesity. However results of associations are contradictory depending on which population is studied. In the present study, these adiponectin SNPs are for the first time studied in Amerindians. Allele frequencies are obtained and comparison with obesity and other MS related parameters are performed. Amerindians were also defined by characteristic HLA genes. Our main results are: (1) SNP276 T is associated to low diastolic blood pressure in Amerindians, (2) SNP45 G allele is correlated with obesity in female but not in male Amerindians, (3) SNP45/SNP276 T/G haplotype in total obese/non-obese subjects tends to show a linkage with non-obese Amerindians, (4) SNP45/SNP276 T/T haplotype is linked to obese Amerindian males. Also, a world population study is carried out finding that SNP45 T and SNP276 T alleles are the most frequent in African Blacks and are found significantly in lower frequencies in Europeans and Asians. This together with the fact that there is a linkage of this haplotype to obese Amerindian males suggest that evolutionary forces related to famine (or population density in relation with available food) may have shaped world population adiponectin polymorphism frequencies.
Bhatt, Shantanu; Anyanful, Akwasi; Kalman, Daniel
Enteropathogenic Escherichia coli(EPEC) requires the tnaA-encoded enzyme tryptophanase and its substrate tryptophan to synthesize diffusible exotoxins that kill the nematode Caenorhabditis elegans. Here, we demonstrate that the RNA-binding protein CsrA and the tryptophan permease TnaB coregulate tryptophanase activity, through mutually exclusive pathways, to stimulate toxin-mediated paralysis and killing of C. elegans.
Bhatt, Shantanu; Anyanful, Akwasi; Kalman, Daniel
Enteropathogenic Escherichia coli(EPEC) requires the tnaA-encoded enzyme tryptophanase and its substrate tryptophan to synthesize diffusible exotoxins that kill the nematode Caenorhabditis elegans. Here, we demonstrate that the RNA-binding protein CsrA and the tryptophan permease TnaB coregulate tryptophanase activity, through mutually exclusive pathways, to stimulate toxin-mediated paralysis and killing of C. elegans.
Wang, Si; Houtman, René; Melchers, Diana; Aarts, Jac; Peijnenburg, Ad; van Beuningen, Rinie; Rietjens, Ivonne; Bovee, Toine F
To further develop an integrated in vitro testing strategy for replacement of in vivo tests for (anti-)estrogenicity testing, the ligand-modulated interaction of coregulators with estrogen receptor α was assessed using a PamChip® plate. The relative estrogenic potencies determined, based on ERα binding to coregulator peptides in the presence of ligands on the PamChip® plate, were compared to the relative estrogenic potencies as determined in the in vivo uterotrophic assay. The results show that the estrogenic potencies predicted by the 57 coactivators on the peptide microarray for 18 compounds that display a clear E2 dose-dependent response (goodness of fit of a logistic dose-response model of 0.90 or higher) correlated very well with their in vivo potencies in the uterotrophic assay, i.e., coefficient of determination values for 30 coactivators higher than or equal to 0.85. Moreover, this coregulator binding assay is able to distinguish ER agonists from ER antagonists: profiles of selective estrogen receptor modulators, such as tamoxifen, were distinct from those of pure ER agonists, such as dienestrol. Combination of this coregulator binding assay with other types of in vitro assays, e.g., reporter gene assays and the H295R steroidogenesis assay, will frame an in vitro test panel for screening and prioritization of chemicals, thereby contributing to the reduction and ultimately the replacement of animal testing for (anti-)estrogenic effects.
Aerts, J.; Wetzels, Y.; Cohen, N.; Aerssens, J.
Different strategies to search public single nucleotide polymorphism (SNP) databases for intragenic SNPs were evaluated. First, we assembled a strategy to annotate SNPs onto candidate genes based on a BLAST search of public SNP databases (Intragenic SNP Annotation by BLAST, ISAB). Only BLAST hits th
Bulens, J.D.; Vullings, L.A.E.; Houtkamp, J.M.; Vanmeulebrouk, B.
As INSPIRE progresses to be implemented in the EU, many new discovery portals are built to facilitate finding spatial data. Currently the structure of the discovery portals is determined by the way spatial data experts like to work. However, we argue that the main target group for discovery portals
Bulens, J.D.; Vullings, L.A.E.; Houtkamp, J.M.; Vanmeulebrouk, B.
As INSPIRE progresses to be implemented in the EU, many new discovery portals are built to facilitate finding spatial data. Currently the structure of the discovery portals is determined by the way spatial data experts like to work. However, we argue that the main target group for discovery portals
Roden, Suzanne E; Dutton, Peter H; Morin, Phillip A
The green sea turtle, Chelonia mydas, was used as a case study for single nucleotide polymorphism (SNP) discovery in a species that has little genetic sequence information available. As green turtles have a complex population structure, additional nuclear markers other than microsatellites could add to our understanding of their complex life history. Amplified fragment length polymorphism technique was used to generate sets of random fragments of genomic DNA, which were then electrophoretically separated with precast gels, stained with SYBR green, excised, and directly sequenced. It was possible to perform this method without the use of polyacrylamide gels, radioactive or fluorescent labeled primers, or hybridization methods, reducing the time, expense, and safety hazards of SNP discovery. Within 13 loci, 2547 base pairs were screened, resulting in the discovery of 35 SNPs. Using this method, it was possible to yield a sufficient number of loci to screen for SNP markers without the availability of prior sequence information.
Bekkevold, Dorte; Limborg, Morten; Helyar, Sarah;
complicating stock assessment and management. It is therefore of management interest to trace individual population migration patterns and contributions to fisheries. To underpin management and to develop a validated tool for traceability of individuals from mixed‐stock samples we applied single nucleotide......Atlantic herring (Clupea harengus) exhibit biocomplexity, with widespread, geographically explicit populations that perform long‐range migration to common feeding and wintering areas, where they are exploited by fisheries. This means that exploited stocks do not describe discrete units, thereby...... polymorphism (SNP) markers in Northeast Atlantic herring population samples. Marker panels were targeted to include gene‐associated loci to maximize statistical resolution. Application of 281 SNP markers to samples representing different levels of stock complexity showed that the regional origin of individual...
Martin W Ganal; Andreas Polley; Eva-Maria Graner; Joerg Plieske; Ralf Wieseke; Hartmut Luerssen; Gregor Durstewitz
Genotyping with large numbers of molecular markers is now an indispensable tool within plant genetics and breeding. Especially through the identification of large numbers of single nucleotide polymorphism (SNP) markers using the novel high-throughput sequencing technologies, it is now possible to reliably identify many thousands of SNPs at many different loci in a given plant genome. For a number of important crop plants, SNP markers are now being used to design genotyping arrays containing thousands of markers spread over the entire genome and to analyse large numbers of samples. In this article, we discuss aspects that should be considered during the design of such large genotyping arrays and the analysis of individuals. The fact that crop plants are also often autopolyploid or allopolyploid is given due consideration. Furthermore, we outline some potential applications of large genotyping arrays including high-density genetic mapping, characterization (fingerprinting) of genetic material and breeding-related aspects such as association studies and genomic selection.
Full Text Available We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with ∂a∂i, the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.
Aflitos, Saulo Alves; Sanchez-Perez, Gabino; de Ridder, Dick; Fransz, Paul; Schranz, Michael E; de Jong, Hans; Peters, Sander A
Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not viable. Here, we present the Introgression Browser (iBrowser), a bioinformatics tool aimed at visualizing introgressions at nucleotide or SNP (Single Nucleotide Polymorphisms) accuracy. The software selects homozygous SNPs from Variant Call Format (VCF) information and filters out heterozygous SNPs, multi-nucleotide polymorphisms (MNPs) and insertion-deletions (InDels). For data analysis iBrowser makes use of sliding windows, but if needed it can generate any desired fragmentation pattern through General Feature Format (GFF) information. In an example of tomato (Solanum lycopersicum) accessions we visualize SNP patterns and elucidate both position and boundaries of the introgressions. We also show that our tool is capable of identifying alien DNA in a panel of the closely related S. pimpinellifolium by examining phylogenetic relationships of the introgressed segments in tomato. In a third example, we demonstrate the power of the iBrowser in a panel of 597 Arabidopsis accessions, detecting the boundaries of a SNP-free region around a polymorphic 1.17 Mbp inverted segment on the short arm of chromosome 4. The architecture and functionality of iBrowser makes the software appropriate for a broad set of analyses including SNP mining, genome structure analysis, and pedigree analysis. Its functionality, together with the capability to process large data sets and efficient visualization of sequence variation, makes iBrowser a valuable breeding tool.
Full Text Available Abstract Background Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm. Results In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots. Conclusion Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.
Katherine A Dick Krueger
Full Text Available The identification of genes for monogenic disorders has proven to be highly effective for understanding disease mechanisms, pathways and gene function in humans. Nevertheless, while thousands of Mendelian disorders have not yet been mapped there has been a trend away from studying single-gene disorders. In part, this is due to the fact that many of the remaining single-gene families are not large enough to map the disease locus to a single site in the genome. New tools and approaches are needed to allow researchers to effectively tap into this genetic gold-mine. Towards this goal, we have used haploid cell lines to experimentally validate the use of high-density single nucleotide polymorphism (SNP arrays to define genome-wide haplotypes and candidate regions, using a small amyotrophic lateral sclerosis (ALS family as a prototype. Specifically, we used haploid-cell lines to determine if high-density SNP arrays accurately predict haplotypes across entire chromosomes and show that haplotype information significantly enhances the genetic information in small families. Panels of haploid-cell lines were generated and a 5 centimorgan (cM short tandem repeat polymorphism (STRP genome scan was performed. Experimentally derived haplotypes for entire chromosomes were used to directly identify regions of the genome identical-by-descent in 5 affected individuals. Comparisons between experimentally determined and in silico haplotypes predicted from SNP arrays demonstrate that SNP analysis of diploid DNA accurately predicted chromosomal haplotypes. These methods precisely identified 12 candidate intervals, which are shared by all 5 affected individuals. Our study illustrates how genetic information can be maximized using readily available tools as a first step in mapping single-gene disorders in small families.
Full Text Available Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr.
Strain-specific genomic diversity in the Mycobacterium tuberculosis complex (MTBC) is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Several systems have been proposed to classify MTBC strains into distinct lineages and families. Here, we investigate single-nucleotide polymorphisms (SNPs) as robust (stable) markers of genetic variation for phylogenetic analysis. We identify ∼92k SNP across a global collection of 1,601 genomes. The SNP-based phylogeny is consistent with the gold-standard regions of difference (RD) classification system. Of the ∼7k strain-specific SNPs identified, 62 markers are proposed to discriminate known circulating strains. This SNP-based barcode is the first to cover all main lineages, and classifies a greater number of sublineages than current alternatives. It may be used to classify clinical isolates to evaluate tools to control the disease, including therapeutics and vaccines whose effectiveness may vary by strain type. © 2014 Macmillan Publishers Limited.
Full Text Available The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb, which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP.
Full Text Available A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of
Full Text Available Recent high-throughput transcript discoveries have yielded a growing recognition of long intergenic non-coding RNAs (lincRNAs, a class of arbitrarily defined transcripts (>200 nt that are primarily produced from the intergenic space. LincRNAs have been increasingly acknowledged for their expressional dynamics and likely functional associations with cancers. However, differential gene dosage of lincRNA genes between cancer genomes is less studied. By using the high-density Human Omni5-Quad BeadChips (Illumina, we investigated genomic copy number aberrations in a set of seven tumor-normal paired primary human mammary epithelial cells (HMECs established from patients with invasive ductal carcinoma. This Beadchip platform includes a total of 2,435,915 SNP loci dispersed at an average interval of ~700 nt throughout the intergenic region of the human genome. We mapped annotated or putative lincRNA genes to a subset of 332,539 SNP loci, which were included in our analysis for lincRNA-associated copy number variations (CNV. We have identified 122 lincRNAs, which were affected by somatic CNV with overlapped aberrations ranging from 0.14% to 100% in length. LincRNA-associated aberrations were detected predominantly with copy number losses and preferential clustering to the ends of chromosomes. Interestingly, lincRNA genes appear to be much less susceptible to CNV in comparison to both protein-coding and intergenic regions (CNV affected segments in percentage: 1.8%, 37.5% and 60.6%, respectively. In summary, our study established a novel approach utilizing high-resolution SNP array to identify lincRNA candidates, which could functionally link to tumorigenesis, and provide new strategies for the diagnosis and treatment of breast cancer.
Carole F S Koning-Boucoiran
Full Text Available In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array.Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L. genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.
Sarachana, Tewarit; Hu, Valerie W
Our independent cohort studies have consistently shown the reduction of the nuclear receptor RORA (retinoic acid-related orphan receptor-alpha) in lymphoblasts as well as in brain tissues from individuals with autism spectrum disorder (ASD). Moreover, we have found that RORA regulates the gene for aromatase, which converts androgen to estrogen, and that male and female hormones regulate RORA in opposite directions, with androgen suppressing RORA, suggesting that the sexually dimorphic regulation of RORA may contribute to the male bias in ASD. However, the molecular mechanisms through which androgen and estrogen differentially regulate RORA are still unknown. Here we use functional knockdown of hormone receptors and coregulators with small interfering RNA (siRNA) to investigate their involvement in sex hormone regulation of RORA in human neuronal cells. Luciferase assays using a vector containing various RORA promoter constructs were first performed to identify the promoter regions required for inverse regulation of RORA by male and female hormones. Sequential chromatin immunoprecipitation methods followed by quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR) analyses of RORA expression in hormone-treated SH-SY5Y cells were then utilized to identify coregulators that associate with hormone receptors on the RORA promoter. siRNA-mediated knockdown of interacting coregulators was performed followed by qRT-PCR analyses to confirm the functional requirement of each coregulator in hormone-regulated RORA expression. Our studies demonstrate the direct involvement of androgen receptor (AR) and estrogen receptor (ER) in the regulation of RORA by male and female hormones, respectively, and that the promoter region between -10055 bp and -2344 bp from the transcription start site of RORA is required for the inverse hormonal regulation. We further show that AR interacts with SUMO1, a reported suppressor of AR transcriptional activity, whereas ERα interacts
Potkin Steven G
from European-American (EA and the other from African-American (AA. In the EA data set, we found 22 pathways with nominal P-value less than or equal to 0.001 and corresponding false discovery rate (FDR less than 5%. In the AA data set, we found 11 pathways by controlling the same nominal P-value and FDR threshold. Interestingly, 8 of these pathways overlap with those found in the EA sample. We have implemented our method in a JAVA software package, called SNP Set Enrichment Analysis (SSEA, which contains a user-friendly interface and is freely available at http://cbcl.ics.uci.edu/SSEA. Conclusions The SNP-based pathway enrichment method described here offers a new alternative approach for analysing GWAS data. By applying it to schizophrenia GWAS studies, we show that our method is able to identify statistically significant pathways, and importantly, pathways that can be replicated in large genetically distinct samples.
Si-sheng OU-YANG; Jun-yan LU; Xiang-qian KONG; Zhong-jie LIANG; Cheng LUO; Hualiang JIANG
Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process.Because of the dramatic increase in the availability of biological macromolecule and small molecule information,the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow,including target identification and validation,lead discovery and optimization and preclinical tests.Over the past decades,computational drug discovery methods such as molecular docking,pharmacophore modeling and mapping,de novo design,molecular similarity calculation and sequence-based virtual screening have been greatly improved.In this review,we present an overview of these important computational methods,platforms and successful applications in this field.
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs have been associated with many aspects of human development and disease, and many non-coding SNPs associated with disease risk are presumed to affect gene regulation. We have previously shown that SNPs within transcription factor binding sites can affect transcription factor binding in an allele-specific and heritable manner. However, such analysis has relied on prior whole-genome genotypes provided by large external projects such as HapMap and the 1000 Genomes Project. This requirement limits the study of allele-specific effects of SNPs in primary patient samples from diseases of interest, where complete genotypes are not readily available. Results In this study, we show that we are able to identify SNPs de novo and accurately from ChIP-seq data generated in the ENCODE Project. Our de novo identified SNPs from ChIP-seq data are highly concordant with published genotypes. Independent experimental verification of more than 100 sites estimates our false discovery rate at less than 5%. Analysis of transcription factor binding at de novo identified SNPs revealed widespread heritable allele-specific binding, confirming previous observations. SNPs identified from ChIP-seq datasets were significantly enriched for disease-associated variants, and we identified dozens of allele-specific binding events in non-coding regions that could distinguish between disease and normal haplotypes. Conclusions Our approach combines SNP discovery, genotyping and allele-specific analysis, but is selectively focused on functional regulatory elements occupied by transcription factors or epigenetic marks, and will therefore be valuable for identifying the functional regulatory consequences of non-coding SNPs in primary disease samples.
Dai, Honghua; Smirnov, Evgueni
Reliable Knowledge Discovery focuses on theory, methods, and techniques for RKDD, a new sub-field of KDD. It studies the theory and methods to assure the reliability and trustworthiness of discovered knowledge and to maintain the stability and consistency of knowledge discovery processes. RKDD has a broad spectrum of applications, especially in critical domains like medicine, finance, and military. Reliable Knowledge Discovery also presents methods and techniques for designing robust knowledge-discovery processes. Approaches to assessing the reliability of the discovered knowledge are introduc
Full Text Available Jared R Kohler, Tobias Guennel, Scott L MarshallBioStat Solutions, Inc., Frederick, MD, USAAbstract: In the past decade, the pharmaceutical industry and biomedical research sector have devoted considerable resources to pharmacogenomics (PGx with the hope that understanding genetic variation in patients would deliver on the promise of personalized medicine. With the advent of new technologies and the improved collection of DNA samples, the roadblock to advancements in PGx discovery is no longer the lack of high-density genetic information captured on patient populations, but rather the development, adaptation, and tailoring of analytical strategies to effectively harness this wealth of information. The current analytical paradigm in PGx considers the single-nucleotide polymorphism (SNP as the genomic feature of interest and performs single SNP association tests to discover PGx effects – ie, genetic effects impacting drug response. While it can be straightforward to process single SNP results and to consider how this information may be extended for use in downstream patient stratification, the rate of replication for single SNP associations has been low and the desired success of producing clinically and commercially viable biomarkers has not been realized. This may be due to the fact that single SNP association testing is suboptimal given the complexities of PGx discovery in the clinical trial setting, including: 1 relatively small sample sizes; 2 diverse clinical cohorts within and across trials due to genetic ancestry (potentially impacting the ability to replicate findings; and 3 the potential polygenic nature of a drug response. Subsequently, a shift in the current paradigm is proposed: to consider the gene as the genomic feature of interest in PGx discovery. The proof-of-concept study presented in this manuscript demonstrates that genomic region-based association testing has the potential to improve the power of detecting single SNP or
Full Text Available Background: We recently reported that aquaporin 5 (AQP5, a water channel never identified in the kidney before, co-localizes with pendrin at the apical membrane of type-B intercalated cells in the kidney cortex. Since co-expression of AQP5 and pendrin in the apical membrane domain is a common feature of several other epithelia such as cochlear and bronchial epithelial cells, we evaluated here whether this strict membrane association may reflect a co-regulation of the two proteins. To investigate this possibility, we analyzed AQP5 and pendrin expression and trafficking in mice under chronic K+ depletion, a condition that results in an increased ability of renal tubule to reabsorb bicarbonate, often leads to metabolic alkalosis and is known to strongly reduce pendrin expression. Methods: Mice were housed in metabolic cages and pair-fed with either a standard laboratory chow or a K+-deficient diet. AQP5 abundance was assessed by western blot in whole kidney homogenates and AQP5 and pendrin were localized by confocal microscopy in kidney sections from those mice. In addition, the short-term effect of changes in external pH on pendrin trafficking was evaluated by fluorescence resonance energy transfer (FRET in MDCK cells, and the functional activity of pendrin was tested in the presence and absence of AQP5 in HEK 293 Phoenix cells. Results: Chronic K+ depletion caused a strong reduction in pendrin and AQP5 expression. Moreover, both proteins shifted from the apical cell membrane to an intracellular compartment. An acute pH shift from 7.4 to 7.0 caused pendrin internalization from the plasma membrane. Conversely, a pH shift from 7.4 to 7.8 caused a significant increase in the cell surface expression of pendrin. Finally, pendrin ion transport activity was not affected by co-expression with AQP5. Conclusions: The co-regulation of pendrin and AQP5 membrane expression under chronic K+-deficiency indicates that these two molecules could cooperate as an
Procino, Giuseppe; Milano, Serena; Tamma, Grazia; Dossena, Silvia; Barbieri, Claudia; Nicoletti, Maria Celeste; Ranieri, Marianna; Di Mise, Annarita; Nofziger, Charity; Svelto, Maria; Paulmichl, Markus; Valenti, Giovanna
We recently reported that aquaporin 5 (AQP5), a water channel never identified in the kidney before, co-localizes with pendrin at the apical membrane of type-B intercalated cells in the kidney cortex. Since co-expression of AQP5 and pendrin in the apical membrane domain is a common feature of several other epithelia such as cochlear and bronchial epithelial cells, we evaluated here whether this strict membrane association may reflect a co-regulation of the two proteins. To investigate this possibility, we analyzed AQP5 and pendrin expression and trafficking in mice under chronic K(+) depletion, a condition that results in an increased ability of renal tubule to reabsorb bicarbonate, often leads to metabolic alkalosis and is known to strongly reduce pendrin expression. Mice were housed in metabolic cages and pair-fed with either a standard laboratory chow or a K(+)-deficient diet. AQP5 abundance was assessed by western blot in whole kidney homogenates and AQP5 and pendrin were localized by confocal microscopy in kidney sections from those mice. In addition, the short-term effect of changes in external pH on pendrin trafficking was evaluated by fluorescence resonance energy transfer (FRET) in MDCK cells, and the functional activity of pendrin was tested in the presence and absence of AQP5 in HEK 293 Phoenix cells. Chronic K(+) depletion caused a strong reduction in pendrin and AQP5 expression. Moreover, both proteins shifted from the apical cell membrane to an intracellular compartment. An acute pH shift from 7.4 to 7.0 caused pendrin internalization from the plasma membrane. Conversely, a pH shift from 7.4 to 7.8 caused a significant increase in the cell surface expression of pendrin. Finally, pendrin ion transport activity was not affected by co-expression with AQP5. The co-regulation of pendrin and AQP5 membrane expression under chronic K(+)-deficiency indicates that these two molecules could cooperate as an osmosensor to rapidly detect and respond to alterations
Rizzi, Giovanni; Østerberg, Frederik Westergaard; Dufva, Martin
We present a magnetoresistive sensor platform for hybridization assays and demonstrate its applicability on single nucleotide polymorphism (SNP) genotyping. The sensor relies on anisotropic magnetoresistance in a new geometry with a local negative reference and uses the magnetic field from...... the sensor bias current to magnetize magnetic beads in the vicinity of the sensor. The method allows for real-time measurements of the specific bead binding to the sensor surface during DNA hybridization and washing. Compared to other magnetic biosensing platforms, our approach eliminates the need...... for external electromagnets and thus allows for miniaturization of the sensor platform....
Li, Ruiqiang; Li, Yingrui; Fang, Xiaodong
of this information was integrated into a single quality score for each base under Bayesian theory to measure the accuracy of consensus calling. We tested this methodology using a large-scale human resequencing data set of 36x coverage and assembled a high-quality nonrepetitive consensus sequence for 92......-genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by carefully considering the data quality, alignment, and experimental errors common to this technology. All...
Purple photosynthetic bacteria are capable of generating cellular energy from several sources, including photosynthesis, respiration, and H2 oxidation. Under nutrient-limiting conditions, cellular energy can be used to assimilate carbon and nitrogen. This study provides the first evidence of a molecular link for the coregulation of nitrogenase and hydrogenase biosynthesis in an anoxygenic photosynthetic bacterium. We demonstrated that molybdenum nitrogenase biosynthesis is under the control o...
The common approach to find co-regulated genes is to cluster genes based on gene expression. However, due to the limited information present in any dataset, genes in the same cluster might be co-expressed but not necessarily co-regulated. In this paper, we propose to integrate known transcription factor binding site informa tion and gene expression data into a single clustering scheme. This scheme will find clusters of co-regulated genes that are not only expressed similarly under the measured conditions, but also share a regulatory structure that may explain their common regulation. We demonstrate the utility of this approach on a microarray dataset of yeast grown under different nutrient and oxygen limitations. Our in tegrated clustering method not only unravels many regulatory modules that are consistent with current biological knowledge, but also provides a more profound understanding of the underlying process. The added value of our approach, compared with the clustering solely based on gene expression, is its ability to uncover clusters of genes that are involved in more specific biological processes and are evidently regulated by a set of transcription factors.
Reed, Rebecca G.; Barnard, Kobus; Butler, Emily A.
Well-regulated emotions, both within people and between relationship partners, play a key role in facilitating health and well-being. The present study examined 39 heterosexual couples’ joint weight status (both partners are healthy-weight, both overweight, one healthy-weight and one overweight) as a predictor of two interpersonal emotional patterns during a discussion of their shared lifestyle choices. The first pattern, co-regulation, is one in which partners’ coupled emotions show a dampening pattern over time and ultimately return to homeostatic levels. The second, co-dysregulation, is one in which partners’ coupled emotions are amplified away from homeostatic balance. We demonstrate how a coupled linear oscillator (CLO) model (Butner, Amazeen, & Mulvey, 2005) can be used to distinguish co-regulation from co-dysregulation. As predicted, healthy-weight couples and mixed-weight couples in which the man was heavier than the woman displayed co-regulation, but overweight couples and mixed-weight couples in which the woman was heavier showed co-dysregulation. These results suggest that heterosexual couples in which the woman is overweight may face formidable co-regulatory challenges that could undermine both partners’ well-being. The results also demonstrate the importance of distinguishing between various interpersonal emotional dynamics for understanding connections between interpersonal emotions and health. PMID:25664951
Kirkegaard, Henriette Schultz; Valentin, Finn
Academic drug discovery centres (ADDCs) are seen as one of the solutions to fill the innovation gap in early drug discovery, which has proven challenging for previous organisational models. Prior studies of ADDCs have identified the need to analyse them from the angle of their economic...... their performance....
The ATLAS & CMS Experiments Celebrate the 2nd Anniversary of the Discovery of the Higgs boson. Here, are some images of the path from LHC startup to Nobel Prize, featuring a musical composition by Roger Zare, performed by the Donald Sinta Quartet, called “LHC”. Happy Discovery Day!
This article features Friends' Discovery Camp, a program that allows children with and without autism spectrum disorder to learn and play together. In Friends' Discovery Camp, campers take part in sensory-rich experiences, ranging from hands-on activities and performing arts to science experiments and stories teaching social skills. Now in its 7th…
Rosenman, Martin F.
The discovery of penicillin is cited in a discussion of the role of serendipity as it relates to scientific discovery. The importance of sagacity as a personality trait is noted. Successful researchers have questioning minds, are willing to view data from several perspectives, and recognize and appreciate the unexpected. (JW)
A discovery system for detecting correspondences in data is described, based on the familiar induction methods of J. S. Mill. Given a set of observations, the system induces the ``causally'' related facts in these observations. Its application to empirical linguistic discovery is described.
Sundramoorthy, V.; Scholten, Johan; Jansen, P.G.; Hartel, Pieter H.
Service discovery is a fairly new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between devices. This paper provides an overview and comparison of several promin
Sundramoorthy, V.; Scholten, Johan; Jansen, P.G.; Hartel, Pieter H.
Service discovery is a fady new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between deviies. This paper provides an ovewiew and comparison of several prominent
Sundramoorthy, V.; Scholten, Johan; Jansen, P.G.; Hartel, Pieter H.
Service discovery is a fairly new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between devices. This paper provides an overview and comparison of several
Sundramoorthy, V.; Scholten, Johan; Jansen, P.G.; Hartel, Pieter H.
Service discovery is a fady new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between deviies. This paper provides an ovewiew and comparison of several prominent
Sundramoorthy, Vasughi; Scholten, Hans; Jansen, Pierre; Hartel, Pieter
Service discovery is a fairly new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between devices. This paper provides an overview and comparison of several promin
Accidental discoveries have been of significant value in the progress of science. Although accidental discoveries are more common in pharmacology and chemistry, other branches of science have also benefited from such discoveries. While most discoveries are the result of persistent research, famous accidental discoveries provide a fascinating…
Harn, Tony; Spielman, Ingrid; Gao, Yang; Kovacikova, Gabriela; Biais, Nicolas
Type IV pilus (T4P) systems are complex molecular machines that polymerize major pilin proteins into thin filaments displayed on bacterial surfaces. Pilus functions require rapid extension and depolymerization of the pilus, powered by the assembly and retraction ATPases, respectively. A set of low abundance minor pilins influences pilus dynamics by unknown mechanisms. The Vibrio cholerae toxin-coregulated pilus (TCP) is among the simplest of the T4P systems, having a single minor pilin TcpB and lacking a retraction ATPase. Here we show that TcpB, like its homolog CofB, initiates pilus assembly. TcpB co-localizes with the pili but at extremely low levels, equivalent to one subunit per pilus. We used a micropillars assay to demonstrate that TCP are retractile despite the absence of a retraction ATPase, and that retraction relies on TcpB, as a V. cholerae tcpB Glu5Val mutant is fully piliated but does not induce micropillars movements. This mutant is impaired in TCP-mediated autoagglutination and TcpF secretion, consistent with retraction being required for these functions. We propose that TcpB initiates pilus retraction by incorporating into the growing pilus in a Glu5-dependent manner, which stalls assembly and triggers processive disassembly. These results provide a framework for understanding filament dynamics in more complex T4P systems and the closely related Type II secretion system. PMID:27992883
Costa, José Hélio; Arnholdt-Schmitt, Birgit
The alternative oxidase (AOX) gene family is a hot candidate for functional marker development that could help plant breeding on yield stability through more robust plants based on multi-stress tolerance. However, there is missing knowledge on the interplay between gene family members that might interfere with the efficiency of marker development. It is common view that AOX1 and AOX2 have different physiological roles. Nevertheless, both family member groups act in terms of molecular-biochemical function as "typical" alternative oxidases and co-regulation of AOX1 and AOX2 had been reported. Although conserved sequence differences had been identified, the basis for differential effects on physiology regulation is not sufficiently explored.This protocol gives instructions for a bioinformatics approach that supports discovering potential interaction of AOX family members in regulating growth and development. It further provides a strategy to elucidate the relevance of gene sequence diversity and copy number variation for final functionality in target tissues and finally the whole plant. Thus, overall this protocol provides the means for efficiently identifying plant AOX variants as functional marker candidates related to growth and development.
Chen, Cynthia; Lodish, Harvey F
Key transcriptional regulators of terminal erythropoiesis, such as GATA-binding factor 1 (GATA1) and T-cell acute lymphocytic leukemia protein 1 (TAL1), have been well characterized, but transcription factors and cofactors and their expression modulations have not yet been explored on a global scale. Here, we use global gene expression analysis to identify 28 transcription factors and 19 transcriptional cofactors induced during terminal erythroid differentiation whose promoters are enriched for binding by GATA1 and TAL1. Utilizing protein-protein interaction databases to identify cofactors for each transcription factor, we pinpoint several co-induced pairs, of which E2f2 and its cofactor transcription factor Dp-2 (Tfdp2) were the most highly induced. TFDP2 is a critical cofactor required for proper cell cycle control and gene expression. GATA1 and TAL1 are bound to the regulatory regions of Tfdp2 and upregulate its expression and knockdown of Tfdp2 results in significantly reduced rates of proliferation as well as reduced upregulation of many erythroid-important genes. Loss of Tfdp2 also globally inhibits the normal downregulation of many E2F2 target genes, including those that regulate the cell cycle, causing cells to accumulate in S phase and resulting in increased erythrocyte size. Our findings highlight the importance of TFDP2 in coupling the erythroid cell cycle with terminal differentiation and validate this study as a resource for future work on elucidating the role of diverse transcription factors and coregulators in erythropoiesis.
Chung, Ho-Ryun; Xu, Chao; Fuchs, Alisa; Mund, Andreas; Lange, Martin; Staege, Hannah; Schubert, Tobias; Bian, Chuanbing; Dunkel, Ilona; Eberharter, Anton; Regnard, Catherine; Klinker, Henrike; Meierhofer, David; Cozzuto, Luca; Winterpacht, Andreas; Di Croce, Luciano; Min, Jinrong; Will, Hans; Kinkley, Sarah
PHF13 is a chromatin affiliated protein with a functional role in differentiation, cell division, DNA damage response and higher chromatin order. To gain insight into PHF13's ability to modulate these processes, we elucidate the mechanisms targeting PHF13 to chromatin, its genome wide localization and its molecular chromatin context. Size exclusion chromatography, mass spectrometry, X-ray crystallography and ChIP sequencing demonstrate that PHF13 binds chromatin in a multivalent fashion via direct interactions with H3K4me2/3 and DNA, and indirectly via interactions with PRC2 and RNA PolII. Furthermore, PHF13 depletion disrupted the interactions between PRC2, RNA PolII S5P, H3K4me3 and H3K27me3 and resulted in the up and down regulation of genes functionally enriched in transcriptional regulation, DNA binding, cell cycle, differentiation and chromatin organization. Together our findings argue that PHF13 is an H3K4me2/3 molecular reader and transcriptional co-regulator, affording it the ability to impact different chromatin processes.
Gali Ramamoorthy, Thanuja; Laverny, Gilles; Schlagowski, Anna-Isabel; Zoll, Joffrey; Messaddeq, Nadia; Bornert, Jean-Marc; Panza, Salvatore; Ferry, Arnaud; Geny, Bernard; Metzger, Daniel
The transcriptional coregulators PGC-1α and PGC-1β modulate the expression of numerous partially overlapping genes involved in mitochondrial biogenesis and energetic metabolism. The physiological role of PGC-1β is poorly understood in skeletal muscle, a tissue of high mitochondrial content to produce ATP levels required for sustained contractions. Here we determine the physiological role of PGC-1β in skeletal muscle using mice, in which PGC-1β is selectively ablated in skeletal myofibres at adulthood (PGC-1β(i)skm−/− mice). We show that myofibre myosin heavy chain composition and mitochondrial number, muscle strength and glucose homeostasis are unaffected in PGC-1β(i)skm−/− mice. However, decreased expression of genes controlling mitochondrial protein import, translational machinery and energy metabolism in PGC-1β(i)skm−/− muscles leads to mitochondrial structural and functional abnormalities, impaired muscle oxidative capacity and reduced exercise performance. Moreover, enhanced free-radical leak and reduced expression of the mitochondrial anti-oxidant enzyme Sod2 increase muscle oxidative stress. PGC-1β is therefore instrumental for skeletal muscles to cope with high energetic demands. PMID:26674215
Full Text Available Type IV pilus (T4P systems are complex molecular machines that polymerize major pilin proteins into thin filaments displayed on bacterial surfaces. Pilus functions require rapid extension and depolymerization of the pilus, powered by the assembly and retraction ATPases, respectively. A set of low abundance minor pilins influences pilus dynamics by unknown mechanisms. The Vibrio cholerae toxin-coregulated pilus (TCP is among the simplest of the T4P systems, having a single minor pilin TcpB and lacking a retraction ATPase. Here we show that TcpB, like its homolog CofB, initiates pilus assembly. TcpB co-localizes with the pili but at extremely low levels, equivalent to one subunit per pilus. We used a micropillars assay to demonstrate that TCP are retractile despite the absence of a retraction ATPase, and that retraction relies on TcpB, as a V. cholerae tcpB Glu5Val mutant is fully piliated but does not induce micropillars movements. This mutant is impaired in TCP-mediated autoagglutination and TcpF secretion, consistent with retraction being required for these functions. We propose that TcpB initiates pilus retraction by incorporating into the growing pilus in a Glu5-dependent manner, which stalls assembly and triggers processive disassembly. These results provide a framework for understanding filament dynamics in more complex T4P systems and the closely related Type II secretion system.
Amin R Mazloom
Full Text Available Coregulator proteins (CoRegs are part of multi-protein complexes that transiently assemble with transcription factors and chromatin modifiers to regulate gene expression. In this study we analyzed data from 3,290 immuno-precipitations (IP followed by mass spectrometry (MS applied to human cell lines aimed at identifying CoRegs complexes. Using the semi-quantitative spectral counts, we scored binary protein-protein and domain-domain associations with several equations. Unlike previous applications, our methods scored prey-prey protein-protein interactions regardless of the baits used. We also predicted domain-domain interactions underlying predicted protein-protein interactions. The quality of predicted protein-protein and domain-domain interactions was evaluated using known binary interactions from the literature, whereas one protein-protein interaction, between STRN and CTTNBP2NL, was validated experimentally; and one domain-domain interaction, between the HEAT domain of PPP2R1A and the Pkinase domain of STK25, was validated using molecular docking simulations. The scoring schemes presented here recovered known, and predicted many new, complexes, protein-protein, and domain-domain interactions. The networks that resulted from the predictions are provided as a web-based interactive application at http://maayanlab.net/HT-IP-MS-2-PPI-DDI/.
Geraldes, A; Difazio, S P; Slavov, G T; Ranjan, P; Muchero, W; Hannemann, J; Gunter, L E; Wymore, A M; Grassa, C J; Farzaneh, N; Porth, I; McKown, A D; Skyba, O; Li, E; Fujita, M; Klápště, J; Martin, J; Schackwitz, W; Pennacchio, C; Rokhsar, D; Friedmann, M C; Wasteneys, G O; Guy, R D; El-Kassaby, Y A; Mansfield, S D; Cronk, Q C B; Ehlting, J; Douglas, C J; Tuskan, G A
Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. For such studies, the use of large single nucleotide polymorphism (SNP) genotyping arrays still offers the most cost-effective solution. Herein we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species latitudinal range. We adopted a candidate gene approach to the array design that resulted in the selection of 34 131 SNPs, the majority of which are located in, or within 2 kb of, 3543 candidate genes. A subset of the SNPs on the array (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%. We demonstrate that even among small numbers of samples (n = 10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca. Finally, we provide evidence for the utility of the array to address evolutionary questions such as intraspecific studies of genetic differentiation, species assignment and the detection of natural hybrids.
SHEN Li; WANG Lei; YANG Hai-Feng; LIU Xiao-Jun; LIU Hong-Ping
@@ We present a simple and efficient method for measuring the atomic lifetimes in order of tens of microseconds and demonstrate it in the lifetime determination of barium Rydberg states.This method extracts the lifetime information from the time-of-flight spectrum directly, which is much more efficient than other methods such as the time-delayed field ionization and the traditional laser induced fluorescence.The lifetimes determined with our method for barium Rydberg 6snp(n=37-59)series are well coincident with the values deduced from the absolute oscillator strengths of barium which were given in the literature [J.Phys.B 14(1981)4489, 29(1996)655]on experiments.%We present a simple and efficient method for measuring the atomic lifetimes in order of tens of microseconds and demonstrate it in the lifetime determination of barium Rydberg states. This method extracts the lifetime information from the time-of-flight spectrum directly, which is much more efficient than other methods such as the time-delayed field ionization and the traditional laser induced fluorescence. The lifetimes determined with our method for barium Rydberg 6snp (n=37-59) series are well coincident with the values deduced from the absolute oscillator strengths of barium which were given in the literature [J. Phys. B 14 (1981) 4489, 29 (1996) 655] onexperiments.
Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B
We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.
Full Text Available Abstract Background With the availability of large-scale genome-wide association study (GWAS data, choosing an optimal set of SNPs for disease susceptibility prediction is a challenging task. This study aimed to use single nucleotide polymorphisms (SNPs to predict psoriasis from searching GWAS data. Methods Totally we had 2,798 samples and 451,724 SNPs. Process for searching a set of SNPs to predict susceptibility for psoriasis consisted of two steps. The first one was to search top 1,000 SNPs with high accuracy for prediction of psoriasis from GWAS dataset. The second one was to search for an optimal SNP subset for predicting psoriasis. The sequential information bottleneck (sIB method was compared with classical linear discriminant analysis(LDA for classification performance. Results The best test harmonic mean of sensitivity and specificity for predicting psoriasis by sIB was 0.674(95% CI: 0.650-0.698, while only 0.520(95% CI: 0.472-0.524 was reported for predicting disease by LDA. Our results indicate that the new classifier sIB performs better than LDA in the study. Conclusions The fact that a small set of SNPs can predict disease status with average accuracy of 68% makes it possible to use SNP data for psoriasis prediction.
Full Text Available Motif discovery for the identification of functional regulatory elements underlying gene expression is a challenging problem. Sequence inspection often leads to discovery of novel motifs (including transcription factor sites with previously uncharacterized function in gene expression. Coupled with the complexity underlying tissue-specific gene expression, there are several motifs that are putatively responsible for expression in a certain cell type. This has important implications in understanding fundamental biological processes such as development and disease progression. In this work, we present an approach to the identification of motifs (not necessarily transcription factor sites and examine its application to some questions in current bioinformatics research. These motifs are seen to discriminate tissue-specific gene promoter or regulatory regions from those that are not tissue-specific. There are two main contributions of this work. Firstly, we propose the use of directed information for such classification constrained motif discovery, and then use the selected features with a support vector machine (SVM classifier to find the tissue specificity of any sequence of interest. Such analysis yields several novel interesting motifs that merit further experimental characterization. Furthermore, this approach leads to a principled framework for the prospective examination of any chosen motif to be discriminatory motif for a group of coexpressed/coregulated genes, thereby integrating sequence and expression perspectives. We hypothesize that the discovery of these motifs would enable the large-scale investigation for the tissue-specific regulatory role of any conserved sequence element identified from genome-wide studies.
Full Text Available Abstract Background Sweet cherry (Prunus avium L., a non-model crop with narrow genetic diversity, is an important member of sub-family Amygdoloideae within Rosaceae. Compared to other important members like peach and apple, sweet cherry lacks in genetic and genomic information, impeding understanding of important biological processes and development of efficient breeding approaches. Availability of single nucleotide polymorphism (SNP-based molecular markers can greatly benefit breeding efforts in such non-model species. RNA-seq approaches employing second generation sequencing platforms offer a unique avenue to rapidly identify gene-based SNPs. Additionally, haplotype markers can be rapidly generated from transcript-based SNPs since they have been found to be extremely utile in identification of genetic variants related to health, disease and response to environment as highlighted by the human HapMap project. Results RNA-seq was performed on two sweet cherry cultivars, Bing and Rainier using a 3' untranslated region (UTR sequencing method yielding 43,396 assembled contigs. In order to test our approach of rapid identification of SNPs without any reference genome information, over 25% (10,100 of the contigs were screened for the SNPs. A total of 207 contigs from this set were identified to contain high quality SNPs. A set of 223 primer pairs were designed to amplify SNP containing regions from these contigs and high resolution melting (HRM analysis was performed with eight important parental sweet cherry cultivars. Six of the parent cultivars were distantly related to Bing and Rainier, the cultivars used for initial SNP discovery. Further, HRM analysis was also performed on 13 seedlings derived from a cross between two of the parents. Our analysis resulted in the identification of 84 (38.7% primer sets that demonstrated variation among the tested germplasm. Reassembly of the raw 3'UTR sequences using upgraded transcriptome assembly software
Bailey, David H.; Borwein, Jonathan M.
What mathematical discovery more than 1500 years ago: (1) Is one of the greatest, if not the greatest, single discovery in the field of mathematics? (2) Involved three subtle ideas that eluded the greatest minds of antiquity, even geniuses such as Archimedes? (3) Was fiercely resisted in Europe for hundreds of years after its discovery? (4) Even today, in historical treatments of mathematics, is often dismissed with scant mention, or else is ascribed to the wrong source? Answer: Our modern system of positional decimal notation with zero, together with the basic arithmetic computational schemes, which were discovered in India about 500 CE.
Full Text Available Single nucleotide polymorphisms (SNPs play important roles as molecular markers in plant genomics and breeding studies. Although onion (Allium cepa L. is an important crop globally, relatively few molecular marker resources have been reported due to its large genome and high heterozygosity. Genotyping-by-sequencing (GBS offers a greater degree of complexity reduction followed by concurrent SNP discovery and genotyping for species with complex genomes. In this study, GBS was employed for SNP mining in onion, which currently lacks a reference genome. A segregating F2 population, derived from a cross between ‘NW-001’ and ‘NW-002,’ as well as multiple parental lines were used for GBS analysis. A total of 56.15 Gbp of raw sequence data were generated and 1,851,428 SNPs were identified from the de novo assembled contigs. Stringent filtering resulted in 10,091 high-fidelity SNP markers. Robust SNPs that satisfied the segregation ratio criteria and with even distribution in the mapping population were used to construct an onion genetic map. The final map contained eight linkage groups and spanned a genetic length of 1,383 centiMorgans (cM, with an average marker interval of 8.08 cM. These robust SNPs were further analyzed using the high-throughput Fluidigm platform for marker validation. This is the first study in onion to develop genome-wide SNPs using GBS. The resulting SNP markers and developed linkage map will be valuable tools for genetic mapping of important agronomic traits and marker-assisted selection in onion breeding programs.
Full Text Available BACKGROUND: Possible single nucleotide polymorphism (SNP interactions in breast cancer are usually not investigated in genome-wide association studies. Previously, we proposed a particle swarm optimization (PSO method to compute these kinds of SNP interactions. However, this PSO does not guarantee to find the best result in every implement, especially when high-dimensional data is investigated for SNP-SNP interactions. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we propose IPSO algorithm to improve the reliability of PSO for the identification of the best protective SNP barcodes (SNP combinations and genotypes with maximum difference between cases and controls associated with breast cancer. SNP barcodes containing different numbers of SNPs were computed. The top five SNP barcode results are retained for computing the next SNP barcode with a one-SNP-increase for each processing step. Based on the simulated data for 23 SNPs of six steroid hormone metabolisms and signalling-related genes, the performance of our proposed IPSO algorithm is evaluated. Among 23 SNPs, 13 SNPs displayed significant odds ratio (OR values (1.268 to 0.848; p<0.05 for breast cancer. Based on IPSO algorithm, the jointed effect in terms of SNP barcodes with two to seven SNPs show significantly decreasing OR values (0.84 to 0.57; p<0.05 to 0.001. Using PSO algorithm, two to four SNPs show significantly decreasing OR values (0.84 to 0.77; p<0.05 to 0.001. Based on the results of 20 simulations, medians of the maximum differences for each SNP barcode generated by IPSO are higher than by PSO. The interquartile ranges of the boxplot, as well as the upper and lower hinges for each n-SNP barcode (n = 3∼10 are more narrow in IPSO than in PSO, suggesting that IPSO is highly reliable for SNP barcode identification. CONCLUSIONS/SIGNIFICANCE: Overall, the proposed IPSO algorithm is robust to provide exact identification of the best protective SNP barcodes for breast cancer.
Sun, Guirong; Li, Ming; Li, Hong; Tian, Yadong; Chen, Qixin; Bai, Yichun; Kang, Xiangtao
The pre-melanin-concentrating hormone (PMCH) gene is an important gene functionally concerning the regulations of body fat content, feeding behavior and energy balance. In this study, the full-length cDNA of chicken PMCH gene was amplified by SMART RACE method. The single nucleotide polymorphisms (SNPs) in the PMCH gene were screened by comparative sequence analysis. The obtained non-synonymous coding SNPs (ncSNPs) were designed for genotyping firstly. Its effects on growth, carcass characteristics and meat quality traits were investigated employing the F2 resource population of Gushi chicken crossed with Anak broiler by AluI CRS-PCR-RFLP. Our results indicated that the cDNA of chicken PMCH shared 67.25 and 66.47% homology with that of human and bovine PMCH, respectively. The deduced amino acid sequence of chicken PMCH (163 amino acids) were 52.07 and 50.89% identical to those of human and bovine PMCH, respectively. The PMCH protein sequence is predicted to have several functional domains, including pro-MCH, CSP, IL7, XPGI and some low complexity sequence. It has 8 phosphorylation sites and no signal peptide sequence. gga-miR-18a, gga-miR-18b, gga-miR-499 microRNA targeting site was predicted in the 3' untranslated region of chicken PMCH mRNA. In addition, a total of seven SNPs including an ncSNP and a synonymous coding SNP, were identified in the PMCH gene. The ncSNP c.81 A>T was found to be in moderate polymorphic state (polymorphic index=0.365), and the frequencies for genotype AA, AB and BB were 0.3648, 0.4682 and 0.1670, respectively. Significant associations between the locus and shear force of breast and leg were observed. This polymorphic site may serve as a useful target for the marker assisted selection of the growth and meat quality traits in chicken.
Watson-Haigh Nathan S
Full Text Available Abstract Background Whole genome association studies using highly dense single nucleotide polymorphisms (SNPs are a set of methods to identify DNA markers associated with variation in a particular complex trait of interest. One of the main outcomes from these studies is a subset of statistically significant SNPs. Finding the potential biological functions of such SNPs can be an important step towards further use in human and agricultural populations (e.g., for identifying genes related to susceptibility to complex diseases or genes playing key roles in development or performance. The current challenge is that the information holding the clues to SNP functions is distributed across many different databases. Efficient bioinformatics tools are therefore needed to seamlessly integrate up-to-date functional information on SNPs. Many web services have arisen to meet the challenge but most work only within the framework of human medical research. Although we acknowledge the importance of human research, we identify there is a need for SNP annotation tools for other organisms. Description We introduce an R package called FunctSNP, which is the user interface to custom built species-specific databases. The local relational databases contain SNP data together with functional annotations extracted from online resources. FunctSNP provides a unified bioinformatics resource to link SNPs with functional knowledge (e.g., genes, pathways, ontologies. We also introduce dbAutoMaker, a suite of Perl scripts, which can be scheduled to run periodically to automatically create/update the customised SNP databases. We illustrate the use of FunctSNP with a livestock example, but the approach and software tools presented here can be applied also to human and other organisms. Conclusions Finding the potential functional significance of SNPs is important when further using the outcomes from whole genome association studies. FunctSNP is unique in that it is the only R
Discovery and Innovation publishes articles and reports in a wide range of ... with the social sciences, particularly as they relate to major areas of concern in Africa. ... The article should begin with an Introduction, stating the hypothesis, defining ...
Susie J. Lee
Full Text Available "The Art of Discovery" discusses an ambitious educational program taught by the artist which incorporated locative media, contemporary art, site specificity, and creative work as a proposal for the integration of art, technology and science.
The learning discovery of youngsters is a do-it-yourself teaching method for clerical, administrative, and accountant trainees at the Bankside House headquarters of the Central Electricity Generating Board's South Eastern Region, London. (Author)
Goethals, George R
This book, a collection of essays from scholars across disciplines, explores leadership of discovery, probing the guided and collaborative exploration and interpretation of the experience of our inner thoughts and feelings, and of our external worlds
Discovery and Innovation is a journal of the African Academy of Sciences (AAS) and ... World, emphasizing the progress in scientific research and issues that impinge on these two areas as well as circumscribe science-driven development.
The spread of resistant bacteria, leading to untreatable infections, is a major public health threat but the pace of antibiotic discovery to combat these pathogens has slowed down. Most antibiotics were originally isolated by screening soil-derived actinomycetes during the golden era of antibiotic discovery in the 1940s to 1960s. However, diminishing returns from this discovery platform led to its collapse, and efforts to create a new platform based on target-focused screening of large libraries of synthetic compounds failed, in part owing to the lack of penetration of such compounds through the bacterial envelope. This article considers strategies to re-establish viable platforms for antibiotic discovery. These include investigating untapped natural product sources such as uncultured bacteria, establishing rules of compound penetration to enable the development of synthetic antibiotics, developing species-specific antibiotics and identifying prodrugs that have the potential to eradicate dormant persisters, which are often responsible for hard-to-treat infections.
Wohlleben, Wolfgang; Mast, Yvonne; Stegmann, Evi; Ziemert, Nadine
Due to the threat posed by the increase of highly resistant pathogenic bacteria, there is an urgent need for new antibiotics; all the more so since in the last 20 years, the approval for new antibacterial agents had decreased. The field of natural product discovery has undergone a tremendous development over the past few years. This has been the consequence of several new and revolutionizing drug discovery and development techniques, which is initiating a 'New Age of Antibiotic Discovery'. In this review, we concentrate on the most significant discovery approaches during the last and present years and comment on the challenges facing the community in the coming years. © 2016 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
"The discovery of the fission of uranium exactly half a century ago is at risk of passing unremarked because of the general ambivalence towards the consequences of this development. Can that be wise?" (4 pages)
Bukh, Per Nikolaj
Anmeldelse af Discovery Driven Growh : A breakthrough process to reduce risk and seize opportunity, af Rita G. McGrath & Ian C. MacMillan, Boston: Harvard Business Press. Udgivelsesdato: 14 august......Anmeldelse af Discovery Driven Growh : A breakthrough process to reduce risk and seize opportunity, af Rita G. McGrath & Ian C. MacMillan, Boston: Harvard Business Press. Udgivelsesdato: 14 august...
Full Text Available This article reviews current achievements in the field of chemoinformatics and their impact on modern drug discovery processes. The main data mining approaches used in cheminformatics, such as descriptor computations, structural similarity matrices, and classification algorithms, are outlined. The applications of cheminformatics in drug discovery, such as compound selection, virtual library generation, virtual high throughput screening, HTS data mining, and in silico ADMET are discussed. At the conclusion, future directions of chemoinformatics are suggested.
Børsting, Claus; Fordyce, Sarah L; Olofsson, Jill Katharina;
The Ion Torrent™ HID SNP assay amplified 136 autosomal SNPs and 33 Y-chromosome markers in one PCR and the markers were subsequently typed using the Ion PGM™ second generation sequencing platform. A total of 51 of the autosomal SNPs were selected from the SNPforID panel that is routinely used...... allele balance among samples. These SNPs should be excluded from the panel. The optimal amount of DNA in the PCR seemed to be ≥0.5ng. Allele drop-outs were rare and only seen in experiments with ... of the heterozygote allele balances were between 0.6 and 1.6, which is comparable to the heterozygote balances of STRs typed with PCR-CE. The number of reads with base calls that differed from the genotype call was typically less than five. This allowed detection of 1:100 mixtures with a high degree of certainty...
Full Text Available Abstract Background Arrayed primer extension (APEX is a microarray-based rapid minisequencing methodology that may have utility in 'personalized medicine' applications that involve genetic diagnostics of single nucleotide polymorphisms (SNPs. However, to date there have been few reports that objectively evaluate the assay completion rate, call rate and accuracy of APEX. We have further developed robust assay design, chemistry and analysis methodologies, and have sought to determine how effective APEX is in comparison to leading 'gold-standard' genotyping platforms. Our methods have been tested against industry-leading technologies in two blinded experiments based on Coriell DNA samples and SNP genotype data from the International HapMap Project. Results In the first experiment, we genotyped 50 SNPs across the entire 270 HapMap Coriell DNA sample set. For each Coriell sample, DNA template was amplified in a total of 7 multiplex PCRs prior to genotyping. We obtained good results for 41 of the SNPs, with 99.8% genotype concordance with HapMap data, at an automated call rate of 94.9% (not including the 9 failed SNPs. In the second experiment, involving modifications to the initial DNA amplification so that a single 50-plex PCR could be achieved, genotyping of the same 50 SNPs across each of 49 randomly chosen Coriell DNA samples allowed extremely robust 50-plex genotyping from as little as 5 ng of DNA, with 100% assay completion rate, 100% call rate and >99.9% accuracy. Conclusion We have shown our methods to be effective for robust multiplex SNP genotyping using APEX, with 100% call rate and >99.9% accuracy. We believe that such methodology may be useful in future point-of-care clinical diagnostic applications where accuracy and call rate are both paramount.
Glover, Kevin A.; Hansen, Michael Møller; Lien, Sigbjørn
between SNP and STR data sets and variants thereof. The best 15 SNPs (30 alleles) gave a similar level of self-assignment to the best 4 STR loci (83 alleles), however, addition of further STR loci did not lead to a notable increase assignment whereas addition of up to 100 SNP loci increased assignment...
Jyh-Der Leu; I-Feng Lin; Ying-Fang Sun; Su-Mei Chen; Chih-Chao Liu; Yi-Jang Lee
AIM: To investigate the risk association and compare the onset age of hepatocellular carcinoma (HCC)patients in Taiwan with different genotypes of MDM2-SNP309.METHODS: We analyzed MDM2-SNP309 genotypes from 58 patients with HCC and 138 cancer-free healthy controls consecutively. Genotyping of MDM2-SNP309 was conducted by restriction fragment length polymorphism assay.RESULTS: The proportion of homozygous MDM2-SNP309 genotype (G/G) in cases and cancer-free healthy controls was similar (17.2% vs 16.7%). Multivariate analysis showed that the risk of G/G genotype of MDM2-SNP309 vs wild-type T/T genotype in patients with HCC was not significant (OR = 1.265, 95%CI = 0.074-21.77) after adjustment for sex, hepatitis B or C virus infection, age, and cardiovascular disease/diabetes. Nevertheless, there was a trend that GG genotype of MDM2-SNP309 might increase the risk in HCC patients infected with hepatitis virus (OR = 2.568,95% CI = 0.054-121.69). Besides, the homozygous MDM2-SNP309 genotype did not exhibit a significantly earlier age of onset for HCC.CONCLUSION: Current data suggest that the association between MDM2-SNP309 GG genotype and HCC is not significant, while the risk may be enhanced in patients infected by hepatitis virus in Taiwan.
A custom 60K SNP panel, extracted from Bovine HD SNP chip was used to evaluate genotypic frequency changes in Braford (BF, a composite breed) when compared to progenitor breeds: Hereford (HF), Brahman (BR), and Nelore (NE). Samples from both the U. S. and Brazil were used. The new panel differentiat...
Shi, Shanshan; Lin, Shaobin; Liao, Yanfen; Li, Weijing
To analyze a case with Angelman syndrome (AS) using single nucleotide polymorphism array (SNP array) and explore its genotype-phenotype correlation. G-banded karyotyping and SNP array were performed on a child featuring congenital malformations, intellectual disability and developmental delay. Mendelian error checking based on the SNP information was used to delineate the parental origin of detected abnormality. Result of the SNP array was validated with fluorescence in situ hybridization (FISH). The SNP array has detected a 6.053 Mb deletion at 15q11.2q13.1 (22,770,421- 28,823,722) which overlapped with the critical region of AS (type 1). The parents of the child showed no abnormal results for G-banded karyotyping, SNP array and FISH analysis, indicating a de novo origin of the deletion. Mendelian error checking based on the SNP information suggested that the 15q11.2q13.1 deletion was of maternal origin. SNP array can accurately define the size, location and parental origin of chromosomal microdeletions, which may facilitate the diagnosis of AS due to 15q11q13 deletion and better understanding of its genotype-phenotype correlation.
Tomas Mas, Carmen; Børsting, Claus; Morling, Niels
SNPs are being increasingly used by forensic laboratories. Different platforms have been developed for SNP typing. We describe the GenPlex™ HID system protocol, a new SNP-typing platform developed by Applied Biosystems where 48 of the 52 SNPforID SNPs and amelogenin are included. The GenPlex™ HID...
In this thesis the results are described of investigations of various application of genome wide SNP (single nucleotide polymorphism) markers. The set of SNP markers was identified by GBS (genotyping by sequencing) strategy. The resulting dataset of 129,156 SNPs across 83 tetraploid varieties was us
Full Text Available Cyclin-dependent kinase 5 (Cdk5 is a proline-directed serine/threonine kinase, which plays critical roles in a wide spectrum of neuronal functions including neuronal survival, neurite outgrowth, and synapse development and plasticity. Cdk5 activity is controlled by its specific activators: p35 or p39. While knockout studies reveal that Cdk5/p35 is critical for neuronal migration during early brain development, functions of Cdk5/p35 have been unraveled through the identification of the interacting proteins of p35, most of which are Cdk5/p35 substrates. However, it remains unclear whether p35 can regulate neuronal functions independent of Cdk5 activity. Here, we report that a nuclear protein, nuclear hormone receptor coregulator (NRC-interacting factor 1 (NIF-1, is a new interacting partner of p35. Interestingly, p35 regulates the functions of NIF-1 independent of Cdk5 activity. NIF-1 was initially discovered as a transcriptional regulator that enhances the transcriptional activity of nuclear hormone receptors. Our results show that p35 interacts with NIF-1 and regulates its nucleocytoplasmic trafficking via the nuclear export pathway. Furthermore, we identified a nuclear export signal on p35; mutation of this site or blockade of the CRM1/exportin-dependent nuclear export pathway resulted in the nuclear accumulation of p35. Intriguingly, blocking the nuclear export of p35 attenuated the nuclear accumulation of NIF-1. These findings reveal a new p35-dependent mechanism in transcriptional regulation that involves the nucleocytoplasmic shuttling of transcription regulators.
Zhao, Xiao-Su; Fu, Wing-Yu; Chien, Winnie W Y; Li, Zhen; Fu, Amy K Y; Ip, Nancy Y
Cyclin-dependent kinase 5 (Cdk5) is a proline-directed serine/threonine kinase, which plays critical roles in a wide spectrum of neuronal functions including neuronal survival, neurite outgrowth, and synapse development and plasticity. Cdk5 activity is controlled by its specific activators: p35 or p39. While knockout studies reveal that Cdk5/p35 is critical for neuronal migration during early brain development, functions of Cdk5/p35 have been unraveled through the identification of the interacting proteins of p35, most of which are Cdk5/p35 substrates. However, it remains unclear whether p35 can regulate neuronal functions independent of Cdk5 activity. Here, we report that a nuclear protein, nuclear hormone receptor coregulator (NRC)-interacting factor 1 (NIF-1), is a new interacting partner of p35. Interestingly, p35 regulates the functions of NIF-1 independent of Cdk5 activity. NIF-1 was initially discovered as a transcriptional regulator that enhances the transcriptional activity of nuclear hormone receptors. Our results show that p35 interacts with NIF-1 and regulates its nucleocytoplasmic trafficking via the nuclear export pathway. Furthermore, we identified a nuclear export signal on p35; mutation of this site or blockade of the CRM1/exportin-dependent nuclear export pathway resulted in the nuclear accumulation of p35. Intriguingly, blocking the nuclear export of p35 attenuated the nuclear accumulation of NIF-1. These findings reveal a new p35-dependent mechanism in transcriptional regulation that involves the nucleocytoplasmic shuttling of transcription regulators.
Full Text Available BACKGROUND: The androgen receptor (AR is a steroid-activated transcription factor that binds at specific DNA locations and plays a key role in the etiology of prostate cancer. While numerous studies have identified a clear connection between AR binding and expression of target genes for a limited number of loci, high-throughput elucidation of these sites allows for a deeper understanding of the complexities of this process. METHODOLOGY/PRINCIPAL FINDINGS: We have mapped 189 AR occupied regions (ARORs and 1,388 histone H3 acetylation (AcH3 loci to a 3% continuous stretch of human genomic DNA using chromatin immunoprecipitation (ChIP microarray analysis. Of 62 highly reproducible ARORs, 32 (52% were also marked by AcH3. While the number of ARORs detected in prostate cancer cells exceeded the number of nearby DHT-responsive genes, the AcH3 mark defined a subclass of ARORs much more highly associated with such genes -- 12% of the genes flanking AcH3+ARORs were DHT-responsive, compared to only 1% of genes flanking AcH3-ARORs. Most ARORs contained enhancer activities as detected in luciferase reporter assays. Analysis of the AROR sequences, followed by site-directed ChIP, identified binding sites for AR transcriptional coregulators FoxA1, CEBPbeta, NFI and GATA2, which had diverse effects on endogenous AR target gene expression levels in siRNA knockout experiments. CONCLUSIONS/SIGNIFICANCE: We suggest that only some ARORs function under the given physiological conditions, utilizing diverse mechanisms. This diversity points to differential regulation of gene expression by the same transcription factor related to the chromatin structure.
Full Text Available Abstract Background HAND2, a key regulator for the development of the sympathetic nervous system, is located on chromosome 4q33 in a head-to-head orientation with DEIN, a recently identified novel gene with stage specific expression in primary neuroblastoma (NB. Both genes are expressed in primary NB as well as most NB cell lines and are separated by a genomic sequence of 228 bp. The similar expression profile of both genes suggests a common transcriptional regulation mediated by a bidirectional promoter. Results Northern Blot analysis of DEIN and HAND2 in 20 primary NBs indicated concurrent expression levels of the two genes, which was confirmed by microarray analysis of 236 primary NBs (Pearson's correlation coefficient r = 0.65. While DEIN expression in the latter cohort was associated with stage 4S (p = 0.02, HAND2 expression was not associated with tumor stage. In contrast, both HAND2 and DEIN transcript levels were highly associated with age at diagnosis DEIN orientation, an average 3.4 fold increase in activity was observed as compared to the promoterless vector, whereas an average 15.4 fold activation was detected in HAND2 orientation. The presence of two highly conserved putative regulatory elements, one of which was shown to enhance HAND2 expression in branchial arches previously, displayed weak repressor activity for both genes. Conclusion HAND2 and DEIN represent a gene pair that is tightly linked by a bidirectional promoter in an evolutionary highly conserved manner. Expression of both genes in NB is co-regulated by asymmetrical activity of this promoter and modulated by the activity of two cis-regulatory elements acting as weak repressors. The concurrent quantitative and tissue specific expression of HAND2 and DEIN suggests a functional link between both genes.
Doron, Shany; Shweiki, Dorit
SNP-based research strongly affects our biomedical and clinically associated knowledge. Nonunique and false-positive SNP existence in commonly used datasets may thus lead to biased, inaccurate clinically associated conclusions. We designed a computational study to reveal the degree of nonunique/false-positive SNPs in the HapMap dataset. Two sets of SNP flanking sequences were used as queries for BLAT analysis against the human genome. 4.2% and 11.9% of HapMap SNPs align to the genome nonuniquely (long and short, respectively). Furthermore, an average of 7.9% nonunique SNPs are included in common commercial genotyping arrays (according to our designed probes). Nonunique SNPs identified in this study are represented to various degrees in clinically associated databases, stressing the consequence of inaccurate SNP annotation and hence SNP utilization. Unfortunately, our results question some disease-related genotyping analyses, raising a worrisome concern on their validity.
Full Text Available Abstract Background The possibilities offered by next generation sequencing (NGS platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model species. However, this abundance of sequences and polymorphisms creates new software needs. To fulfill these needs, we have developed a powerful, yet easy-to-use application. Results The ngs_backbone software is a parallel pipeline capable of analyzing Sanger, 454, Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection sequence reads. Its main supported analyses are: read cleaning, transcriptome assembly and annotation, read mapping and single nucleotide polymorphism (SNP calling and selection. In order to build a truly useful tool, the software development was paired with a laboratory experiment. All public tomato Sanger EST reads plus 14.2 million Illumina reads were employed to test the tool and predict polymorphism in tomato. The cleaned reads were mapped to the SGN tomato transcriptome obtaining a coverage of 4.2 for Sanger and 8.5 for Illumina. 23,360 single nucleotide variations (SNVs were predicted. A total of 76 SNVs were experimentally validated, and 85% were found to be real. Conclusions ngs_backbone is a new software package capable of analyzing sequences produced by NGS technologies and predicting SNVs with great accuracy. In our tomato example, we created a highly polymorphic collection of SNVs that will be a useful resource for tomato researchers and breeders. The software developed along with its documentation is freely available under the AGPL license and can be downloaded from http://bioinf.comav.upv.es/ngs_backbone/ or http://github.com/JoseBlanca/franklin.
Full Text Available Abstract Background Good genetic progress for pig reproduction traits has been achieved using a quantitative genetics-based multi-trait BLUP evaluation system. At present, whole-genome single nucleotide polymorphisms (SNP panels provide a new tool for pig selection. The purpose of this study was to identify SNP associated with reproduction traits in the Finnish Landrace pig breed using the Illumina PorcineSNP60 BeadChip. Methods Association of each SNP with different traits was tested with a weighted linear model, using SNP genotype as a covariate and animal as a random variable. Deregressed estimated breeding values of the progeny tested boars were used as the dependent variable and weights were based on their reliabilities. Statistical significance of the associations was based on Bonferroni-corrected P-values. Results Deregressed estimated breeding values were available for 328 genotyped boars. Of the 62 163 SNP in the chip, 57 868 SNP had a call rate > 0.9 and 7 632 SNP were monomorphic. Statistically significant results (P-value P-value P-value = 1.69E-08 more than unfavourable double homozygote animals. A region on chromosome 9 (66 Mb was statistically significant for piglet mortality between birth and weaning in later parity (0.44 piglets between homozygotes, P-value = 6.94E-08. Conclusions Three separate regions on chromosome 9 gave significant results for litter size and pig mortality. The frequencies of favourable alleles of the significant SNP are moderate in the Finnish Landrace population and these SNP are thus valuable candidates for possible marker-assisted selection.
Peter P. Grimminger
Full Text Available Background. To further improve the screening, diagnosis, and therapy of patients with nonsmall cell lung cancer (NSCLC additional diagnostic tools are urgently needed. Gene expression of Cyclooxygenase-2 (COX-2 has been linked to prognosis in patients with NSCLC. The role of the COX-2 926G>C Single Nucleotide Polymorphism (SNP in patients with NSCLC remains unclear. The aim of this study was to investigate the potential of the COX-2 926G>C SNP as a molecular marker in this disease. Methods. COX-2 926G>C SNP was analyzed in surgically resected tumor tissue of 85 patients with NSCLC using a PCR-based RFLP technique. Results. The COX-2 926G>C SNP genotypes were detected with the following frequencies: GG n=62 (73%, GC n=20 (23%, CC n=3 (4%. There were no associations between COX-2 SNP genotype and histology, grading or gender detectable. COX-2 SNP was significantly associated with tumor stage (P=.032 and lymph node status (P=.016, Chi-square test. With a median followup of 85.9 months, the median survival was 59.7 months. There were no associations seen between the COX-2 SNP genotype and patients prognosis. Conclusions. The COX-2 926G>C SNP is detectable at a high frequency in patients with NSCLC. The COX-2 926G>C SNP genotype is not a prognostic molecular marker in this disease. However, patients with the GC or CC genotype seem more susceptible to lymph node metastases and higher tumor stage than patients with the GG genotype. The results suggest COX-2 926G>C SNP as a molecular marker for lymph node involvement in this disease.
Selinski, Silvia; Blaszkewicz, Meinolf; Lehmann, Marie-Louise; Ovsiannikov, Daniel; Moormann, Oliver; Guballa, Christoph; Kress, Alexander; Truss, Michael C; Gerullis, Holger; Otto, Thomas; Barski, Dimitri; Niegisch, Günter; Albers, Peter; Frees, Sebastian; Brenner, Walburgis; Thüroff, Joachim W; Angeli-Greaves, Miriam; Seidel, Thilo; Roth, Gerhard; Dietrich, Holger; Ebbinghaus, Rainer; Prager, Hans M; Bolt, Hermann M; Falkenstein, Michael; Zimmermann, Anna; Klein, Torsten; Reckwitz, Thomas; Roemer, Hermann C; Löhlein, Dietrich; Weistenhöfer, Wobbeke; Schöps, Wolfgang; Hassan Rizvi, Syed Adibul; Aslam, Muhammad; Bánfi, Gergely; Romics, Imre; Steffens, Michael; Ekici, Arif B; Winterpacht, Andreas; Ickstadt, Katja; Schwender, Holger; Hengstler, Jan G; Golka, Klaus
Genotyping N-acetyltransferase 2 (NAT2) is of high relevance for individualized dosing of antituberculosis drugs and bladder cancer epidemiology. In this study we compared a recently published tagging single nucleotide polymorphism (SNP) (rs1495741) to the conventional 7-SNP genotype (G191A, C282T, T341C, C481T, G590A, A803G and G857A haplotype pairs) and systematically analysed if novel SNP combinations outperform the latter. For this purpose, we studied 3177 individuals by PCR and phenotyped 344 individuals by the caffeine test. Although the tagSNP and the 7-SNP genotype showed a high degree of correlation (R=0.933, P<0.0001) the 7-SNP genotype nevertheless outperformed the tagging SNP with respect to specificity (1.0 vs. 0.9444, P=0.0065). Considering all possible SNP combinations in a receiver operating characteristic analysis we identified a 2-SNP genotype (C282T, T341C) that outperformed the tagging SNP and was equivalent to the 7-SNP genotype. The 2-SNP genotype predicted the correct phenotype with a sensitivity of 0.8643 and a specificity of 1.0. In addition, it predicted the 7-SNP genotype with sensitivity and specificity of 0.9993 and 0.9880, respectively. The prediction of the NAT2 genotype by the 2-SNP genotype performed similar in populations of Caucasian, Venezuelan and Pakistani background. A 2-SNP genotype predicts NAT2 phenotypes with similar sensitivity and specificity as the conventional 7-SNP genotype. This procedure represents a facilitation in individualized dosing of NAT2 substrates without losing sensitivity or specificity.
CluGene: A Bioinformatics Framework for the Identification of Co-Localized, Co-Expressed and Co-Regulated Genes Aimed at the Investigation of Transcriptional Regulatory Networks from High-Throughput Expression Data.
Full Text Available The full understanding of the mechanisms underlying transcriptional regulatory networks requires unravelling of complex causal relationships. Genome high-throughput technologies produce a huge amount of information pertaining gene expression and regulation; however, the complexity of the available data is often overwhelming and tools are needed to extract and organize the relevant information. This work starts from the assumption that the observation of co-occurrent events (in particular co-localization, co-expression and co-regulation may provide a powerful starting point to begin unravelling transcriptional regulatory networks. Co-expressed genes often imply shared functional pathways; co-expressed and functionally related genes are often co-localized, too; moreover, co-expressed and co-localized genes are also potential targets for co-regulation; finally, co-regulation seems more frequent for genes mapped to proximal chromosome regions. Despite the recognized importance of analysing co-occurrent events, no bioinformatics solution allowing the simultaneous analysis of co-expression, co-localization and co-regulation is currently available. Our work resulted in developing and valuating CluGene, a software providing tools to analyze multiple types of co-occurrences within a single interactive environment allowing the interactive investigation of combined co-expression, co-localization and co-regulation of genes. The use of CluGene will enhance the power of testing hypothesis and experimental approaches aimed at unravelling transcriptional regulatory networks. The software is freely available at http://bioinfolab.unipg.it/.
Villagran, Marcelo A; Gutierrez-Castro, Francisco A; Pantoja, Diego F; Alarcon, Jose C; Fariña, Macarena A; Amigo, Romina F; Muñoz-Godoy, Natalia A; Pinilla, Mabel G; Peña, Eduardo A; Gonzalez-Chavarria, Ivan; Toledo, Jorge R; Rivas, Coralia I; Vera, Juan C; McNerney, Eileen M; Onate, Sergio A
Prostate cancer (CaP) bone metastasis is an early event that remains inactive until later-stage progression. Reduced levels of circulating androgens, due to andropause or androgen deprivation therapies, alter androgen receptor (AR) coactivator expression. Coactivators shift the balance towards enhanced AR-mediated gene transcription that promotes progression to androgen-resistance. Disruptions in coregulators may represent a molecular switch that reactivates latent bone metastasis. Changes in AR-mediated transcription in androgen-sensitive LNCaP and androgen-resistant C4-2 cells were analyzed for AR coregulator recruitment in co-culture with Saos-2 and THP-1. The Saos-2 cell line derived from human osteosarcoma and THP-1 cell line representing human monocytes were used to display osteoblast and osteoclast activity. Increased AR activity in androgen-resistant C4-2 was due to increased AR expression and SRC1/TIF2 recruitment and decreased SMRT/NCoR expression. AR activity in both cell types was decreased over 90% when co-cultured with Saos-2 or THP-1 due to dissociation of AR from the SRC1/TIF2 and SMRT/NCoR coregulators complex, in a ligand-dependent and cell-type specific manner. In the absence of androgens, Saos-2 decreased while THP-1 increased proliferation of LNCaP cells. In contrast, both Saos-2 and THP-1 decreased proliferation of C4-2 in absence and presence of androgens. Global changes in gene expression from both CaP cell lines identified potential cell cycle and androgen regulated genes as mechanisms for changes in cell proliferation and AR-mediated transactivation in the context of bone marrow stroma cells. Copyright © 2015 Elsevier Inc. All rights reserved.
Villagran, Marcelo A.; Gutierrez-Castro, Francisco A.; Pantoja, Diego F.; Alarcon, Jose C.; Fariña, Macarena A.; Amigo, Romina F.; Muñoz-Godoy, Natalia A. [Molecular Endocrinology and Oncology Laboratory, University of Concepcion, Concepcion (Chile); Pinilla, Mabel G. [Department of Medical Specialties, School of Medicine, University of Concepcion, Concepcion (Chile); Peña, Eduardo A.; Gonzalez-Chavarria, Ivan; Toledo, Jorge R.; Rivas, Coralia I.; Vera, Juan C. [Department of Physiopathology, School of Biological Sciences, University of Concepcion, Concepcion (Chile); McNerney, Eileen M. [Molecular Endocrinology and Oncology Laboratory, University of Concepcion, Concepcion (Chile); Onate, Sergio A., E-mail: email@example.com [Molecular Endocrinology and Oncology Laboratory, University of Concepcion, Concepcion (Chile); Department of Medical Specialties, School of Medicine, University of Concepcion, Concepcion (Chile); Department of Urology, State University of New York at Buffalo, NY (United States)
Prostate cancer (CaP) bone metastasis is an early event that remains inactive until later-stage progression. Reduced levels of circulating androgens, due to andropause or androgen deprivation therapies, alter androgen receptor (AR) coactivator expression. Coactivators shift the balance towards enhanced AR-mediated gene transcription that promotes progression to androgen-resistance. Disruptions in coregulators may represent a molecular switch that reactivates latent bone metastasis. Changes in AR-mediated transcription in androgen-sensitive LNCaP and androgen-resistant C4-2 cells were analyzed for AR coregulator recruitment in co-culture with Saos-2 and THP-1. The Saos-2 cell line derived from human osteosarcoma and THP-1 cell line representing human monocytes were used to display osteoblast and osteoclast activity. Increased AR activity in androgen-resistant C4-2 was due to increased AR expression and SRC1/TIF2 recruitment and decreased SMRT/NCoR expression. AR activity in both cell types was decreased over 90% when co-cultured with Saos-2 or THP-1 due to dissociation of AR from the SRC1/TIF2 and SMRT/NCoR coregulators complex, in a ligand-dependent and cell-type specific manner. In the absence of androgens, Saos-2 decreased while THP-1 increased proliferation of LNCaP cells. In contrast, both Saos-2 and THP-1 decreased proliferation of C4-2 in absence and presence of androgens. Global changes in gene expression from both CaP cell lines identified potential cell cycle and androgen regulated genes as mechanisms for changes in cell proliferation and AR-mediated transactivation in the context of bone marrow stroma cells. - Highlights: • Decreased corepressor expression change AR in androgen-resistance prostate cancer. • Bone stroma-derived cells change AR coregulator recruitment in prostate cancer. • Bone stroma cells change cell proliferation in androgen-resistant cancer cells. • Global gene expression in CaP cells is modified by bone stroma cells in co
Bose, Niranjan; Taylor, Ronald K.
The toxin-coregulated pilus (TCP) of Vibrio cholerae and the soluble TcpF protein that is secreted via the TCP biogenesis apparatus are essential for intestinal colonization. The TCP biogenesis apparatus is composed of at least nine proteins but is largely uncharacterized. TcpC is an outer membrane lipoprotein required for TCP biogenesis that is a member of the secretin protein superfamily. In the present study, analysis of TcpC in a series of strains deficient in each of the TCP biogenesis p...
Full Text Available In colorectal cancer (CRC, chromosomal instability (CIN is typically studied using comparative-genomic hybridization (CGH arrays. We studied paired (tumor and surrounding healthy fresh frozen tissue from 86 CRC patients using Illumina's Infinium-based SNP array. This method allowed us to study CIN in CRC, with simultaneous analysis of copy number (CN and B-allele frequency (BAF--a representation of allelic composition. These data helped us to detect mono-allelic and bi-allelic amplifications/deletion, copy neutral loss of heterozygosity, and levels of mosaicism for mixed cell populations, some of which can not be assessed with other methods that do not measure BAF. We identified associations between CN abnormalities and different CRC phenotypes (histological diagnosis, location, tumor grade, stage, MSI and presence of lymph node metastasis. We showed commonalities between regions of CN change observed in CRC and the regions reported in previous studies of other solid cancers (e.g. amplifications of 20q, 13q, 8q, 5p and deletions of 18q, 17p and 8p. From Therapeutic Target Database, we identified relevant drugs, targeted to the genes located in these regions with CN changes, approved or in trials for other cancers and common diseases. These drugs may be considered for future therapeutic trials in CRC, based on personalized cytogenetic diagnosis. We also found many regions, harboring genes, which are not currently targeted by any relevant drugs that may be considered for future drug discovery studies. Our study shows the application of high density SNP arrays for cytogenetic study in CRC and its potential utility for personalized treatment.
Garcés-Claver, Ana; Fellman, Shanna Moore; Gil-Ortega, Ramiro; Jahn, Molly; Arnedo-Andrés, María S
A single nucleotide polymorphism (SNP) associated with pungency was detected within an expressed sequence tag (EST) of 307 bp. This fragment was identified after expression analysis of the EST clone SB2-66 in placenta tissue of Capsicum fruits. Sequence alignments corresponding to this new fragment allowed us to identify an SNP between pungent and non-pungent accessions. Two methods were chosen for the development of the SNP marker linked to pungency: tetra-primer amplification refractory mutation system-PCR (tetra-primer ARMS-PCR) and cleaved amplified polymorphic sequence. Results showed that both methods were successful in distinguishing genotypes. Nevertheless, tetra-primer ARMS-PCR was chosen for SNP genotyping because it was more rapid, reliable and less cost-effective. The utility of this SNP marker for pungency was demonstrated by the ability to distinguish between 29 pungent and non-pungent cultivars of Capsicum annuum. In addition, the SNP was also associated with phenotypic pungent character in the tested genotypes of C. chinense, C. baccatum, C. frutescens, C. galapagoense, C. eximium, C. tovarii and C. cardenasi. This SNP marker is a faster, cheaper and more reproducible method for identifying pungent peppers than other techniques such as panel tasting, and allows rapid screening of the trait in early growth stages.
Yang, Cheng-Hong; Cheng, Yu-Huei; Yang, Cheng-Huei; Chuang, Li-Yeh
Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) is useful in small-scale basic research studies of complex genetic diseases that are associated with single nucleotide polymorphism (SNP). Designing a feasible primer pair is an important work before performing PCR-RFLP for SNP genotyping. However, in many cases, restriction enzymes to discriminate the target SNP resulting in the primer design is not applicable. A mutagenic primer is introduced to solve this problem. GA-based Mismatch PCR-RFLP Primers Design (GAMPD) provides a method that uses a genetic algorithm to search for optimal mutagenic primers and available restriction enzymes from REBASE. In order to improve the efficiency of the proposed method, a mutagenic matrix is employed to judge whether a hypothetical mutagenic primer can discriminate the target SNP by digestion with available restriction enzymes. The available restriction enzymes for the target SNP are mined by the updated core of SNP-RFLPing. GAMPD has been used to simulate the SNPs in the human SLC6A4 gene under different parameter settings and compared with SNP Cutter for mismatch PCR-RFLP primer design. The in silico simulation of the proposed GAMPD program showed that it designs mismatch PCR-RFLP primers. The GAMPD program is implemented in JAVA and is freely available at http://bio.kuas.edu.tw/gampd/.
Marjolein van Gent
Full Text Available To monitor changes in Bordetella pertussis populations, mainly two typing methods are used; Pulsed-Field Gel Electrophoresis (PFGE and Multiple-Locus Variable-Number Tandem Repeat Analysis (MLVA. In this study, a single nucleotide polymorphism (SNP typing method, based on 87 SNPs, was developed and compared with PFGE and MLVA. The discriminatory indices of SNP typing, PFGE and MLVA were found to be 0.85, 0.95 and 0.83, respectively. Phylogenetic analysis, using SNP typing as Gold Standard, revealed false homoplasies in the PFGE and MLVA trees. Further, in contrast to the SNP-based tree, the PFGE- and MLVA-based trees did not reveal a positive correlation between root-to-tip distance and the isolation year of strains. Thus PFGE and MLVA do not allow an estimation of the relative age of the selected strains. In conclusion, SNP typing was found to be phylogenetically more informative than PFGE and more discriminative than MLVA. Further, in contrast to PFGE, it is readily standardized allowing interlaboratory comparisons. We applied SNP typing to study strains with a novel allele for the pertussis toxin promoter, ptxP3, which have a worldwide distribution and which have replaced the resident ptxP1 strains in the last 20 years. Previously, we showed that ptxP3 strains showed increased pertussis toxin expression and that their emergence was associated with increased notification in The Netherlands. SNP typing showed that the ptxP3 strains isolated in the Americas, Asia, Australia and Europe formed a monophyletic branch which recently diverged from ptxP1 strains. Two predominant ptxP3 SNP types were identified which spread worldwide. The widespread use of SNP typing will enhance our understanding of the evolution and global epidemiology of B. pertussis.
Wong, Melissa M L; Cannon, Charles H; Wickneswari, Ratnam
Next Generation Sequencing has provided comprehensive, affordable and high-throughput DNA sequences for Single Nucleotide Polymorphism (SNP) discovery in Acacia auriculiformis and Acacia mangium. Like other non-model species, SNP detection and genotyping in Acacia are challenging due to lack of genome sequences. The main objective of this study is to develop the first high-throughput SNP genotyping assay for linkage map construction of A. auriculiformis x A. mangium hybrids. We identified a total of 37,786 putative SNPs by aligning short read transcriptome data from four parents of two Acacia hybrid mapping populations using Bowtie against 7,839 de novo transcriptome contigs. Given a set of 10 validated SNPs from two lignin genes, our in silico SNP detection approach is highly accurate (100%) compared to the traditional in vitro approach (44%). Further validation of 96 SNPs using Illumina GoldenGate Assay gave an overall assay success rate of 89.6% and conversion rate of 37.5%. We explored possible factors lowering assay success rate by predicting exon-intron boundaries and paralogous genes of Acacia contigs using Medicago truncatula genome as reference. This assessment revealed that presence of exon-intron boundary is the main cause (50%) of assay failure. Subsequent SNPs filtering and improved assay design resulted in assay success and conversion rate of 92.4% and 57.4%, respectively based on 768 SNPs genotyping. Analysis of clustering patterns revealed that 27.6% of the assays were not reproducible and flanking sequence might play a role in determining cluster compression. In addition, we identified a total of 258 and 319 polymorphic SNPs in A. auriculiformis and A. mangium natural germplasms, respectively. We have successfully discovered a large number of SNP markers in A. auriculiformis x A. mangium hybrids using next generation transcriptome sequencing. By using a reference genome from the most closely related species, we converted most SNPs to successful
Wong Melissa ML
Full Text Available Abstract Background Next Generation Sequencing has provided comprehensive, affordable and high-throughput DNA sequences for Single Nucleotide Polymorphism (SNP discovery in Acacia auriculiformis and Acacia mangium. Like other non-model species, SNP detection and genotyping in Acacia are challenging due to lack of genome sequences. The main objective of this study is to develop the first high-throughput SNP genotyping assay for linkage map construction of A. auriculiformis x A. mangium hybrids. Results We identified a total of 37,786 putative SNPs by aligning short read transcriptome data from four parents of two Acacia hybrid mapping populations using Bowtie against 7,839 de novo transcriptome contigs. Given a set of 10 validated SNPs from two lignin genes, our in silico SNP detection approach is highly accurate (100% compared to the traditional in vitro approach (44%. Further validation of 96 SNPs using Illumina GoldenGate Assay gave an overall assay success rate of 89.6% and conversion rate of 37.5%. We explored possible factors lowering assay success rate by predicting exon-intron boundaries and paralogous genes of Acacia contigs using Medicago truncatula genome as reference. This assessment revealed that presence of exon-intron boundary is the main cause (50% of assay failure. Subsequent SNPs filtering and improved assay design resulted in assay success and conversion rate of 92.4% and 57.4%, respectively based on 768 SNPs genotyping. Analysis of clustering patterns revealed that 27.6% of the assays were not reproducible and flanking sequence might play a role in determining cluster compression. In addition, we identified a total of 258 and 319 polymorphic SNPs in A. auriculiformis and A. mangium natural germplasms, respectively. Conclusion We have successfully discovered a large number of SNP markers in A. auriculiformis x A. mangium hybrids using next generation transcriptome sequencing. By using a reference genome from the most closely
Zheng, Jie; Gaunt, Tom R; Day, Ian N M
Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool (http://apps.biocompute.org.uk/sssrap/sssrap.cgi). To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data. © 2012 Blackwell Publishing Ltd/University College London.
Valenzuela-Miranda, Diego; Gallardo-Escárate, Cristian; Valenzuela-Muñoz, Valentina; Farlora, Rodolfo; Gajardo, Gonzalo
In order to enhance genomic resources for the brine shrimp Artemia franciscana, RNA-Seq analysis was conducted for adult females and males. Through de novo assembly, 36,896 high quality contigs were obtained, of which 13,749 sequences were annotated with arthropod sequences. Just 4.5% matched against previously reported sequences for Artemia spp. Additionally, different transcriptional patterns between males and females were found, evidencing sex-related transcriptional responses. Furthermore, 221 and 534 putative SNPs were identified exclusively in males and females, respectively. These results will build the foundation for further genomic studies in A. franciscana.
Basal stalk rot (BSR) caused by the ascomycete fungus Sclerotinia sclerotiorum (Lib.) de Bary is a serious disease of sunflower (Helianthus annuus L.) in the cool and humid production areas of the world. Quantitative trait loci (QTL) for BSR resistance were identified in a sunflower recombinant inbr...
Binder, Harald; Müller, Tina; Schwender, Holger; Golka, Klaus; Steffens, Michael; Hengstler, Jan G; Ickstadt, Katja; Schumacher, Martin
The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, these groups of individuals are identified using a clustering approach, where each group may be defined via different SNPs. This allows for representing complex interaction patterns, such as compositional epistasis, that might not be detected by a single main effects model. In a simulation study, the CLR approach results in improved prediction performance, compared to the main effects approach, and identification of important SNPs in several scenarios. Improved prediction performance is also obtained for an application example considering urinary bladder cancer. Some of the identified SNPs are predictive for all individuals, while others are only relevant for a specific group. Together with the sets of SNPs that define the groups, potential interaction patterns are uncovered.
Full Text Available Abstract Background Several millions single nucleotide polymorphisms (SNPs have already been collected and deposited in public databases and these are important resources not only for use as markers to identify disease-associated genes, but also to understand the mechanisms that underlie the genome diversification. Results A spectrum analysis of SNP density distribution in the genomic regions around transcription start sites (TSSs revealed a remarkable periodicity of 146 nucleotides. This periodicity was observed in the regions that were associated with CpG islands (CGIs, but not in the regions without CpG islands (nonCGIs. An analysis of the sequence divergence of the same genomic regions between humans and chimpanzees also revealed a similar periodical pattern in CGI. The occurrences of any mono- or di-nucleotide sequences in these regions did not reveal such a periodicity, thus indicating that an interpretation of this periodicity solely based on the sequence-dependent susceptibility to mutation is highly unlikely. Conclusion The periodical patterns of nucleotide variability suggest the location of nucleosomes that are phased at TSS, and can be viewed as the genetic footprint of the chromatin state that has been maintained throughout mammalian evolutionary history. The results suggest the possible involvement of the nucleosome structure in the promoter function, and also a fundamental functional/structural difference between the two promoter classes, i.e., those with and without CGIs.
Vukcevic, Damjan; Traherne, James A; Næss, Sigrid; Ellinghaus, Eva; Kamatani, Yoichiro; Dilthey, Alexander; Lathrop, Mark; Karlsen, Tom H; Franke, Andre; Moffatt, Miriam; Cookson, William; Trowsdale, John; McVean, Gil; Sawcer, Stephen; Leslie, Stephen
Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease.
Douglas S Goodin
Full Text Available BACKGROUND: Genome-wide association studies (GWAS identify disease-associations for single-nucleotide-polymorphisms (SNPs from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association. METHODOLOGY/PRINCIPAL FINDINGS: Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person's SNP-genotype over specified chromosomal segments. Two multiple sclerosis (MS-associated genetic regions were modeled; DRB1 (a Class II molecule of the major histocompatibility complex and MMEL1 (an endopeptidase that degrades both neuropeptides and β-amyloid. For each locus, we considered sets of eleven adjacent SNPs, surrounding the putative disease-associated gene and spanning ∼200 kb of DNA. The SNP-information was converted into an ordered-set of eleven-numbers (subject-vectors based on whether a person had zero, one, or two copies of particular SNP-variant at each sequential SNP-location. SNP-strings were defined as those ordered-combinations of eleven-numbers (0 or 1, representing a haplotype, two of which combined to form the observed subject-vector. Subject-vectors were resolved using probabilistic methods. In both regions, only a small number of SNP-strings were present. We compared our method to the SHAPEIT-2 phasing-algorithm. When the SNP-information spanning 200 kb was used, SHAPEIT-2 was inaccurate. When the SHAPEIT-2 window was increased to 2,000 kb, the concordance between the two methods, in both of these eleven-SNP regions, was over 99%, suggesting that, in these regions, both methods were quite accurate. Nevertheless, correspondence was not uniformly high over the entire DNA-span but, rather, was characterized by alternating peaks and valleys of concordance. Moreover, in the valleys of poor-correspondence, SHAPEIT-2 was also inconsistent with itself
Martinez, Pierre; Kimberley, Christopher; Birkbak, Nicolai Juul
standard genomic data such as SNP-arrays, that could be implemented routinely. We designed two novel scores S and R, respectively based on the Shannon diversity index and Ripley's L statistic of spatial homogeneity, to quantify ITH in single SNP-array samples. We created in-silico and in-vitro mixtures...... sequencing data but heterogeneity in the fraction of tumour cells present across samples hampered accurate quantification. The prognostic potential of both scores was moderate but significantly predictive of survival in several tumour types (corrected p = 0.03). Our work thus shows how individual SNP...
Sedighi, Abootaleb; Li, Paul C H
Here, we describe detection of single nucleotide polymorphism (SNP) in genomic DNA samples using a NanoBioArray (NBA) chip. Fast DNA hybridization is achieved in the chip when target DNAs are introduced to the surface-arrayed probes using centrifugal force. Gold nanoparticles (AuNPs) are used to assist SNP detection at room temperature. The parallel setting of sample introduction in the spiral channels of the NBA chip enables multiple analyses on many samples, resulting in a technique appropriate for high-throughput SNP detection. The experimental procedure, including chip fabrication, probe array printing, DNA amplification, hybridization, signal detection, and data analysis, is described in detail.
Hansen, Thomas V. O.; Vikesaa, Jonas; Buhl, Sine S
) arrays can provide additional diagnostic power to assess HER2 gene status. METHODS: DNA from 65 breast tumor samples previously diagnosed by HER2 IHC and FISH analysis were blinded and examined for HER2 copy number variation employing SNP array analysis. RESULTS: SNP array analysis identified 24 (37......%) samples with selective amplification or imbalance of the HER2 region in the q-arm of chromosome 17. In contrast, only 15 (23%) tumors were found to have HER2 amplification by IHC and FISH analysis. In total, there was a discrepancy in 19 (29%) samples between SNP array and IHC/FISH analysis. In 12...
Interim Report on SNP analysis and forensic microarray probe design for South American hemorrhagic fever viruses, tick-borne encephalitis virus, henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever viruses, Rift Valley fever
Jaing, C; Gardner, S
The goal of this project is to develop forensic genotyping assays for select agent viruses, enhancing the current capabilities for the viral bioforensics and law enforcement community. We used a multipronged approach combining bioinformatics analysis, PCR-enriched samples, microarrays and TaqMan assays to develop high resolution and cost effective genotyping methods for strain level forensic discrimination of viruses. We have leveraged substantial experience and efficiency gained through year 1 on software development, SNP discovery, TaqMan signature design and phylogenetic signature mapping to scale up the development of forensics signatures in year 2. In this report, we have summarized the whole genome wide SNP analysis and microarray probe design for forensics characterization of South American hemorrhagic fever viruses, tick-borne encephalitis viruses and henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus and Japanese encephalitis virus.
LIAONING Province, in northeastern China, has been inhabited by many ethnic groups since ancient times. It is one of the sites of China’s earliest civilization. Since the 1950s many archaeological discoveries from periods beginning with the Paleolithic of 200,000 years ago, and through all the following historic periods, have been made in the province.
Christiansen, Elisabeth; Hansen, Steffen V F; Urban, Christian;
Free fatty acid receptor 1 (FFA1 or GPR40) enhances glucose-stimulated insulin secretion from pancreatic β-cells and currently attracts high interest as a new target for the treatment of type 2 diabetes. We here report the discovery of a highly potent FFA1 agonist with favorable physicochemical a...
Balachandran, Premalatha; Govindarajan, Rajgopal
Ayurveda is a major traditional system of Indian medicine that is still being successfully used in many countries. Recapitulation and adaptation of the older science to modern drug discovery processes can bring renewed interest to the pharmaceutical world and offer unique therapeutic solutions for a wide range of human disorders. Eventhough time-tested evidences vouch immense therapeutic benefits for ayurvedic herbs and formulations, several important issues are required to be resolved for successful implementation of ayurvedic principles to present drug discovery methodologies. Additionally, clinical examination in the extent of efficacy, safety and drug interactions of newly developed ayurvedic drugs and formulations are required to be carefully evaluated. Ayurvedic experts suggest a reverse-pharmacology approach focusing on the potential targets for which ayurvedic herbs and herbal products could bring tremendous leads to ayurvedic drug discovery. Although several novel leads and drug molecules have already been discovered from ayurvedic medicinal herbs, further scientific explorations in this arena along with customization of present technologies to ayurvedic drug manufacturing principles would greatly facilitate a standardized ayurvedic drug discovery.
Contributes to a special issue on how the reconsideration of what scholarship is affects the way in which scholarship is assessed. Examines traditional criteria for evaluating faculty research. Identifies activities pertinent to the scholarship of discovery, and the assessment practices in the field of communication as well as in general use. (SR)
Wilson, Harold C.
Discovery Education is based on the writings of Henry David Thoreau, an early champion of experiential learning. After 2 months of preparation, 10th-grade students spent 4 days in the wilderness reenacting a piece of history, such as the Lewis and Clark Expedition. The interdisciplinary approach always included journal-writing. Students gained…
Haeupler, Bernhard; Peleg, David; Rajaraman, Rajmohan; Sun, Zhifeng
We study randomized gossip-based processes in dynamic networks that are motivated by discovery processes in large-scale distributed networks like peer-to-peer or social networks. A well-studied problem in peer-to-peer networks is the resource discovery problem. There, the goal for nodes (hosts with IP addresses) is to discover the IP addresses of all other hosts. In social networks, nodes (people) discover new nodes through exchanging contacts with their neighbors (friends). In both cases the discovery of new nodes changes the underlying network - new edges are added to the network - and the process continues in the changed network. Rigorously analyzing such dynamic (stochastic) processes with a continuously self-changing topology remains a challenging problem with obvious applications. This paper studies and analyzes two natural gossip-based discovery processes. In the push process, each node repeatedly chooses two random neighbors and puts them in contact (i.e., "pushes" their mutual information to each oth...
Dana B Hancock
Full Text Available Genome-wide association studies have identified numerous genetic loci for spirometic measures of pulmonary function, forced expiratory volume in one second (FEV(1, and its ratio to forced vital capacity (FEV(1/FVC. Given that cigarette smoking adversely affects pulmonary function, we conducted genome-wide joint meta-analyses (JMA of single nucleotide polymorphism (SNP and SNP-by-smoking (ever-smoking or pack-years associations on FEV(1 and FEV(1/FVC across 19 studies (total N = 50,047. We identified three novel loci not previously associated with pulmonary function. SNPs in or near DNER (smallest P(JMA = 5.00×10(-11, HLA-DQB1 and HLA-DQA2 (smallest P(JMA = 4.35×10(-9, and KCNJ2 and SOX9 (smallest P(JMA = 1.28×10(-8 were associated with FEV(1/FVC or FEV(1 in meta-analysis models including SNP main effects, smoking main effects, and SNP-by-smoking (ever-smoking or pack-years interaction. The HLA region has been widely implicated for autoimmune and lung phenotypes, unlike the other novel loci, which have not been widely implicated. We evaluated DNER, KCNJ2, and SOX9 and found them to be expressed in human lung tissue. DNER and SOX9 further showed evidence of differential expression in human airway epithelium in smokers compared to non-smokers. Our findings demonstrated that joint testing of SNP and SNP-by-environment interaction identified novel loci associated with complex traits that are missed when considering only the genetic main effects.
Gardner, S; Jaing, C
The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interim report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.
张婧; 沈礼; 戴长建
The field ionization process of the Eu 4f76snp Rydberg states, converging to the first ionization limit, 4f76s 9S4, is systematically investigated. The spectra of the Eu 4f76snp Rydberg states are populated with three-step laser excitation, and detected by electric field ionization (EFI) method. Two different kinds of the EFI pulses are applied after laser excitation to observe the possible impacts on the EFI process. The exact EFI ionization thresholds for the 4f76snp Rydberg states can be determined by observing the corresponding EFI spectra. In particular, some structures above the EFI threshold are found in the EFI spectra, which may be interpreted as the effect from black body radiation (BBR). Finally, the scaling law of the EFI threshold for the Eu 4f76snp Rydberg states with the effective quantum number is built.
Ren, Jing; Chen, Liang; Sun, Daokun; You, Frank M; Wang, Jirui; Peng, Yunliang; Nevo, Eviatar; Beiles, Avigdor; Sun, Dongfa; Luo, Ming-Cheng; Peng, Junhua
.... However, few studies have been performed on the genetic structure and population divergence in wild emmer wheat using a large number of EST-related single nucleotide polymorphism (SNP) markers...
Sherry, S T; Ward, M; Sirotkin, K
While high quality information regarding variation in genes is currently available in locus-specific or specialized mutation databases, the need remains for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping, and evolutionary biology. In response to this need, the National Center for Biotechnology Information (NCBI) has established the dbSNP database http://ncbi. nlm.nih.gov/SNP/ to serve as a generalized, central variation database. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink, and the Human Genome Project data, and the complete contents of dbSNP are available to the public via anonymous FTP. Hum Mutat 15:68-75, 2000. Published 2000 Wiley-Liss, Inc.
Malov, Sergey V; Antonik, Alexey; Tang, Minzhong; Berred, Alexandre; Zeng, Yi; O'Brien, Stephen J
A new approach for statistical association signal identification is developed in this paper. We consider a strategy for nonprecise signal identification by extending the well-known signal detection and signal identification methods applicable to the multiple testing problem. Collection of statistical instruments under the presented approach is much broader than under the traditional signal identification methods, allowing more efficient signal discovery. Further assessments of maximal value and average statistics in signal discovery are improved. While our method does not attempt to detect individual predictors, it instead detects sets of predictors that are jointly associated with the outcome. Therefore, an important application would be in genome wide association study (GWAS), where it can be used to detect genes which influence the phenotype but do not contain any individually significant single nucleotide polymorphism (SNP). We compare power of the signal identification method based on extremes of single p-values with the signal localization method based on average statistics for logarithms of p-values. A simulation analysis informs the application of signal localization using the average statistics for wide signals discovery in Gaussian white noise process. We apply average statistics and the localization method to GWAS to discover better gene influences of regulating loci in a Chinese cohort developed for risk of nasopharyngeal carcinoma (NPC).
In a guided - discovery lesson, students sequentially uncover layers of mathematical information one step at a time and learn new mathematics. We have identified eight critical steps necessary in developing a successful guided- discovery lesson.
Thomas George H
Full Text Available Abstract Background A variety of diseases are caused by chromosomal abnormalities such as aneuploidies (having an abnormal number of chromosomes, microdeletions, microduplications, and uniparental disomy. High density single nucleotide polymorphism (SNP microarrays provide information on chromosomal copy number changes, as well as genotype (heterozygosity and homozygosity. SNP array studies generate multiple types of data for each SNP site, some with more than 100,000 SNPs represented on each array. The identification of different classes of anomalies within SNP data has been challenging. Results We have developed SNPscan, a web-accessible tool to analyze and visualize high density SNP data. It enables researchers (1 to visually and quantitatively assess the quality of user-generated SNP data relative to a benchmark data set derived from a control population, (2 to display SNP intensity and allelic call data in order to detect chromosomal copy number anomalies (duplications and deletions, (3 to display uniparental isodisomy based on loss of heterozygosity (LOH across genomic regions, (4 to compare paired samples (e.g. tumor and normal, and (5 to generate a file type for viewing SNP data in the University of California, Santa Cruz (UCSC Human Genome Browser. SNPscan accepts data exported from Affymetrix Copy Number Analysis Tool as its input. We validated SNPscan using data generated from patients with known deletions, duplications, and uniparental disomy. We also inspected previously generated SNP data from 90 apparently normal individuals from the Centre d'Étude du Polymorphisme Humain (CEPH collection, and identified three cases of uniparental isodisomy, four females having an apparently mosaic X chromosome, two mislabelled SNP data sets, and one microdeletion on chromosome 2 with mosaicism from an apparently normal female. These previously unrecognized abnormalities were all detected using SNPscan. The microdeletion was independently
Tomas Mas, Carmen; Børsting, Claus; Morling, Niels
SNPs are being increasingly used by forensic laboratories. Different platforms have been developed for SNP typing. We describe the GenPlex™ HID system protocol, a new SNP-typing platform developed by Applied Biosystems where 48 of the 52 SNPforID SNPs and amelogenin are included. The GenPlex™ HID...... system protocol has been successfully tested by a number of forensic laboratories using both ordinary and forensic samples....
Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh; Chang, Hsueh-Wei
Genetic association is a challenging task for the identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases. To fully execute genetic studies of complex diseases, modern geneticists face the challenge of detecting interactions between loci. A genetic algorithm (GA) is developed to detect the association of genotype frequencies of cancer cases and noncancer cases based on statistical analysis. An improved genetic algorithm (IGA) is proposed to improve the reliability of the GA method for high-dimensional SNP-SNP interactions. The strategy offers the top five results to the random population process, in which they guide the GA toward a significant search course. The IGA increases the likelihood of quickly detecting the maximum ratio difference between cancer cases and noncancer cases. The study systematically evaluates the joint effect of 23 SNP combinations of six steroid hormone metabolisms, and signaling-related genes involved in breast carcinogenesis pathways were systematically evaluated, with IGA successfully detecting significant ratio differences between breast cancer cases and noncancer cases. The possible breast cancer risks were subsequently analyzed by odds-ratio (OR) and risk-ratio analysis. The estimated OR of the best SNP barcode is significantly higher than 1 (between 1.15 and 7.01) for specific combinations of two to 13 SNPs. Analysis results support that the IGA provides higher ratio difference values than the GA between breast cancer cases and noncancer cases over 3-SNP to 13-SNP interactions. A more specific SNP-SNP interaction profile for the risk of breast cancer is also provided.
Wiszniewska, Joanna; Bi, Weimin; Shaw, Chad; Stankiewicz, Pawel; Kang, Sung-Hae L; Pursley, Amber N; Lalani, Seema; Hixson, Patricia; Gambin, Tomasz; Tsai, Chun-hui; Bock, Hans-Georg; Descartes, Maria; Probst, Frank J; Scaglia, Fernando; Beaudet, Arthur L; Lupski, James R; Eng, Christine; Cheung, Sau Wai; Bacino, Carlos; Patel, Ankita
In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60,000 SNP probes, referred to as Chromosomal Microarray Analysis - Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner.
Stangegaard, Michael; Tomas, Carmen; Hansen, Anders J.;
Single nucleotide polymorphism genotyping provides a supplement for conventional short tandem repeats-based kits currently used for human identification. GenPlex (Applied Biosystems (AB), Foster City, CA) is an SNP-genotyping kit based on a multiplex of 48 informative, autosomal SNPs from...... but one sample. The results demonstrate that the Biomek-3000 can perform a series of complex reactions leading to highly consistent forensic genetic SNP-typing results Udgivelsesdato: 2008/10...
Full Text Available Single nucleotide polymorphisms (SNPs have been increasingly utilized to investigate somatic genetic abnormalities in premalignancy and cancer. LOH is a common alteration observed during cancer development, and SNP assays have been used to identify LOH at specific chromosomal regions. The design of such studies requires consideration of the resolution for detecting LOH throughout the genome and identification of the number and location of SNPs required to detect genetic alterations in specific genomic regions. Our study evaluated SNP distribution patterns and used probability models, Monte Carlo simulation, and real human subject genotype data to investigate the relationships between the number of SNPs, SNP HET rates, and the sensitivity (resolution for detecting LOH. We report that variances of SNP heterozygosity rate in dbSNP are high for a large proportion of SNPs. Two statistical methods proposed for directly inferring SNP heterozygosity rates require much smaller sample sizes (intermediate sizes and are feasible for practical use in SNP selection or verification. Using HapMap data, we showed that a region of LOH greater than 200 kb can be reliably detected, with losses smaller than 50 kb having a substantially lower detection probability when using all SNPs currently in the HapMap database. Higher densities of SNPs may exist in certain local chromosomal regions that provide some opportunities for reliably detecting LOH of segment sizes smaller than 50 kb. These results suggest that the interpretation of the results from genome-wide scans for LOH using commercial arrays need to consider the relationships among inter-SNP distance, detection probability, and sample size for a specific study. New experimental designs for LOH studies would also benefit from considering the power of detection and sample sizes required to accomplish the proposed aims.
Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong
Copy number variations (CNVs) are abundant in the human genome. They have been associated with complex traits in genome-wide association studies (GWAS) and expected to continue playing an important role in identifying the etiology of disease phenotypes. As a result of current high throughput whole-genome single-nucleotide polymorphism (SNP) arrays, we currently have datasets that simultaneously have integer copy numbers in CNV regions as well as SNP genotypes. At the same time, haplotypes that have been shown to offer advantages over genotypes in identifying disease traits even though available for SNP genotypes are largely not available for CNV/SNP data due to insufficient computational tools. We introduce a new framework for inferring haplotypes in CNV/SNP data using a sequential Monte Carlo sampling scheme 'Tree-Based Deterministic Sampling CNV' (TDSCNV). We compare our method with polyHap(v2.0), the only currently available software able to perform inference in CNV/SNP genotypes, on datasets of varying number of markers. We have found that both algorithms show similar accuracy but TDSCNV is an order of magnitude faster while scaling linearly with the number of markers and number of individuals and thus could be the method of choice for haplotype inference in such datasets. Our method is implemented in the TDSCNV package which is available for download at http://www.ee.columbia.edu/~anastas/tdscnv.
Full Text Available The present study evaluated the role of SNP microarray in 101 cases of clinically suspected FISH negative (noninformative/normal 22q11.2 microdeletion syndrome. SNP microarray was carried out using 300 K HumanCytoSNP-12 BeadChip array or CytoScan 750 K array. SNP microarray identified 8 cases of 22q11.2 microdeletions and/or microduplications in addition to cases of chromosomal abnormalities and other pathogenic/likely pathogenic CNVs. Clinically suspected specific deletions (22q11.2 were detectable in approximately 8% of cases by SNP microarray, mostly from FISH noninformative cases. This study also identified several LOH/AOH loci with known and well-defined UPD (uniparental disomy disorders. In conclusion, this study suggests more strict clinical criteria for FISH analysis. However, if clinical criteria are few or doubtful, in particular newborn/neonate in intensive care, SNP microarray should be the first screening test to be ordered. FISH is ideal test for detecting mosaicism, screening family members, and prenatal diagnosis in proven families.
Park, Jae-Wan; Park, Cheol-Min
The development of new anode materials having high electrochemical performances and interesting reaction mechanisms is highly required to satisfy the need for long-lasting mobile electronic devices and electric vehicles. Here, we report a layer crystalline structured SnP3 and its unique electrochemical behaviors with Li. The SnP3 was simply synthesized through modification of Sn crystallography by combination with P and its potential as an anode material for LIBs was investigated. During Li insertion reaction, the SnP3 anode showed an interesting two-step electrochemical reaction mechanism comprised of a topotactic transition (0.7–2.0 V) and a conversion (0.0–2.0 V) reaction. When the SnP3-based composite electrode was tested within the topotactic reaction region (0.7–2.0 V) between SnP3 and LixSnP3 (x ≤ 4), it showed excellent electrochemical properties, such as a high volumetric capacity (1st discharge/charge capacity was 840/663 mA h cm‑3) with a high initial coulombic efficiency, stable cycle behavior (636 mA h cm‑3 over 100 cycles), and fast rate capability (550 mA h cm‑3 at 3C). This layered SnP3 anode will be applicable to a new anode material for rechargeable LIBs.
Kumar, Sunil; Ambrosini, Giovanna; Bucher, Philipp
SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/. PMID:27899579
Barnes, Michael R
SNP data has grown exponentially over the last two years, SNP database evolution has matched this growth, as initial development of several independent SNP databases has given way to one central SNP database, dbSNP. Other SNP databases have instead evolved to complement this central database by providing gene specific focus and an increased level of curation and analysis on subsets of data, derived from the central data set. By contrast, human mutation data, which has been collected over many years, is still stored in disparate sources, although moves are afoot to move to a similar central database. These developments are timely, human mutation and polymorphism data both hold complementary keys to a better understanding of how genes function and malfunction in disease. The impending availability of a complete human genome presents us with an ideal framework to integrate both these forms of data, as our understanding of the mechanisms of disease increase, the full genomic context of variation may become increasingly significant.
Park, Jae-Wan; Park, Cheol-Min
The development of new anode materials having high electrochemical performances and interesting reaction mechanisms is highly required to satisfy the need for long-lasting mobile electronic devices and electric vehicles. Here, we report a layer crystalline structured SnP3 and its unique electrochemical behaviors with Li. The SnP3 was simply synthesized through modification of Sn crystallography by combination with P and its potential as an anode material for LIBs was investigated. During Li insertion reaction, the SnP3 anode showed an interesting two-step electrochemical reaction mechanism comprised of a topotactic transition (0.7–2.0 V) and a conversion (0.0–2.0 V) reaction. When the SnP3-based composite electrode was tested within the topotactic reaction region (0.7–2.0 V) between SnP3 and LixSnP3 (x ≤ 4), it showed excellent electrochemical properties, such as a high volumetric capacity (1st discharge/charge capacity was 840/663 mA h cm−3) with a high initial coulombic efficiency, stable cycle behavior (636 mA h cm−3 over 100 cycles), and fast rate capability (550 mA h cm−3 at 3C). This layered SnP3 anode will be applicable to a new anode material for rechargeable LIBs. PMID:27775090
Finner, Helmut; Strassburger, Klaus; Heid, Iris M; Herder, Christian; Rathmann, Wolfgang; Giani, Guido; Dickhaus, Thorsten; Lichtner, Peter; Meitinger, Thomas; Wichmann, H-Erich; Illig, Thomas; Gieger, Christian
We study the link between two quality measures of SNP (single nucleotide polymorphism) data in genome-wide association (GWA) studies, that is, per SNP call rates (CR) and p-values for testing Hardy-Weinberg equilibrium (HWE). The aim is to improve these measures by applying methods based on realized randomized p-values, the false discovery rate and estimates for the proportion of false hypotheses. While exact non-randomized conditional p-values for testing HWE cannot be recommended for estimating the proportion of false hypotheses, their realized randomized counterparts should be used. P-values corresponding to the asymptotic unconditional chi-square test lead to reasonable estimates only if SNPs with low minor allele frequency are excluded. We provide an algorithm to compute the probability that SNPs violate HWE given the observed CR, which yields an improved measure of data quality. The proposed methods are applied to SNP data from the KORA (Cooperative Health Research in the Region of Augsburg, Southern Germany) 500 K project, a GWA study in a population-based sample genotyped by Affymetrix GeneChip 500 K arrays using the calling algorithm BRLMM 1.4.0. We show that all SNPs with CR = 100 per cent are nearly in perfect HWE which militates in favor of the population to meet the conditions required for HWE at least for these SNPs. Moreover, we show that the proportion of SNPs not being in HWE increases with decreasing CR. We conclude that using a single threshold for judging HWE p-values without taking the CR into account is problematic. Instead we recommend a stratified analysis with respect to CR.
Geraldes, Armando [University of British Columbia, Vancouver; Hannemann, Jan [University of Victoria, Canada; Grassa, Chris [University of British Columbia, Vancouver; Farzaneh, Nima [University of British Columbia, Vancouver; Porth, Ilga [University of British Columbia, Vancouver; McKown, Athena [University of British Columbia, Vancouver; Skyba, Oleksandr [University of British Columbia, Vancouver; Li, Eryang [University of British Columbia, Vancouver; Mike, Fujita [University of British Columbia, Vancouver; Friedmann, Michael [University of British Columbia, Vancouver; Wasteneys, Geoffrey [University of British Columbia, Vancouver; Guy, Robert [University of British Columbia, Vancouver; El-Kassaby, Yousry [University of British Columbia, Vancouver; Mansfield, Shawn [University of British Columbia, Vancouver; Cronk, Quentin [University of British Columbia, Vancouver; Ehlting, Juergen [University of Victoria, Canada; Douglas, Carl [University of British Columbia, Vancouver; DiFazio, Stephen P [West Virginia University, Morgantown; Slavov, Gancho [West Virginia University, Morgantown; Ranjan, Priya [ORNL; Muchero, Wellington [ORNL; Gunter, Lee E [ORNL; Wymore, Ann [ORNL; Tuskan, Gerald A [ORNL; Martin, Joel [U.S. Department of Energy, Joint Genome Institute; Schackwitz, Wendy [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Christa [U.S. Department of Energy, Joint Genome Institute; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute
Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. Despite the declining costs of genotyping by sequencing, for most studies, the use of large SNP genotyping arrays still offers the most cost-effective solution for large-scale targeted genotyping. Here we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species range. Due to the rapid decay of linkage disequilibrium in P. trichocarpa we adopted a candidate gene approach to the array design that resulted in the selection of 34,131 SNPs, the majority of which are located in, or within 2 kb, of 3,543 candidate genes. A subset of the SNPs (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%, indicating that high-quality data are generated with this array. We demonstrate that even among small numbers of samples (n=10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that due to ascertainment bias the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca (P. balsamifera and P. angustifolia). Finally, we provide evidence for the utility of the array for intraspecific studies of genetic differentiation and for species assignment and the detection of natural hybrids.
Four hundred years ago in Middelburg, in the Netherlands, the telescope was invented. The invention unleashed a revolution in the exploration of the universe. Galileo Galilei discovered mountains on the Moon, spots on the Sun, and moons around Jupiter. Christiaan Huygens saw details on Mars and rings around Saturn. William Herschel discovered a new planet and mapped binary stars and nebulae. Other astronomers determined the distances to stars, unraveled the structure of the Milky Way, and discovered the expansion of the universe. And, as telescopes became bigger and more powerful, astronomers delved deeper into the mysteries of the cosmos. In his Atlas of Astronomical Discoveries, astronomy journalist Govert Schilling tells the story of 400 years of telescopic astronomy. He looks at the 100 most important discoveries since the invention of the telescope. In his direct and accessible style, the author takes his readers on an exciting journey encompassing the highlights of four centuries of astronomy. Spectacul...
Davies, Shelley L; Moral, Maria Angels; Bozzo, Jordi
Chronicles in Drug Discovery features special interest reports on advances in drug discovery. This month we highlight agents that target and deplete immunosuppressive regulatory T cells, which are produced by tumor cells to hinder innate immunity against, or chemotherapies targeting, tumor-associated antigens. Antiviral treatments for respiratory syncytial virus, a severe and prevalent infection in children, are limited due to their side effect profiles and cost. New strategies currently under clinical development include monoclonal antibodies, siRNAs, vaccines and oral small molecule inhibitors. Recent therapeutic lines for Huntington's disease include gene therapies that target the mutated human huntingtin gene or deliver neuroprotective growth factors and cellular transplantation in apoptotic regions of the brain. Finally, we highlight the antiinflammatory and antinociceptive properties of new compounds targeting the somatostatin receptor subtype sst4, which warrant further study for their potential application as clinical analgesics.
DNA microarrays provide an efficient means of identifying single-nucleotide polymorphisms (SNPs) in DNA samples and characterizing their frequencies in individual and mixed samples. We have studied the parameters that determine the sensitivity of DNA probes to SNPs and found that the melting temperature (T (m)) of the probe is the primary determinant of probe sensitivity. An isothermal-melting temperature DNA microarray design, in which the T (m) of all probes is tightly distributed, can be implemented by varying the length of DNA probes within a single DNA microarray. I describe guidelines for designing isothermal-melting temperature DNA microarrays and protocols for labeling and hybridizing DNA samples to DNA microarrays for SNP discovery, genotyping, and quantitative determination of allele frequencies in mixed samples.
Quarks are widely recognized today as being among the elementary particles of which matter is composed. The key evidence for their existence came from a series of inelastic electron-nucleon scattering experiments conducted between 1967 and 1973 at the Stanford Linear Accelerator Center. Other theoretical and experimental advances of the 1970s confirmed this discovery, leading to the present standard model of elementary particle physics.
The three great myths, which form a sort of triumvirate of misunderstanding, are the Eureka! myth, the hypothesis myth, and the measurement myth. These myths are prevalent among scientists as well as among observers of science. The Eureka! myth asserts that discovery occurs as a flash of insight, and as such is not subject to investigation. This leads to the perception that discovery or deriving a hypothesis is a moment or event rather than a process. Events are singular and not subject to description. The hypothesis myth asserts that proper science is motivated by testing hypotheses, and that if something is not experimentally testable then it is not scientific. This myth leads to absurd posturing by some workers conducting empirical descriptive studies, who dress up their study with a ``hypothesis`` to obtain funding or get it published. Methods papers are often rejected because they do not address a specific scientific problem. The fact is that many of the great breakthroughs in silence involve methods and not hypotheses or arise from largely descriptive studies. Those captured by this myth also try to block funding for those developing methods. The third myth is the measurement myth, which holds that determining what to measure is straightforward, so one doesn`t need a lot of introspection to do science. As one ecologist put it to me ``Don`t give me any of that philosophy junk, just let me out in the field. I know what to measure.`` These myths lead to difficulties for scientists who must face peer review to obtain funding and to get published. These myths also inhibit the study of science as a process. Finally, these myths inhibit creativity and suppress innovation. In this paper I first explore these myths in more detail and then propose a new model of discovery that opens the supposedly miraculous process of discovery to doser scrutiny.
the backbone of most of the materials discovery activities we pursue. We call the internal structure of a qualitative model an envisionment . A simple... envisionment (discussed in detail in the next chapter) from the polymer curing domain that is illustrated in figure 1-7. The initial conditions are...TSC "fires" IF-THEN rules which the initial conditions enable, building an envisonment. A stylized envisionment for the immunology knowledge base looks
Full Text Available Abstract Background L-ascorbic acid (AsA; vitamin C is essential for all living plants where it functions as the main hydrosoluble antioxidant. It has diverse roles in the regulation of plant cell growth and expansion, photosynthesis, and hormone-regulated processes. AsA is also an essential component of the human diet, being tomato fruit one of the main sources of this vitamin. To identify genes responsible for AsA content in tomato fruit, transcriptomic studies followed by clustering analysis were applied to two groups of fruits with contrasting AsA content. These fruits were identified after AsA profiling of an F8 Recombinant Inbred Line (RIL population generated from a cross between the domesticated species Solanum lycopersicum and the wild relative Solanum pimpinellifollium. Results We found large variability in AsA content within the RIL population with individual RILs with up to 4-fold difference in AsA content. Transcriptomic analysis identified genes whose expression correlated either positively (PVC genes or negatively (NVC genes with the AsA content of the fruits. Cluster analysis using SOTA allowed the identification of subsets of co-regulated genes mainly involved in hormones signaling, such as ethylene, ABA, gibberellin and auxin, rather than any of the known AsA biosynthetic genes. Data mining of the corresponding PVC and NVC orthologs in Arabidopis databases identified flagellin and other ROS-producing processes as cues resulting in differential regulation of a high percentage of the genes from both groups of co-regulated genes; more specifically, 26.6% of the orthologous PVC genes, and 15.5% of the orthologous NVC genes were induced and repressed, respectively, under flagellin22 treatment in Arabidopsis thaliana. Conclusion Results here reported indicate that the content of AsA in red tomato fruit from our selected RILs are not correlated with the expression of genes involved in its biosynthesis. On the contrary, the data
张园园; 肖利云; 伍会健
The regulation of gene expression is respond to the extra stimuli signals in mammalian cells. Mostly, both gene transcription and pre-mRNA splicing are the sticking regulation steps during gene expression. Increasing evidences showed that gene transcription and pre-mRNA splicing are highly related with each other in both space and time. Gene transcription could affect pre-mRNA splicing and inversely it is also regulated by pre-mRNA splicing factors. Recently, it is found that transcription coregulators play important roles in the processes of gene expressing regulation and pre-mRNA splicing. Moreover, transcription coregulators not only modulate the amount of transcripts but also influence the function of target protein which is coded by the splicing mature mRNA achieved from the alternative splicing that is regulated by the coregulators during the gene expression regulation, hi this review, we mainly demonstrated the relationship between gene transcription and pre-mRNA splicing and concluded the molecular mechanism of their interactions, which would help us to deeply understand the process of gene expression regulation.%细胞通过基因表达调控来应对外界刺激,其中对基因特录起始和pre-mRNA剪接的调控是基因表达调控的重要环节.越来越多的实验显示基因转录和pre-mRNA剪接这两个过程在时空上密切相关.基因转录能调节剪接模式的选择性,反之剪接过程也影响基因转录.近年来研究发现转录辅调节因子在联系转录和剪接过程中扮演着重要角色.转录辅调节因子对基因表达的调控不仅在于影响转录产物的量,还可以调控pre-mRNA的选择性剪接并产生不同的剪接体,从而翻译出具有不同生物学功能的蛋白质.本文主要阐述了基因转录与剪接之间的关系以及它们之间相互作用的机制,有利于更深入理解基因表达调控的过程.
Batley, Jacqueline; Jewell, Erica; Edwards, David
Molecular genetic markers represent one of the most powerful tools for the analysis of genomes. Molecular marker technology has developed rapidly over the last decade, and two forms of sequence-based markers, simple sequence repeats (SSRs), also known as microsatellites, and single nucleotide polymorphisms (SNPs), now predominate applications in modern genetic analysis. The availability of large sequence data sets permits mining for SSRs and SNPs, which may then be applied to genetic trait mapping and marker-assisted selection. Here, we describe Web-based automated methods for the discovery of these SSRs and SNPs from sequence data. SSRPrimer enables the real-time discovery of SSRs within submitted DNA sequences, with the concomitant design of PCR primers for SSR amplification. Alternatively, users may browse the SSR Taxonomy Tree to identify predetermined SSR amplification primers for any species represented within the GenBank database. SNPServer uses a redundancy-based approach to identify SNPs within DNA sequence data. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences, and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms.
Nab Raj Roshyara
Full Text Available Modern analysis of high-dimensional SNP data requires a number of biometrical and statistical methods such as pre-processing, analysis of population structure, association analysis and genotype imputation. Software used for these purposes often rely on specific and incompatible input and output data formats. Therefore extensive data management including multiple format conversions is necessary during analyses.In order to support fast and efficient management and bio-statistical quality control of high-dimensional SNP data, we developed the publically available software fcGENE using C++ object-oriented programming language. This software simplifies and automates the use of different existing analysis packages, especially during the workflow of genotype imputations and corresponding analyses.fcGENE transforms SNP data and imputation results into different formats required for a large variety of analysis packages such as PLINK, SNPTEST, HAPLOVIEW, EIGENSOFT, GenABEL and tools used for genotype imputation such as MaCH, IMPUTE, BEAGLE and others. Data Management tasks like merging, splitting, extracting SNP and pedigree information can be performed. fcGENE also supports a number of bio-statistical quality control processes and quality based filtering processes at SNP- and sample-wise level. The tool also generates templates of commands required to run specific software packages, especially those required for genotype imputation. We demonstrate the functionality of fcGENE by example workflows of SNP data analyses and provide a comprehensive manual of commands, options and applications.We have developed a user-friendly open-source software fcGENE, which comprehensively supports SNP data management, quality control and analysis workflows. Download statistics and corresponding feedbacks indicate that software is highly recognised and extensively applied by the scientific community.
Hirschler-Guttenberg, Yael; Feldman, Ruth; Ostfeld-Etzion, Sharon; Laor, Nathaniel; Golan, Ofer
Emotion regulation (ER) difficulties are a major concern in children with autism spectrum disorder (ASD). Maternal temperament and parenting style have significant effects on children's ER. However, these effects have not been studied in children with ASD. Forty preschoolers with ASD and their mothers and forty matched controls engaged in fear and anger ER paradigms, micro-coded for child self- and co-regulatory behaviors and parent's regulation-facilitation. Mothers' parenting style and temperament were self-reported. In the ASD group only, maternal authoritarian style predicted higher self-regulation and lower co-regulation of anger and maternal authoritative style predicted higher self-regulation of fear. Maternal temperament did not predict child's ER. Findings emphasize the importance of maternal flexible parenting style in facilitating ER among children with ASD.
Granneman, Sander; Baserga, Susan J
Ribosomes, the large RNPs that translate mRNA into protein in the cytoplasm of eukaryotic cells, are synthesized in a subcompartment of the nucleus, the nucleolus. There, transcription by Pol I yields a pre-rRNA which is modified, cleaved and assembled with ribosomal proteins to make functional ribosomes. Previously, rRNA transcription and pre-rRNA cleavage in eukaryotes were considered to be separable steps in gene expression. However, recent findings suggest that these two steps in gene expression can be concurrent and are co-regulated. Unexpectedly, optimal rDNA transcription requires the presence of a defined subset of components of the pre-rRNA processing machinery.
Millis, R. L.; Dunham, E. W.; Sebring, T. A.; Smith, B. W.; de Kock, M.; Wiecha, O.
The Discovery Channel Telescope (DCT) is a 4.2-m telescope to be built at a new site near Happy Jack, Arizona. The DCT features a large prime focus mosaic CCD camera with a 2-degree-diameter field of view especially designed for surveys of KBOs, Centaurs, NEAs and other moving or time-variable targets. The telescope can be switched quickly to a Ritchey-Chretien configuration for optical/IR spectroscopy or near-IR imaging. This flexibility allows timely follow-up physical studies of high priority objects discovered in survey mode. The ULE (ultra-low-expansion) meniscus primary and secondary mirror blanks for the telescope are currently in fabrication by Corning Glass. Goodrich Aerospace, Vertex RSI, M3 Engineering and Technology Corp., and e2v Technologies have recently completed in-depth conceptual design studies of the optics, mount, enclosure, and mosaic focal plane, respectively. The results of these studies were subjected to a formal design review in July, 2004. Site testing at the 7760-ft altitude Happy Jack site began in 2001. Differential image motion observations from 117 nights since January 1, 2003 gave median seeing of 0.84 arcsec FWHM, and the average of the first quartile was 0.62 arcsec. The National Environmental Policy Act (NEPA) process for securing long-term access to this site on the Coconino National Forest is nearing completion and ground breaking is expected in the spring of 2005. The Discovery Channel Telescope is a project of the Lowell Observatory with major financial support from Discovery Communications, Inc. (DCI). DCI plans ongoing television programming featuring the construction of the telescope and the research ultimately undertaken with the DCT. An additional partner can be accommodated in the project. Interested parties should contact the lead author.
Gao, Yang; Hauke, Caitlyn A; Marles, Jarrad M; Taylor, Ronald K
Vibrio cholerae is the etiological agent of the acute intestinal disorder cholera. The toxin-coregulated pilus (TCP), a type IVb pilus, is an essential virulence factor of V. cholerae Recent work has shown that TcpB is a large minor pilin encoded within the tcp operon. TcpB contributes to efficient pilus formation and is essential for all TCP functions. Here, we have initiated a detailed targeted mutagenesis approach to further characterize this salient TCP component. We have identified (thus far) 20 residues of TcpB which affect either the steady-state level of TcpB or alter one or more TCP functions. This study provides a solid framework for further understanding of the complex role of TcpB and will be of use upon determination of the crystal structure of TcpB or related minor pilin orthologs of type IVb pilus systems. Type IV pili, such as the toxin-coregulated pilus (TCP) in V. cholerae, are bacterial appendages that often act as essential virulence factors. Minor pilins, like TcpB, of these pili systems often play integral roles in pilus assembly and function. In this study, we have generated mutations in tcpB to determine residues of importance for TCP stability and function. Combined with a predicted tertiary structure, characterization of these mutants allows us to better understand critical residues in TcpB and the role they may play in the mechanisms underlying minor pilin functions. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Rollenhagen, Julianne E; Kalsy, Anuj; Cerda, Francisca; John, Manohar; Harris, Jason B; Larocque, Regina C; Qadri, Firdausi; Calderwood, Stephen B; Taylor, Ronald K; Ryan, Edward T
Toxin-coregulated pilin A (TcpA) is the main structural subunit of a type IV bundle-forming pilus of Vibrio cholerae, the cause of cholera. Toxin-coregulated pilus is involved in formation of microcolonies of V. cholerae at the intestinal surface, and strains of V. cholerae deficient in TcpA are attenuated and unable to colonize intestinal surfaces. Anti-TcpA immunity is common in humans recovering from cholera in Bangladesh, and immunization against TcpA is protective in murine V. cholerae models. To evaluate whether transcutaneously applied TcpA is immunogenic, we transcutaneously immunized mice with 100 mug of TcpA or TcpA with an immunoadjuvant (cholera toxin [CT], 50 mug) on days 0, 19, and 40. Mice immunized with TcpA alone did not develop anti-TcpA responses. Mice that received transcutaneously applied TcpA and CT developed prominent anti-TcpA immunoglobulin G (IgG) serum responses but minimal anti-TcpA IgA. Transcutaneous immunization with CT induced prominent IgG and IgA anti-CT serum responses. In an infant mouse model, offspring born to dams transcutaneously immunized either with TcpA and CT or with CT alone were challenged with 10(6) CFU (one 50% lethal dose) wild-type V. cholerae O1 El Tor strain N16961. At 48 h, mice born to females transcutaneously immunized with CT alone had 36% +/- 10% (mean +/- standard error of the mean) survival, while mice born to females transcutaneously immunized with TcpA and CT had 69% +/- 6% survival (P < 0.001). Our results suggest that transcutaneous immunization with TcpA and an immunoadjuvant induces protective anti-TcpA immune responses. Anti-TcpA responses may contribute to an optimal cholera vaccine.
Full Text Available Abstract Background Next-generation sequencing technologies are widely used for detection of millions of Single Nucleotide Polymorphisms (SNPs and also provide a means of assessing their variation. This information is useful for composing subsets of highly informative SNPs for region-specific or genome-wide analysis and to identify mutations regulating phenotypic differences within or between populations. In this study, we investigated the sensitivity of SNP detection and introduced the flanking SNPs value (FSV as a novel measure for predicting SNP-variability using ~5X genome resequencing with ABI SOLID and DNA pools from two chicken lines divergently selected for juvenile bodyweight. Results Genotyping with a 60 K SNP chip revealed polymorphisms within or between two divergently selected chicken lines for 31 363 SNPs, 48% of which were also detected using resequencing of DNA pools. SNP detection using resequencing was more powerful for positions with larger differences in allele frequency between the lines. About 50% of the SNPs with non-reference allele frequencies in the range 0.5-0.6 and 67% of those with frequencies > 0.9 could be detected. On average, ~3.7 SNPs/kb were detected by resequencing, with about 5% lower density on microchromosomes than on macrochromosomes. There was a positive correlation between the observed between-line SNP variation from the 60 K chip analysis and our proposed FSV score computed from the genome resequencing data. The strongest correlations on macrochromosomes and microchromosomes were observed when the FSV was calculated with total flanking regions of 62 kb (correlation 0.55 and 38 kb (correlation 0.45, respectively. Conclusions Genome resequencing with limited coverage (~5X using pooled DNA samples and three non-reference reads as a threshold for SNP detection, identified 50 - 67% of the 60 K SNPs with a non-reference allele frequency larger than 0.5. The SNP density was around 5% lower on the
Full Text Available Abstract Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs have been used for single nucleotide polymorphism (SNP discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends. Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183 of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In
Chen, M.; Ertl, T.; Jirotka, M.; Trefethen, A.; Schmidt, A.; Coecke, B.; Bañares-Alcántara, R.
Causality is the fabric of our dynamic world. We all make frequent attempts to reason causation relationships of everyday events (e.g., what was the cause of my headache, or what has upset Alice?). We attempt to manage causality all the time through planning and scheduling. The greatest scientific discoveries are usually about causality (e.g., Newton found the cause for an apple to fall, and Darwin discovered natural selection). Meanwhile, we continue to seek a comprehensive understanding about the causes of numerous complex phenomena, such as social divisions, economic crisis, global warming, home-grown terrorism, etc. Humans analyse and reason causality based on observation, experimentation and acquired a priori knowledge. Today's technologies enable us to make observations and carry out experiments in an unprecedented scale that has created data mountains everywhere. Whereas there are exciting opportunities to discover new causation relationships, there are also unparalleled challenges to benefit from such data mountains. In this article, we present a case for developing a new piece of ICT, called Causality Discovery Technology. We reason about the necessity, feasibility and potential impact of such a technology.
Post, R. S.
(Abstract only) We are developing a system of robotic telescopes for automatic recognition of Supernovas as well as other transient events in collaboration with the Puckett Supernova Search Team. At the SAS2014 meeting, the discovery program, SNARE, was first described. Since then, it has been continuously improved to handle searches under a wide variety of atmospheric conditions. Currently, two telescopes are used to build a reference library while searching for PSN with a partial library. Since data is taken every night without clouds, we must deal with varying atmospheric and high background illumination from the moon. Software is configured to identify a PSN, reshoot for verification with options to change the run plan to acquire photometric or spectrographic data. The telescopes are 24-inch CDK24, with Alta U230 cameras, one in CA and one in NM. Images and run plans are sent between sites so the CA telescope can search while photometry is done in NM. Our goal is to find bright PSNs with magnitude 17.5 or less which is the limit of our planned spectroscopy. We present results from our first automated PSN discoveries and plans for PSN data acquisition.
Wei Jiang; Lijie Zhang; Bo Na; Lihong Wang; Jiankai Xu; Xia Li; Yadong Wang; Shaoqi Rao
Variations of gene expression and DNA sequence are genetically associated.The goal of this study was to build genetic networks to map from SNPs to gene expressions and to characterize the two different kinds of networks.We employed mutual information to evaluate the strength of SNP-SNP and gene-gene associations based on SNPs identity by descent (IBD) data and differences of gene expressions.We applied the approach to one dataset of Genetics of Gene Expression in Humans,and discovered that both the SNP relevance network and the gene relevance network approximated the scale-free topology.We also found that 12.09% of SNP-SNP interactions matched 24.49% of gene-gene interactions,which was consistent with that of the previous studies.Finally,we identified 49 hub SNPs and 115 hub genes in their relevance networks,in which 27 hub SNPs were associated with 25 hub genes.(C) 2009 National Natural Science Foundation of China and Chinese Academy of Sciences.Published by Elsevier Limited and Science in China Press.All rights reserved.
Full Text Available Although highly polymorphic SSRs are currently the marker of choice worldwide in maize breeding, single nucleotide polymorphisms (SNPs as a newer marker system are recently used more extensively. The objective of this study was investigate the utility of SSR and SNP markers for mapping of a maize population adapted to conditions of Southeast Europe. Total of 294 F2:3 lines derived from a biparental mapping population were genotyped using 121 polymorphic SNP and SSR markers. The SNP markers were analyzed using the SNPlex technology. 56 of the 142 tested SNPs (39% were polymorphic between the parents of the mapping population and were successfully mapped. The remaining markers were either not functional (5 = 3.5% or not polymorphic (81 = 57%. No mapped SNP marker showed more than 10% missing data. On average, the level of missing data for SNPs (1.5% was considerably lower than that for SSRs (3.4%. For the mapping procedure, the SNP data were combined SSR data. A comparison of the mapping data with the publicly available mapping data on SSR markers and the proprietary mapping data indicates that the map is of good quality and that the map position of almost all markers agrees with their published map position. Thus, information obtained from both marker systems is utilizable for further QTL analysis.
Hwang, Michael T; Landon, Preston B; Lee, Joon; Choi, Duyoung; Mo, Alexander H; Glinsky, Gennadi; Lal, Ratnesh
Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine.
McKay Stephanie D
Full Text Available Abstract Background Genetic markers can be used to identify and verify the origin of individuals. Motivation for the inference of ancestry ranges from conservation genetics to forensic analysis. High density assays featuring Single Nucleotide Polymorphism (SNP markers can be exploited to create a reduced panel containing the most informative markers for these purposes. The objectives of this study were to evaluate methods of marker selection and determine the minimum number of markers from the BovineSNP50 BeadChip required to verify the origin of individuals in European cattle breeds. Delta, Wright's FST, Weir & Cockerham's FST and PCA methods for population differentiation were compared. The level of informativeness of each SNP was estimated from the breed specific allele frequencies. Individual assignment analysis was performed using the ranked informative markers. Stringency levels were applied by log-likelihood ratio to assess the confidence of the assignment test. Results A 95% assignment success rate for the 384 individually genotyped animals was achieved with ST (60 to 140 SNPs depending on the chosen degree of confidence. Certain breeds required fewer markers ( 95% assignment success. The power of assignment success, and therefore the number of SNP markers required, is dependent on the levels of genetic heterogeneity and pool of samples considered. Conclusions While all SNP selection methods produced marker panels capable of breed identification, the power of assignment varied markedly among analysis methods. Thus, with effective exploration of available high density genetic markers, a diagnostic panel of highly informative markers can be produced.
Full Text Available Abstract Background Identification of causal SNPs in most genome wide association studies relies on approaches that consider each SNP individually. However, there is a strong correlation structure among SNPs that needs to be taken into account. Hence, increasingly modern computationally expensive regression methods are employed for SNP selection that consider all markers simultaneously and thus incorporate dependencies among SNPs. Results We develop a novel multivariate algorithm for large scale SNP selection using CAR score regression, a promising new approach for prioritizing biomarkers. Specifically, we propose a computationally efficient procedure for shrinkage estimation of CAR scores from high-dimensional data. Subsequently, we conduct a comprehensive comparison study including five advanced regression approaches (boosting, lasso, NEG, MCP, and CAR score and a univariate approach (marginal correlation to determine the effectiveness in finding true causal SNPs. Conclusions Simultaneous SNP selection is a challenging task. We demonstrate that our CAR score-based algorithm consistently outperforms all competing approaches, both uni- and multivariate, in terms of correctly recovered causal SNPs and SNP ranking. An R package implementing the approach as well as R code to reproduce the complete study presented here is available from http://strimmerlab.org/software/care/.
Full Text Available Complex diseases are often highly heritable. However, for many complex traits only a small proportion of the heritability can be explained by observed genetic variants in traditional genome-wide association (GWA studies. Moreover, for some of those traits few significant SNPs have been identified. Single SNP association methods test for association at a single SNP, ignoring the effect of other SNPs. We show using a simple multi-locus odds model of complex disease that moderate to large effect sizes of causal variants may be estimated as relatively small effect sizes in single SNP association testing. This underestimation effect is most severe for diseases influenced by numerous risk variants. We relate the underestimation effect to the concept of non-collapsibility found in the statistics literature. As described, continuous phenotypes generated with linear genetic models are not affected by this underestimation effect. Since many GWA studies apply single SNP analysis to dichotomous phenotypes, previously reported results potentially underestimate true effect sizes, thereby impeding identification of true effect SNPs. Therefore, when a multi-locus model of disease risk is assumed, a multi SNP analysis may be more appropriate.
Full Text Available Abstract Background The aim of this study was to investigate if there is an association between different SNP combinations in the guanosine triphosphate cyclohydrolase (GCH1 gene and a number of pain behavior related outcomes during labor. A population-based sample of pregnant women (n = 814 was recruited at gestational week 18. A plasma sample was collected from each subject. Genotyping was performed and three single nucleotide polymorphisms (SNP previously defined as a pain-protective SNP combination of GCH1 were used. Results Homozygous carriers of the pain-protective SNP combination of GCH1 arrived to the delivery ward with a more advanced stage of cervical dilation compared to heterozygous carriers and non-carriers. However, homozygous carriers more often used second line labor analgesia compared to the others. Conclusion The pain-protective SNP combination of GCH1 may be of importance in the limited number of homozygous carriers during the initial dilation of cervix but upon arrival at the delivery unit these women are more inclined to use second line labor analgesia.
Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long
SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies.
Barnett, Ian; Mukherjee, Rajarshi; Lin, Xihong
It is of substantial interest to study the effects of genes, genetic pathways, and networks on the risk of complex diseases. These genetic constructs each contain multiple SNPs, which are often correlated and function jointly, and might be large in number. However, only a sparse subset of SNPs in a genetic construct is generally associated with the disease of interest. In this article, we propose the generalized higher criticism (GHC) to test for the association between an SNP set and a disease outcome. The higher criticism is a test traditionally used in high-dimensional signal detection settings when marginal test statistics are independent and the number of parameters is very large. However, these assumptions do not always hold in genetic association studies, due to linkage disequilibrium among SNPs and the finite number of SNPs in an SNP set in each genetic construct. The proposed GHC overcomes the limitations of the higher criticism by allowing for arbitrary correlation structures among the SNPs in an SNP-set, while performing accurate analytic p-value calculations for any finite number of SNPs in the SNP-set. We obtain the detection boundary of the GHC test. We compared empirically using simulations the power of the GHC method with existing SNP-set tests over a range of genetic regions with varied correlation structures and signal sparsity. We apply the proposed methods to analyze the CGEM breast cancer genome-wide association study. Supplementary materials for this article are available online.
Representations are at the heart of artificial intelligence (AI). This book is devoted to the problem of representation discovery: how can an intelligent system construct representations from its experience? Representation discovery re-parameterizes the state space - prior to the application of information retrieval, machine learning, or optimization techniques - facilitating later inference processes by constructing new task-specific bases adapted to the state space geometry. This book presents a general approach to representation discovery using the framework of harmonic analysis, in particu
Full Text Available High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium and allotetraploid sour cherry (P. cerasus. This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome and P. fruticosa (fruticosa subgenome. Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269 and sour (n = 330 cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery
Federal Laboratory Consortium — Description: CORAL Name: Sputter 2 Similar to the existing 4-Gun Denton Discovery 22 Sputter system, with the following enhancements: Specifications / Capabilities:...
Song, Chenchen; Knöpfel, Thomas
Optogenetics - the use of light and genetics to manipulate and monitor the activities of defined cell populations - has already had a transformative impact on basic neuroscience research. Now, the conceptual and methodological advances associated with optogenetic approaches are providing fresh momentum to neuroscience drug discovery, particularly in areas that are stalled on the concept of 'fixing the brain chemistry'. Optogenetics is beginning to translate and transit into drug discovery in several key domains, including target discovery, high-throughput screening and novel therapeutic approaches to disease states. Here, we discuss the exciting potential of optogenetic technologies to transform neuroscience drug discovery.
Mouawad, Amer E; Mansour, Nashat
Despite the advances in genotyping technologies which have led to large reduction in genotyping cost, the Tag SNP Selection problem remains an important problem for computational biologists and geneticists. Selecting the smallest subset of tag SNPs that can predict the other SNPs would considerably minimize the complexity of genome-wide or block-based SNP-disease association studies. These studies would lead to better diagnosis and treatment of diseases. In this work, we propose three variations of a genetic algorithm based on two-marker linkage disequilibrium, multi-marker linkage disequilibrium, and a third measure that we denote by prediction power. The performance of the three algorithms are compared with those of a recognized tag SNP selection algorithm using three different real data sets from the HapMap project. The results indicate that the multi-marker linkage disequilibrium based genetic algorithm yields better prediction accuracy.
Stjernqvist, Susann; Rydén, Tobias; Greenman, Chris D
SNP allelic copy number data provides intensity measurements for the two different alleles separately. We present a method that estimates the number of copies of each allele at each SNP position, using a continuous-index hidden Markov model. The method is especially suited for cancer data, since it includes the fraction of normal tissue contamination, often present when studying data from cancer tumors, into the model. The continuous-index structure takes into account the distances between the SNPs, and is thereby appropriate also when SNPs are unequally spaced. In a simulation study we show that the method performs favorably compared to previous methods even with as much as 70% normal contamination. We also provide results from applications to clinical data produced using the Affymetrix genome-wide SNP 6.0 platform.
Motivation: A review of the available single nucleotide polymorphism (SNP) calling procedures for Illumina high-throughput sequencing (HTS) platform data reveals that most rely mainly on base-calling and mapping qualities as sources of error when calling SNPs. Thus, errors not involved in base-calling or alignment, such as those in genomic sample preparation, are not accounted for.Results: A novel method of consensus and SNP calling, Genotype Model Selection (GeMS), is given which accounts for the errors that occur during the preparation of the genomic sample. Simulations and real data analyses indicate that GeMS has the best performance balance of sensitivity and positive predictive value among the tested SNP callers. © The Author 2012. Published by Oxford University Press. All rights reserved.
Veerkamp Roel F
Full Text Available Abstract Background The simulated dataset of the 13th QTL-MAS workshop was analysed to i detect QTL and ii predict breeding values for animals without phenotypic information. Several parameterisations considering all SNP simultaneously were applied using Gibbs sampling. Results Fourteen QTL were detected at the different time points. Correlations between estimated breeding values were high between models, except when the model was used that assumed that all SNP effects came from one distribution. The model that used the selected 14 SNP found associated with QTL, gave close to unity correlations with the full parameterisations. Conclusions Nine out of 18 QTL were detected, however the six QTL for inflection point were missed. Models for genomic selection were indicated to be fairly robust, e.g. with respect to accuracy of estimated breeding values. Still, it is worthwhile to investigate the number QTL underlying the quantitative traits, before choosing the model used for genomic selection.
Zhang, Jia; Li, Kai; Pardinas, Jose R; Liao, Duan F; Li, Hong J; Zhang, Xu
Single nucleotide polymorphisms (SNPs) are useful physical markers for genetic studies as well as the cause of some genetic diseases. To develop more reliable SNP assays, we examined the underlying molecular mechanisms by which deoxyribonucleic acid (DNA) polymerases with 3' exonuclease activity maintain the high fidelity of DNA replication. In addition to mismatch removal by proofreading, we have discovered a premature termination of polymerization mediated by a novel OFF-switch mechanism. Two SNP assays were developed, one based on proofreading using 3' end-labeled primer extension and the other based on the newly identified OFF-switch, respectively. These two new assays are well suited for conventional techniques, such as electrophoresis and microplates detection systems as well as the sophisticated microchips. Application of these reliable SNP assays will greatly facilitate genetic and biomedical studies in the postgenome era.
Full Text Available Many genetic association studies used single nucleotide polymorphisms (SNPs data to identify genetic variants for complex diseases. Although SNP-based associations are most common in genome-wide association studies (GWAS, gene-based association analysis has received increasing attention in understanding genetic etiologies for complex diseases. While both methods have been used to analyze the same data, few genome-wide association studies compare the results or observe the connection between them. We performed a comprehensive analysis of the data from the Study of Addiction: Genetics and Environment (SAGE and compared the results from the SNP-based and gene-based analyses. Our results suggest that the gene-based method complements the individual SNP-based analysis, and conceptually they are closely related. In terms of gene findings, our results validate many genes that were either reported from the analysis of the same dataset or based on animal studies for substance dependence.
王艳; 张菁菁; 刘翠云; 胡平; 刘安; 许争峰
目的：探讨单核苷酸多态性基因芯片( SNP array)在先天性心脏病( CHD)胎儿产前诊断中的临床应用价值。方法：选取102例产前超声诊断为CHD而核型分析未见异常并排除22 q11.2微缺失综合征的胎儿,采用SNP array技术进行遗传学检测。将102例胎儿分为单纯CHD胎儿组(66例)和合并心外结构异常CHD胎儿组(36例),对两组胎儿检出的基因组拷贝数变异( CNⅤs)性质按致病性CNⅤs、临床意义不明确的拷贝数变异(ⅤOUS)及良性CNⅤs进行分类。结果：102例CHD 胎儿中,良性 CNⅤs 的检出率为21.6%(22/102)、ⅤOUS的检出率为2.9%(3/102)、致病性CNⅤs的检出率为9.8%(10/102)。单纯CHD胎儿组和合并心外结构异常CHD胎儿组的致病性CNⅤs检出率分别为12.1%(8/66)和5.6%(2/36)。结论：SNP array技术具有高分辨率、准确等优点,对染色体核型分析结果正常的CHD胎儿进行SNP array检测,能额外发现部分致病性CNⅤs,在先天性心脏病胎儿的产前诊断中具有较强的临床应用价值。%Objective:To explore the clinical value of single nucleotide polymorphism (SNP array) performed in prenatal diagnosis of fetuses with congenital heart disease(CHD). Methods:SNP array was performed in 102 fetuses sonographically diagnosed with CHD, who have normal karyotype and negative results for 22q11. 2 deletion syndrome. The 102 fetuses were divided into two groups:isolated CHD (66) and CHD with extra cardiac structural abnor-malities ( 36 ) . Copy number variations ( CNⅤs ) were classified into three categories:benign CNⅤs,variant of uncertain significance (ⅤOUS) and pathogenic CNⅤs. Results:Benign CNⅤs were detected in 22/102 (21. 6%) cases.ⅤOUS were detected in 3/102 (2. 9%) cases. Path-ogenic CNⅤs were detected in 10/102 (9. 8%) cases. The detection rates of pathogenic CNⅤs for isolated CHD and CHD with extra cardiac structural abnormalities were 12 . 1%( 8/66 ) and 5. 6%(2/36),respectively
Braun, Rosemary; Buetow, Kenneth
Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.
Full Text Available The paper presents the development of the theory of asynchronous motors since Tesla’s discovery until the present day. The theory of steady state, as we know it today, was completed already during the first dozen of years. That was followed by a period of stagnation during a number of decades, when the theory of asynchronous motors was developed only in the framework of the general theory of electric machines, which was stimulated by the problems of the development of synchronous generators and big electric networks. It is only in our time that this simple motor, which was used for a long time just to perform crude tasks, became again the inspiration for the researchers and engineers who enabled it, with the help of power electronics and semi-conductor technology, to be used in the finest drives.
The Rotating Service Structure has been retracted at Pad 39A. Discovery, the Space Shuttle for STS-82 Mission is ready for the launch of the second Hubble Space Telescope service mission. The payload consists of the Near Infrared Camera and Multi-Object Spectrometer (NICMOS) Which will be installed, the Fine Guidance Sensor #1 (FGS-1) and the Space Telescope Imaging Spectrograph (STIS) which will be installed. STS-82 will launch with a crew of seven at 3:54 a.m. February 11, 1997. The launch window is 65 minutes. The Mission Commander for STS-82 is Ken Bowersox. The purpose of the mission is to upgrade the scientific capabilities, service or replace aging components on the Telescope and provide a reboost to the optimum altitude.
Banyuaji, Andhini; Rahayu, Endang S.; Utami, Tyas
Due to the health benefit reasons, probiotics have been incorporated into a range of dairy products, including yogurt, cheese, and ice cream. However, the viability of probiotics can decrease during ice cream processing. The reduction of viable probiotics after consumption may also be due to the stomach acid and the presence of bile salt. This research studied the encapsulation of Lactobacillus acidophilus SNP 2 probiotic bacteria using extrusion and emulsion tech niques, and their effect on...
Matsumoto, Mitsuyuki; Walton, Noah M; Yamada, Hiroshi; Kondo, Yuji; Marek, Gerard J; Tajinda, Katsunori
Failures of investigational new drugs (INDs) for schizophrenia have left huge unmet medical needs for patients. Given the recent lackluster results, it is imperative that new drug discovery approaches (and resultant drug candidates) target pathophysiological alterations that are shared in specific, stratified patient populations that are selected based on pre-identified biological signatures. One path to implementing this paradigm is achievable by leveraging recent advances in genetic information and technologies. Genome-wide exome sequencing and meta-analysis of single nucleotide polymorphism (SNP)-based association studies have already revealed rare deleterious variants and SNPs in patient populations. Areas covered: Herein, the authors review the impact that genetics have on the future of schizophrenia drug discovery. The high polygenicity of schizophrenia strongly indicates that this disease is biologically heterogeneous so the identification of unique subgroups (by patient stratification) is becoming increasingly necessary for future investigational new drugs. Expert opinion: The authors propose a pathophysiology-based stratification of genetically-defined subgroups that share deficits in particular biological pathways. Existing tools, including lower-cost genomic sequencing and advanced gene-editing technology render this strategy ever more feasible. Genetically complex psychiatric disorders such as schizophrenia may also benefit from synergistic research with simpler monogenic disorders that share perturbations in similar biological pathways.
Bogari, Neda M; Rayes, Husni H; Mostafa, Fakri; Abdel-Latif, Azza M; Ramadan, Abeer; Al-Allaf, Faisal A; Taher, Mohiuddin M; Fawzy, Ahmed
Neonatal diabetes mellitus (NDM) is a rare condition with a prevalence of 1 in 300,000 live births. We have found 3 known SNPs in 5'UTR and a novel SNP in 3' UTR in the INS gene. These SNPs were present in 9-month-old girl from Saudi Arabia and also present in the father and mother. The novel SNP we found is not present in 1000 Genome project or other databases. Further, the newly identified 3' UTR mutation in the INS gene may abolish the polyadenylation signal and result in severe RNA instability.
Huang, Hailiang; Tata, Sandeep; Prill, Robert J
Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. http://github.com/ibm-bioinformatics/bluesnp
Of the about 3000 isotopes presently known, about 20% have been discovered in fission. The history of fission as it relates to the discovery of isotopes as well as the various reaction mechanisms leading to isotope discoveries involving fission are presented.
Goering, P.T.H.; Heijenk, Gerhard J.; Lelieveldt, B.P.F.; Haverkort, Boudewijn R.H.M.; de Laat, C.T.A.M.; Heijnsdijk, J.W.J.
A protocol to perform service discovery in adhoc networks is introduced in this paper. Attenuated Bloom filters are used to distribute services to nodes in the neighborhood and thus enable local service discovery. The protocol has been implemented in a discrete event simulator to investigate the
The study substantiates that the effectiveness of Discovery Learning method in learning English Grammar for the learners at standard V. Discovery Learning is particularly beneficial for any student learning a second language. It promotes peer interaction and development of the language and the learning of concepts with content. Reichert and…
...(c) or 111 of the Act has been filed. 30 U.S.C. 815(c) and 821. (e) Completion of discovery... 29 Labor 9 2010-07-01 2010-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or...
van der Lee, Theo A J; Medema, Marnix H
Fungal natural products possess biological activities that are of great value to medicine, agriculture and manufacturing. Recent metagenomic studies accentuate the vastness of fungal taxonomic diversity, and the accompanying specialized metabolic diversity offers a great and still largely untapped resource for natural product discovery. Although fungal natural products show an impressive variation in chemical structures and biological activities, their biosynthetic pathways share a number of key characteristics. First, genes encoding successive steps of a biosynthetic pathway tend to be located adjacently on the chromosome in biosynthetic gene clusters (BGCs). Second, these BGCs are often are located on specific regions of the genome and show a discontinuous distribution among evolutionarily related species and isolates. Third, the same enzyme (super)families are often involved in the production of widely different compounds. Fourth, genes that function in the same pathway are often co-regulated, and therefore co-expressed across various growth conditions. In this mini-review, we describe how these partly interlinked characteristics can be exploited to computationally identify BGCs in fungal genomes and to connect them to their products. Particular attention will be given to novel algorithms to identify unusual classes of BGCs, as well as integrative pan-genomic approaches that use a combination of genomic and metabolomic data for parallelized natural product discovery across multiple strains. Such novel technologies will not only expedite the natural product discovery process, but will also allow the assembly of a high-quality toolbox for the re-design or even de novo design of biosynthetic pathways using synthetic biology approaches.
Douglas Gary Bielenberg
Full Text Available Low-cost, high throughput genotyping methods are crucial to marker discovery and marker-assisted breeding efforts, but have not been available for many 'specialty crops' such as fruit and nut trees. Here we apply the Genotyping-By-Sequencing (GBS method developed for cereals to the discovery of single nucleotide polymorphisms (SNPs in a peach F2 mapping population. Peach is a genetic and genomic model within the Rosaceae and will provide a template for the use of this method with other members of this family. Our F2 mapping population of 57 genotypes segregates for bloom time (BD and chilling requirement (CR and we have extensively phenotyped this population. The population derives from a selfed F1 progeny of a cross between 'Hakuho' (high CR and 'UFGold' (low CR. We were able to successfully employ GBS and the TASSEL GBS pipeline without modification of the original methodology using the ApeKI restriction enzyme and multiplexing at an equivalent of 96 samples per Illumina HiSeq 2000 lane. We obtained hundreds of SNP markers which were then used to construct a genetic linkage map and identify quantitative trait loci (QTL for BD and CR.
Mondal, Pinaki; Datta, Sayantan; Maiti, Guru Prasad; Baral, Aradhita; Jha, Ganga Nath; Panda, Chinmay Kumar; Chowdhury, Shantanu; Ghosh, Saurabh; Roy, Bidyut; Roychoudhury, Susanta
Polymorphic variants of DNA repair and damage response genes play major role in carcinogenesis. These variants are suspected as predisposition factors to Oral Squamous Cell Carcinoma (OSCC). For identification of susceptible variants affecting OSCC development in Indian population, the "maximally informative" method of SNP selection from HapMap data to non-HapMap populations was applied. Three hundred twenty-five SNPs from 11 key genes involved in double strand break repair, mismatch repair and DNA damage response pathways were genotyped on a total of 373 OSCC, 253 leukoplakia and 535 unrelated control individuals. The significantly associated SNPs were validated in an additional cohort of 144 OSCC patients and 160 controls. The rs12515548 of MSH3 showed significant association with OSCC both in the discovery and validation phases (discovery P-value: 1.43E-05, replication P-value: 4.84E-03). Two SNPs (rs12360870 of MRE11A, P-value: 2.37E-07 and rs7003908 of PRKDC, P-value: 7.99E-05) were found to be significantly associated only with leukoplakia. Stratification of subjects based on amount of tobacco consumption identified SNPs that were associated with either high or low tobacco exposed group. The study reveals a synergism between associated SNPs and lifestyle factors in predisposition to OSCC and leukoplakia.
In this paper, we discuss the datasets stored in SNP-Seek, architecture of the database and web application, interoperability methodologies in place, and discuss a few use cases demonstrating the utility of SNP-Seek for diversity analysis and molecular breeding.
White Frank F
Full Text Available Abstract Background Eight diverse sorghum (Sorghum bicolor L. Moench accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs. Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. Results Alignment of eight genome equivalents (6 Gb to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. Conclusions A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.
Full Text Available This study was undertaken to clarify the molecular basis for human skin color variation and the environmental adaptability to ultraviolet irradiation, with the ultimate goal of predicting the impact of changes in future environments on human health risk. One hundred twenty-two Caucasians living in Toledo, Ohio participated. Back and cheek skin were assayed for melanin as a quantitative trait marker. Buccal cell samples were collected and used for DNA extraction. DNA was used for SNP genotyping using the Masscode™ system, which entails two-step PCR amplification and a platform chemistry which allows cleavable mass spectrometry tags. The results show gene-gene interaction between SNP alleles at multiple loci (not necessarily on the same chromosome contributes to inter-individual skin color variation while suggesting a high probability of linkage disequilibrium. Confirmation of these findings requires further study with other ethic groups to analyze the associations between SNP alleles at multiple loci and human skin color variation. Our overarching goal is to use remote sensing data to clarify the interaction between atmospheric environments and SNP allelic frequency and investigate human adaptability to ultraviolet irradiation. Such information should greatly assist in the prediction of the health effects of future environmental changes such as ozone depletion and increased ultraviolet exposure. If such health effects are to some extent predictable, it might be possible to prepare for such changes in advance and thus reduce the extent of their impact.
Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas
There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.
In this investigation 45 parental cacao plants and five progeny derived from the parental stock studied were genotyped using six SNP markers to determine off-types or mislabeled clones and to authenticate crosses made in the Cocoa Research Institute of Ghana (CRIG) breeding program. Investigation wa...
Meisel, S.F.; Beeken, R.J.; Jaarsveld, C.H.M. van; Wardle, J.
AIM: We tested the hypothesis that the obesity-associated FTO SNP rs9939609 would be associated with clinically significant weight gain (>/= 5% of initial body weight) in the first year of university; a time identified as high risk for weight gain. METHODS: We collected anthropometric data from
Saccone, Scott F; Quan, Jiaxi; Mehta, Gaurang; Bolze, Raphael; Thomas, Prasanth; Deelman, Ewa; Tischfield, Jay A; Rice, John P
Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale.
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...
Mikkelsen, Martin; Rockenbauer, Eszter; Sørensen, Erik;
Mitochondrial DNA (mtDNA) is maternally inherited without recombination events and has a high copy number, which makes mtDNA analysis feasible even when genomic DNA is sparse or degraded. Here, we present a SNP typing assay with 33 previously described mtDNA coding region SNPs for haplogroup assi...
Scholten, Olga E.; Kaauwen, van Martijn P.W.; Shahin, Arwa; Hendrickx, Patrick M.; Keizer, Paul; Burger-Meijer, Karin; Heusden, van Sjaak; Linden, van der Gerard; Vosman, Ben
Background: Within onion, Allium cepa L., the availability of disease resistance is limited. The identification of sources of resistance in related species, such as Allium roylei and Allium fistulosum, was a first step towards the improvement of onion cultivars by breeding. SNP markers linked to
Aubrey E Hill
Full Text Available Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene--as opposed to an entire genome or species--should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated--and continue to accrue--in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of 'directional' or 'intelligent design-type' SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive.
Børsting, Claus; Sanchez, Juan J; Hansen, Hanna E; Hansen, Anders J; Bruun, Hanne Q; Morling, Niels
The performance of a multiplex assay with 52 autosomal single nucleotide polymorphisms (SNPs) developed for human identification was tested on 124 mother-child-father trios. The typical paternity indices (PIs) were 10(5)-10(6) for the trios and 10(3)-10(4) for the child-father duos. Using the SNP profiles from the randomly selected trios and 700 previously typed individuals, a total of 83,096 comparisons between mother, child and an unrelated man were performed. On average, 9-10 mismatches per comparison were detected. Four mismatches were genetic inconsistencies and 5-6 mismatches were opposite homozygosities. In only two of the 83,096 comparisons did an unrelated man match perfectly to a mother-child duo, and in both cases the PI of the true father was much higher than the PI of the unrelated man. The trios were also typed for 15 short tandem repeats (STRs) and seven variable number of tandem repeats (VNTRs). The typical PIs based on 15 STRs or seven VNTRs were 5-50 times higher than the typical PIs based on 52 SNPs. Six mutations in tandem repeats were detected among the randomly selected trios. In contrast, there was not found any mutations in the SNP loci. The results showed that the 52 SNP-plex assay is a very useful alternative to currently used methods in relationship testing. The usefulness of SNP markers with low mutation rates in paternity and immigration casework is discussed.
LI Jing-qiong; ZHENG You-liang; WEI Yu-ming
Forty-three gene sequences encoding purothionin were characterized from the three species or subspecies of einkorn wheats.These sequences contained 887 bp,among which 92 SNPs including 29 indel loci were detected,giving an average SNP frequency of one SNP per 9.64 bases.According to these sequences,5 SNP markers were successfully designed,which were used to mine the variations of purothionin genes of 102 einkorn wheat accessions.Based on the 5 detected SNP loci,102 einkorn wheat accessions could be divided into 21 haplotypes,among which 11 hapiotypes contained a single sample.Phylogenetic analysis indicated that the purothionin genes from einkorn wheats were more closely related to those from D genome than B genome.Seven out of the 43 gene sequences were assumed to be pseudogenes by the definition of containing in-frame stop codons and small insertions/deletions leading to frameshifi.In the remaining 36 amino acid sequences,the 8 Cys and Tyr-13 loci in the mature thionin domain which played important roles in the biological activities were all conserved,whereas there were some varieties occurred in some other important amino acid residues such as Lys and Arg.
Evidence for the impact of mislabeling and/or pollen contamination on consistency of field performance has been lacking to reinforce the need for strict adherence to quality control protocols in cacao seed garden and germplasm plot management. The present study used SNP fingerprinting at 64 loci to ...
Calus, M.P.L.; Mulder, H.A.; Bastiaansen, J.W.M.
Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype info
Johnatty, S.E.; Couch, F.J.; Fredericksen, Z.; Tarrell, R.; Spurdle, A.B.; Beesley, J.; Chen, X.; Gschwantler-Kaulich, D.; Singer, C.F.; Fuerhauser, C.; Fink-Retter, A.; Domchek, S.M.; Nathanson, K.L.; Pankratz, V.S.; Lindor, N.; Godwin, A.K.; Caligo, M.A.; Hopper, J.; Southey, M.C.; Giles, G.G.; Justenhoven, C.; Brauch, H.; Hamann, U.; Ko, Y.D.; Heikkinen, T.; Aaltonen, K.; Aittomaki, K.; Blomqvist, C.; Nevanlinna, H.; Hall, P.; Czene, K.; Liu, J.; Peock, S.; Cook, M.; Platte, R.; Evans, D.G.; Lalloo, F.; Eeles, R.; Pichert, G.; Eccles, D.; Davidson, R.; Cole, T.; Cook, J.; Douglas, F.; Chu, C.; Hodgson, S.; Paterson, J.; Hogervorst, F.B.L.; Rookus, M.A.; Seynaeve, C.; Wijnen, J.; Vreeswijk, M.; Ligtenberg, M.J.L.; Luijt, R.B. van der; Os, T.A. van; Gille, H.J.; Blok, M.J.; Issacs, C.; Humphreys, M.K.; McGuffog, L.; Healey, S.; Sinilnikova, O.M.; Antoniou, A.C.; Easton, D.F.; Chenevix-Trench, G.
GATA-binding protein 3 (GATA3) is a transcription factor that is crucial to mammary gland morphogenesis and differentiation of progenitor cells, and has been suggested to have a tumor suppressor function. The rs570613 single nucleotide polymorphism (SNP) in intron 4 of GATA3 was previously found to
Fountain, Emily D; Pauli, Jonathan N; Reid, Brendan N; Palsbøll, Per J; Peery, M Zachariah
Restriction-enzyme-based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in non-model organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction-enzyme-based methods remain largely unknown
Orr, J.L.; Back, W.; Gu, J.; Leegwater, P.H.; Govindarajan, P.; Conroy, J.; Ducro, B.J.; Arendonk, van J.A.M.
The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of
Tvedegaard, Kristine C.; Parner, Erik; Hooper, Craig W.
Multiplex SNP analysis on whole genome amplified DNA from archived dried bloodspots, a validation study Kristine C. Tvedegaard,1 Erik Parner,1 Craig W. Hooper,2 Jørn Atterman,1 Niels Gregersen3, Poul Thorsen,1 1Institute of Public Health, NANEA at Department of Epidemiology, University of Aarhus...
A CottonSNP63K array and accompanying cluster file has been developed and includes 45,104 intra-specific SNPs and 17,954 inter-specific SNPs for automated genotyping of cotton (Gossypium spp.) samples. Development of the cluster file included genotyping of 1,156 samples, a subset of which were iden...
Groenen, M.A.M.; Megens, H.J.W.C.; Zare, Y.; Warren, W.C.; Hillier, L.W.; Crooijmans, R.P.M.A.; Vereijken, A.; Okimoto, R.; Muir, W.M.; Cheng, H.H.
Background In livestock species like the chicken, high throughput single nucleotide polymorphism (SNP) genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection). To be of value in a wide variety of breeds and popul
Nasri, Soroush; Anjomshoaa, Ahmad; Song, Sarah; Guilford, Parry; McNoe, Les; Black, Michael; Phillips, Vicky; Reeve, Anthony; Humar, Bostjan
Compromised quality of formalin-fixed paraffin-embedded (FFPE)-derived DNA has compounded the use of archival specimens for array-based genomic studies. Recent technological advances have led to first successes in this field; however, there is currently no general agreement on the most suitable platform for the array-based analysis of FFPE DNA. In this study, FFPE and matched fresh-frozen (FF) specimens were separately analyzed with Affymetrix single nucleotide polymorphism (SNP) 6.0 and Agilent 4x44K oligonucleotide arrays to compare the genomic profiles from the two tissue sources and to assess the relative performance of the two platforms on FFPE material. Genomic DNA was extracted from matched FFPE-FF pairs of normal intestinal epithelium from four patients and were applied to the SNP and oligonucleotide platforms according to the manufacturer-recommended protocols. On the Affymetrix platform, a substantial increase in apparent copy number alterations was observed in all FFPE tissues relative to their matched FF counterparts. In contrast, FFPE and matched FF genomic profiles obtained via the Agilent platform were very similar. Both the SNP and the oligonucleotide platform performed comparably on FF material. This study demonstrates that Agilent oligonucleotide array comparative genomic hybridization generates reliable results from FFPE extracted DNA, whereas the Affymetrix SNP-based array seems less suitable for the analysis of FFPE material.
Calus, M.P.L.; Mulder, H.A.; Bastiaansen, J.W.M.
Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype
Fett, James D; Markham, David W
The past decade has seen remarkable gains for outcomes in peripartum cardiomyopathy (PPCM), one of the leading causes of maternal mortality and morbidity in the USA and many other countries, including the high-incidence areas of Haiti and South Africa. This review article emphasizes the importance of continuing the process of increasing awareness of PPCM and presents details of this evolving picture, including important discoveries that point the way to full recovery for almost all PPCM subjects. In addition, new interventions will be highlighted, which may facilitate recovery. Numerous studies have demonstrated that when the diagnosis of PPCM is made with LVEF > 0.30, the probability is that recovery to LVEF ≥ 0.50 will occur in the overwhelming majority of subjects. PPCM patients diagnosed with severely depressed systolic function (LVEF < 0.30) and a remodeled left ventricle with greater dilatation (LVEDd ≥ 60mm) are least likely to reach the outcome recovery goals. These are the patients with the greatest need for newer interventional strategies.
Surveyors of all ages, have your rulers and compasses at the ready! This sixth edition of Discovery Monday is your chance to learn about the surveyor's tools - the state of the art in measuring instruments - and see for yourself how they work. With their usual daunting precision, the members of CERN's Surveying Group have prepared some demonstrations and exercises for you to try. Find out the techniques for ensuring accelerator alignment and learn about high-tech metrology systems such as deviation indicators, tracking lasers and total stations. The surveyors will show you how they precisely measure magnet positioning, with accuracy of a few thousandths of a millimetre. You can try your hand at precision measurement using different types of sensor and a modern-day version of the Romans' bubble level, accurate to within a thousandth of a millimetre. You will learn that photogrammetry techniques can transform even a simple digital camera into a remarkable measuring instrument. Finally, you will have a chance t...
Devoted teachers and mentors during early childhood and adolescence nurtured my ambition to become a scientist, but it was not until I actually began doing experiments in college and graduate school that I was confident about that choice and of making it a reality. During my postdoctoral experiences and thereafter, I made several significant advances, most notably the discovery of the then novel acyl- and aminoacyl adenylates: the former as intermediates in fatty acyl coenzyme A (CoA) formation and the latter as precursors to aminoacyl tRNAs. In the early 1970s, my research changed from a focus on transcription and translation in Escherichia coli to the molecular genetics of mammalian cells. To that end, my laboratory developed a method for creating recombinant DNAs that led us and others, over the next two decades, to create increasingly sophisticated ways for introducing "foreign" DNAs into cultured mammalian cells and to target modifications of specific chromosomal loci. Circumstances surrounding that work drew me into the public policy debates regarding recombinant DNA practices. As an outgrowth of my commitment to teaching, I co-authored several textbooks on molecular genetics and a biography of George Beadle. The colleagues, students, and wealth of associates with whom I interacted have made being a scientist far richer than I can have imagined.
Duan, Pei; Ding, Feng; Wang, Fang; Wang, Bao-Shan
The effect of SNP, an NO donor, on seed germination of wheat (Triticum aestivum L. cv. 'DK961') under salt stress was studied. The results showed that priming of seeds with 0.06 mmol/L SNP for 24 h markedly alleviated the decrease of the germination percentage, germination index, vigor index and imbibition rate of wheat seeds under salt stress. SNP significantly alleviated the decrease of the beta-amylase activity but almost did not affect the alpha-amylase activity of wheat seeds under salt stress. SNP slightly increased the alpha-amylase isoenzymes (especially isoenzyme 3) and significantly increased the beta-amylase isoenzymes (especially isoenzyme d, e, f and g). SNP pretreatment decreased Na(+) content, but increased the K(+) content, resulting in a mark increase of K(+)/Na(+) ratio of wheat seedlings under salt stress. These results suggested that NO is involved in promoting wheat seed germination under salt stress by increasing the beta-amylase activity.
Full Text Available Abstract Background In livestock species like the chicken, high throughput single nucleotide polymorphism (SNP genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection. To be of value in a wide variety of breeds and populations, the success rate of the SNP genotyping assay, the distribution of the SNP across the genome and the minor allele frequencies (MAF of the SNPs used are extremely important. Results We describe the design of a moderate density (60k Illumina SNP BeadChip in chicken consisting of SNPs known to be segregating at high to medium minor allele frequencies (MAF in the two major types of commercial chicken (broilers and layers. This was achieved by the identification of 352,303 SNPs with moderate to high MAF in 2 broilers and 2 layer lines using Illumina sequencing on reduced representation libraries. To further increase the utility of the chip, we also identified SNPs on sequences currently not covered by the chicken genome assembly (Gallus_gallus-2.1. This was achieved by 454 sequencing of the chicken genome at a depth of 12x and the identification of SNPs on 454-derived contigs not covered by the current chicken genome assembly. In total we added 790 SNPs that mapped to 454-derived contigs as well as 421 SNPs with a position on Chr_random of the current assembly. The SNP chip contains 57,636 SNPs of which 54,293 could be genotyped and were shown to be segregating in chicken populations. Our SNP identification procedure appeared to be highly reliable and the overall validation rate of the SNPs on the chip was 94%. We were able to map 328 SNPs derived from the 454 sequence contigs on the chicken genome. The majority of these SNPs map to chromosomes that are already represented in genome build Gallus_gallus-2.1.0. Twenty-eight SNPs were used to construct two new linkage groups most likely representing two micro-chromosomes not covered by the
Wayne E Clarke
Full Text Available Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38. The main goal of this project was to combine sequence capture with next generation sequencing (NGS to discover single nucleotide polymorphisms (SNPs in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively. Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.
Ratcliffe, B; El-Dien, O G; Klápště, J; Porth, I; Chen, C; Jaquish, B; El-Kassaby, Y A
Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3-40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31-0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04-0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated.
Tomas, Carmen; Stangegaard, Michael; Børsting, Claus; Hansen, Anders Johannes; Morling, Niels
GenPlex (Applied Biosystems) is a new SNP genotyping system based on an initial PCR amplification followed by an oligo ligation assay (OLA). The OLA consists of the hybridization of allele and locus specific oligonucleotides (ASOs and LSOs) to PCR products and posterior ligation of ASOs and LSOs. The ligation products are immobilized to microtitre plates and reporter oligonucleotides (ZipChute probes) are hybridized to the ligation products. ZipChute probes are subsequently eluted and detected using capillary electrophoresis. Applied Biosystems developed the GenPlex SNP genotyping system with amelogenin and 48 of the 52 SNPs used in the 52 SNP-plex assay developed by the SNPforID consortium. The system requires equipment that is usually found in forensic genetic laboratories. The use of a robot for performance of the pipetting steps is highly recommendable. A total of 286 individuals from Denmark, Somalia and Greenland were investigated with GenPlex using a Biomek 3000 (Beckman Coulter) robot. The results were compared to results obtained with an ISO 17025 accredited SNP typing assay based on single base extension (SBE). With the GenPlex SNP genotyping system, full SNP profiles were obtained in 97.6% of the investigations. Perfect concordance was obtained in duplicate investigations and the SNP genotypes obtained with the GenPlex system were concordant with those of the accredited SBE based SNP typing system except for one result in rs901398 in one of 286 individuals most likely due to a mutation 6 bp downstream of the SNP. Reproducible SNP genotypes were obtained from as little as 250 pg of DNA.
Yang, M; Geng, G-J; Zhang, W; Cui, L; Zhang, H-X; Zheng, J-L
To find out the relationship between SNP genotypes of canine olfactory receptor genes and olfactory ability, 28 males and 20 females from German Shepherd dogs in police service were scored by odor detection tests and analyzed using the Beckman GenomeLab SNPstream. The representative 22 SNP loci from the exonic regions of 12 olfactory receptor genes were investigated, and three kinds of odor (human, ice drug and trinitrotoluene) were detected. The results showed that the SNP genotypes at the OR10H1-like:c.632C>T, OR10H1-like:c.770A>T, OR2K2-like:c.518G>A, OR4C11-like:c.511T>G and OR4C11-like:c.692G>A loci had a statistically significant effect on the scenting abilities (P dogs (P T, OR10H1-like:c.770A>T, OR4C11-like:c.511T>G and OR4C11-like:c.692G>A (P dogs with genotype CC at the OR10H1-like:c.632C>T, genotype AA at the OR10H1-like:c.770A>T, genotype TT at the OR4C11-like:c.511T>G and genotype GG at the OR4C11-like:c.692G>A loci did better at detecting the ice drug. We concluded that there was linkage between certain SNP genotypes and the olfactory ability of dogs and that SNP genotypes might be useful in determining dogs' scenting potential.
Lakshmi K Matukumalli
Full Text Available The success of genome-wide association (GWA studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP genotyping for the identification of quantitative trait loci (QTL and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF ranging from 0.24 to 0.27. The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle.
Full Text Available Knowledge, as intellectual capital, has become the main resource of anorganization, and the process of knowledge discovery, acquisition and storage is a very important one. Knowledge discovery can be easily realized through Data Mining, a Machine Learning technique, which allows the discovery of useful knowledge from a large amount of data, this knowledge supporting the decision process. A proper knowledge management of the discovered knowledge is able to improve the organization’s results and will lead to increasing the intellectualcapital, the result being a more efficient management.
Gawehn, Erik; Hiss, Jan A; Schneider, Gisbert
Artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. Currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research by borrowing from the field of "deep learning". Compared with some of the other life sciences, their application in drug discovery is still limited. Here, we provide an overview of this emerging field of molecular informatics, present the basic concepts of prominent deep learning methods and offer motivation to explore these techniques for their usefulness in computer-assisted drug discovery and design. We specifically emphasize deep neural networks, restricted Boltzmann machine networks and convolutional networks.
Gallinger, C. [Elk Valley Coal Corporation, Sparwood, BC (Canada)
The opening of the 30-kilometre Coal Discovery Trail in August is described. The trail, through a pine, spruce, and larch forest, extends from Sparwood to Fernie and passes through Hosmer, a historic mining site. The trail, part of the Elk Valley Coal Discovery Centre, will be used for hiking, bicycling, horseback riding, and cross-country skiing. The Coal Discovery Centre will provide an interpretive centre that concentrates on history of coal mining and miners, preservation of mining artifacts and sites, and existing technology. 3 figs.
Wilhelmson, R.; Moore, C. W.
Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed
Farha, Maya A; Brown, Eric D
.... Nevertheless, a paucity of new antibacterial drugs in discovery and development pipelines using traditional approaches has prompted a variety of unconventional and disruptive strategies for antibacterial drug discovery...
Hestand, Matthew S; van Galen, Michiel; Villerius, Michel P; van Ommen, Gert-Jan B; den Dunnen, Johan T; 't Hoen, Peter AC
Background The identification of transcription factor binding sites is difficult since they are only a small number of nucleotides in size, resulting in large numbers of false positives and false negatives in current approaches. Computational methods to reduce false positives are to look for over-representation of transcription factor binding sites in a set of similarly regulated promoters or to look for conservation in orthologous promoter alignments. Results We have developed a novel tool, "CORE_TF" (Conserved and Over-REpresented Transcription Factor binding sites) that identifies common transcription factor binding sites in promoters of co-regulated genes. To improve upon existing binding site predictions, the tool searches for position weight matrices from the TRANSFACR database that are over-represented in an experimental set compared to a random set of promoters and identifies cross-species conservation of the predicted transcription factor binding sites. The algorithm has been evaluated with expression and chromatin-immunoprecipitation on microarray data. We also implement and demonstrate the importance of matching the random set of promoters to the experimental promoters by GC content, which is a unique feature of our tool. Conclusion The program CORE_TF is accessible in a user friendly web interface at . It provides a table of over-represented transcription factor binding sites in the users input genes' promoters and a graphical view of evolutionary conserved transcription factor binding sites. In our test data sets it successfully predicts target transcription factors and their binding sites. PMID:19036135
Kundu, Juthika; Mazumder, Rupa; Srivastava, Ranjana; Srivastava, Brahm S
Intranasal immunization, a noninvasive method of vaccination, has been found to be effective in inducing systemic and mucosal immune responses. The present study was aimed at investigating the efficacy of intranasal immunization in inducing mucosal immunity in experimental cholera by subunit recombinant protein vaccines from Vibrio cholerae O1. The structural genes encoding toxin-coregulated pilus A (TcpA) and B subunit of cholera toxin (CtxB) from V. cholerae O1 were cloned and expressed in Escherichia coli. Rabbits were immunized intranasally with purified TcpA and CtxB alone or a mixture of TcpA and CtxB. Immunization with TcpA and CtxB alone conferred, respectively, 41.1% and 70.5% protection against V. cholerae challenge, whereas immunization with a mixture of both antigens conferred complete (100%) protection, as assayed in the rabbit ileal loop model. Serum titers of immunoglobulin G (IgG) antibodies to TcpA and CtxB, and anti-TcpA- and anti-CtxB-specific sIgA in intestinal lavage of vaccinated animals were found to be significantly elevated compared with unimmunized controls. Vibriocidal antibodies were detected at remarkable levels in rabbits receiving TcpA antigen and their titers correlated with protection. Thus, mucosal codelivery of pertinent cholera toxoids provides enhanced protection against experimental cholera.
Transcription factors, transcriptional coregulators, and epigenetic modulation in the control of pulmonary vascular cell phenotype: therapeutic implications for pulmonary hypertension (2015 Grover Conference series).
Pullamsetti, Soni S; Perros, Frédéric; Chelladurai, Prakash; Yuan, Jason; Stenmark, Kurt
Pulmonary hypertension (PH) is a complex and multifactorial disease involving genetic, epigenetic, and environmental factors. Numerous stimuli and pathological conditions facilitate severe vascular remodeling in PH by activation of a complex cascade of signaling pathways involving vascular cell proliferation, differentiation, and inflammation. Multiple signaling cascades modulate the activity of certain sequence-specific DNA-binding transcription factors (TFs) and coregulators that are critical for the transcriptional regulation of gene expression that facilitates PH-associated vascular cell phenotypes, as demonstrated by several studies summarized in this review. Past studies have largely focused on the role of the genetic component in the development of PH, while the presence of epigenetic alterations such as microRNAs, DNA methylation, histone levels, and histone deacetylases in PH is now also receiving increasing attention. Epigenetic regulation of chromatin structure is also recognized to influence gene expression in development or disease states. Therefore, a complete understanding of the mechanisms involved in altered gene expression in diseased cells is vital for the design of novel therapeutic strategies. Recent technological advances in DNA sequencing will provide a comprehensive improvement in our understanding of mechanisms involved in the development of PH. This review summarizes current concepts in TF and epigenetic control of cell phenotype in pulmonary vascular disease and discusses the current issues and possibilities in employing potential epigenetic or TF-based therapies for achieving complete reversal of PH.
Endang S Rahayu1
Full Text Available Functional food is defined as any potentially healthful food or food ingredient that may provide a health benefit beyond the traditional nutrients it contains. Many researches have been conducted on the health benefit of probiotic (life bacterial cells, one of the ingredient of functional foods. One of the potential bacteria used for probiotic agent and also involved in traditional fermented foods are lactic acid bacteria (LAB. Previous research showed that Lactobacillus acidophilus SNP-2 isolated from faecal material of healthy infant is resistant to acid and bile salt, and has an antagonistic effect against several enteric bacterial pathogens. The objective of this research was to study the effect of L. acidophilus SNP-2 as probiotic agent to the health benefits. These bacteria were supplemented into tape ketan (fermented sticky rice, the indigenous Indonesian fermented food. Tape ketan was chosen as the carrier of probiotic biomass based on the high population of LAB in this product, i.e., 1.3 x 108 CFU/g. Addition of L. acidophilus SNP-2 biomass prior to fermentation of tape ketan resulted in a higher total of LAB cells, i.e. 2.1 x 109 CFU/g compared to the amount of 1.5 x 108 CFU/g when the addition was done after fermentation. Consumption of tape ketan containing probiotic agent by the volunteers increased the population of lactobacilli (from 1.7x107 CFU/g to 9.9x107 CFU/g and decreased the population of enterobacteriacea (from 5.4x109 CFU/g to 4.4x108 in their faecal material. This phenomenon revealed that probiotic agent was able to colonize and inhibit the growth of enterobacteriaceae in the gastrointestinal tract. The result implied that tape ketan can be used as a carrier for probiotic agent and it can be categorized as functional food
Full Text Available Wheat leaf rust is an important disease worldwide. Growing resistant cultivars is an effective means to control the disease. In the present study, 244 recombinant inbred lines from Zhou 8425B/Chinese Spring cross were phenotyped for leaf rust severities during the 2011–2012, 2012–2013, 2013–2014, and 2014–2015 cropping seasons at Baoding, Hebei province, and 2012–2013 and 2013–2014 cropping seasons in Zhoukou, Henan province. The population was genotyped using the high-density Illumina iSelect 90K SNP assay and SSR markers. Inclusive composite interval mapping identified eight QTL, designated as QLr.hebau-2AL, QLr.hebau-2BS, QLr.hebau-3A, QLr.hebau-3BS, QLr.hebau-4AL, QLr.hebau-4B, QLr.hebau-5BL, and QLr.hebau-7DS, respectively. QLr.hebau-2BS, QLr.hebau-3A, QLr.hebau-3BS, and QLr.hebau-5BL were derived from Zhou 8425B, whereas the other four were from Chinese Spring. Three stable QTL on chromosomes 2BS, 4B and 7DS explained 7.5–10.6%, 5.5–24.4%, and 11.2–20.9% of the phenotypic variance, respectively. QLr.hebau-2BS in Zhou 8425B might be the same as LrZH22 in Zhoumai 22; QLr.hebau-4B might be the residual resistance of Lr12, and QLr.hebau-7DS is Lr34. QLr.hebau-2AL, QLr.hebau-3BS, QLr.hebau-4AL, and QLr.hebau-5BL are likely to be novel QTL for leaf rust. These QTL and their closely linked SNP and SSR markers can be used for fine mapping, candidate gene discovery, and marker-assisted selection in wheat breeding.
Full Text Available High-throughput sequencing of RNA (RNA-Seq was developed primarily to analyze global gene expression in different tissues. It is also an efficient way to discover coding SNPs and when multiple individuals with different genetic backgrounds were used, RNA-Seq is very effective for the identification of SNPs. The objective of this study was to perform SNP and INDEL discoveries in human airway transcriptome of healthy never smokers, healthy current smokers, smokers without lung cancer and smokers with lung cancer. By preliminary comparative analysis of these four data sets, it is expected to get SNP and INDEL patterns responsible for lung cancer. A total of 85,028 SNPs and 5738 INDELs in healthy never smokers, 32,671 SNPs and 1561 INDELs in healthy current smokers, 50,205 SNPs and 3008 INDELs in smokers without lung cancer and 51,299 SNPs and 3138 INDELs in smokers with lung cancer were identified. The analysis of the SNPs and INDELs in genes that were reported earlier as differentially expressed was also performed. It has been found that a smoking person has SNPs at position 62,186,542 and 62,190,293 in SCGB1A1 gene and 180,017,251, 180,017,252, and 180,017,597 in SCGB3A1 gene and INDELs at position 35,871,168 in NFKBIA gene and 180,017,797 in SCGB3A1 gene. The SNPs identified in this study provides a resource for genetic studies in smokers and shall contribute to the development of a personalized medicine. This study is only a preliminary kind and more vigorous data analysis and wet lab validation are required.
Li, Gang; Hillier, LaDeana W; Grahn, Robert A; Zimin, Aleksey V; David, Victor A; Menotti-Raymond, Marilyn; Middleton, Rondo; Hannah, Steven; Hendrickson, Sher; Makunin, Alex; O'Brien, Stephen J; Minx, Pat; Wilson, Richard K; Lyons, Leslie A; Warren, Wesley C; Murphy, William J
High-resolution genetic and physical maps are invaluable tools for building accurate genome assemblies, and interpreting results of genome-wide association studies (GWAS). Previous genetic and physical maps anchored good quality draft assemblies of the domestic cat genome, enabling the discovery of numerous genes underlying hereditary disease and phenotypes of interest to the biomedical science and breeding communities. However, these maps lacked sufficient marker density to order thousands of shorter scaffolds in earlier assemblies, which instead relied heavily on comparative mapping with related species. A high-resolution map would aid in validating and ordering chromosome scaffolds from existing and new genome assemblies. Here, we describe a high-resolution genetic linkage map of the domestic cat genome based on genotyping 453 domestic cats from several multi-generational pedigrees on the Illumina 63K SNP array. The final maps include 58,055 SNP markers placed relative to 6637 markers with unique positions, distributed across all autosomes and the X chromosome. Our final sex-averaged maps span a total autosomal length of 4464 cM, the longest described linkage map for any mammal, confirming length estimates from a previous microsatellite-based map. The linkage map was used to order and orient the scaffolds from a substantially more contiguous domestic cat genome assembly (Felis catus v8.0), which incorporated ∼20 × coverage of Illumina fragment reads. The new genome assembly shows substantial improvements in contiguity, with a nearly fourfold increase in N50 scaffold size to 18 Mb. We use this map to report probable structural errors in previous maps and assemblies, and to describe features of the recombination landscape, including a massive (∼50 Mb) recombination desert (of virtually zero recombination) on the X chromosome that parallels a similar desert on the porcine X chromosome in both size and physical location.
National Aeronautics and Space Administration — The proposal addresses the NASA's need to enable scientific discovery and the topic's requirements for: processing large volumes of data, commonly available on the...
NCI funded the development of rituximab, one of the first monoclonal antibody cancer treatments. With the discovery of rituximab, more than 70 percent of patients diagnosed with non-hodgkin lymphoma now live five years past their initial diagnosis.
Zakeri, Bijan; Lu, Timothy K.
Antibiotic discovery has a storied history. From the discovery of penicillin by Sir Alexander Fleming to the relentless quest for antibiotics by Selman Waksman, the stories have become like folklore, used to inspire future generations of scientists. However, recent discovery pipelines have run dry at a time when multidrug resistant pathogens are on the rise. Nature has proven to be a valuable reservoir of antimicrobial agents, which are primarily produced by modularized biochemical pathways. Such modularization is well suited to remodeling by an interdisciplinary approach that spans science and engineering. Herein, we discuss the biological engineering of small molecules, peptides, and non-traditional antimicrobials and provide an overview of the growing applicability of synthetic biology to antimicrobials discovery. PMID:23654251
Szymanski, T.; Thoennessen, M.
Twenty-six cobalt isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.
Shore, A; Heim, M; Schuh, A; Thoennessen, M
Twenty-nine arsenic isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.
The RAS Drug Discovery group aims to develop assays that will reveal aspects of RAS biology upon which cancer cells depend. Successful assay formats are made available for high-throughput screening programs to yield potentially effective drug compounds.
Sumudu P. Leelananda
Full Text Available The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein–ligand docking, pharmacophore modeling and QSAR techniques are reviewed.
Shupla, Christine; Shipp, S. S.; Halligan, E.; Dalton, H.; Boonstra, D.; Buxner, S.; SMD Planetary Forum, NASA
"New Worlds, New Discoveries" is a synthesis of NASA’s 50-year exploration history which provides an integrated picture of our new understanding of our solar system. As NASA spacecraft head to and arrive at key locations in our solar system, "New Worlds, New Discoveries" provides an integrated picture of our new understanding of the solar system to educators and the general public! The site combines the amazing discoveries of past NASA planetary missions with the most recent findings of ongoing missions, and connects them to the related planetary science topics. "New Worlds, New Discoveries," which includes the "Year of the Solar System" and the ongoing celebration of the "50 Years of Exploration," includes 20 topics that share thematic solar system educational resources and activities, tied to the national science standards. This online site and ongoing event offers numerous opportunities for the science community - including researchers and education and public outreach professionals - to raise awareness, build excitement, and make connections with educators, students, and the public about planetary science. Visitors to the site will find valuable hands-on science activities, resources and educational materials, as well as the latest news, to engage audiences in planetary science topics and their related mission discoveries. The topics are tied to the big questions of planetary science: how did the Sun’s family of planets and bodies originate and how have they evolved? How did life begin and evolve on Earth, and has it evolved elsewhere in our solar system? Scientists and educators are encouraged to get involved either directly or by sharing "New Worlds, New Discoveries" and its resources with educators, by conducting presentations and events, sharing their resources and events to add to the site, and adding their own public events to the site’s event calendar! Visit to find quality resources and ideas. Connect with educators, students and the public to
Meng, Jia; Gao, Shou-Jiang; Huang, Yufei
An algorithm for the discovery of time varying modules using genome-wide expression data is present here. When applied to large-scale time serious data, our method is designed to discover not only the transcription modules but also their timing information, which is rarely annotated by the existing approaches. Rather than assuming commonly defined time constant transcription modules, a module is depicted as a set of genes that are co-regulated during a specific period of time, i.e., a time dependent transcription module (TDTM). A rigorous mathematical definition of TDTM is provided, which is serve as an objective function for retrieving modules. Based on the definition, an effective signature algorithm is proposed that iteratively searches the transcription modules from the time series data. The proposed method was tested on the simulated systems and applied to the human time series microarray data during Kaposi's sarcoma-associated herpesvirus (KSHV) infection. The result has been verified by Expression Analysis Systematic Explorer.
Calus Mario PL
Full Text Available Abstract Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Methods Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP genotypes between parent and offspring (PAR-OFF. Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT. The second test compares pedigree and SNP-based relationships (SIBREL. All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Results Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL identified 18 (22 additional inconsistent animals. Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error, were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error, were considerably higher for SIBREL compared to SIBCOUNT. Conclusions
Calus, Mario P L; Mulder, Han A; Bastiaansen, John W M
Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals.Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Tests to remove Mendelian inconsistencies between sibs should
Rapid, economical single-nucleotide polymorphism and microsatellite discovery based on de novo assembly of a reduced representation genome in a non-model organism: a case study of Atlantic cod Gadus morhua.
Carlsson, J; Gauthier, D T; Carlsson, J E L; Coughlan, J P; Dillane, E; Fitzgerald, R D; Keating, U; McGinnity, P; Mirimin, L; Cross, T F
By combining next-generation sequencing technology (454) and reduced representation library (RRL) construction, the rapid and economical isolation of over 25 000 potential single-nucleotide polymorphisms (SNP) and >6000 putative microsatellite loci from c. 2% of the genome of the non-model teleost, Atlantic cod Gadus morhua from the Celtic Sea, south of Ireland, was demonstrated. A small-scale validation of markers indicated that 80% (11 of 14) of SNP loci and 40% (6 of 15) of the microsatellite loci could be amplified and showed variability. The results clearly show that small-scale next-generation sequencing of RRL genomes is an economical and rapid approach for simultaneous SNP and microsatellite discovery that is applicable to any species. The low cost and relatively small investment in time allows for positive exploitation of ascertainment bias to design markers applicable to specific populations and study questions.
Lu, Haitao; Zhang, Tong; Wen, Mei; Sun,Li
Background Little is known about the effects of low-frequency repetitive transcranial magnetic stimulation (rTMS) on dysmnesia and the impact of brain nucleotide neurotrophic factor (BDNF) Val66Met single-nucleotide polymorphism (SNP). This study investigated the impact of low-frequency rTMS on post-stroke dysmnesia and the impact of BDNF Val66Met SNP. Material/Methods Forty patients with post-stroke dysmnesia were prospectively randomized into the rTMS and sham groups. BDNF Val66Met SNP was ...
... PENALTIES ACT OF 1986 § 1264.120 Discovery. (a) The following types of discovery are authorized: (1... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Discovery. 1264.120 Section 1264.120..., discovery is available only as ordered by the presiding officer. The presiding officer shall regulate...
... IMPLEMENTING THE PROGRAM FRAUD CIVIL REMEDIES ACT § 42.21 Discovery. (a) The following types of discovery are... 38 Pensions, Bonuses, and Veterans' Relief 2 2010-07-01 2010-07-01 false Discovery. 42.21 Section... creation of a document. (c) Unless mutually agreed to by the parties, discovery is available only...
... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon...
... COMMERCE RULES OF PRACTICE IN TRADEMARK CASES Procedure in Inter Partes Proceedings § 2.120 Discovery. (a... develop a disclosure and discovery plan, the scope, timing and sequence of discovery, protective orders... foreign countries. (1) The discovery deposition of a natural person residing in a foreign country who is...
Lopes, M S; Bastiaansen, J W M; Janss, Luc
The contributions of additive, dominance and imprinting effects to the variance of number of teats (NT) were evaluated in two purebred pig populations using SNP markers. Three different random regression models were evaluated, accounting for the mean and: 1) additive effects (MA), 2) additive...... and dominance effects (MAD) and 3) additive, dominance and imprinting effects (MADI). Additive heritability estimates were 0.30, 0.28 and 0.27-0.28 in both lines using MA, MAD and MADI, respectively. Dominance heritability ranged from 0.06 to 0.08 using MAD and MADI. Imprinting heritability ranged from 0.......01 to 0.02. Dominance effects make an important contribution to the genetic variation of NT in the two lines evaluated. Imprinting effects appeared less important for NT than additive and dominance effects. The SNP random regression model presented and evaluated in this study is a feasible approach...
Wu, Xiaoping; Lund, Mogens S; Sahana, Goutam;
for mastitis traits: 54 k markers of a medium-density SNP (single nucleotide polymorphism) chip (MD), imputed 777 k markers of a high-density SNP chip (HD), and imputed whole-genome sequencing data (SEQ). Each dataset contained data for 4496 Danish Holstein cattle. Comparisons were performed using a linear...... when tested using the same statistical model. With the LM model, 120 (MD), 967 (HD), and 7209 (SEQ) SNPs were significantly associated with mastitis, whereas with the BVS model, 43 (MD), 131 (HD), and 1052 (SEQ) significant SNPs (Bayes factor > 3.2) were observed. A total of 26 (MD), 75 (HD), and 465......, LIFR, and EDN3 may be considered as candidate genes for mastitis susceptibility....
Full Text Available The single nucleotide polymorphism (SNP rs13438494 in intron 24 of PCLO was significantly associated with bipolar disorder in a meta-analysis of genome-wide association studies. In this study, we performed functional minigene analysis and bioinformatics prediction of splicing regulatory sequences to characterize the deep intronic SNP rs13438494. We constructed minigenes with A and C alleles containing exon 24, intron 24, and exon 25 of PCLO to assess the genetic effect of rs13438494 on splicing. We found that the C allele of rs13438494 reduces the splicing efficiency of the PCLO minigene. In addition, prediction analysis of enhancer/silencer motifs using the Human Splice Finder web tool indicated that rs13438494 induces the abrogation or creation of such binding sites. Our results indicate that rs13438494 alters splicing efficiency by creating or disrupting a splicing motif, which functions by binding of splicing regulatory proteins, and may ultimately result in bipolar disorder in affected people.
Uncu, Ali Tevfik; Frary, Anne; Doganlar, Sami
The aim of this study was to establish a DNA-based identification key to ascertain the cultivar origin of Turkish monovarietal olive oils. To reach this aim, we sequenced short fragments from five olive genes for SNP (single nucleotide polymorphism) identification and developed CAPS (cleaved amplified polymorphic DNA) assays for SNPs that alter restriction enzyme recognition motifs. When applied on the oils of 17 olive cultivars, a maximum of five CAPS assays were necessary to discriminate the varietal origin of the samples. We also tested the efficiency and limit of our approach for detecting olive oil admixtures. As a result of the analysis, we were able to detect admixing down to a limit of 20%. The SNP-based CAPS assays developed in this work can be used for testing and verification of the authenticity of Turkish monovarietal olive oils, for olive tree certification, and in germplasm characterization and preservation studies.
Shen, Terry H; Tarczy-Hornoch, Peter; Detwiler, Landon T; Cadag, Eithon; Carlson, Christopher S
Genome wide association studies (GWAS) are an important approach to understanding the genetic mechanisms behind human diseases. Single nucleotide polymorphisms (SNPs) are the predominant markers used in genome wide association studies, and the ability to predict which SNPs are likely to be functional is important for both a priori and a posteriori analyses of GWA studies. This article describes the design, implementation and evaluation of a family of systems for the purpose of identifying SNPs that may cause a change in phenotypic outcomes. The methods described in this article characterize the feasibility of combinations of logical and probabilistic inference with federated data integration for both point and regional SNP annotation and analysis. Evaluations of the methods demonstrate the overall strong predictive value of logical, and logical with probabilistic, inference applied to the domain of SNP annotation.
In addition to genetic testing techniques, the embryo biopsy stage (polar body, cleavage embryo or blastocyst and the mode of embryo transfer (fresh or frozen embryos can affect the outcome of PGD. It is now generally recommended that blastomere biopsy should be replaced by blastocyst biopsy to avoid a high mosaic rate and biopsy-related damage to cleavage-stage embryos, which might affect embryo development. However, more clinical data are required to confirm that the technique of SNP array-based PGD (SNP-PGD combined with trophectoderm (TE biopsy and frozen embryo transfer (FET is superior to traditional FISH-PGD combined with Day 3 (D3 blastomere biopsy and fresh embryo transfer.
Malhis, Nawar; Butterfield, Yaron S N; Ester, Martin; Jones, Steven J M
A plethora of alignment tools have been created that are designed to best fit different types of alignment conditions. While some of these are made for aligning Illumina Sequence Analyzer reads, none of these are fully utilizing its probability (prb) output. In this article, we will introduce a new alignment approach (Slider) that reduces the alignment problem space by utilizing each read base's probabilities given in the prb files. Compared with other aligners, Slider has higher alignment accuracy and efficiency. In addition, given that Slider matches bases with probabilities other than the most probable, it significantly reduces the percentage of base mismatches. The result is that its SNP predictions are more accurate than other SNP prediction approaches used today that start from the most probable sequence, including those using base quality.
Data from the international HapMap project were mined to determine if the degree of genetic differentiation (Fst) is dependent on single nucleotide polymorphism (SNP) category. The Fst statistic was evaluated across all SNPs for each of 30 genes and for each of five chromosomes. A consistent decrease in diversity between Europeans and Africans was seen for nonsynonymous coding region SNPs compared to the three other SNP categories: synonymous SNPs, UTR, and intronic SNPs. This suggests an effect of balancing selection in reducing interpopulation genetic diversity at sites that would be expected to influence phenotype and therefore be subject to selection. This result is inconsistent with the concept of large population specific genetic differences that could have applications in "racialized medicine."
Background Technological advances have lead to the rapid increase in availability of single nucleotide polymorphisms (SNPs) in a range of organisms, and there is a general optimism that SNPs will become the marker of choice for a range of evolutionary applications. Here, comparisons between 300 polymorphic SNPs and 14 short tandem repeats (STRs) were conducted on a data set consisting of approximately 500 Atlantic salmon arranged in 10 samples/populations. Results Global FST ranged from 0.033-0.115 and -0.002-0.316 for the 14 STR and 300 SNP loci respectively. Global FST was similar among 28 linkage groups when averaging data from mapped SNPs. With the exception of selecting a panel of SNPs taking the locus displaying the highest global FST for each of the 28 linkage groups, which inflated estimation of genetic differentiation among the samples, inferred genetic relationships were highly similar between SNP and STR data sets and variants thereof. The best 15 SNPs (30 alleles) gave a similar level of self-assignment to the best 4 STR loci (83 alleles), however, addition of further STR loci did not lead to a notable increase assignment whereas addition of up to 100 SNP loci increased assignment. Conclusion Whilst the optimal combinations of SNPs identified in this study are linked to the samples from which they were selected, this study demonstrates that identification of highly informative SNP loci from larger panels will provide researchers with a powerful approach to delineate genetic relationships at the individual and population levels. PMID:20051144
Rudy M Jonker
Full Text Available Migratory birds are of particular interest for population genetics because of the high connectivity between habitats and populations. A high degree of connectivity requires using many genetic markers to achieve the required statistical power, and a genome wide SNP set can fit this purpose. Here we present the development of a genome wide SNP set for the Barnacle Goose Branta leucopsis, a model species for the study of bird migration. We used the genome of a different waterfowl species, Mallard Anas platyrhynchos, as a reference to align Barnacle Goose second generation sequence reads from an RRL library and detected 2188 SNPs genome wide. Furthermore, we used chimeric flanking sequences, merged from both Mallard and Barnacle Goose DNA sequence information, to create primers for validation by genotyping. Validation with a 384 SNP genotyping set resulted in 374 (97% successfully typed SNPs in the assay, of which 358 (96% were polymorphic. Additionally, we validated our SNPs on relatively old (30 years museum samples, which resulted in a success rate of at least 80%. This shows that museum samples could be used in standard SNP genotyping assays. Our study also shows that the genome of a related species can be used as reference to detect genome wide SNPs in birds, because genomes of birds are highly conserved. This is illustrated by the use of chimeric flanking sequences, which showed that the incorporation of flanking nucleotides from Mallard into Barnacle Goose sequences lead to equal genotyping performance when compared to flanking sequences solely composed of Barnacle Goose sequence.
Vulpe Chris D
Full Text Available Abstract Background We report our experience of selecting tag SNPs in 35 genes involved in iron metabolism in a cohort study seeking to discover genetic modifiers of hereditary hemochromatosis. Methods We combined our own and publicly available resequencing data with HapMap to maximise our coverage to select 384 SNPs in candidate genes suitable for typing on the Illumina platform. Results Validation/design scores above 0.6 were not strongly correlated with SNP performance as estimated by Gentrain score. We contrasted results from two tag SNP selection algorithms, LDselect and Tagger. Varying r2 from 0.5 to 1.0 produced a near linear correlation with the number of tag SNPs required. We examined the pattern of linkage disequilibrium of three levels of resequencing coverage for the transferrin gene and found HapMap phase 1 tag SNPs capture 45% of the ≥ 3% MAF SNPs found in SeattleSNPs where there is nearly complete resequencing. Resequencing can reveal adjacent SNPs (within 60 bp which may affect assay performance. We report the number of SNPs present within the region of six of our larger candidate genes, for different versions of stock genotyping assays. Conclusion A candidate gene approach should seek to maximise coverage, and this can be improved by adding to HapMap data any available sequencing data. Tag SNP software must be fast and flexible to data changes, since tag SNP selection involves iteration as investigators seek to satisfy the competing demands of coverage within and between populations, and typability on the technology platform chosen.
Chromosomal DNA is characterized by variation between individuals at the level of entire chromosomes (e.g., aneuploidy in which the chromosome copy number is altered), segmental changes (including insertions, deletions, inversions, and translocations), and changes to small genomic regions (including single nucleotide polymorphisms). A variety of alterations that occur in chromosomal DNA, many of which can be detected using high density single nucleotide polymorphism (SNP)...
Background: Cucurbita pepo is amember of the Cucurbitaceae family, the second-most important horticultural family in terms of economic importance after Solanaceae. The ¿summer squash¿ types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP), was recently generated using massive sequencing. A set o...
Full Text Available Abstract Background With improvements in genotyping technologies, genome-wide association studies with hundreds of thousands of SNPs allow the identification of candidate genetic loci for multifactorial diseases in different populations. However, genotyping errors caused by genotyping platforms or genotype calling algorithms may lead to inflation of false associations between markers and phenotypes. In addition, the number of SNPs available for genome-wide association studies in the Japanese population has been investigated using only 45 samples in the HapMap project, which could lead to an inaccurate estimation of the number of SNPs with low minor allele frequencies. We genotyped 400 Japanese samples in order to estimate the number of SNPs available for genome-wide association studies in the Japanese population and to examine the performance of the current SNP Array 6.0 platform and the genotype calling algorithm "Birdseed". Results About 20% of the 909,622 SNP markers on the array were revealed to be monomorphic in the Japanese population. Consequently, 661,599 SNPs were available for genome-wide association studies in the Japanese population, after excluding the poorly behaving SNPs. The Birdseed algorithm accurately determined the genotype calls of each sample with a high overall call rate of over 99.5% and a high concordance rate of over 99.8% using more than 48 samples after removing low-quality samples by adjusting QC criteria. Conclusion Our results confirmed that the SNP Array 6.0 platform reached the level reported by the manufacturer, and thus genome-wide association studies using the SNP Array 6.0 platform have considerable potential to identify candidate susceptibility or resistance genetic factors for multifactorial diseases in the Japanese population, as well as in other populations.
Groeneveld, Eildert; Lichtenberg, Helmut
The fast development of high throughput genotyping has opened up new possibilities in genetics while at the same time producing considerable data handling issues. TheSNPpit is a database system for managing large amounts of multi panel SNP genotype data from any genotyping platform. With an increasing rate of genotyping in areas like animal and plant breeding as well as human genetics, already now hundreds of thousand of individuals need to be managed. While the common database design with on...
Michael A Eberle
Full Text Available Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550 and 650,000 (HumanHap650Y SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (lambda approximately 1.8-2.0. Relative risks as low as lambda approximately 1.1-1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%-35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.
Xu, Zongli; Kaplan, Norman L; Taylor, Jack A
HapMap provides linkage disequilibrium (LD) information on a sample of 3.7 million SNPs that can be used for tag SNP selection in whole-genome association studies. HapMap can also be used for tag SNP selection in candidate genes, although its performance has yet to be evaluated against gene resequencing data, where there is near-complete SNP ascertainment. The Environmental Genome Project (EGP) is the largest gene resequencing effort to date with over 500 resequenced genes. We used HapMap data to select tag SNPs and calculated the proportions of common SNPs (MAF>or=0.05) tagged (rho2>or=0.8) for each of 127 EGP Panel 2 genes where individual ethnic information was available. Median gene-tagging proportions are 50, 80 and 74% for African, Asian, and European groups, respectively. These low gene-tagging proportions may be problematic for some candidate gene studies. In addition, although HapMap targeted nonsynonymous SNPs (nsSNPs), we estimate only approximately 30% of nonsynonymous SNPs in EGP are in high LD with any HapMap SNP. We show that gene-tagging proportions can be improved by adding a relatively small number of tag SNPs that were selected based on resequencing data. We also demonstrate that ethnic-mixed data can be used to improve HapMap gene-tagging proportions, but are not as efficient as ethnic-specific data. Finally, we generalized the greedy algorithm proposed by Carlson et al (2004) to select tag SNPs for multiple populations and implemented the algorithm into a freely available software package mPopTag.
Yang Mary Qu; Chen Zhongxue; Yang Jack; Liu Qingzhong; Sung Andrew H; Huang Xudong
Abstract Background Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information...
Edwards, B.C. [Los Alamos National Lab., NM (United States); Chyba, C.F. [Univ. of Arizona, Tucson, AZ (United States); Abshire, J.B. [National Aeronautics and Space Administration, Greenbelt, MD (United States). Goddard Space Flight Center] [and others
Since it was first proposed that tidal heating of Europa by Jupiter might lead to liquid water oceans below Europa`s ice cover, there has been speculation over the possible exobiological implications of such an ocean. Liquid water is the essential ingredient for life as it is known, and the existence of a second water ocean in the Solar System would be of paramount importance for seeking the origin and existence of life beyond Earth. The authors present here a Discovery-class mission concept (Europa Ocean Discovery) to determine the existence of a liquid water ocean on Europa and to characterize Europa`s surface structure. The technical goal of the Europa Ocean Discovery mission is to study Europa with an orbiting spacecraft. This goal is challenging but entirely feasible within the Discovery envelope. There are four key challenges: entering Europan orbit, generating power, surviving long enough in the radiation environment to return valuable science, and complete the mission within the Discovery program`s launch vehicle and budget constraints. The authors will present here a viable mission that meets these challenges.
Hansen, C. J.; Paige, D. A.
The Mars Exploration Program should consider following the Discovery Program model. In the Discovery Program a team of scientists led by a PI develop the science goals of their mission, decide what payload achieves the necessary measurements most effectively, and then choose a spacecraft with the capabilities needed to carry the payload to the desired target body. The primary constraints associated with the Discovery missions are time and money. The proposer must convince reviewers that their mission has scientific merit and is feasible. Every Announcement of Opportunity has resulted in a collection of creative ideas that fit within advertised constraints. Following this model, a "Mars Discovery Program" would issue an Announcement of Opportunity for each launch opportunity with schedule constraints dictated by the launch window and fiscal constraints in accord with the program budget. All else would be left to the proposer to choose, based on the science the team wants to accomplish, consistent with the program theme of "Life, Climate and Resources". A proposer could propose a lander, an orbiter, a fleet of SCOUT vehicles or penetrators, an airplane, a balloon mission, a large rover, a small rover, etc. depending on what made the most sense for the science investigation and payload. As in the Discovery program, overall feasibility relative to cost, schedule and technology readiness would be evaluated and be part of the selection process.
Full Text Available Abstract Background Suicide and major depressive disorders (MDD are strongly associated, and genetic factors are responsible for at least part of the variability in suicide risk. We investigated whether variation at the tryptophan hydroxylase-2 (TPH2 gene rs7305115 SNP may predispose to suicide attempts in MDD. Methods We genotyped TPH2 gene rs7305115 SNP in 215 MDD patients with suicide and matched MDD patients without suicide. Differences in behavioral and personality traits according to genotypic variation were investigated by logistic regression analysis. Results There were no significant differences between MDD patients with suicide and controls in genotypic (AG and GG frequencies for rs7305115 SNP, but the distribution of AA genotype differed significantly (14.4% vs. 29.3%, p p p Conclusions The study suggested that hopelessness, negative life events and family history of suicide were risk factors of attempted suicide in MDD while the TPH2 rs7305115A remained a significant protective predictor of suicide attempts.
Xiaoman Wo; Dong Han; Haiming Sun; Yang Liu; Xiangning Meng; Jing Bai; Feng Chen
The potentially functional polymorphism,SNP309,in the promoter region of MDM2 gene has been implicated in cancer risk,but individual published studies showed inconclusive results.To obtain a more precise estimate of the association between MDM2 SNP309 and risk of cancer,we performed a meta-analysis of 70 individual studies in 59 publications that included 26,160 cases with different types of tumors and 33,046 controls.Summary odds ratios (OR) and corresponding 95％ confidence intervals (CIs) were estimated using fixed- and random-effects models when appropriate.Overall,the variant genotypes were associated with a significantly increased cancer risk for all cancer types in different genetic models (GG vs.TT:OR,1.123; 95％ CI,1.056-1.193; GG/GT vs.TT:OR,1.028; 95％ CI,1.006-1.050).In the stratified analyses,the increased risk remained for the studies of most types of cancers,Asian populations,and hospital-/population-based studies in different genetic models,whereas significantly decreased risk was found in prostate cancer (GG vs.TT:OR,0.606; 95％ CI,0.407-0.903; GG/GT vs.TT:OR,0.748; 95％ CI,0.579-0.968).In conclusion,the data of meta-analysis suggests that MDM2 SNP309 is a potential biomarker for cancer risk.
Lopes, Joao S; Marques, Isabel; Soares, Patricia; Nebenzahl-Guimaraes, Hanna; Costa, Joao; Miranda, Anabela; Duarte, Raquel; Alves, Adriana; Macedo, Rita; Duarte, Tonya A; Barbosa, Theolis; Oliveira, Martha; Nery, Joilda S; Boechat, Neio; Pereira, Susan M; Barreto, Mauricio L; Pereira-Leal, Jose; Gomes, Maria Gabriela Miranda; Penha-Goncalves, Carlos
Human tuberculosis is an infectious disease caused by bacteria from the Mycobacterium tuberculosis complex (MTBC). Although spoligotyping and MIRU-VNTR are standard methodologies in MTBC genetic epidemiology, recent studies suggest that Single Nucleotide Polymorphisms (SNP) are advantageous in phylogenetics and strain group/lineages identification. In this work we use a set of 79 SNPs to characterize 1987 MTBC isolates from Portugal and 141 from Northeast Brazil. All Brazilian samples were further characterized using spolygotyping. Phylogenetic analysis against a reference set revealed that about 95% of the isolates in both populations are singly attributed to bacterial lineage 4. Within this lineage, the most frequent strain groups in both Portugal and Brazil are LAM, followed by Haarlem and X. Contrary to these groups, strain group T showed a very different prevalence between Portugal (10%) and Brazil (1.5%). Spoligotype identification shows about 10% of mis-matches compared to the use of SNPs and a little more than 1% of strains unidentifiability. The mis-matches are observed in the most represented groups of our sample set (i.e., LAM and Haarlem) in almost the same proportion. Besides being more accurate in identifying strain groups/lineages, SNP-typing can also provide phylogenetic relationships between strain groups/lineages and, thus, indicate cases showing phylogenetic incongruence. Overall, the use of SNP-typing revealed striking similarities between MTBC populations from Portugal and Brazil.
Ochiai, Eriko; Minaguchi, Kiyoshi; Nambiar, Phrabhakaran; Kakimoto, Yu; Satoh, Fumiko; Nakatome, Masato; Miyashita, Keiko; Osawa, Motoki
The Y chromosomal haplogroup determined from single nucleotide polymorphism (SNP) combinations is a valuable genetic marker to study ancestral male lineage and ethical distribution. Next-generation sequencing has been developed for widely diverse genetics fields. For this study, we demonstrate 34 Y-SNP typing employing the Ion PGM™ system to perform haplogrouping. DNA libraries were constructed using the HID-Ion AmpliSeq™ Identity Panel. Emulsion PCR was performed, then DNA sequences were analyzed on the Ion 314 and 316 Chip Kit v2. Some difficulties became apparent during the analytic processes. No-call was reported at rs2032599 and M479 in six samples, in which the least coverage was observed at M479. A minor misreading occurred at rs2032631 and M479. A real time PCR experiment using other pairs of oligonucleotide primers showed that these events might result from the flanking sequence. Finally, Y haplogroup was determined completely for 81 unrelated males including Japanese (n=59) and Malay (n=22) subjects. The allelic divergence differed between the two populations. In comparison with the conventional Sanger method, next-generation sequencing provides a comprehensive SNP analysis with convenient procedures, but further system improvement is necessary. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Full Text Available Abstract Background Diagnostic analysis of patients with developmental disorders has improved over recent years largely due to the use of microarray technology. Array methods that facilitate copy number analysis have enabled the diagnosis of up to 20% more patients with previously normal karyotyping results. A substantial number of patients remain undiagnosed, however. Methods and Results Using the Genome-Wide Human SNP array 6.0, we analyzed 35 patients with a developmental disorder of unknown cause and normal array comparative genomic hybridization (array CGH results, in order to characterize previously undefined genomic aberrations. We detected no seemingly pathogenic copy number aberrations. Most of the vast amount of data produced by the array was polymorphic and non-informative. Filtering of this data, based on copy number variant (CNV population frequencies as well as phenotypically relevant genes, enabled pinpointing regions of allelic homozygosity that included candidate genes correlating to the phenotypic features in four patients, but results could not be confirmed. Conclusions In this study, the use of an ultra high-resolution SNP array did not contribute to further diagnose patients with developmental disorders of unknown cause. The statistical power of these results is limited by the small size of the patient cohort, and interpretation of these negative results can only be applied to the patients studied here. We present the results of our study and the recurrence of clustered allelic homozygosity present in this material, as detected by the SNP 6.0 array.
MAHA REBAĨ; AHMED REBAĨ
Single-nucleotide polymorphism (SNP) association studies have become crucial in uncovering the genetic correlations of genomic variants with complex diseases, quantitative traits and physiological responses to drugs. However, the identificationof SNPs responsible for specific phenotypes is a difficult problem to solve, requiring multiple testing of hundreds or thousands of SNPs in candidate genes. In this study, we performed an analysis of the genetic variations that can alter the structure and function of oestrogen receptor α using different computational tools. Among the nonsynonymous SNPs, a total of four SNPs were found to be damaging by both a sequence homology-based tool (SIFT) and a structural homology-based method (polyphen-2, SNAP), as well as by the ESEfinder program, and one nonsense nsSNP was found. For noncoding SNPs, we found that one SNP in 5'UTR may potentially change protein expression level, nine SNPs were found to affect miRNA binding site and 28 SNPs might affect transcriptional regulation of the ESR1 gene. Reviewing the literature, 89 SNPs were found to be functional among which only four were located in exons.
Keren, Boris; Chantot-Bastaraud, Sandra; Brioude, Frédéric; Mach, Corinne; Fonteneau, Eric; Azzi, Salah; Depienne, Christel; Brice, Alexis; Netchine, Irène; Le Bouc, Yves; Siffroi, Jean-Pierre; Rossignol, Sylvie
Beckwith-Wiedemann syndrome is an overgrowth disorder with an increased risk of childhood tumors that results from a dysregulation of imprinted gene expression in the 11p15 region. Since epigenetic defects are the most frequent anomalies, first-line diagnostic methods involve methylation analysis. When paternal isodisomy is suspected, it should be confirmed by a second technique capable of distinguishing true 11p15 paternal disomy (patUPD) from paternal 11p15 duplication or 11p15 trisomy. We sought to evaluate the interest of using SNP arrays in the Beckwith-Wiedemann syndrome diagnostic strategy. We analyzed the SNP profiles of 25 Beckwith Wiedemann patients with previously determined methylation indexes. Among them, 3 had 11p15 trisomies, 13 had patUPD, 8 had an inconclusive methylation index and 1 had a normal result. All known trisomies and known patUPDs were detected. Moreover we found 7 low-rate mosaicisms 11p15 patUPDs among the 8 patients with an inconclusive methylation index. We were able to precisely characterize the sizes and mosaicism rates of the anomalies. We demonstrate that SNP arrays are of real diagnostic interest in Beckwith-Wiedemann syndrome: 1) they help to distinguish patUPDs from trisomies more precisely than karyotyping and FISH, 2) they help determine the size and mosaicism rate of patUPDs, 3) they provide complementary information in inconclusive cases, helping to distinguish low-rate patUPD mosaicism from other BWS-related molecular defects.
Esteras, Cristina; Formisano, Gelsomina; Roig, Cristina; Díaz, Aurora; Blanca, José; Garcia-Mas, Jordi; Gómez-Guillamón, María Luisa; López-Sesé, Ana Isabel; Lázaro, Almudena; Monforte, Antonio J; Picó, Belén
Novel sequencing technologies were recently used to generate sequences from multiple melon (Cucumis melo L.) genotypes, enabling the in silico identification of large single nucleotide polymorphism (SNP) collections. In order to optimize the use of these markers, SNP validation and large-scale genotyping are necessary. In this paper, we present the first validated design for a genotyping array with 768 SNPs that are evenly distributed throughout the melon genome. This customized Illumina GoldenGate assay was used to genotype a collection of 74 accessions, representing most of the botanical groups of the species. Of the assayed loci, 91 % were successfully genotyped. The array provided a large number of polymorphic SNPs within and across accessions. This set of SNPs detected high levels of variation in accessions from this crop's center of origin as well as from several other areas of melon diversification. Allele distribution throughout the genome revealed regions that distinguished between the two main groups of cultivated accessions (inodorus and cantalupensis). Population structure analysis showed a subdivision into five subpopulations, reflecting the history of the crop. A considerably low level of LD was detected, which decayed rapidly within a few kilobases. Our results show that the GoldenGate assay can be used successfully for high-throughput SNP genotyping in melon. Since many of the genotyped accessions are currently being used as the parents of breeding populations in various programs, this set of mapped markers could be used for future mapping and breeding efforts.
Fernández Ana I
Full Text Available Abstract Background Recent studies in pigs have detected copy number variants (CNVs using the Comparative Genomic Hybridization technique in arrays designed to cover specific porcine chromosomes. The goal of this study was to identify CNV regions (CNVRs in swine species based on whole genome SNP genotyping chips. Results We used predictions from three different programs (cnvPartition, PennCNV and GADA to analyze data from the Porcine SNP60 BeadChip. A total of 49 CNVRs were identified in 55 animals from an Iberian x Landrace cross (IBMAP according to three criteria: detected in at least two animals, contained three or more consecutive SNPs and recalled by at least two programs. Mendelian inheritance of CNVRs was confirmed in animals belonging to several generations of the IBMAP cross. Subsequently, a segregation analysis of these CNVRs was performed in 372 additional animals from the IBMAP cross and its distribution was studied in 133 unrelated pig samples from different geographical origins. Five out of seven analyzed CNVRs were validated by real time quantitative PCR, some of which coincide with well known examples of CNVs conserved across mammalian species. Conclusions Our results illustrate the usefulness of Porcine SNP60 BeadChip to detect CNVRs and show that structural variants can not be neglected when studying the genetic variability in this species.
Mei, Xingxing; Kang, Xiangtao; Liu, Xiaojun; Jia, Lijuan; Li, Hong; Li, Zhuanjian; Jiang, Ruirui
A novel gene that was predicted to encode a long noncoding RNA (lncRNA) transcript was identified in a previous study that aimed to detect candidate genes related to growth rate differences between Chinese local breed Gushi chickens and Anka broilers. To characterise the biological function of the lncRNA, we cloned and sequenced the complete open reading frame of the gene. We performed quantitative real-time polymerase chain reaction (qPCR) to analyse the expression patterns of the lncRNA in different tissues of chicken at different development stages. The qPCR data showed that the novel lncRNA gene was expressed extensively, with the highest abundance in spleen and lung and the lowest abundance in pectoralis and leg muscle. Additionally, we identified a single nucleotide polymorphism (SNP) at the 5'-end of the gene and studied the association between the SNP and chicken growth traits using data from an F2 resource population of Gushi chickens and Anka broilers. The association analysis showed that the SNP was significantly (P chickens at 1 day, 4 weeks and 6 weeks of age. We concluded that the novel lncRNA gene, which we designated pouBW1, may play an important role in regulating chicken growth.
Baral, Aradhita; Kumar, Pankaj; Halder, Rashi; Mani, Prithvi; Yadav, Vinod Kumar; Singh, Ankita; Das, Swapan K; Chowdhury, Shantanu
Non-canonical guanine quadruplex structures are not only predominant but also conserved among bacterial and mammalian promoters. Moreover recent findings directly implicate quadruplex structures in transcription. These argue for an intrinsic role of the structural motif and thereby posit that single nucleotide polymorphisms (SNP) that compromise the quadruplex architecture could influence function. To test this, we analysed SNPs within quadruplex motifs (Quad-SNP) and gene expression in 270 individuals across four populations (HapMap) representing more than 14,500 genotypes. Findings reveal significant association between quadruplex-SNPs and expression of the corresponding gene in individuals (P analysis of Quad-SNPs obtained from population-scale sequencing of 1000 human genomes showed relative selection bias against alteration of the structural motif. To directly test the quadruplex-SNP-transcription connection, we constructed a reporter system using the RPS3 promoter-remarkable difference in promoter activity in the 'quadruplex-destabilized' versus 'quadruplex-intact' promoter was noticed. As a further test, we incorporated a quadruplex motif or its disrupted counterpart within a synthetic promoter reporter construct. The quadruplex motif, and not the disrupted-motif, enhanced transcription in human cell lines of different origin. Together, these findings build direct support for quadruplex-mediated transcription and suggest quadruplex-SNPs may play significant role in mechanistically understanding variations in gene expression among individuals.
Das, Ashutosh; Panitz, Frank; Holm, Lars-Erik
We sequenced the whole-genome of a Danish Jutland bull to identify genetic variants (SNP/indel). Using UnifiedGenotyper from the Genome Analysis Toolkit (GATK), we identified 6,812,198 SNPs and 804,453 indels. There were 2,598,000 (38.1%) novel SNPs and 607,923(75.6%) novel indels while the remai......,122 indels in coding sequences, 832 predicted to cause frame shift, 89 predicted to be inframe insertion and 115 to be inframe deletion. We detected a higher level of genetic variation in the Jutland bull compared to similar data from Holstein cattle......We sequenced the whole-genome of a Danish Jutland bull to identify genetic variants (SNP/indel). Using UnifiedGenotyper from the Genome Analysis Toolkit (GATK), we identified 6,812,198 SNPs and 804,453 indels. There were 2,598,000 (38.1%) novel SNPs and 607,923(75.6%) novel indels while...... the remaining was annotated in dbSNP build 133. In-depth annotation of the variants revealed that 45,776 SNPs affected the coding sequences of 11,538 genes, 221 SNPs predicted to cause a premature stop codon, 17 to cause a gain in coding sequence and 20,828 predicted to be non-synonymous. We identified 1...
Spaniolas, Stelios; Bazakos, Christos; Tucker, Gregory A; Bennett, Malcolm J
Recently, DNA-based authentication methods were developed to serve as complementary approaches to analytical chemistry techniques. The single nucleotide polymorphism (SNP)-based reaction chemistries, when combined with the existing detection methods, could result in numerous analytical approaches, all with particular advantages and disadvantages. The dual aim of this study was (a) to develop SNP-based analytical assays such as the single-base primer extension (SNaPShot) and pyrosequencing in order to differentiate Arabica and Robusta varieties for the authentication of coffee beans and (b) to compare the performances of SNaPshot, pyrosequencing and the previously developed polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) using an Agilent 2100 Bioanalyzer on the basis of linearity (R2) and LOD, expressed as percentage of the adulterant species, using green coffee beans (Arabica and Robusta) as a food model. The results showed that SNaPshot analysis exhibited the best LOD, whereas pyrosequencing revealed the best linearity (R2 = 0.997). The PCR-RFLP assay using the Agilent 2100 Bioanalyzer could prove to be a very useful method for a laboratory that lacks sequencing facilities but it can be used only if a SNP creates/deletes a restriction site.
Fowler Katie E
Full Text Available Abstract Background The ability to transport and store DNA at room temperature in low volumes has the advantage of optimising cost, time and storage space. Blood spots on adapted filter papers are popular for this, with FTA (Flinders Technology Associates Whatman™TM technology being one of the most recent. Plant material, plasmids, viral particles, bacteria and animal blood have been stored and transported successfully using this technology, however the method of porcine DNA extraction from FTA Whatman™TM cards is a relatively new approach, allowing nucleic acids to be ready for downstream applications such as PCR, whole genome amplification, sequencing and subsequent application to single nucleotide polymorphism microarrays has hitherto been under-explored. Findings DNA was extracted from FTA Whatman™TM cards (following adaptations of the manufacturer’s instructions, whole genome amplified and subsequently analysed to validate the integrity of the DNA for downstream SNP analysis. DNA was successfully extracted from 288/288 samples and amplified by WGA. Allele dropout post WGA, was observed in less than 2% of samples and there was no clear evidence of amplification bias nor contamination. Acceptable call rates on porcine SNP chips were also achieved using DNA extracted and amplified in this way. Conclusions DNA extracted from FTA Whatman cards is of a high enough quality and quantity following whole genomic amplification to perform meaningful SNP chip studies.