Full Text Available Abstract Background Sweet cherry (Prunus avium L., a non-model crop with narrow genetic diversity, is an important member of sub-family Amygdoloideae within Rosaceae. Compared to other important members like peach and apple, sweet cherry lacks in genetic and genomic information, impeding understanding of important biological processes and development of efficient breeding approaches. Availability of single nucleotide polymorphism (SNP-based molecular markers can greatly benefit breeding efforts in such non-model species. RNA-seq approaches employing second generation sequencing platforms offer a unique avenue to rapidly identify gene-based SNPs. Additionally, haplotype markers can be rapidly generated from transcript-based SNPs since they have been found to be extremely utile in identification of genetic variants related to health, disease and response to environment as highlighted by the human HapMap project. Results RNA-seq was performed on two sweet cherry cultivars, Bing and Rainier using a 3' untranslated region (UTR sequencing method yielding 43,396 assembled contigs. In order to test our approach of rapid identification of SNPs without any reference genome information, over 25% (10,100 of the contigs were screened for the SNPs. A total of 207 contigs from this set were identified to contain high quality SNPs. A set of 223 primer pairs were designed to amplify SNP containing regions from these contigs and high resolution melting (HRM analysis was performed with eight important parental sweet cherry cultivars. Six of the parent cultivars were distantly related to Bing and Rainier, the cultivars used for initial SNP discovery. Further, HRM analysis was also performed on 13 seedlings derived from a cross between two of the parents. Our analysis resulted in the identification of 84 (38.7% primer sets that demonstrated variation among the tested germplasm. Reassembly of the raw 3'UTR sequences using upgraded transcriptome assembly software
Nayak, Spurthi N.; Zhu, Hongyan; Varghese, Nicy; Datta, Subhojit; Choi, Hong-Kyu; Horres, Ralf; Jüngling, Ruth; Singh, Jagbir; Kavi Kishor, P. B.; Sivaramakrishnan, S.; Hoisington, Dave A.; Kahl, Günter; Winter, Peter; Cook, Douglas R.
This study presents the development and mapping of simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers in chickpea. The mapping population is based on an inter-specific cross between domesticated and non-domesticated genotypes of chickpea (Cicer arietinum ICC 4958 × C. reticulatum PI 489777). This same population has been the focus of previous studies, permitting integration of new and legacy genetic markers into a single genetic map. We report a set of 311 novel SSR markers (designated ICCM—ICRISAT chickpea microsatellite), obtained from an SSR-enriched genomic library of ICC 4958. Screening of these SSR markers on a diverse panel of 48 chickpea accessions provided 147 polymorphic markers with 2–21 alleles and polymorphic information content value 0.04–0.92. Fifty-two of these markers were polymorphic between parental genotypes of the inter-specific population. We also analyzed 233 previously published (H-series) SSR markers that provided another set of 52 polymorphic markers. An additional 71 gene-based SNP markers were developed from transcript sequences that are highly conserved between chickpea and its near relative Medicago truncatula. By using these three approaches, 175 new marker loci along with 407 previously reported marker loci were integrated to yield an improved genetic map of chickpea. The integrated map contains 521 loci organized into eight linkage groups that span 2,602 cM, with an average inter-marker distance of 4.99 cM. Gene-based markers provide anchor points for comparing the genomes of Medicago and chickpea, and reveal extended synteny between these two species. The combined set of genetic markers and their integration into an improved genetic map should facilitate chickpea genetics and breeding, as well as translational studies between chickpea and Medicago. Electronic supplementary material The online version of this article (doi:10.1007/s00122-010-1265-1) contains supplementary material, which is
Full Text Available Chickpea is an important food legume crop for the semi-arid regions, however, its productivity is adversely affected by various biotic and abiotic stresses. Identification of candidate genes associated with abiotic stress response will help breeding efforts aiming to enhance its productivity. With this objective, 10 abiotic stress responsive candidate genes were selected on the basis of prior knowledge of this complex trait. These 10 genes were subjected to allele specific sequencing across a chickpea reference set comprising 300 genotypes including 211 accessions of chickpea mini core collection. A total of 1.3 Mbp sequence data were generated. Multiple sequence alignment revealed 79 SNPs and 41 indels in nine genes while the CAP2 gene was found to be conserved across all the genotypes. Among ten candidate genes, the maximum number of SNPs (34 was observed in abscisic acid stress and ripening (ASR gene including 22 transitions, 11 transversions and one tri-allelic SNP. Nucleotide diversity varied from 0.0004 to 0.0029 while PIC values ranged from 0.01 (AKIN gene to 0.43 (CAP2 promoter. Haplotype analysis revealed that alleles were represented by more than two haplotype blocks, except alleles of the CAP2 and sucrose synthase (SuSy gene, where only one haplotype was identified. These genes can be used for association analysis and if validated, may be useful for enhancing abiotic stress, including drought tolerance, through molecular breeding.
Sindhu, Anoop; Ramsay, Larissa; Sanderson, Lacey-Anne; Stonehouse, Robert; Li, Rong; Condie, Janet; Shunmugam, Arun S K; Liu, Yong; Jha, Ambuj B; Diapari, Marwan; Burstin, Judith; Aubert, Gregoire; Tar'an, Bunyamin; Bett, Kirstin E; Warkentin, Thomas D; Sharpe, Andrew G
Gene-based SNPs were identified and mapped in pea using five recombinant inbred line populations segregating for traits of agronomic importance. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and has been a model system in plant biology and genetics since the work of Gregor Mendel. Pea is the second most widely grown pulse crop in the world following common bean. The importance of pea as a food crop is growing due to its combination of moderate protein concentration, slowly digestible starch, high dietary fiber concentration, and its richness in micronutrients; however, pea has lagged behind other major crops in harnessing recent advances in molecular biology, genomics and bioinformatics, partly due to its large genome size with a large proportion of repetitive sequence, and to the relatively limited investment in research in this crop globally. The objective of this research was the development of a genome-wide transcriptome-based pea single-nucleotide polymorphism (SNP) marker platform using next-generation sequencing technology. A total of 1,536 polymorphic SNP loci selected from over 20,000 non-redundant SNPs identified using deep transcriptome sequencing of eight diverse Pisum accessions were used for genotyping in five RIL populations using an Illumina GoldenGate assay. The first high-density pea SNP map defining all seven linkage groups was generated by integrating with previously published anchor markers. Syntenic relationships of this map with the model legume Medicago truncatula and lentil (Lens culinaris Medik.) maps were established. The genic SNP map establishes a foundation for future molecular breeding efforts by enabling both the identification and tracking of introgression of genomic regions harbouring QTLs related to agronomic and seed quality traits.
Thiel, Thomas; Kota, Raja; Grosse, Ivo; Stein, Nils; Graner, Andreas
With the influx of various SNP genotyping assays in recent years, there has been a need for an assay that is robust, yet cost effective, and could be performed using standard gel-based procedures. In this context, CAPS markers have been shown to meet these criteria. However, converting SNPs to CAPS markers can be a difficult process if done manually. In order to address this problem, we describe a computer program, SNP2CAPS, that facilitates the computational conversion of SNP markers into CA...
Bekkevold, Dorte; Limborg, Morten; Helyar, Sarah;
complicating stock assessment and management. It is therefore of management interest to trace individual population migration patterns and contributions to fisheries. To underpin management and to develop a validated tool for traceability of individuals from mixed‐stock samples we applied single nucleotide......Atlantic herring (Clupea harengus) exhibit biocomplexity, with widespread, geographically explicit populations that perform long‐range migration to common feeding and wintering areas, where they are exploited by fisheries. This means that exploited stocks do not describe discrete units, thereby...... polymorphism (SNP) markers in Northeast Atlantic herring population samples. Marker panels were targeted to include gene‐associated loci to maximize statistical resolution. Application of 281 SNP markers to samples representing different levels of stock complexity showed that the regional origin of individual...
Full Text Available Toll-like receptors (TLRs play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.
Nathan A Baird
Full Text Available Single nucleotide polymorphism (SNP discovery and genotyping are essential to genetic mapping. There remains a need for a simple, inexpensive platform that allows high-density SNP discovery and genotyping in large populations. Here we describe the sequencing of restriction-site associated DNA (RAD tags, which identified more than 13,000 SNPs, and mapped three traits in two model organisms, using less than half the capacity of one Illumina sequencing run. We demonstrated that different marker densities can be attained by choice of restriction enzyme. Furthermore, we developed a barcoding system for sample multiplexing and fine mapped the genetic basis of lateral plate armor loss in threespine stickleback by identifying recombinant breakpoints in F(2 individuals. Barcoding also facilitated mapping of a second trait, a reduction of pelvic structure, by in silico re-sorting of individuals. To further demonstrate the ease of the RAD sequencing approach we identified polymorphic markers and mapped an induced mutation in Neurospora crassa. Sequencing of RAD markers is an integrated platform for SNP discovery and genotyping. This approach should be widely applicable to genetic mapping in a variety of organisms.
Blair, Matthew W; Cortés, Andrés J; Penmetsa, R Varma; Farmer, Andrew; Carrasquilla-Garcia, Noelia; Cook, Doug R
Single nucleotide polymorphism (SNP) detection has become a marker system of choice, because of the high abundance of source polymorphisms and the ease with which allele calls are automated. Various technologies exist for the evaluation of SNP loci and previously we validated two medium throughput technologies. In this study, our goal was to utilize a 768 feature, Illumina GoldenGate assay for common bean (Phaseolus vulgaris L.) developed from conserved legume gene sequences and to use the new technology for (1) the evaluation of parental polymorphisms in a mini-core set of common bean accessions and (2) the analysis of genetic diversity in the crop. A total of 736 SNPs were scored on 236 diverse common bean genotypes with the GoldenGate array. Missing data and heterozygosity levels were low and 94 % of the SNPs were scorable. With the evaluation of the parental polymorphism genotypes, we estimated the utility of the SNP markers in mapping for inter-genepool and intra-genepool populations, the latter being of lower polymorphism than the former. When we performed the diversity analysis with the diverse genotypes, we found Illumina GoldenGate SNPs to provide equivalent evaluations as previous gene-based SNP markers, but less fine-distinctions than with previous microsatellite marker analysis. We did find, however, that the gene-based SNPs in the GoldenGate array had some utility in race structure analysis despite the low polymorphism. Furthermore the SNPs detected high heterozygosity in wild accessions which was probably a reflection of ascertainment bias. The Illumina SNPs were shown to be effective in distinguishing between the genepools, and therefore were most useful in saturation of inter-genepool genetic maps. The implications of these results for breeding in common bean are discussed as well as the advantages and disadvantages of the GoldenGate system for SNP detection.
McKay Stephanie D
Full Text Available Abstract Background Genetic markers can be used to identify and verify the origin of individuals. Motivation for the inference of ancestry ranges from conservation genetics to forensic analysis. High density assays featuring Single Nucleotide Polymorphism (SNP markers can be exploited to create a reduced panel containing the most informative markers for these purposes. The objectives of this study were to evaluate methods of marker selection and determine the minimum number of markers from the BovineSNP50 BeadChip required to verify the origin of individuals in European cattle breeds. Delta, Wright's FST, Weir & Cockerham's FST and PCA methods for population differentiation were compared. The level of informativeness of each SNP was estimated from the breed specific allele frequencies. Individual assignment analysis was performed using the ranked informative markers. Stringency levels were applied by log-likelihood ratio to assess the confidence of the assignment test. Results A 95% assignment success rate for the 384 individually genotyped animals was achieved with ST (60 to 140 SNPs depending on the chosen degree of confidence. Certain breeds required fewer markers ( 95% assignment success. The power of assignment success, and therefore the number of SNP markers required, is dependent on the levels of genetic heterogeneity and pool of samples considered. Conclusions While all SNP selection methods produced marker panels capable of breed identification, the power of assignment varied markedly among analysis methods. Thus, with effective exploration of available high density genetic markers, a diagnostic panel of highly informative markers can be produced.
Full Text Available Although highly polymorphic SSRs are currently the marker of choice worldwide in maize breeding, single nucleotide polymorphisms (SNPs as a newer marker system are recently used more extensively. The objective of this study was investigate the utility of SSR and SNP markers for mapping of a maize population adapted to conditions of Southeast Europe. Total of 294 F2:3 lines derived from a biparental mapping population were genotyped using 121 polymorphic SNP and SSR markers. The SNP markers were analyzed using the SNPlex technology. 56 of the 142 tested SNPs (39% were polymorphic between the parents of the mapping population and were successfully mapped. The remaining markers were either not functional (5 = 3.5% or not polymorphic (81 = 57%. No mapped SNP marker showed more than 10% missing data. On average, the level of missing data for SNPs (1.5% was considerably lower than that for SSRs (3.4%. For the mapping procedure, the SNP data were combined SSR data. A comparison of the mapping data with the publicly available mapping data on SSR markers and the proprietary mapping data indicates that the map is of good quality and that the map position of almost all markers agrees with their published map position. Thus, information obtained from both marker systems is utilizable for further QTL analysis.
Scaglione Davide; Acquadro Alberto; Portis Ezio; Tirone Matteo; Knapp Steven J; Lanteri Sergio
Abstract Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic ...
Foley Brad R
Full Text Available Abstract Background As yet, few genomic resources have been developed in crustaceans. This lack is particularly evident in Copepoda, given the extraordinary numerical abundance, and taxonomic and ecological diversity of this group. Tigriopus californicus is ideally suited to serve as a genetic model copepod and has been the subject of extensive work in environmental stress and reproductive isolation. Accordingly, we set out to develop a broadly-useful panel of genetic markers and to construct a linkage map dense enough for quantitative trait locus detection in an interval mapping framework for T. californicus--a first for copepods. Results One hundred and ninety Single Nucleotide Polymorphisms (SNPs were used to genotype our mapping population of 250 F2 larvae. We were able to construct a linkage map with an average intermarker distance of 1.8 cM, and a maximum intermarker distance of 10.3 cM. All markers were assembled into linkage groups, and the 12 linkage groups corresponded to the 12 known chromosomes of T. californicus. We estimate a total genome size of 401.0 cM, and a total coverage of 73.7%. Seventy five percent of the mapped markers were detected in 9 additional populations of T. californicus. Of available model arthropod genomes, we were able to show more colocalized pairs of homologues between T. californicus and the honeybee Apis mellifera, than expected by chance, suggesting preserved macrosynteny between Hymenoptera and Copepoda. Conclusions Our study provides an abundance of linked markers spanning all chromosomes. Many of these markers are also found in multiple populations of T. californicus, and in two other species in the genus. The genomic resource we have developed will enable mapping throughout the geographical range of this species and in closely related species. This linkage map will facilitate genome sequencing, mapping and assembly in an ecologically and taxonomically interesting group for which genomic resources are
Foley Brad R; Rose Colin G; Rundle Daniel E; Leong Wai; Moy Gary W; Burton Ronald S; Edmands Suzanne
Abstract Background As yet, few genomic resources have been developed in crustaceans. This lack is particularly evident in Copepoda, given the extraordinary numerical abundance, and taxonomic and ecological diversity of this group. Tigriopus californicus is ideally suited to serve as a genetic model copepod and has been the subject of extensive work in environmental stress and reproductive isolation. Accordingly, we set out to develop a broadly-useful panel of genetic markers and to construct...
Mouawad, Amer E; Mansour, Nashat
Despite the advances in genotyping technologies which have led to large reduction in genotyping cost, the Tag SNP Selection problem remains an important problem for computational biologists and geneticists. Selecting the smallest subset of tag SNPs that can predict the other SNPs would considerably minimize the complexity of genome-wide or block-based SNP-disease association studies. These studies would lead to better diagnosis and treatment of diseases. In this work, we propose three variations of a genetic algorithm based on two-marker linkage disequilibrium, multi-marker linkage disequilibrium, and a third measure that we denote by prediction power. The performance of the three algorithms are compared with those of a recognized tag SNP selection algorithm using three different real data sets from the HapMap project. The results indicate that the multi-marker linkage disequilibrium based genetic algorithm yields better prediction accuracy.
Cheong, H S; Kim, L H; Namgoong, S; Shin, H D
In the Korean meat market, the native cattle, Hanwoo beef, are preferred over imported beef and domestic Holstein beef despite its relatively high price. In order to hold the beef industry accountable and support consumers' right to know, correct beef-origin labeling is required. For this purpose, we developed 90 single-nucleotide polymorphism markers to discriminate between Hanwoo and other breeds including Holstein using 1602 cattle DNAs. The probability of discrimination was found to be 100% in a subsequent validation set consisting of 632 DNAs. Our study suggests that improved beef-origin discrimination can be achieved by using a combined genetic model that takes into account small genetic differences among a large number of markers. These markers could be useful for discriminating between Hanwoo and imported breeds including domestic Holsteins, and would contribute to the prevention of falsified beef origin.
In this investigation 45 parental cacao plants and five progeny derived from the parental stock studied were genotyped using six SNP markers to determine off-types or mislabeled clones and to authenticate crosses made in the Cocoa Research Institute of Ghana (CRIG) breeding program. Investigation wa...
Scholten, Olga E.; Kaauwen, van Martijn P.W.; Shahin, Arwa; Hendrickx, Patrick M.; Keizer, Paul; Burger-Meijer, Karin; Heusden, van Sjaak; Linden, van der Gerard; Vosman, Ben
Background: Within onion, Allium cepa L., the availability of disease resistance is limited. The identification of sources of resistance in related species, such as Allium roylei and Allium fistulosum, was a first step towards the improvement of onion cultivars by breeding. SNP markers linked to
Kamphuis, Lars G; Hane, James K; Nelson, Matthew N; Gao, Lingling; Atkins, Craig A; Singh, Karam B
Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is an important grain legume crop that is valuable for sustainable farming and is becoming recognized as a human health food. NLL breeding is directed at improving grain production, disease resistance, drought tolerance and health benefits. However, genetic and genomic studies have been hindered by a lack of extensive genomic resources for the species. Here, the generation, de novo assembly and annotation of transcriptome datasets derived from five different NLL tissue types of the reference accession cv. Tanjil are described. The Tanjil transcriptome was compared to transcriptomes of an early domesticated cv. Unicrop, a wild accession P27255, as well as accession 83A:476, together being the founding parents of two recombinant inbred line (RIL) populations. In silico predictions for transcriptome-derived gene-based length and SNP polymorphic markers were conducted and corroborated using a survey assembly sequence for NLL cv. Tanjil. This yielded extensive indel and SNP polymorphic markers for the two RIL populations. A total of 335 transcriptome-derived markers and 66 BAC-end sequence-derived markers were evaluated, and 275 polymorphic markers were selected to genotype the reference NLL 83A:476 × P27255 RIL population. This significantly improved the completeness, marker density and quality of the reference NLL genetic map. PMID:25060816
Kamphuis, Lars G; Hane, James K; Nelson, Matthew N; Gao, Lingling; Atkins, Craig A; Singh, Karam B
Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is an important grain legume crop that is valuable for sustainable farming and is becoming recognized as a human health food. NLL breeding is directed at improving grain production, disease resistance, drought tolerance and health benefits. However, genetic and genomic studies have been hindered by a lack of extensive genomic resources for the species. Here, the generation, de novo assembly and annotation of transcriptome datasets derived from five different NLL tissue types of the reference accession cv. Tanjil are described. The Tanjil transcriptome was compared to transcriptomes of an early domesticated cv. Unicrop, a wild accession P27255, as well as accession 83A:476, together being the founding parents of two recombinant inbred line (RIL) populations. In silico predictions for transcriptome-derived gene-based length and SNP polymorphic markers were conducted and corroborated using a survey assembly sequence for NLL cv. Tanjil. This yielded extensive indel and SNP polymorphic markers for the two RIL populations. A total of 335 transcriptome-derived markers and 66 BAC-end sequence-derived markers were evaluated, and 275 polymorphic markers were selected to genotype the reference NLL 83A:476 × P27255 RIL population. This significantly improved the completeness, marker density and quality of the reference NLL genetic map.
Full Text Available Abstract Background The globe artichoke (Cynara cardunculus L. var. scolymus genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp, of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria.
Bertioli, David J; Ozias-Akins, Peggy; Chu, Ye; Dantas, Karinne M; Santos, Silvio P; Gouvea, Ediene; Guimarães, Patricia M; Leal-Bertioli, Soraya C M; Knapp, Steven J; Moretzsohn, Marcio C
Single nucleotide polymorphic markers (SNPs) are attractive for use in genetic mapping and marker-assisted breeding because they can be scored in parallel assays at favorable costs. However, scoring SNP markers in polyploid plants like the peanut is problematic because of interfering signal generated from the DNA bases that are homeologous to those being assayed. The present study used a previously constructed 1536 GoldenGate SNP assay developed using SNPs identified between two A. duranensis accessions. In this study, the performance of this assay was tested on two RIL mapping populations, one diploid (A. duranensis × A. stenosperma) and one tetraploid [A. hypogaea cv. Runner IAC 886 × synthetic tetraploid (A. ipaënsis × A. duranensis)(4×)]. The scoring was performed using the software GenomeStudio version 2011.1. For the diploid, polymorphic markers provided excellent genotyping scores with default software parameters. In the tetraploid, as expected, most of the polymorphic markers provided signal intensity plots that were distorted compared to diploid patterns and that were incorrectly scored using default parameters. However, these scorings were easily corrected using the GenomeStudio software. The degree of distortion was highly variable. Of the polymorphic markers, approximately 10% showed no distortion at all behaving as expected for single-dose markers, and another 30% showed low distortion and could be considered high-quality. The genotyped markers were incorporated into diploid and tetraploid genetic maps of Arachis and, in the latter case, were located almost entirely on A genome linkage groups.
Kottapalli, Pratibha; Ulloa, Mauricio; Kottapalli, Kameswara Rao; Payton, Paxton; Burke, John
The objective of this study was to explore the known narrow genetic diversity and discover single-nucleotide polymorphic (SNP) markers for marker-assisted breeding within Pima cotton (Gossypium barbadense L.) leaf transcriptomes. cDNA from 25-day plants of three diverse cotton genotypes [Pima S6 (PS6), Pima S7 (PS7), and Pima 3-79 (P3-79)] was sequenced on Illumina sequencing platform. A total of 28.9 million reads (average read length of 138 bp) were generated by sequencing cDNA libraries of these three genotypes. The de novo assembly of reads generated transcriptome sets of 26,369 contigs for PS6, 25,870 contigs for PS7, and 24,796 contigs for P3-79. A Pima leaf reference transcriptome was generated consisting of 42,695 contigs. More than 10,000 single-nucleotide polymorphisms (SNPs) were identified between the genotypes, with 100% SNP frequency and a minimum of eight sequencing reads. The most prevalent SNP substitutions were C—T and A—G in these cotton genotypes. The putative SNPs identified can be utilized for characterizing genetic diversity, genotyping, and eventually in Pima cotton breeding through marker-assisted selection.
Brad S. Coates
Full Text Available Microsatellite markers are difficult to apply within lepidopteran studies due to the lack of locus-specific PCR amplification and the high proportion of null alleles, such that erroneous estimations of population genetic parameters often result. Herein single nucleotide polymorphism (SNP markers are developed from Ostrinia nubilalis (Lepidoptera: Crambidae using next-generation expressed sequence tag (EST data. A total of 2742 SNPs were predicted within a reference assembly of 7414 EST contigs, and a subset of 763 were incorporated into 24 multiplex PCR reactions. To validate this pipeline, 5 European and North American sample sites were genotyped at 178 SNP loci, which indicated 84 (47.2% were in Hardy-Weinberg equilibrium. Locus-by-locus FST, analysis of molecular variance (AMOVA, and STRUCTURE analyses indicate significant genetic differentiation may exist between European and North American O. nubilalis. The observed genetic diversity was significantly lower among European sites, which may be the result from genetic drift, natural selection, a genetic bottleneck, or ascertainment bias due to North American origin of EST sequence data. SNPs are an abundant mutation data molecular genetic marker development in non-model species with shared ancestral SNPs showing application within closely related species. These markers offer advantages over microsatellite markers for genetic and genomic analyses of Lepidoptera, but the source of mutation data may affect the estimation of population parameters and likely need be considered in the interpretation of empirical data.
LI Jing-qiong; ZHENG You-liang; WEI Yu-ming
Forty-three gene sequences encoding purothionin were characterized from the three species or subspecies of einkorn wheats.These sequences contained 887 bp,among which 92 SNPs including 29 indel loci were detected,giving an average SNP frequency of one SNP per 9.64 bases.According to these sequences,5 SNP markers were successfully designed,which were used to mine the variations of purothionin genes of 102 einkorn wheat accessions.Based on the 5 detected SNP loci,102 einkorn wheat accessions could be divided into 21 haplotypes,among which 11 hapiotypes contained a single sample.Phylogenetic analysis indicated that the purothionin genes from einkorn wheats were more closely related to those from D genome than B genome.Seven out of the 43 gene sequences were assumed to be pseudogenes by the definition of containing in-frame stop codons and small insertions/deletions leading to frameshifi.In the remaining 36 amino acid sequences,the 8 Cys and Tyr-13 loci in the mature thionin domain which played important roles in the biological activities were all conserved,whereas there were some varieties occurred in some other important amino acid residues such as Lys and Arg.
Full Text Available Abstract Background Date palm (Phoenix dactylifera L. is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding. Results In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs. We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7% were the most common, followed by tetranucleotide (10.4% and dinucleotide motifs (9.6%. The motif AG (85.7% was most abundant in dinucleotides, while motifs AGG (26.8%, AAG (19.3%, and AGC (16.1% were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4% of such ESTs had homology with known proteins. Conclusion Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.
Lopes, M S; Bastiaansen, J W M; Janss, Luc
The contributions of additive, dominance and imprinting effects to the variance of number of teats (NT) were evaluated in two purebred pig populations using SNP markers. Three different random regression models were evaluated, accounting for the mean and: 1) additive effects (MA), 2) additive...... and dominance effects (MAD) and 3) additive, dominance and imprinting effects (MADI). Additive heritability estimates were 0.30, 0.28 and 0.27-0.28 in both lines using MA, MAD and MADI, respectively. Dominance heritability ranged from 0.06 to 0.08 using MAD and MADI. Imprinting heritability ranged from 0.......01 to 0.02. Dominance effects make an important contribution to the genetic variation of NT in the two lines evaluated. Imprinting effects appeared less important for NT than additive and dominance effects. The SNP random regression model presented and evaluated in this study is a feasible approach...
Egbadzor, Kenneth F; Ofori, Kwadwo; Yeboah, Martin; Aboagye, Lawrence M; Opoku-Agyeman, Michael O; Danquah, Eric Y; Offei, Samuel K
Single Nucleotide Polymorphism (SNP) markers were used in characterization of 113 cowpea accessions comprising of 108 from Ghana and 5 from abroad. Leaf tissues from plants cultivated at the University of Ghana were genotyped at KBioscience in the United Kingdom. Data was generated for 477 SNPs, out of which 458 revealed polymorphism. The results were used to analyze genetic dissimilarity among the accessions using Darwin 5 software. The markers discriminated among all of the cowpea accessions and the dissimilarity values which ranged from 0.006 to 0.63 were used for factorial plot. Unexpected high levels of heterozygosity were observed on some of the accessions. Accessions known to be closely related clustered together in a dendrogram drawn with WPGMA method. A maximum length sub-tree which comprised of 48 core accessions was constructed. The software package structure was used to separate accessions into three groups, and the programme correctly identified varieties that were known hybrids. The hybrids were those accessions with numerous heterozygous loci. The structure plot showed closely related accessions with similar genome patterns. The SNP markers were more efficient in discriminating among the cowpea germplasm than morphological, seed protein polymorphism and simple sequence repeat studies reported earlier on the same collection.
Huang, Zhen; Peng, Gary; Liu, Xunjia; Deora, Abhinandan; Falk, Kevin C.; Gossen, Bruce D.; McDonald, Mary R.; Yu, Fengqun
Clubroot, caused by Plasmodiophora brassicae, is an important disease of canola (Brassica napus) in western Canada and worldwide. In this study, a clubroot resistance gene (Rcr2) was identified and fine mapped in Chinese cabbage cv. “Jazz” using single-nucleotide polymorphisms (SNP) markers identified from bulked segregant RNA sequencing (BSR-Seq) and molecular markers were developed for use in marker assisted selection. In total, 203.9 million raw reads were generated from one pooled resistant (R) and one pooled susceptible (S) sample, and >173,000 polymorphic SNP sites were identified between the R and S samples. One significant peak was observed between 22 and 26 Mb of chromosome A03, which had been predicted by BSR-Seq to contain the causal gene Rcr2. There were 490 polymorphic SNP sites identified in the region. A segregating population consisting of 675 plants was analyzed with 15 SNP sites in the region using the Kompetitive Allele Specific PCR method, and Rcr2 was fine mapped between two SNP markers, SNP_A03_32 and SNP_A03_67 with 0.1 and 0.3 cM from Rcr2, respectively. Five SNP markers co-segregated with Rcr2 in this region. Variants were identified in 14 of 36 genes annotated in the Rcr2 target region. The numbers of poly variants differed among the genes. Four genes encode TIR-NBS-LRR proteins and two of them Bra019410 and Bra019413, had high numbers of polymorphic variants and so are the most likely candidates of Rcr2. PMID:28894454
Fjalestad Kjersti T
Full Text Available Abstract Background The Atlantic cod (Gadus morhua is a groundfish of great economic value in fisheries and an emerging species in aquaculture. Genetic markers are needed to identify wild stocks in order to ensure sustainable management, and for marker-assisted selection and pedigree determination in aquaculture. Here, we report on the development and evaluation of a large number of Single Nucleotide Polymorphism (SNP markers from the alignment of Expressed Sequence Tag (EST sequences in Atlantic cod. We also present basic population parameters of the SNPs in samples of North-East Arctic cod and Norwegian coastal cod obtained from three different localities, and test for SNPs that may have been targeted by natural selection. Results A total of 17,056 EST sequences were used to find 724 putative SNPs, from which 318 segregating SNPs were isolated. The SNPs were tested on Atlantic cod from four different sites, comprising both North-East Arctic cod (NEAC and Norwegian coastal cod (NCC. The average heterozygosity of the SNPs was 0.25 and the average minor allele frequency was 0.18. FST values were highly variable, with the majority of SNPs displaying very little differentiation while others had FST values as high as 0.83. The FST values of 29 SNPs were found to be larger than expected under a strictly neutral model, suggesting that these loci are, or have been, influenced by natural selection. For the majority of these outlier SNPs, allele frequencies in a northern sample of NCC were intermediate between allele frequencies in a southern sample of NCC and a sample of NEAC, indicating a cline in allele frequencies similar to that found at the Pantophysin I locus. Conclusion The SNP markers presented here are powerful tools for future genetics work related to management and aquaculture. In particular, some SNPs exhibiting high levels of population divergence have potential to significantly enhance studies on the population structure of Atlantic cod.
KUANG Meng; WEI Shou-jun; WANG Yan-qin; ZHOU Da-yun; MA Lei; FANG Dan; YANG Wei-hua; MA Zhi-ying
Considering the advantages of single nucleotide polymorphisms (SNP) in genotyping and variety identiifcation, the ifrst set public SNP markers at Cotton Marker Database (http://www.cottonmarker.org/) were validated and screened across standard varieties of cotton distinctness, uniformity and stability (DUS) test, aiming to obtain an appropriate set of core SNP markers suitable for upland cotton cultivars in China. A total of 399 out of 1005 SNPs from 270 loci including 170 insertions-dele-tions (InDels) were evaluated for their polymorphisms among 30 standard varieties using Sanger sequencing. As a result, 147 loci were sequenced successfuly, 377 SNPs and 49 InDels markers were obtained. Among the 377 SNP markers, 333 markers (88.3%) were polymorphic betweenGossypium hirsutum andG. barbadense, while 164 markers (43.5%) were polymorphic within upland cotton. As for InDel markers, the polymorphic rate is relatively lower than that of SNP both between species and within species. The homozygous DNA locus ratio of 121 SNPs was higher than 86.2% while that of other 43 SNPs was less than 70%. Only 64 SNPs displayed completely homozygous genotypes among al of the detected upland cotton varieties with 100% homozygous DNA locus ratio. At last, a set of 23 pairs of core SNPs were achieved in view of avoidance of linkage, with polymorphism information content (PIC) values varying from 0.21 to 0.38 with an average of 0.28. Genotype characteristics and genetic diversity were analyzed based on the set of core markers, while 40 pairs of core simple-sequence repeats (SSR) primers comprised of 10 sets of four multiplex PCR combinations were also used for analysis based on lfuorescence detection system. Comparison results indicated that the genetic diversity level was almost equal, while various varieties were signiifcantly different from each other. Genetic relationship revealed by SSR markers is related to geographic source to a certain extent. Meanwhile clustering results analyzed
Nemli, Seda; Kaya, Hilal Betul; Tanyolac, Bahattin
Peroxidase, a plant-specific oxidoreductase, is a heme-containing glycoprotein encoded by a large multigenic family in plants. Plant peroxidases (POXs, EC 22.214.171.124) play important roles in many self-defense interactions in plants. Here, 67 common bean (Phaseolus vulgaris L.) genotypes were studied using a POX gene-based marker method. Comparison of POX genes could resolve evolutionary relationships in common bean. Eighty fragments were obtained with 20 primer pairs that amplified one (POX8c) to eight (ATP29) bands, with a mean of four bands per primer pair. The average (polymorphic information content) PIC value for the POX products was 0.40. The maximum variation (93%) was found between Turkey (#33) and India (#52) and between Antalya (#33) and India (#53). The minimum variation (0%) was found among four pairs: Bozdag (#2) and Karadeniz (#38), Kirklareli (#11) and Turkey (#15, 16, 43), Bandirma (#13) and Turkey (#15, 16, 43), and Kirklareli (#10) and Bandirma (#22). UPGMA was used to discriminate the common bean genotypes into five clusters, while STRUCTURE software was used to investigate the genetic population structure. The results showed that POX gene family markers can be used to study genotypic diversity and provide new information for breeding programs and common bean improvement practices. © 2013 Society of Chemical Industry.
Juan M. Montes
Full Text Available Jatropha curcas L. (jatropha is an undomesticated plant that has recently received great attention for its utilization in biofuel production, rehabilitation of wasteland, and rural development. Knowledge of genetic diversity and marker-trait associations is urgently needed for the design of breeding strategies. The main goal of this study was to assess the genetic structure and diversity in jatropha germplasm with co-dominant markers (Simple Sequence Repeats (SSR and Single Nucleotide Polymorphism (SNP in a diverse, worldwide, germplasm panel of 70 accessions. We found a high level of homozygosis in the germplasm that does not correspond to the purely outcrossing mating system assumed to be present in jatropha. We hypothesize that the prevalent mating system of jatropha comprise a high level of self-fertilization and that the outcrossing rate is low. Genetic diversity in accessions from Central America and Mexico was higher than in accession from Africa, Asia, and South America. We identified makers associated with the presence of phorbol esters. We think that the utilization of molecular markers in breeding of jatropha will significantly accelerate the development of improved cultivars.
Full Text Available Amomum villosum Lour., produced from Yangchun, Guangdong Province, China, is a Daodi medicinal material of Amomi Fructus in traditional Chinese medicine. This herb germplasm should be accurately identified and collected to ensure its quality and safety in medication. In the present study, single nucleotide polymorphism typing method was evaluated on the basis of DNA barcoding markers to identify the germplasm of Amomi Fructus. Genomic DNA was extracted from the leaves of 29 landraces representing three Amomum species (A. villosum Lour., A. xanthioides Wall. ex Baker and A. longiligulare T. L. Wu by using the CTAB method. Six barcoding markers (ITS, ITS2, LSU D1-D3, matK, rbcL and trnH-psbA were PCR amplified and sequenced; SNP typing and phylogenetic analysis were performed to differentiate the landraces. Results showed that high-quality bidirectional sequences were acquired for five candidate regions (ITS, ITS2, LSU D1-D3, matK, and rbcL except trnH-psbA. Three ribosomal regions, namely, ITS, ITS2, and LSU D1-D3, contained more SNP genotypes (STs than the plastid genes rbcL and matK. In the 29 specimens, 19 STs were detected from the combination of four regions (ITS, LSU D1-D3, rbcL, and matK. Phylogenetic analysis results further revealed two clades. Minimum-spanning tree demonstrated the existence of two main groups: group I was consisting of 9 STs (ST1-8 and ST11 of A. villosum Lour., and group II was composed of 3 STs (ST16-18 of A. longiligulare T.L. Wu. Our results suggested that ITS and LSU D1-D3 should be incorporated with the core barcodes rbcL and matK. The four combined regions could be used as a multiregional DNA barcode to precisely differentiate the Amomi Fructus landraces in different producing areas.
Long, Nanye; Gianola, Daniel; Rosa, Guilherme J M; Weigel, Kent A; Kranis, Andreas; González-Recio, Oscar
A challenge when predicting total genetic values for complex quantitative traits is that an unknown number of quantitative trait loci may affect phenotypes via cryptic interactions. If markers are available, assuming that their effects on phenotypes are additive may lead to poor predictive ability. Non-parametric radial basis function (RBF) regression, which does not assume a particular form of the genotype-phenotype relationship, was investigated here by simulation and analysis of body weight and food conversion rate data in broilers. The simulation included a toy example in which an arbitrary non-linear genotype-phenotype relationship was assumed, and five different scenarios representing different broad sense heritability levels (0.1, 0.25, 0.5, 0.75 and 0.9) were created. In addition, a whole genome simulation was carried out, in which three different gene action modes (pure additive, additive+dominance and pure epistasis) were considered. In all analyses, a training set was used to fit the model and a testing set was used to evaluate predictive performance. The latter was measured by correlation and predictive mean-squared error (PMSE) on the testing data. For comparison, a linear additive model known as Bayes A was used as benchmark. Two RBF models with single nucleotide polymorphism (SNP)-specific (RBF I) and common (RBF II) weights were examined. Results indicated that, in the presence of complex genotype-phenotype relationships (i.e. non-linearity and non-additivity), RBF outperformed Bayes A in predicting total genetic values using SNP markers. Extension of Bayes A to include all additive, dominance and epistatic effects could improve its prediction accuracy. RBF I was generally better than RBF II, and was able to identify relevant SNPs in the toy example.
Full Text Available Abstract Background Identification of genes underlying drought tolerance (DT quantitative trait loci (QTLs will facilitate understanding of molecular mechanisms of drought tolerance, and also will accelerate genetic improvement of pearl millet through marker-assisted selection. We report a map based on genes with assigned functional roles in plant adaptation to drought and other abiotic stresses and demonstrate its use in identifying candidate genes underlying a major DT-QTL. Results Seventy five single nucleotide polymorphism (SNP and conserved intron spanning primer (CISP markers were developed from available expressed sequence tags (ESTs using four genotypes, H 77/833-2, PRLT 2/89-33, ICMR 01029 and ICMR 01004, representing parents of two mapping populations. A total of 228 SNPs were obtained from 30.5 kb sequenced region resulting in a SNP frequency of 1/134 bp. The positions of major pearl millet linkage group (LG 2 DT-QTLs (reported from crosses H 77/833-2 × PRLT 2/89-33 and 841B × 863B were added to the present consensus function map which identified 18 genes, coding for PSI reaction center subunit III, PHYC, actin, alanine glyoxylate aminotransferase, uridylate kinase, acyl-CoA oxidase, dipeptidyl peptidase IV, MADS-box, serine/threonine protein kinase, ubiquitin conjugating enzyme, zinc finger C- × 8-C × 5-C × 3-H type, Hd3, acetyl CoA carboxylase, chlorophyll a/b binding protein, photolyase, protein phosphatase1 regulatory subunit SDS22 and two hypothetical proteins, co-mapping in this DT-QTL interval. Many of these candidate genes were found to have significant association with QTLs of grain yield, flowering time and leaf rolling under drought stress conditions. Conclusions We have exploited available pearl millet EST sequences to generate a mapped resource of seventy five new gene-based markers for pearl millet and demonstrated its use in identifying candidate genes underlying a major DT-QTL in this species. The reported gene-based
Expressed sequence tags (ESTs) were analyzed in silico in order to identify single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels) in cotton. A total of 1349 EST-based SNP and InDel markers were developed by comparing ESTs between Gossypium hirsutum and G. barbadense, m...
Angel eFernandez i Marti
Full Text Available Most previous studies on genetic fingerprinting and cultivar relatedness in sweet cherry were based on isoenzyme, RAPD and SSR markers. This study was carried out to assess the utility of SNP markers generated from 3’UTRs for genetic fingerprinting in sweet cherry. A total of 114 sweet cherry germplasm representing advanced selections, commercial cultivars and old cultivars imported from different parts of the world were screened with 7 SSR markers developed from other Prunus species and with 40 SNPs obtained from 3’UTR sequences of Rainier and Bing sweet cherry cultivars. Both types of marker study had 99 accessions in common. The SSR data was used to validate the SNP results. Results showed that the average number of alleles per locus, mean observed heterozygosity, expected heterozygosity and polymorphic information content (PIC values were higher in SSRs than in SNPs although both set of markers were similar in their grouping of the sweet cherry accessions as shown in the dendrogram. SNPs were able to distinguish sport mutants from their wild type germplasm. For example, ‘Stella’ was separated from ‘Compact Stella’. This demonstrates the greater power of SNPs for discriminating mutants from their original parents than SSRs. In addition, SNP markers confirmed parentage and also determined relationships of the accessions in a manner consistent with their pedigree relationships. We would recommend the use of 3’ UTR SNPs for genetic fingerprinting, parentage verification, gene mapping and study of genetic diversity in sweet cherry.
LIU Rui; SUN Dong-xiao; WANG Ya-chun; YU Ying; ZHANG Yi; CHEN Hui-yong; ZHANG Qin; ZHANG Sheng-li; ZHANG Yuan
Our previous studies demonstrated that the region around markers BMS470 and BMS1242 on BTA6 showed a linkage to 305-d milk yield and composition traits in the Chinese Holstein population. We herein focused on such narrow region to fine map milk production QTLs with 15 SNPs across 25 Mb with each SNP in 1 Mb within most regions in a Chinese Holstein population with daughter design. 1 449 Holstein cows and 11 sires were genotyped for such SNPs by using TaqMan probe and RFLP assays. Multipoint linkage analysis across family revealed a QTL affecting milk yield between PPARGC1A C4075T and SLC34A2 T1713C. Meanwhile, within family analysis found three milk yield QTLs (two in CR T60984131G-CEP135 C501T and one in PDLIM5 A106C-OPN T3907, a fat yield QTL in UGDH T1670C-CR T60984131G region, and two protein yield QTLs in TBC1D1 G501C-UGDH T1670C and PPARGC1A C4075T-SLC34A2 T1713C, respectively. Associations between aforementioned significant SNP markers and milk production traits were further implemented. We found significant associations of PPARGC1A C4075T, SLC34A2 T1713C with milk yield (P<0.05, P<0.01, P<0.01), UGDH T1670C, and CR T60984131G with fat yield (P<0.01, P<0.01), and PPARGC1A C4075T, SLC34A2 T1713C, UGDH T1670C and OPN T3907 with protein yield (P<0.01, P<0.01, P<0.01, P<0.01). Our findings implied that QTLs affecting milk production traits on BTA6 were pleictropism or multigenic effect and PPARGC1A and OPN may be the causal mutations behind milk production QTLs on BTA6 in the Chinese Holstein population.
Yang, Jinfen; Ma, Xinye; Zhan, Ruoting; Xu, Hui; Chen, Weiwen
Amomum villosum Lour., produced from Yangchun, Guangdong Province, China, is a Daodi medicinal material of Amomi Fructus in traditional Chinese medicine. This herb germplasm should be accurately identified and collected to ensure its quality and safety in medication. In the present study, single nucleotide polymorphism typing method was evaluated on the basis of DNA barcoding markers to identify the germplasm of Amomi Fructus. Genomic DNA was extracted from the leaves of 29 landraces representing three Amomum species (A. villosum Lour., A. xanthioides Wall. ex Baker and A. longiligulare T. L. Wu) by using the CTAB method. Six barcoding markers (ITS, ITS2, LSU D1–D3, matK, rbcL and trnH-psbA) were PCR amplified and sequenced; SNP typing and phylogenetic analysis were performed to differentiate the landraces. Results showed that high-quality bidirectional sequences were acquired for five candidate regions (ITS, ITS2, LSU D1–D3, matK, and rbcL) except trnH-psbA. Three ribosomal regions, namely, ITS, ITS2, and LSU D1–D3, contained more SNP genotypes (STs) than the plastid genes rbcL and matK. In the 29 specimens, 19 STs were detected from the combination of four regions (ITS, LSU D1–D3, rbcL, and matK). Phylogenetic analysis results further revealed two clades. Minimum-spanning tree demonstrated the existence of two main groups: group I was consisting of 9 STs (ST1–8 and ST11) of A. villosum Lour., and group II was composed of 3 STs (ST16–18) of A. longiligulare T.L. Wu. Our results suggested that ITS and LSU D1–D3 should be incorporated with the core barcodes rbcL and matK. The four combined regions could be used as a multiregional DNA barcode to precisely differentiate the Amomi Fructus landraces in different producing areas. PMID:25531885
Stephen N Carmichael
Full Text Available The salmon louse (Lepeophtheirus salmonis (Krøyer, 1837 is a parasitic copepod that can, if untreated, cause considerable damage to Atlantic salmon (Salmo salar Linnaeus, 1758 and incurs significant costs to the Atlantic salmon mariculture industry. Salmon lice are gonochoristic and normally show sex ratios close to 1:1. While this observation suggests that sex determination in salmon lice is genetic, with only minor environmental influences, the mechanism of sex determination in the salmon louse is unknown. This paper describes the identification of a sex-linked Single Nucleotide Polymorphism (SNP marker, providing the first evidence for a genetic mechanism of sex determination in the salmon louse. Restriction site-associated DNA sequencing (RAD-seq was used to isolate SNP markers in a laboratory-maintained salmon louse strain. A total of 85 million raw Illumina 100 base paired-end reads produced 281,838 unique RAD-tags across 24 unrelated individuals. RAD marker Lsa101901 showed complete association with phenotypic sex for all individuals analysed, being heterozygous in females and homozygous in males. Using an allele-specific PCR assay for genotyping, this SNP association pattern was further confirmed for three unrelated salmon louse strains, displaying complete association with phenotypic sex in a total of 96 genotyped individuals. The marker Lsa101901 was located in the coding region of the prohibitin-2 gene, which showed a sex-dependent differential expression, with mRNA levels determined by RT-qPCR about 1.8-fold higher in adult female than adult male salmon lice. This study's observations of a novel sex-linked SNP marker are consistent with sex determination in the salmon louse being genetic and following a female heterozygous system. Marker Lsa101901 provides a tool to determine the genetic sex of salmon lice, and could be useful in the development of control strategies.
Shrestha, Sandesh; Hu, Jian; Fryxell, Rebecca Trout; Mudge, Joann; Lamour, Kurt
Taro (Colocasia esculenta) is an important food crop, and taro leaf blight caused by Phytophthora colocasiae can significantly affect production. Our objectives were to develop single nucleotide polymorphism (SNP) markers for P. colocasiae and characterize populations in Hawaii (HI), Vietnam (VN) and Hainan Island, China (HIC). In total, 379 isolates were analyzed for mating type and multilocus SNP profiles including 214 from HI, 97 from VN and 68 from HIC. A total of 1152 single nucleotide variant (SNV) sites were identified via restriction site-associated DNA (RAD) sequencing of two field isolates. Genotyping with 27 SNPs revealed 41 multilocus SNP genotypes grouped into seven clonal lineages containing 2-232 members. Three clonal lineages were shared among countries. In addition, five SNP markers had a low incidence of loss of heterozygosity (LOH) during asexual laboratory growth. For HI and VN, >95% of isolates were the A2 mating type. On HIC, isolates within single clonal lineages had A1, A2 and A0 (neuter) isolates. The implications for the wide dispersal of clonal lineages are discussed.
Studer, Bruno; Nielsen, Rasmus Ory; Panitz, Frank;
a clear cluster separation. An additional 83 (12%) were monomorphic. A total of 513 gene-associated SNPs were available for linkage mapping, out of which 495 (64% of the total 768 SNPs on the array) were successfully mapped in the VrnA population. The current VrnA map contains a total of 837 DNA markers......-assisted breeding strategies, a surprisingly low number of validated SNPs are currently available in perennial ryegrass. The advent of next generation sequencing opened up the opportunity for efficient and high throughput in silico SNP discovery in absence of a reference genome sequence. However, the percentages...... of 768 SNP markers were selected for GoldenGate genotyping on 181 individuals of the perennial ryegrass mapping population VrnA, which has been previously evaluated for important agronomic traits. A total of 692 (90%) of the 768 SNPs tested were successfully called. Of these, 96 (14%) did not reveal...
Studer, Bruno; Kölliker, Roland
for this is the availability of high-throughput platforms for multiplexed SNP genotyping. Advancements in these technologies have enabled increased flexibility and throughput, allowing for the generation of adequate SNP marker data at very competitive cost per data point.......In the recent years, single nucleotide polymorphism (SNP) markers have emerged as the marker technology of choice for plant genetics and breeding applications. Besides the efficient technologies available for SNP discovery even in complex genomes, one of the main reasons...
Studer, Bruno; Kölliker, Roland
In the recent years, single nucleotide polymorphism (SNP) markers have emerged as the marker technology of choice for plant genetics and breeding applications. Besides the efficient technologies available for SNP discovery even in complex genomes, one of the main reasons...... for this is the availability of high-throughput platforms for multiplexed SNP genotyping. Advancements in these technologies have enabled increased flexibility and throughput, allowing for the generation of adequate SNP marker data at very competitive cost per data point....
Milano, Ilaria; Babbucci, Massimiliano; Cariani, Alessia; Atanassova, Miroslava; Bekkevold, Dorte; Carvalho, Gary R; Espiñeira, Montserrat; Fiorentino, Fabio; Garofalo, Germana; Geffen, Audrey J; Hansen, Jakob H; Helyar, Sarah J; Nielsen, Einar E; Ogden, Rob; Patarnello, Tomaso; Stagioni, Marco; Tinti, Fausto; Bargelloni, Luca
Shallow population structure is generally reported for most marine fish and explained as a consequence of high dispersal, connectivity and large population size. Targeted gene analyses and more recently genome-wide studies have challenged such view, suggesting that adaptive divergence might occur even when neutral markers provide genetic homogeneity across populations. Here, 381 SNPs located in transcribed regions were used to assess large- and fine-scale population structure in the European hake (Merluccius merluccius), a widely distributed demersal species of high priority for the European fishery. Analysis of 850 individuals from 19 locations across the entire distribution range showed evidence for several outlier loci, with significantly higher resolving power. While 299 putatively neutral SNPs confirmed the genetic break between basins (F(CT) = 0.016) and weak differentiation within basins, outlier loci revealed a dramatic divergence between Atlantic and Mediterranean populations (F(CT) range 0.275-0.705) and fine-scale significant population structure. Outlier loci separated North Sea and Northern Portugal populations from all other Atlantic samples and revealed a strong differentiation among Western, Central and Eastern Mediterranean geographical samples. Significant correlation of allele frequencies at outlier loci with seawater surface temperature and salinity supported the hypothesis that populations might be adapted to local conditions. Such evidence highlights the importance of integrating information from neutral and adaptive evolutionary patterns towards a better assessment of genetic diversity. Accordingly, the generated outlier SNP data could be used for tackling illegal practices in hake fishing and commercialization as well as to develop explicit spatial models for defining management units and stock boundaries.
Wang, Hongtao; Li, Guisheng; Kwon, Woo-Saeng; Yang, Deok-Chun
Panax ginseng is one of the most valuable medicinal plants in the Orient. The low level of genetic variation has limited the application of molecular markers for cultivar authentication and marker-assisted selection in cultivated ginseng. To exploit DNA polymorphism within ginseng cultivars, ginseng expressed sequence tags (ESTs) were searched against the potential intron polymorphism (PIP) database to predict the positions of introns. Intron-flanking primers were then designed in conserved exon regions and used to amplify across the more variable introns. Sequencing results showed that single nucleotide polymorphisms (SNPs), as well as indels, were detected in four EST-derived introns, and SNP markers specific to "Gopoong" and "K-1" were first reported in this study. Based on cultivar-specific SNP sites, allele-specific polymerase chain reaction (PCR) was conducted and proved to be effective for the authentication of ginseng cultivars. Additionally, the combination of a simple NaOH-Tris DNA isolation method and real-time allele-specific PCR assay enabled the high throughput selection of cultivars from ginseng fields. The established real-time allele-specific PCR assay should be applied to molecular authentication and marker assisted selection of P. ginseng cultivars, and the EST intron-targeting strategy will provide a potential approach for marker development in species without whole genomic DNA sequence information.
Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.
Full Text Available To estimate genetic diversity within and between 10 interfertile Cicer species (94 genotypes from the primary, secondary and tertiary gene pool, we analysed 5,257 DArT markers and 651 KASPar SNP markers. Based on successful allele calling in the tertiary gene pool, 2,763 DArT and 624 SNP markers that are polymorphic between genotypes from the gene pools were analyzed further. STRUCTURE analyses were consistent with 3 cultivated populations, representing kabuli, desi and pea-shaped seed types, with substantial admixture among these groups, while two wild populations were observed using DArT markers. AMOVA was used to partition variance among hierarchical sets of landraces and wild species at both the geographical and species level, with 61% of the variation found between species, and 39% within species. Molecular variance among the wild species was high (39% compared to the variation present in cultivated material (10%. Observed heterozygosity was higher in wild species than the cultivated species for each linkage group. Our results support the Fertile Crescent both as the center of domestication and diversification of chickpea. The collection used in the present study covers all the three regions of historical chickpea cultivation, with the highest diversity in the Fertile Crescent region. Shared alleles between different gene pools suggest the possibility of gene flow among these species or incomplete lineage sorting and could indicate complicated patterns of divergence and fusion of wild chickpea taxa in the past.
Ilyasov, R A; Poskryakov, A V; Nikolenko, A G
Preservation of the gene pool of honeybee subspecies Apis mellifera mellifera is of vital importance for successful beekeeping development in the northern regions of Eurasia. An effective method of genotyping honeybee colonies used in modern science is the mapping of sites of single nucleotide polymorphism (SNP). The honeybee vitellogenin gene (Vg) encodes a protein that affects reproductive function, behavior, immunity, longevity, and social organization in the honeybee Apis mellifera and is therefore a topical research subject. The results of comparative analysis of honeybee Vg sequences show that there are 26 SNP sites that differentiate M and C evolutionary branches and can be used as markers in selective breeding, DNA-barcoding, and the creation of genetic passports for A. m. mellifera colonies.
Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick
Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP
Full Text Available Abstract Background Expressed sequence tags (ESTs are an important source of gene-based markers such as those based on insertion-deletions (Indels or single-nucleotide polymorphisms (SNPs. Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs, to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes. Results A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 × G19833 recombinant inbred line (RIL population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 × 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes. Conclusion The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction
Labuschagne, Christiaan; Nupen, Lisa; Kotzé, Antoinette; Grobler, Paul J; Dalton, Desiré L
Captive management of ex situ populations of endangered species is traditionally based on pedigree information derived from studbook data. However, molecular methods could provide a powerful set of complementary tools to verify studbook records and also contribute to improving the understanding of the genetic status of captive populations. Here, we compare the utility of single nucleotide polymorphisms (SNPs) and microsatellites (MS) and two analytical methods for assigning parentage in ten families of captive African penguins held in South African facilities. We found that SNPs performed better than microsatellites under both analytical frameworks, but a combination of all markers was most informative. A subset of combined SNP (n = 14) and MS loci (n = 10) provided robust assessments of parentage. Captive or supportive breeding programs will play an important role in future African penguin conservation efforts as a source of individuals for reintroduction. Cooperation among these captive facilities is essential to facilitate this process and improve management. This study provided us with a useful set of SNP and MS markers for parentage and relatedness testing among these captive populations. Further assessment of the utility of these markers over multiple (>3) generations and the incorporation of a larger variety of relationships among individuals (e.g., half-siblings or cousins) is strongly suggested.
A. Janiak; M.Y. Kim; S.H. Lee
@@ There are several strategies that can be applied in SNP discovery, as for example the locus-specific amplification of target genome regions (Primmer et al., 2002; Van et al., 2004) or simultaneous assembly of anonymous sequences which are the product of whole genome shotgun sequencing (Webber and Myers, 1997) or reduced representation shotgun sequencing (Altshuler et al., 2000).
Camila Campos Mantello
Full Text Available Hevea brasiliensis (Willd. Ex Adr. Juss. Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr protein database returned 32,018 (63% positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG, gene ontology (GO, Kyoto Encyclopedia of Genes and Genomes (KEGG, and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs and single nucleotide polymorphisms (SNPs. In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA and 2-C-methyl-D-erythritol 4-phosphate (MEP pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.
Mantello, Camila Campos; Cardoso-Silva, Claudio Benicio; da Silva, Carla Cristina; de Souza, Livia Moura; Scaloppi Junior, Erivaldo José; de Souza Gonçalves, Paulo; Vicentini, Renato; de Souza, Anete Pereira
Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.
Milano, I.; Babbucci, M.; Cariani, A.;
of integrating information from neutral and adaptive evolutionary patterns towards a better assessment of genetic diversity. Accordingly, the generated outlier SNP data could be used for tackling illegal practices in hake fishing and commercialization as well as to develop explicit spatial models for defining......Shallow population structure is generally reported for most marine fish and explained as a consequence of high dispersal, connectivity and large population size. Targeted gene analyses and more recently genome-wide studies have challenged such view, suggesting that adaptive divergence might occur...
Cuenca, Jose; Aleza, Pablo; Garcia-Lor, Andres; Ollitrault, Patrick; Navarro, Luis
Alternaria brown spot (ABS) is a serious disease affecting susceptible citrus genotypes, which is a strong concern regarding citrus breeding programs. Resistance is conferred by a recessive locus (ABSr) previously located by our group within a 3.3 Mb genome region near the centromere in chromosome III. This work addresses fine-linkage mapping of this region for identifying candidate resistance genes and develops new molecular markers for ABS-resistance effective marker-assisted selection (MAS). Markers closely linked to ABSr locus were used for fine mapping using a 268-segregating diploid progeny derived from a heterozygous susceptible × resistant cross. Fine mapping limited the genomic region containing the ABSr resistance gene to 366 kb, flanked by markers at 0.4 and 0.7 cM. This region contains nine genes related to pathogen resistance. Among them, eight are resistance (R) gene homologs, with two of them harboring a serine/threonine protein kinase domain. These two genes along with a gene encoding a S-adenosyl-L-methionine-dependent-methyltransferase protein, should be considered as strong candidates for ABS-resistance. Moreover, the closest SNP was genotyped in 40 citrus varieties, revealing very high association with the resistant/susceptible phenotype. This new marker is currently used in our citrus breeding program for ABS-resistant parent and cultivar selection, at diploid, triploid and tetraploid level. PMID:28066498
Full Text Available Variations in human genome (e.g., single nucleotide polymorphisms, SNPs may be associated with hereditary diseases, their complications, comorbidities, and drug responses. Using Web service SNP_TATA_Comparator presented in our previous paper, here we analyzed immediate surroundings of known SNP markers of diseases and identified several candidate SNP markers that can significantly change the affinity of TATA-binding protein for human gene promoters, with circadian consequences. For example, rs572527200 may be related to asthma, where symptoms are circadian (worse at night, and rs367732974 may be associated with heart attacks that are characterized by a circadian preference (early morning. By the same method, we analyzed the 90 bp proximal promoter region of each protein-coding transcript of each human gene of the circadian clock core. This analysis yielded 53 candidate SNP markers, such as rs181985043 (susceptibility to acute Q fever in male patients, rs192518038 (higher risk of a heart attack in patients with diabetes, and rs374778785 (emphysema and lung cancer in smokers. If they are properly validated according to clinical standards, these candidate SNP markers may turn out to be useful for physicians (to select optimal treatment for each patient and for the general population (to choose a lifestyle preventing possible circadian complications of diseases.
Preety Panwar; Manoj Nath; Vijay Kumar Yadav; Anil Kumar
Genetic relationships among 52 Eleusine coracana (finger millet) genotypes collected from different districts of Uttarakhand were investigated by using randomly amplified polymorphic DNA (RAPD), simple sequence repeat (SSR) and cytochrome P450 gene based markers. A total of 18 RAPD primers, 10 SSR primers, and 10 pairs of cytochrome P450 gene based markers, respectively, revealed 49.4%, 50.2% and 58.7% polymorphism in 52 genotypes of E. coracana. Mean polymorphic information content (PIC) for each of these marker systems (0.351 for RAPD, 0.505 for SSR and 0.406 for cyt P450 gene based markers) suggested that all the marker systems were effective in determining polymorphisms. Pair-wise similarity index values ranged from 0.011 to 0.999 (RAPD), 0.010 to 0.999 (SSR) and 0.001 to 0.998 (cyt P450 gene based markers) and mean similarity index value of 0.505, 0.504 and 0.499, respectively. The dendrogram developed by RAPD, SSR and cytochrome P450 gene based primers analyses revealed that the genotypes are grouped in different clusters according to high calcium (300–450 mg/100 g), medium calcium (200–300 mg/100 g) and low calcium (100–200 mg/100 g). Mantel test employed for detection of goodness of fit established cophenetic correlation values above 0.95 for all the three marker systems. The dendrograms and principal coordinate analysis (PCA) plots derived from the binary data matrices of the three marker systems are highly concordant. High bootstrap values were obtained at major nodes of phenograms through WINBOOT software. Comparison of RAPD, SSR and cytochrome P450 gene based markers, in terms of the quality of data output, indicated that SSRs and cyt P450 gene based markers are particularly promising for the analysis of plant genome diversity. The genotypes of finger millet collected from different districts of Uttarakhand constitute a wide genetic base and clustered according to calcium contents. The identified genotypes could be used in breeding programmes and
Tyrka, Mirosław; Tyrka, Dorota; Wędzony, Maria
Triticale (×Triticosecale Wittm) is an economically important crop for fodder and biomass production. To facilitate the identification of markers for agronomically important traits and for genetic and genomic characteristics of this species, a new high-density genetic linkage map of triticale was constructed using doubled haploid (DH) population derived from a cross between cultivars 'Hewo' and 'Magnat'. The map consists of 1615 bin markers, that represent 50 simple sequence repeat (SSR), 842 diversity array technology (DArT), and 16888 DArTseq markers mapped onto 20 linkage groups assigned to the A, B, and R genomes of triticale. No markers specific to chromosome 7R were found, instead mosaic linkage group composed of 1880 highly distorted markers (116 bins) from 10 wheat chromosomes was identified. The genetic map covers 4907 cM with a mean distance between two bins of 3.0 cM. Comparative analysis in respect to published maps of wheat, rye and triticale revealed possible deletions in chromosomes 4B, 5A, and 6A, as well as inversion in chromosome 7B. The number of bin markers in each chromosome varied from 24 in chromosome 3R to 147 in chromosome 6R. The length of individual chromosomes ranged between 50.7 cM for chromosome 2R and 386.2 cM for chromosome 7B. A total of 512 (31.7%) bin markers showed significant (P triticale will facilitate fine mapping of quantitative trait loci, the identification of candidate genes and map-based cloning.
CHEN FENG; CHAO FENG; MING KANG
Primulina eburneais a promising candidate for domestication and floriculture, since it is easy to culture and has beautiful flow-ers. An F2population of 189 individuals was established for the construction of first-generation linkage maps based onexpressed sequence tags-derived single-nucleotide polymorphism markers using the massARRAY genotyping platform. Ofthe 232 screened markers, 215 were assigned to 18 LG according to the haploid number of chromosomes in the species. Thelinkage map spanned a total of 3774.7 cM with an average distance of 17.6 cM between adjacent markers. This linkage mapprovides a framework for identification of important genes in breeding programm
Feng, Chen; Feng, Chao; Kang, Ming
Primulina eburnea is a promising candidate for domestication and floriculture, since it is easy to culture and has beautiful flowers. An F₂ population of 189 individuals was established for the construction of first-generation linkage maps based on expressed sequence tags-derived single-nucleotide polymorphism markers using the massARRAY genotyping platform. Of the 232 screened markers, 215 were assigned to 18 LG according to the haploid number of chromosomes in the species. The linkage map spanned a total of 3774.7 cM with an average distance of 17.6 cM between adjacent markers. This linkage map provides a framework for identification of important genes in breeding programmes.
Seedlessness, flavor, and color are top priorities for mandarin (Citrus reticulata Blanco) cultivar improvement. Given long juvenility, large tree size, and high breeding cost, marker-assisted selection (MAS) may be an expeditious and economical approach to these challenges. The objectives of this s...
Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
Genetic diversity, population structure, and genome-wide marker-trait association analyses were conducted on a special collection of 298 homozygous lettuce (Lactuca sativa L.) lines. Each of these lines was derived from a single plant that had been genotyped with 384 SNP makers using LSGermOPA. They...
Pal, Lipika R; Kundu, Kunal; Yin, Yizhou; Moult, John
Understanding the basis of complex trait disease is a fundamental problem in human genetics. The CAGI Crohn's Exome challenges are providing insight into the adequacy of current disease models by requiring participants to identify which of a set of individuals has been diagnosed with the disease, given exome data. For the CAGI4 round, we developed a method that used the genotypes from exome sequencing data only to impute the status of genome wide association studies marker SNPs. We then used the imputed genotypes as input to several machine learning methods that had been trained to predict disease status from marker SNP information. We achieved the best performance using Naïve Bayes and with a consensus machine learning method, obtaining an area under the curve of 0.72, larger than other methods used in CAGI4. We also developed a model that incorporated the contribution from rare missense variants in the exome data, but this performed less well. Future progress is expected to come from the use of whole genome data rather than exomes. © 2017 Wiley Periodicals, Inc.
Qionglin Huang; Zhonggang Duan; Jinfen Yang; Xinye Ma; Ruoting Zhan; Hui Xu; Weiwen Chen
Amomum villosum Lour., produced from Yangchun, Guangdong Province, China, is a Daodi medicinal material of Amomi Fructus in traditional Chinese medicine. This herb germplasm should be accurately identified and collected to ensure its quality and safety in medication. In the present study, single nucleotide polymorphism typing method was evaluated on the basis of DNA barcoding markers to identify the germplasm of Amomi Fructus. Genomic DNA was extracted from the leaves of 29 landraces represen...
Fitak, Robert R.; Naidu, Ashwin; Thompson, Ron W.; Culver, Melanie
Pumas Puma concolor are one of the most studied terrestrial carnivores because of their widespread distribution, substantial ecological impacts, and conflicts with humans. Over the past decade, managing pumas has involved extensive efforts including the use of genetic methods. Microsatellites have been the most commonly used genetic markers; however, technical artifacts and little overlap of frequently used loci render large-scale comparison of puma genetic data across studies challenging. Therefore, a panel of genetic markers that can produce consistent genotypes across studies without the need for extensive calibrations is essential for range-wide genetic management of puma populations. Here, we describe the development of PumaPlex, a high-throughput assay to genotype 25 single nucleotide polymorphisms in pumas. We validated PumaPlex in 748 North American pumas Puma concolor couguar, and demonstrated its ability to generate reproducible genotypes and accurately identify individuals. Furthermore, in a test using fecal deoxyribonucleic acid (DNA) samples, we found that PumaPlex produced significantly more genotypes with fewer errors than 12 microsatellite loci, 8 of which are commonly used. Our results demonstrate that PumaPlex is a valuable tool for the genetic monitoring and management of North American puma populations. Given the analytical simplicity, reproducibility, and high-throughput capability of single nucleotide polymorphisms, PumaPlex provides a standard panel of markers that promotes the comparison of genotypes across studies and independent of the genotyping technology used.
Kim, Sung Min; Yoo, Seong Yeon; Nam, Soo Hyun; Lee, Jae Moon; Chung, Ki Wha
Analysis of large numbers of single-nucleotide polymorphisms (SNPs) can increase individual discrimination power, and, particularly, it can supply important evidence for kinship or ethnic identification. We identified 300 Korean-specific SNPs from 306 Korean whole-exome sequencing (WES) data. Functionally significant SNPs (variants in splicing site, missense, nonsense, and exonic indels) were filtered out from the variant pool, and SNPs with minor allele frequencies (MAFs) of 0.3 in the Korean population were selected. Genotypes obtained from WES were confirmed by the Sanger sequencing method. The identified markers were evenly distributed throughout the autosomal chromosomes. All the SNPs were in the Hardy-Weinberg equilibrium with a mean MAF of 0.415 (0.161 in 1000G). The mean heterozygosities were 0.476 (observed) and 0.470 (experimental). The combined power of discrimination was very high. Korean MAFs in most SNPs were similar to those for the Chinese and Japanese populations, but were significantly higher than those for several other ethnic populations. These selected SNPs will be used to develop forensic markers and are expected to be widely used for additional individual identification, ethnic discrimination, and linkage analysis for kinship tests.
Wang, X Q; Zhao, L; Eaton, D A R; Li, D Z; Guo, Z H
Phylogenetic relationships among temperate species of bamboo are difficult to resolve, owing to both the challenge of detecting sufficiently variable markers and their polyploid history. Here, we use restriction site-associated DNA sequencing to identify candidate loci with fixed allelic differences segregating between and within two temperate species of bamboos: Arundinaria faberi and Yushania brevipaniculata. Approximately 27 million paired-end sequencing reads were generated across four samples. From pooled data, we assembled 67 685 and 70 668 de novo contigs from partial overlap among paired-end reads, with an average length of 240 and 241 bp for the two species, respectively, which were used to investigate functional classification of RAD tags in a blastx search. Analysed separately by population, we recovered 29 443 putatively orthologous RAD tags shared across the four sampled populations, containing 28 023 sequence variants, of which c. 13 000 are segregating between species, and c. 3000 segregating between populations within each species. Analyses based on these RAD tags yielded robust phylogenetic inferences, even with data set constructed from surprisingly few loci. This study illustrates the potential for reduced-representation genome data to resolve difficult phylogenetic relationships in temperate bamboos. © 2013 John Wiley & Sons Ltd.
Ong, P W; Maizura, I; Abdullah, N A P; Rafii, M Y; Ooi, L C L; Low, E T L; Singh, R
The genetic evaluation of oil palm germplasm collections is required for insight into the variability among populations. The information obtained is also useful for incorporating new genetic materials into current breeding programs. Single nucleotide polymorphisms (SNPs) have been widely used in many plant genetic studies due to the availability of large numbers of genomic sequences and expressed sequence tags. The present study examined 219 oil palms collected from two natural Angolan populations, a few hundred kilometers apart. A total of 62 SNPs were designed from oil palm genomic sequences and converted to cleaved amplified polymorphic sequence (CAPS). Of these, nine were found to be informative across the two populations. The nine informative SNPs revealed mean major allele frequency of 0.693. The average expected and observed heterozygosities were 0.398 and 0.400, respectively. The mean polymorphism information content was 0.315 (ranging between 0.223 and 0.375). None of the loci deviated from Hardy-Weinberg equilibrium and no rare alleles were detected. In cluster analysis using unweighted pair group method with arithmetic, the 219 oil palms fell into two clusters. This was further supported by the population structure analysis result (K = 2), suggesting that the samples were divided into two main genetic groups. However, the two groups did not coincide with the geographic populations. Analysis of molecular variance indicated that within-population variation contributed 93% of the total genetic variation. This study showed that SNP-based CAPS markers are useful for studying the genetic diversity of oil palm and have potential application for marker-trait association studies.
Edea, Zewdu; Dadi, Hailu; Kim, Sang-Wook; Dessie, Tadelle; Lee, Taeheon; Kim, Heebal; Kim, Jong-Joo; Kim, Kwan-Suk
In total, 166 individuals from five indigenous Ethiopian cattle populations - Ambo (n = 27), Borana (n = 35), Arsi (n = 30), Horro (n = 36), and Danakil (n = 38) - were genotyped for 8773 single nucleotide polymorphism (SNP) markers to assess genetic diversity, population structure, and relationships. As a representative of taurine breeds, Hanwoo cattle (n = 40) were also included in the study for reference. Among Ethiopian cattle populations, the proportion of SNPs with minor allele frequencies (MAFs) ≥0.05 ranged from 81.63% in Borana to 85.30% in Ambo, with a mean of 83.96% across all populations. The Hanwoo breed showed the highest proportion of polymorphism, with MAFs ≥0.05, accounting for 95.21% of total SNPs. The mean expected heterozygosity varied from 0.370 in Danakil to 0.410 in Hanwoo. The mean genetic differentiation (F ST; 1%) in Ethiopian cattle revealed that within individual variation accounted for approximately 99% of the total genetic variation. As expected, F ST and Reynold genetic distance were greatest between Hanwoo and Ethiopian cattle populations, with average values of 17.62 and 18.50, respectively. The first and second principal components explained approximately 78.33% of the total variation and supported the clustering of the populations according to their historical origins. At K = 2 and 3, a considerable source of variation among cattle is the clustering of the populations into Hanwoo (taurine) and Ethiopian cattle populations. The low estimate of genetic differentiation (F ST) among Ethiopian cattle populations indicated that differentiation among these populations is low, possibly owing to a common historical origin and high gene flow. Genetic distance, phylogenic tree, principal component analysis, and population structure analyses clearly differentiated the cattle population according to their historical origins, and confirmed that Ethiopian cattle populations are genetically distinct from the Hanwoo breed.
Full Text Available In total, 166 individuals from 5 indigenous Ethiopian cattle populations—Ambo (n = 27, Borana (n = 35, Arsi (n = 30, Horro (n = 36, and Danakil (n = 38—were genotyped for 8773 single nucleotide polymorphism (SNP markers to assess genetic diversity, population structure, and relationships. As a representative of taurine breeds, Hanwoo cattle (n = 40 were also included in the study for reference. Among Ethiopian cattle populations, the proportion of SNPs with minor allele frequencies (MAFs ≥ 0.05 ranged from 81.63% in Borana to 85.30% in Ambo, with a mean of 83.96% across all populations. The Hanwoo breed showed the highest proportion of polymorphism, with MAFs ≥ 0.05, accounting for 95.21% of total SNPs. The mean expected heterozygosity varied from 0.370 in Danakil to 0.410 in Hanwoo. The mean genetic differentiation (FST (1% in Ethiopian cattle revealed that within-individual variation accounted for approximately 99% of the total genetic variation. As expected, FST and Reynold genetic distance were greatest between Hanwoo and Ethiopian cattle populations, with average values of 17.62 and 18.50, respectively. The first and second principal components explained approximately 78.33% of the total variation and supported the clustering of the populations according to their historical origins. At K = 2 and 3, a considerable source of variation among cattle is the clustering of the populations into Hanwoo (taurine and Ethiopian cattle populations. The low estimate of genetic differentiation (FST among Ethiopian cattle populations indicated that differentiation among these populations is low, possibly owing to a common historical origin and high gene flow. Genetic distance, phylogenic tree, PCA, and population structure analyses clearly differentiated the cattle population according to their historical origins, and confirmed that Ethiopian cattle populations are genetically distinct from the Hanwoo breed
Full Text Available Hybrid zones are noteworthy systems for the study of environmental adaptation to fast-changing environments, as they constitute reservoirs of polymorphism and are key to the maintenance of biodiversity. They can move in relation to climate fluctuations, as temperature can affect both selection and migration, or remain trapped by environmental and physical barriers. There is therefore a very strong incentive to study the dynamics of hybrid zones subjected to climate variations. The infaunal bivalve Macoma balthica emerges as a noteworthy model species, as divergent lineages hybridize, and its native NE Atlantic range is currently contracting to the North. To investigate the dynamics and functioning of hybrid zones in M. balthica, we developed new molecular markers by sequencing the collective transcriptome of 30 individuals. Ten individuals were pooled for each of the three populations sampled at the margins of two hybrid zones. A single 454 run generated 277 Mb from which 17K SNPs were detected. SNP density averaged 1 polymorphic site every 14 to 19 bases, for mitochondrial and nuclear loci, respectively. An [Formula: see text] scan detected high genetic divergence among several hundred SNPs, some of them involved in energetic metabolism, cellular respiration and physiological stress. The high population differentiation, recorded for nuclear-encoded ATP synthase and NADH dehydrogenase as well as most mitochondrial loci, suggests cytonuclear genetic incompatibilities. Results from this study will help pave the way to a high-resolution study of hybrid zone dynamics in M. balthica, and the relative importance of endogenous and exogenous barriers to gene flow in this system.
Butler, John M.; Budowle, B.; Gill, P.;
Six scientists presented their views and experience with single nucleotide polymorphism (SNP) markers, multiplexes, and methods regarding their potential application in forensic identity and relationship testing. Benefits and limitations of SNPs were reviewed, as were different SNP marker categor...
Full Text Available Abstract Background The eastern oyster, Crassostrea virginica (Gmelin 1791, is an economically important species cultured in many areas in North America. It is also ecologically important because of the impact of its filter feeding behaviour on water quality. Populations of C. virginica have been threatened by overfishing, habitat degradation, and diseases. Through genome research, strategies are being developed to reverse its population decline. However, large-scale expressed sequence tag (EST resources have been lacking for this species. Efficient generation of EST resources from this species has been hindered by a high redundancy of transcripts. The objectives of this study were to construct a normalized cDNA library for efficient EST analysis, to generate thousands of ESTs, and to analyze the ESTs for microsatellites and potential single nucleotide polymorphisms (SNPs. Results A normalized and subtracted C. virginica cDNA library was constructed from pooled RNA isolated from hemocytes, mantle, gill, gonad and digestive tract, muscle, and a whole juvenile oyster. A total of 6,528 clones were sequenced from this library generating 5,542 high-quality EST sequences. Cluster analysis indicated the presence of 635 contigs and 4,053 singletons, generating a total of 4,688 unique sequences. About 46% (2,174 of the unique ESTs had significant hits (E-value ≤ 1e-05 to the non-redundant protein database; 1,104 of which were annotated using Gene Ontology (GO terms. A total of 35 microsatellites were identified from the ESTs, with 18 having sufficient flanking sequences for primer design. A total of 6,533 putative SNPs were also identified using all existing and the newly generated EST resources of the eastern oysters. Conclusion A high quality normalized cDNA library was constructed. A total of 5,542 ESTs were generated representing 4,688 unique sequences. Putative microsatellite and SNP markers were identified. These genome resources provide the
Restriction-site associated DNA sequencing (RAD-seq) and related methods are revolutionizing the field of population genomics in non-model organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD-seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under- or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD-seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD-seq data analysis strategies on population structure inferences that are directly applicable to other species.
Gilbey, John; Cauwelier, Eef; Coulson, Mark W.; Stradmeyer, Lee; Sampayo, James N.; Armstrong, Anja; Verspoor, Eric; Corrigan, Laura; Shelley, Jonathan; Middlemas, Stuart
Understanding the habitat use patterns of migratory fish, such as Atlantic salmon (Salmo salar L.), and the natural and anthropogenic impacts on them, is aided by the ability to identify individuals to their stock of origin. Presented here are the results of an analysis of informative single nucleotide polymorphic (SNP) markers for detecting genetic structuring in Atlantic salmon in Scotland and NE England and their ability to allow accurate genetic stock identification. 3,787 fish from 147 sites covering 27 rivers were screened at 5,568 SNP markers. In order to identify a cost-effective subset of SNPs, they were ranked according to their ability to differentiate between fish from different rivers. A panel of 288 SNPs was used to examine both individual assignments and mixed stock fisheries and eighteen assignment units were defined. The results improved greatly on previously available methods and, for the first time, fish caught in the marine environment can be confidently assigned to geographically coherent units within Scotland and NE England, including individual rivers. As such, this SNP panel has the potential to aid understanding of the various influences acting upon Atlantic salmon on their marine migrations, be they natural environmental variations and/or anthropogenic impacts, such as mixed stock fisheries and interactions with marine power generation installations. PMID:27723810
Li, Feng; Hasegawa, Yoichi; Saito, Masako; Shirasawa, Sachiko; Fukushima, Aki; Ito, Toyoaki; Fujii, Hiroshi; Kishitani, Sachie; Kitashiba, Hiroyasu; Nishio, Takeshi
A linkage map of expressed sequence tag (EST)-based markers in radish (Raphanus sativus L.) was constructed using a low-cost and high-efficiency single-nucleotide polymorphism (SNP) genotyping method named multiplex polymerase chain reaction-mixed probe dot-blot analysis developed in this study. Seven hundred and forty-six SNP markers derived from EST sequences of R. sativus were assigned to nine linkage groups with a total length of 806.7 cM. By BLASTN, 726 markers were found to have homologous genes in Arabidopsis thaliana, and 72 syntenic regions, which have great potential for utilizing genomic information of the model species A. thaliana in basic and applied genetics of R. sativus, were identified. By construction and analysis of the genome structures of R. sativus based on the 24 genomic blocks within the Brassicaceae ancestral karyotype, 23 of the 24 genomic blocks were detected in the genome of R. sativus, and half of them were found to be triplicated. Comparison of the genome structure of R. sativus with those of the A, B, and C genomes of Brassica species and that of Sinapis alba L. revealed extensive chromosome homoeology among Brassiceae species, which would facilitate transfer of the genomic information from one Brassiceae species to another.
Xiao, Yong; Zhou, Lixia; Xia, Wei; Mason, Annaliese S; Yang, Yaodong; Ma, Zilong; Peng, Ming
The oil palm (Elaeis guineensis, 2n = 32) has the highest oil yield of any crop species, as well as comprising the richest dietary source of provitamin A. For the tropical species, the best mean growth temperature is about 27°C, with a minimal growth temperature of 15°C. Hence, the plantation area is limited into the geographical ranges of 10°N to 10°S. Enhancing cold tolerance capability will increase the total cultivation area and subsequently oil productivity of this tropical species. Developing molecular markers related to cold tolerance would be helpful for molecular breeding of cold tolerant Elaeis guineensis. In total, 5791 gene-based SSRs were identified in 51,452 expressed sequences from Elaeis guineensis transcriptome data: approximately one SSR was detected per 10 expressed sequences. Of these 5791 gene-based SSRs, 916 were derived from expressed sequences up- or down-regulated at least two-fold in response to cold stress. A total of 182 polymorphic markers were developed and characterized from 442 primer pairs flanking these cold-responsive SSR repeats. The polymorphic information content (PIC) of these polymorphic SSR markers across 24 lines of Elaeis guineensis varied from 0.08 to 0.65 (mean = 0.31 ± 0.12). Using in-silico mapping, 137 (75.3%) of the 182 polymorphic SSR markers were located onto the 16 Elaeis guineensis chromosomes. Total coverage of 473 Mbp was achieved, with an average physical distance of 3.4 Mbp between adjacent markers (range 96 bp - 20.8 Mbp). Meanwhile, Comparative analysis of transcriptome under cold stress revealed that one ICE1 putative ortholog, five CBF putative orthologs, 19 NAC transcription factors and four cold-induced orhologs were up-regulated at least two fold in response to cold stress. Interestingly, 5' untranslated region of both Unigene21287 (ICE1) and CL2628.Contig1 (NAC) both contained an SSR markers. In the present study, a series of SSR markers were developed based on sequences
Full Text Available Genome-wide association studies (GWAS have identified multiple single nucleotide polymorphisms (SNPs associated with prostate cancer risk. However, whether these associations can be consistently replicated, vary with disease aggressiveness (tumor stage and grade and/or interact with non-genetic potential risk factors or other SNPs is unknown. We therefore genotyped 39 SNPs from regions identified by several prostate cancer GWAS in 10,501 prostate cancer cases and 10,831 controls from the NCI Breast and Prostate Cancer Cohort Consortium (BPC3. We replicated 36 out of 39 SNPs (P-values ranging from 0.01 to 10⁻²⁸. Two SNPs located near KLK3 associated with PSA levels showed differential association with Gleason grade (rs2735839, P = 0.0001 and rs266849, P = 0.0004; case-only test, where the alleles associated with decreasing PSA levels were inversely associated with low-grade (as defined by Gleason grade < 8 tumors but positively associated with high-grade tumors. No other SNP showed differential associations according to disease stage or grade. We observed no effect modification by SNP for association with age at diagnosis, family history of prostate cancer, diabetes, BMI, height, smoking or alcohol intake. Moreover, we found no evidence of pair-wise SNP-SNP interactions. While these SNPs represent new independent risk factors for prostate cancer, we saw little evidence for effect modification by other SNPs or by the environmental factors examined.
Qi, L L; Talukder, Z I; Hulke, B S; Foley, M E
Diagnostic DNA markers are an invaluable resource in breeding programs for successful introgression and pyramiding of disease resistance genes. Resistance to downy mildew (DM) disease in sunflower is mediated by Pl genes which are known to be effective against the causal fungus, Plasmopara halstedii. Two DM resistance genes, Pl Arg and Pl 8 , are highly effective against P. halstedii races in the USA, and have been previously mapped to the sunflower linkage groups (LGs) 1 and 13, respectively, using simple sequence repeat (SSR) markers. In this study, we developed high-density single nucleotide polymorphism (SNP) maps encompassing the Pl arg and Pl 8 genes and identified diagnostic SNP markers closely linked to these genes. The specificity of the diagnostic markers was validated in a highly diverse panel of 548 sunflower lines. Dissection of a large marker cluster co-segregated with Pl Arg revealed that the closest SNP markers NSA_007595 and NSA_001835 delimited Pl Arg to an interval of 2.83 Mb on the LG1 physical map. The SNP markers SFW01497 and SFW06597 delimited Pl 8 to an interval of 2.85 Mb on the LG13 physical map. We also developed sunflower lines with homozygous, three gene pyramids carrying Pl Arg , Pl 8 , and the sunflower rust resistance gene R 12 using the linked SNP markers from a segregating F2 population of RHA 340 (carrying Pl 8 )/RHA 464 (carrying Pl Arg and R 12 ). The high-throughput diagnostic SNP markers developed in this study will facilitate marker-assisted selection breeding, and the pyramided sunflower lines will provide durable resistance to downy mildew and rust diseases.
Yu, Long-Xi; Zheng, Ping; Bhamidimarri, Suresh; Liu, Xiang-Ping; Main, Dorie
Verticillium wilt (VW) of alfalfa is a soilborne disease causing severe yield loss in alfalfa. To identify molecular markers associated with VW resistance, we used an integrated framework of genome-wide association study (GWAS) with high-throughput genotyping by sequencing (GBS) to identify loci associated with VW resistance in an F1 full-sib alfalfa population. Phenotyping was performed using manual inoculation of the pathogen to cloned plants of each individual and disease severity was scored using a standard scale. Genotyping was done by GBS, followed by genotype calling using three bioinformatics pipelines including the TASSEL-GBS pipeline (TASSEL), the Universal Network Enabled Analysis Kit (UNEAK), and the haplotype-based FreeBayes pipeline (FreeBayes). The resulting numbers of SNPs, marker density, minor allele frequency (MAF) and heterozygosity were compared among the pipelines. The TASSEL pipeline generated more markers with the highest density and MAF, whereas the highest heterozygosity was obtained by the UNEAK pipeline. The FreeBayes pipeline generated tetraploid genotypes, with the least number of markers. SNP markers generated from each pipeline were used independently for marker-trait association. Markers significantly associated with VW resistance identified by each pipeline were compared. Similar marker loci were found on chromosomes 5, 6, and 7, whereas different loci on chromosome 1, 2, 3, and 4 were identified by different pipelines. Most significant markers were located on chromosome 6 and they were identified by all three pipelines. Of those identified, several loci were linked to known genes whose functions are involved in the plants’ resistance to pathogens. Further investigation on these loci and their linked genes would provide insight into understanding molecular mechanisms of VW resistance in alfalfa. Functional markers closely linked to the resistance loci would be useful for MAS to improve alfalfa cultivars with enhanced resistance
Terracciano, Irma; Maccaferri, Marco; Bassi, Filippo; Mantovani, Paola; Sanguineti, Maria C; Salvi, Silvio; Simková, Hana; Doležel, Jaroslav; Massi, Andrea; Ammar, Karim; Kolmer, James; Tuberosa, Roberto
Leaf rust (Puccinia triticina Eriks. & Henn.) is a major disease affecting durum wheat production. The Lr14a-resistant gene present in the durum wheat cv. Creso and its derivative cv. Colosseo is one of the best characterized leaf-rust resistance sources deployed in durum wheat breeding. Lr14a has been mapped close to the simple sequence repeat markers gwm146, gwm344 and wmc10 in the distal portion of the chromosome arm 7BL, a gene-dense region. The objectives of this study were: (1) to enrich the Lr14a region with single nucleotide polymorphisms (SNPs) and high-resolution melting (HRM)-based markers developed from conserved ortholog set (COS) genes and from sequenced Diversity Array Technology (DArT(®)) markers; (2) to further investigate the gene content and colinearity of this region with the Brachypodium and rice genomes. Ten new COS-SNP and five HRM markers were mapped within an 8.0 cM interval spanning Lr14a. Two HRM markers pinpointed the locus in an interval of COS-SNPs were mapped 2.1-4.1 cM distal to Lr14a. Each marker was tested for its capacity to predict the state of Lr14a alleles (in particular, Lr14-Creso associated to resistance) in a panel of durum wheat elite germplasm including 164 accessions. Two of the most informative markers were converted into KASPar(®) markers. Single assay markers ubw14 and wPt-4038-HRM designed for agarose gel electrophoresis/KASPar(®) assays and high-resolution melting analysis, respectively, as well as the double-marker combinations ubw14/ubw18, ubw14/ubw35 and wPt-4038-HRM-ubw35 will be useful for germplasm haplotyping and for molecular-assisted breeding.
Full Text Available While year after year, conditions, quality, and duration of human lives have been improving due to the progress in science, technology, education, and medicine, only eight diseases have been increasing in prevalence and shortening human lives because of premature deaths according to the retrospective official review on the state of US health, 1990-2010. These diseases are kidney cancer, chronic kidney diseases, liver cancer, diabetes, drug addiction, poisoning cases, consequences of falls, and Alzheimer's disease (AD as one of the leading pathologies. There are familial AD of hereditary nature (~4% of cases and sporadic AD of unclear etiology (remaining ~96% of cases; i.e., non-familial AD. Therefore, sporadic AD is no longer a purely medical problem, but rather a social challenge when someone asks oneself: “What can I do in my own adulthood to reduce the risk of sporadic AD at my old age to save the years of my lifespan from the destruction caused by it?” Here, we combine two computational approaches for regulatory SNPs: Web service SNP_TATA_Comparator for sequence analysis and a PubMed-based keyword search for articles on the biochemical markers of diseases. Our purpose was to try to find answers to the question: “What can be done in adulthood to reduce the risk of sporadic AD in old age to prevent the lifespan reduction caused by it?” As a result, we found 89 candidate SNP markers of familial and sporadic AD (e.g., rs562962093 is associated with sporadic AD in the elderly as a complication of stroke in adulthood, where natural marine diets can reduce risks of both diseases in case of the minor allele of this SNP. In addition, rs768454929, and rs761695685 correlate with sporadic AD as a comorbidity of short stature, where maximizing stature in childhood and adolescence as an integral indicator of health can minimize (or even eliminate the risk of sporadic AD in the elderly. After validation by clinical protocols, these candidate SNP
Ponomarenko, Petr; Chadaeva, Irina; Rasskazov, Dmitry A; Sharypova, Ekaterina; Kashina, Elena V; Drachkova, Irina; Zhechev, Dmitry; Ponomarenko, Mikhail P; Savinkova, Ludmila K; Kolchanov, Nikolay
While year after year, conditions, quality, and duration of human lives have been improving due to the progress in science, technology, education, and medicine, only eight diseases have been increasing in prevalence and shortening human lives because of premature deaths according to the retrospective official review on the state of US health, 1990-2010. These diseases are kidney cancer, chronic kidney diseases, liver cancer, diabetes, drug addiction, poisoning cases, consequences of falls, and Alzheimer's disease (AD) as one of the leading pathologies. There are familial AD of hereditary nature (~4% of cases) and sporadic AD of unclear etiology (remaining ~96% of cases; i.e., non-familial AD). Therefore, sporadic AD is no longer a purely medical problem, but rather a social challenge when someone asks oneself: "What can I do in my own adulthood to reduce the risk of sporadic AD at my old age to save the years of my lifespan from the destruction caused by it?" Here, we combine two computational approaches for regulatory SNPs: Web service SNP_TATA_Comparator for sequence analysis and a PubMed-based keyword search for articles on the biochemical markers of diseases. Our purpose was to try to find answers to the question: "What can be done in adulthood to reduce the risk of sporadic AD in old age to prevent the lifespan reduction caused by it?" As a result, we found 89 candidate SNP markers of familial and sporadic AD (e.g., rs562962093 is associated with sporadic AD in the elderly as a complication of stroke in adulthood, where natural marine diets can reduce risks of both diseases in case of the minor allele of this SNP). In addition, rs768454929, and rs761695685 correlate with sporadic AD as a comorbidity of short stature, where maximizing stature in childhood and adolescence as an integral indicator of health can minimize (or even eliminate) the risk of sporadic AD in the elderly. After validation by clinical protocols, these candidate SNP markers may become
Among SNP markers that become increasingly valuable in molecular breeding of crop plants are the CAP and dCAP markers derived from the genes of interest. To date, the number of such gene-based markers is small in polyploid crop plants such as tetraploid cotton that has A and D subgenomes. The obje...
Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar
Mango (Mangifera indica L.) is called “king of fruits” due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties ‘Neelam’, ‘Dashehari’ and their hybrid ‘Amrapali’ using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango. PMID:27736892
de Miguel Marina
Full Text Available Abstract Background Pinus pinaster Ait. is a major resin producing species in Spain. Genetic linkage mapping can facilitate marker-assisted selection (MAS through the identification of Quantitative Trait Loci and selection of allelic variants of interest in breeding populations. In this study, we report annotated genetic linkage maps for two individuals (C14 and C15 belonging to a breeding program aiming to increase resin production. We use different types of DNA markers, including last-generation molecular markers. Results We obtained 13 and 14 linkage groups for C14 and C15 maps, respectively. A total of 211 and 215 markers were positioned on each map and estimated genome length was between 1,870 and 2,166 cM respectively, which represents near 65% of genome coverage. Comparative mapping with previously developed genetic linkage maps for P. pinaster based on about 60 common markers enabled aligning linkage groups to this reference map. The comparison of our annotated linkage maps and linkage maps reporting QTL information revealed 11 annotated SNPs in candidate genes that co-localized with previously reported QTLs for wood properties and water use efficiency. Conclusions This study provides genetic linkage maps from a Spanish population that shows high levels of genetic divergence with French populations from which segregating progenies have been previously mapped. These genetic maps will be of interest to construct a reliable consensus linkage map for the species. The importance of developing functional genetic linkage maps is highlighted, especially when working with breeding populations for its future application in MAS for traits of interest.
Full Text Available Wheat leaf rust is an important disease worldwide. Growing resistant cultivars is an effective means to control the disease. In the present study, 244 recombinant inbred lines from Zhou 8425B/Chinese Spring cross were phenotyped for leaf rust severities during the 2011–2012, 2012–2013, 2013–2014, and 2014–2015 cropping seasons at Baoding, Hebei province, and 2012–2013 and 2013–2014 cropping seasons in Zhoukou, Henan province. The population was genotyped using the high-density Illumina iSelect 90K SNP assay and SSR markers. Inclusive composite interval mapping identified eight QTL, designated as QLr.hebau-2AL, QLr.hebau-2BS, QLr.hebau-3A, QLr.hebau-3BS, QLr.hebau-4AL, QLr.hebau-4B, QLr.hebau-5BL, and QLr.hebau-7DS, respectively. QLr.hebau-2BS, QLr.hebau-3A, QLr.hebau-3BS, and QLr.hebau-5BL were derived from Zhou 8425B, whereas the other four were from Chinese Spring. Three stable QTL on chromosomes 2BS, 4B and 7DS explained 7.5–10.6%, 5.5–24.4%, and 11.2–20.9% of the phenotypic variance, respectively. QLr.hebau-2BS in Zhou 8425B might be the same as LrZH22 in Zhoumai 22; QLr.hebau-4B might be the residual resistance of Lr12, and QLr.hebau-7DS is Lr34. QLr.hebau-2AL, QLr.hebau-3BS, QLr.hebau-4AL, and QLr.hebau-5BL are likely to be novel QTL for leaf rust. These QTL and their closely linked SNP and SSR markers can be used for fine mapping, candidate gene discovery, and marker-assisted selection in wheat breeding.
Marcadores SNP: conceitos básicos, aplicações no manejo e no melhoramento animal e perspectivas para o futuro SNP markers: basic concepts, applications in animal breeding and management and perspectives for the future
Alexandre Rodrigues Caetano
molecular markers to characterize genetic resources and generate tools for animal breeding and management date from the end of the 80s. In the last 20 years the technologies to generate molecular data went through several innovation cycles. The last wave of technological innovations represents a true revolution, bringing methods to identify and genotype SNP (Single Nucleotide Polymorphism markers in large scale. High density DNA chips were generated to genotype from tens of thousands to hundreds of thousands of SNPs in a single assay. Furthermore, other medium density technologies allow for the genotyping of tens to hundreds of makers, in high numbers of samples, with very high speed and automation. These new technologies allowed for the generation of new applications, such as the methods to genetically evaluate and select animals based on their Genomic Value (Genomic Estimated Breeding Value - GEBV. The statistical methods for genomic evaluation and selection are in full development, but the technology already became reality with the release of the first bull summary for the Holstein breed with GEBVs for milk production and quality traits in January 2009. In addition, these technologies brought new options for development of diagnostic tests for paternity testing, individual identification, traceability, etc. Also, these new technologies to genotype SNP markers facilitated the development of outsourcing companies to generate molecular data, allowing any group to conduct advanced experiments, always using the most advanced technologies, without the need of investments into equipment.
Hiremath, Pavana J; Kumar, Ashish; Penmetsa, Ramachandra Varma; Farmer, Andrew; Schlueter, Jessica A; Chamarthi, Siva K; Whaley, Adam M; Carrasquilla-Garcia, Noelia; Gaur, Pooran M; Upadhyaya, Hari D; Kavi Kishor, Polavarapu B; Shah, Trushar M; Cook, Douglas R; Varshney, Rajeev K
A set of 2486 single nucleotide polymorphisms (SNPs) were compiled in chickpea using four approaches, namely (i) Solexa/Illumina sequencing (1409), (ii) amplicon sequencing of tentative orthologous genes (TOGs) (604), (iii) mining of expressed sequence tags (ESTs) (286) and (iv) sequencing of candidate genes (187). Conversion of these SNPs to the cost-effective and flexible throughput Competitive Allele Specific PCR (KASPar) assays generated successful assays for 2005 SNPs. These marker assays have been designated as Chickpea KASPar Assay Markers (CKAMs). Screening of 70 genotypes including 58 diverse chickpea accessions and 12 BC3F2 lines showed 1341 CKAMs as being polymorphic. Genetic analysis of these data clustered chickpea accessions based on geographical origin. Genotyping data generated for 671 CKAMs on the reference mapping population (Cicer arietinum ICC 4958 × Cicer reticulatum PI 489777) were compiled with 317 unpublished TOG-SNPs and 396 published markers for developing the genetic map. As a result, a second-generation genetic map comprising 1328 marker loci including novel 625 CKAMs, 314 TOG-SNPs and 389 published marker loci with an average inter-marker distance of 0.59 cM was constructed. Detailed analyses of 1064 mapped loci of this second-generation chickpea genetic map showed a higher degree of synteny with genome of Medicago truncatula, followed by Glycine max, Lotus japonicus and least with Vigna unguiculata. Development of these cost-effective CKAMs for SNP genotyping will be useful not only for genetics research and breeding applications in chickpea, but also for utilizing genome information from other sequenced or model legumes. PMID:22703242
Wong, Quin Nee; Tanzi, Alberto Stefano; Ho, Wai Kuan; Malla, Sunir; Blythe, Martin; Karunaratne, Asha; Massawe, Festo; Mayes, Sean
Winged bean (Psophocarpus tetragonolobus) is an herbaceous multipurpose legume grown in hot and humid countries as a pulse, vegetable (leaves and pods), or root tuber crop depending on local consumption preferences. In addition to its different nutrient-rich edible parts which could contribute to food and nutritional security, it is an efficient nitrogen fixer as a component of sustainable agricultural systems. Generating genetic resources and improved lines would help to accelerate the breeding improvement of this crop, as the lack of improved cultivars adapted to specific environments has been one of the limitations preventing wider use. A transcriptomic de novo assembly was constructed from four tissues: leaf, root, pod, and reproductive tissues from Malaysian accessions, comprising of 198,554 contigs with a N50 of 1462 bp. Of these, 138,958 (70.0%) could be annotated. Among 9682 genic simple sequence repeat (SSR) motifs identified (excluding monomer repeats), trinucleotide-repeats were the most abundant (4855), followed by di-nucleotide (4500) repeats. A total of 18 SSR markers targeting di- and tri-nucleotide repeats have been validated as polymorphic markers based on an initial assessment of nine genotypes originated from five countries. A cluster analysis revealed provisional clusters among this limited, yet diverse selection of germplasm. The developed assembly and validated genic SSRs in this study provide a foundation for a better understanding of the plant breeding system for the genetic improvement of winged bean. PMID:28282950
Full Text Available Rice bean (Vigna umbellata (Thunb. Ohwi & Ohashi is a warm season annual legume mainly grown in East Asia. Only scarce genomic resources are currently available for this legume crop species and no simple sequence repeat (SSR markers have been specifically developed for rice bean yet. In this study, approximately 26 million high quality cDNA sequence reads were obtained from rice bean using Illumina paired-end sequencing technology and assembled into 71,929 unigenes with an average length of 986 bp. Of these unigenes, 38,840 (33.2% showed significant similarity to proteins in the NCBI non-redundant protein and nucleotide sequence databases. Furthermore, 30,170 (76.3% could be classified into gene ontology categories, 25,451 (64.4% into Swiss-Prot categories and 21,982 (55.6% into KOG database categories (E-value < 1.0E-5. A total of 9,301 (23.5% were mapped onto 118 pathways using the Kyoto Encyclopedia of Genes and Genome (KEGG pathway database. A total of 3,011 genic SSRs were identified as potential molecular markers. AG/CT (30.3%, AAG/CTT (8.1% and AGAA/TTCT (20.0% are the three main repeat motifs. A total of 300 SSR loci were randomly selected for validation by using PCR amplification. Of these loci, 23 primer pairs were polymorphic among 32 rice bean accessions. A UPGMA dendrogram revealed three major clusters among 32 rice bean accessions. The large number of SSR-containing sequences and genic SSRs in this study will be valuable for the construction of high-resolution genetic linkage maps, association or comparative mapping and genetic analyses of various Vigna species.
The development of resources for genomic studies in Mangifera indica (mango) will allow marker-assisted selection and identification of genetically diverse germplasm, greatly aiding mango breeding programs. We report here a first step in developing such resources, our identification of thousands una...
Full Text Available Two methods of SNPs pre-selection based on single marker regression for the estimation of genomic breeding values (G-EBVs were compared using simulated data provided by the XII QTL-MAS workshop: i Bonferroni correction of the significance threshold and ii Permutation test to obtain the reference distribution of the null hypothesis and identify significant markers at P<0.01 and P<0.001 significance thresholds. From the set of markers significant at P<0.001, random subsets of 50% and 25% markers were extracted, to evaluate the effect of further reducing the number of significant SNPs on G-EBV predictions. The Bonferroni correction method allowed the identification of 595 significant SNPs that gave the best G-EBV accuracies in prediction generations (82.80%. The permutation methods gave slightly lower G-EBV accuracies even if a larger number of SNPs resulted significant (2,053 and 1,352 for 0.01 and 0.001 significance thresholds, respectively. Interestingly, halving or dividing by four the number of SNPs significant at P<0.001 resulted in an only slightly decrease of G-EBV accuracies. The genetic structure of the simulated population with few QTL carrying large effects, might have favoured the Bonferroni method.
Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza
Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Burt, Andrew J.; William, H. Manilal; Perry, Gregory; Khanal, Raja; Pauls, K. Peter; Kelly, James D.; Navabi, Alireza
Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co–4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co–4 is localized. Three SCAR markers with known linkage to Co–4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK–4 loci found in previous studies. It is possible that the Co–4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases. PMID:26431031
Andrew J Burt
Full Text Available Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris. Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08 where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Full Text Available Huge efforts have been invested in the last two decades to dissect the genetic bases of complex traits including yields of many crop plants, through quantitative trait locus (QTL analyses. However, almost all the studies were based on linkage maps constructed using low-throughput molecular markers, e.g. restriction fragment length polymorphisms (RFLPs and simple sequence repeats (SSRs, thus are mostly of low density and not able to provide precise and complete information about the numbers and locations of the genes or QTLs controlling the traits. In this study, we constructed an ultra-high density genetic map based on high quality single nucleotide polymorphisms (SNPs from low-coverage sequences of a recombinant inbred line (RIL population of rice, generated using new sequencing technology. The quality of the map was assessed by validating the positions of several cloned genes including GS3 and GW5/qSW5, two major QTLs for grain length and grain width respectively, and OsC1, a qualitative trait locus for pigmentation. In all the cases the loci could be precisely resolved to the bins where the genes are located, indicating high quality and accuracy of the map. The SNP map was used to perform QTL analysis for yield and three yield-component traits, number of tillers per plant, number of grains per panicle and grain weight, using data from field trials conducted over years, in comparison to QTL mapping based on RFLPs/SSRs. The SNP map detected more QTLs especially for grain weight, with precise map locations, demonstrating advantages in detecting power and resolution relative to the RFLP/SSR map. Thus this study provided an example for ultra-high density map construction using sequencing technology. Moreover, the results obtained are helpful for understanding the genetic bases of the yield traits and for fine mapping and cloning of QTLs.
Geographic surveys of allozymes, microsatellites, nuclear DNA (nDNA) and mitochondrial DNA (mtDNA) have detected several genetic subdivisions among European anchovy populations. However, these studies have been limited in their power to detect some aspects of population structure by the use of a single or a few molecular markers, or by limited geographic sampling. We use a multi-marker approach, 47 nDNA and 15 mtDNA single nucleotide polymorphisms (SNPs), to analyze 626 European anchovies from the whole range of the species to resolve shallow and deep levels of population structure. Nuclear SNPs define 10 genetic entities within two larger genetically distinctive groups associated with oceanic variables and different life-history traits. MtDNA SNPs define two deep phylogroups that reflect ancient dispersals and colonizations. These markers define two ecological groups. One major group of Iberian-Atlantic populations is associated with upwelling areas on narrow continental shelves and includes populations spawning and overwintering in coastal areas. A second major group includes northern populations in the North East (NE) Atlantic (including the Bay of Biscay) and the Mediterranean and is associated with wide continental shelves with local larval retention currents. This group tends to spawn and overwinter in oceanic areas. These two groups encompass ten populations that differ from previously defined management stocks in the Alboran Sea, Iberian-Atlantic and Bay of Biscay regions. In addition, a new North Sea-English Channel stock is defined. SNPs indicate that some populations in the Bay of Biscay are genetically closer to North Western (NW) Mediterranean populations than to other populations in the NE Atlantic, likely due to colonizations of the Bay of Biscay and NW Mediterranean by migrants from a common ancestral population. Northern NE Atlantic populations were subsequently established by migrants from the Bay of Biscay. Populations along the Iberian
Full Text Available In the population, there are aggressive sheep in a small number which requires special management those specific animal house and routine management. The purpose of this study was to identify the variation of DNA marker SNP (single nucleotide polymorphism as a genetic marker for the aggressive trait in several of sheep breed. The identification of point mutations in exon 8 of MAO-A gene associated with aggressive behavior in sheep may be further useful to become of DNA markers for the aggressive trait in sheep. Five of sheep breed were used, i.e.: Barbados Black belly Cross sheep (BC, Composite Garut (KG, Local Garut (LG, Composite Sumatra (KS and St. Cross Croix (SC. Duration of ten behavior traits, blood serotonin concentrations and DNA sequence of exon 8 of MAO-A gene from the sheep aggressive and nonaggressive were observed. PROC GLM of SAS Ver. 9.0 program was used to analyze variable behavior and blood serotonin concentrations. DNA polymorphism in exon 8 of MAO-A gene was analyzed using the MEGA software Ver. 4.0. The results show that the percentage of the aggressive rams of each breed was less than 10 percent; except for the KS sheep is higher (23%. Based on the duration of behavior, aggressive sheep group was not significantly different with non aggressive sheep group, except duration of care giving and drinking behavior. It is known that concentration of blood serotonin in aggressive and non aggressive rams was not significantly different. The aggressive trait in sheep has a mechanism or a different cause like that occurs in mice and humans. In this study, aggressive behavior in sheep was not associated with a mutation in exon 8 of MAO-A gene.
GUANG FENG CHEN; RU GANG WU; DONG MEI LI; HAI XIA YU; ZHIYING DENG; JI CHUN TIAN
Seeding emergence and tiller number are the most important traits for wheat (Triticum aestivum L.) yield, but the inheritance of seeding emergence and tillering is poorly understood. We conducted a genomewide association study focussing on seeding emergence and tiller number at different growth stages with a panel of 205 elite winter wheat accessions. The populationwas genotyped with a high-density Illumina iSelect 90K SNPs assay. A total of 31 loci were found to be associated with seeding emergence rate (SER) and tiller number in different growth stages. Loci distributed among 12 chromosomesaccounted for 5.35 to 11.33% of the observed phenotypic variation. With this information, 10 stable SNPs were identified for eventual development of cleaved amplified polymorphic sequence markers for SER and tiller number in different growth stages. Additionally, a set of elite alleles were identified, such as Ra_c14761_1348-T, which may increase SER by 13.35%, and Excalibur_c11045_236-A and BobWhite_c8436_391-T, which may increase the rate of available tillering by 14.78 and 8.47%, respectively. These results should provide valuable information for marker-assisted selection and parental selection in wheat breeding programmes.
Maroso, Francesco; Franch, Rafaella; Dalla Rovere, Giulia; Arculeo, Marco; Bargelloni, Luca
Dolphinfish is an important fish species for both commercial and sport fishing, but so far limited information is available on genetic variability and pattern of differentiation of dolphinfish populations in the Mediterranean basin. Recently developed techniques allow genome-wide identification of genetic markers for better understanding of population structure in species with limited genome information. Using restriction-site associated DNA analysis we successfully genotyped 140 individuals of dolphinfish from eight locations in the Mediterranean Sea at 3324 SNP loci. We identified 311 sex-related loci that were used to assess sex-ratio in dolphinfish populations. In addition, we identified a weak signature of genetic differentiation of the population closer to Gibraltar Strait in comparison to other Mediterranean populations, which might be related to introgression of individuals from Atlantic. No further genetic differentiation could be detected in the other populations sampled, as expected considering the known highly mobility of the species. The results obtained improve our knowledge of the species and can help managing dolphinfish stock in the future.
Cui, Chengqi; Mei, Hongxian; Liu, Yanyang; Zhang, Haiyang; Zheng, Yongzhan
The characterization of genetic diversity and population structure can be used in tandem to detect reliable phenotype–genotype associations. In the present study, we genotyped a set of 366 sesame germplasm accessions by using 89,924 single-nucleotide polymorphisms (SNPs). The number of SNPs on each chromosome was consistent with the physical length of the respective chromosome, and the average marker density was approximately 2.67 kb/SNP. The genetic diversity analysis showed that the average nucleotide diversity of the panel was 1.1 × 10-3, with averages of 1.0 × 10-4, 2.7 × 10-4, and 3.6 × 10-4 obtained, respectively for three identified subgroups of the panel: Pop 1, Pop 2, and the Mixed. The genetic structure analysis revealed that these sesame germplasm accessions were structured primarily along the basis of their geographic collection, and that an extensive admixture occurred in the panel. The genome-wide linkage disequilibrium (LD) analysis showed that an average LD extended up to ∼99 kb. The genetic diversity and population structure revealed in this study should provide guidance to the future design of association studies and the systematic utilization of the genetic variation characterizing the sesame panel. PMID:28729877
Background The economic importance of grapevine has driven significant efforts in genomics to accelerate the exploitation of Vitis resources for development of new cultivars. However, although a large number of clonally propagated accessions are maintained in grape germplasm collections worldwide, their use for crop improvement is limited by the scarcity of information on genetic diversity, population structure and proper phenotypic assessment. The identification of representative and manageable subset of accessions would facilitate access to the diversity available in large collections. A genome-wide germplasm characterization using molecular markers can offer reliable tools for adjusting the quality and representativeness of such core samples. Results We investigated patterns of molecular diversity at 22 common microsatellite loci and 384 single nucleotide polymorphisms (SNPs) in 2273 accessions of domesticated grapevine V. vinifera ssp. sativa, its wild relative V. vinifera ssp. sylvestris, interspecific hybrid cultivars and rootstocks. Despite the large number of putative duplicates and extensive clonal relationships among the accessions, we observed high level of genetic variation. In the total germplasm collection the average genetic diversity, as quantified by the expected heterozygosity, was higher for SSR loci (0.81) than for SNPs (0.34). The analysis of the genetic structure in the grape germplasm collection revealed several levels of stratification. The primary division was between accessions of V. vinifera and non-vinifera, followed by the distinction between wild and domesticated grapevine. Intra-specific subgroups were detected within cultivated grapevine representing different eco-geographic groups. The comparison of a phenological core collection and genetic core collections showed that the latter retained more genetic diversity, while maintaining a similar phenotypic variability. Conclusions The comprehensive molecular characterization of our grape
Chen, Xin; Mei, Jie; Wu, Junjie; Jing, Jing; Ma, Wenge; Zhang, Jin; Dan, Cheng; Wang, Weimin; Gui, Jian-Fang
Sex dimorphic growth pattern has significant theory and application implications in fish. Recently, a Y- and X-specific allele marker-assisted sex control technique has been developed for mass production of all-male population in yellow catfish (Pelteobagrus fulvidraco), but the genetic information for sex determination and sex control breeding has remained unclear. Here, we attempted to provide the first insight into a comprehensive transcriptome covering multiple tissues from XX females, XY males, and YY super-males of yellow catfish by using 454 GS-FLX platform, for a better assembly and gene coverage. A total of 1,202,933 high quality reads (about 540 Mbp) were obtained and assembled into 28,297 contigs and 141,951 singletons. BLASTX searches against the NCBI non-redundant protein database (nr) led a total of 52,564 unique sequences including 18,748 contigs and 33,816 singletons to match 25,669 known or predicted unique proteins. All of them with annotated function were categorized by gene ontology (GO) analysis, and 712 were assigned to reproduction and reproductive process. Some potential genes relevant to reproductive system including steroid hormone biosynthesis and GnRH (gonadotropin-releasing hormone) signaling pathway were further identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis; and at least 21 sex determination and differentiation-related genes, such as Dmrt1, Sox9a/b, Cyp19b, WT1, and AMH were identified and characterized. Additionally, a total of 82,794 simple sequence repeats (SSRs), 26,450 single nucleotide polymorphisms (SNPs), and 4,145 insertions and deletions (INDELs) were revealed from the transcriptome data. Therefore, the current transcriptome resources highlight further studies on sex-control breeding in yellow catfish and will benefit future studies on reproduction and sex determination in teleost fish.
Mikhail P. Ponomarenko
Full Text Available Some variations of human genome (for example, single nucleotide polymorphisms [SNPs] are markers of hereditary diseases and drug responses. Analysis of them can help to improve treatment. Computer-based analysis of millions of SNPs in the 1000 Genomes project makes a search for SNP markers more targeted. Here we combined two computer-based approaches: DNA sequence analysis and keyword search in databases. In the binding sites for TATA-binding protein (TBP in human gene promoters, we found candidate SNP markers of gender-biased autoimmune diseases, including rs1143627 (cachexia in rheumatoid arthritis [double prevalence among women]; rs11557611 (demyelinating diseases [thrice more prevalent among young white women than among nonwhite individuals]; rs17231520 and rs569033466 (both: atherosclerosis comorbid with related diseases [double prevalence among women]; rs563763767 (Hughes syndrome-related thrombosis [lethal during pregnancy]; rs2814778 (autoimmune diseases [excluding multiple sclerosis and rheumatoid arthritis] underlying hypergammaglobulinemia in women; rs72661131 and rs562962093 (both: preterm delivery in pregnant diabetic women; and rs35518301, rs34166473, rs34500389, rs33981098, rs33980857, rs397509430, rs34598529, rs33931746, rs281864525, and rs63750953 (all: autoimmune diseases underlying hypergammaglobulinemia in women. Validation of these predicted candidate SNP markers using the clinical standards may advance personalized medicine.
张成才; 王丽鸳; 韦康; 成浩
为了促进SNP技术在茶树遗传育种中的应用，本文探讨了将茶树SNPs转化成CAPS标记进行检测的可行性。首先，对从茶树ESTs中挖掘的候选SNPs进行重测序验证；然后，对确证的SNPs进行限制性内切酶识别位点分析，筛选出引起酶切识别位点改变的SNPs；使用相应的引物从茶树基因组中扩增这些SNPs所在的片段，然后对扩增产物进行酶切和电泳检测，将SNPs转化为CAPS标记；最后，随机选择了14个CAPS标记，在25个茶树品种中进行多态性分析，以验证标记的可用性。结果，经重测序验证，在8个茶树品种中共检测到162个SNPs；将39个SNPs成功转化为CAPS标记；在14个随机选择的CAPS标记中有13个在25个品种中表现出多态性，多态性信息含量(PIC)介于0.043和0.497之间，平均0.311；另外1个标记在25个品种中均为杂合。结果表明，本文构建的将茶树SNPs转化为CAPS标记的方法，可以方便地用于茶树SNPs的检测；该方法能够促进SNP技术在茶树遗传育种研究中的应用；同时，本文报道的39个CAPS标记可以用于茶树遗传多态性检测和茶树遗传图谱构建等方面的研究。%CAPS (cleaved amplified polymorphic sequence) has been used as a useful tool for SNP (single nucl-eotide polymorphism) detection. This method was attempted to detect SNPs in tea plant. Fristly, candidate SNPs were mined from tea derived-ESTs and evaluated them in genomic DNA using re-sequencing method. Secondly, the SNPs were analyzed by CodonCode Aligner software to identified which alter the restrict enzyme recognition sites. Thirdly, restrict enzymes were used to digest the PCR products and checked on agarose gel. Moreover, 14 CAPS markers were randomly selected to validate the usefulness of new markers in genetic study in tea plant. The results showed that 162 SNPs were confirmed in 8 tea cultivars. And 39 SNPs were successfully converted into CAPS markers
Historically, association tests were limited to single variants,so that the allele was considered the basic unit for association testing. As marker density increases and indirect approaches are used to assess association through linkage disequilibrium, association is now frequently considered- at the haplotypic level. We suggest that there are difficulties in replicating association findings at the single-nucleotide-polymorphism (SNP)or the haplotype level,and we propose a shift toward a gene-based approach in which all common variation within a candidate gene is considered jointly. Inconsistencies arising frompopulation differences are more readily resolved by use of a genebased approach rather than either a SNP-based or a haplotype-based approach. A gene-based approach captures all of the potential risk-conferring variations; thus, negative findings are subject only to the issue of power. In addition, chance findings due to multiple testing can be readily accounted for by use of a genewide-significance level. Meta-analysis procedures can be formalized for gene-based methods through the combination of P values. It is only a matter of time before all variation within genes is mapped, at which point the gene-based approach will become the natural end point tor association analysis and will intorm our search for functional variants relevantto disease etiology.
Muchero, Wellington; Ehlers, Jeffrey D; Close, Timothy J; Roberts, Philip A
Macrophomina phaseolina is an emerging and devastating fungal pathogen that causes significant losses in crop production under high temperatures and drought stress. An increasing number of disease incidence reports highlight the wide prevalence of the pathogen around the world and its contribution toward crop yield suppression. In cowpea [Vigna unguiculata (L) Walp.], limited sources of low-level host resistance have been identified, the genetic basis of which is unknown. In this study we report on the identification of strong sources of host resistance to M. phaseolina and the genetic mapping of putative resistance loci on a cowpea genetic map comprised of gene-derived single nucleotide polymorphisms (SNPs) and amplified fragment length polymorphisms (AFLPs). Nine quantitative trait loci (QTLs), accounting for between 6.1 and 40.0% of the phenotypic variance (R2), were identified using plant mortality data taken over three years in field experiments and disease severity scores taken from two greenhouse experiments. Based on annotated genic SNPs as well as synteny with soybean (Glycine max) and Medicago truncatula, candidate resistance genes were found within mapped QTL intervals. QTL Mac-2 explained the largest percent R2 and was identified in three field and one greenhouse experiments where the QTL peak co-located with a SNP marker derived from a pectin esterase inhibitor encoding gene. Maturity effects on the expression of resistance were indicated by the co-location of Mac-6 and Mac-7 QTLs with maturity-related senescence QTLs Mat-2 and Mat-1, respectively. Homologs of the ELF4 and FLK flowering genes were found in corresponding syntenic soybean regions. Only three Macrophomina resistance QTLs co-located with delayed drought-induced premature senescence QTLs previously mapped in the same population, suggesting that largely different genetic mechanisms mediate cowpea response to drought stress and Macrophomina infection. Effective sources of host resistance were
Ehlers Jeffrey D
Full Text Available Abstract Background Macrophomina phaseolina is an emerging and devastating fungal pathogen that causes significant losses in crop production under high temperatures and drought stress. An increasing number of disease incidence reports highlight the wide prevalence of the pathogen around the world and its contribution toward crop yield suppression. In cowpea [Vigna unguiculata (L Walp.], limited sources of low-level host resistance have been identified, the genetic basis of which is unknown. In this study we report on the identification of strong sources of host resistance to M. phaseolina and the genetic mapping of putative resistance loci on a cowpea genetic map comprised of gene-derived single nucleotide polymorphisms (SNPs and amplified fragment length polymorphisms (AFLPs. Results Nine quantitative trait loci (QTLs, accounting for between 6.1 and 40.0% of the phenotypic variance (R2, were identified using plant mortality data taken over three years in field experiments and disease severity scores taken from two greenhouse experiments. Based on annotated genic SNPs as well as synteny with soybean (Glycine max and Medicago truncatula, candidate resistance genes were found within mapped QTL intervals. QTL Mac-2 explained the largest percent R2 and was identified in three field and one greenhouse experiments where the QTL peak co-located with a SNP marker derived from a pectin esterase inhibitor encoding gene. Maturity effects on the expression of resistance were indicated by the co-location of Mac-6 and Mac-7 QTLs with maturity-related senescence QTLs Mat-2 and Mat-1, respectively. Homologs of the ELF4 and FLK flowering genes were found in corresponding syntenic soybean regions. Only three Macrophomina resistance QTLs co-located with delayed drought-induced premature senescence QTLs previously mapped in the same population, suggesting that largely different genetic mechanisms mediate cowpea response to drought stress and Macrophomina infection
QTL analysis using SNP markers developed by next-generation sequencing for identification of candidate genes controlling 4-methylthio-3-butenyl glucosinolate contents in roots of radish, Raphanus sativus L.
Zou, Zhongwei; Ishida, Masahiko; Li, Feng; Kakizaki, Tomohiro; Suzuki, Sho; Kitashiba, Hiroyasu; Nishio, Takeshi
SNP markers for QTL analysis of 4-MTB-GSL contents in radish roots were developed by determining nucleotide sequences of bulked PCR products using a next-generation sequencer. DNA fragments were amplified from two radish lines by multiplex PCR with six primer pairs, and those amplified by 2,880 primer pairs were mixed and sequenced. By assembling sequence data, 1,953 SNPs in 750 DNA fragments, 437 of which have been previously mapped in a linkage map, were identified. A linkage map of nine linkage groups was constructed with 188 markers, and five QTLs were detected in two F(2) populations, three of them accounting for more than 50% of the total phenotypic variance being repeatedly detected. In the identified QTL regions, nine SNP markers were newly produced. By synteny analysis of the QTLs regions with Arabidopsis thaliana and Brassica rapa genome sequences, three candidate genes were selected, i.e., RsMAM3 for production of aliphatic glucosinolates linked to GSL-QTL-4, RsIPMDH1 for leucine biosynthesis showing strong co-expression with glucosinolate biosynthesis genes linked to GSL-QTL-2, and RsBCAT4 for branched-chain amino acid aminotransferase linked to GSL-QTL-1. Nucleotide sequences and expression of these genes suggested their possible function in 4MTB-GSL biosynthesis in radish roots.
Aerts, J.; Wetzels, Y.; Cohen, N.; Aerssens, J.
Different strategies to search public single nucleotide polymorphism (SNP) databases for intragenic SNPs were evaluated. First, we assembled a strategy to annotate SNPs onto candidate genes based on a BLAST search of public SNP databases (Intragenic SNP Annotation by BLAST, ISAB). Only BLAST hits th
Song, F S; Ni, J L; Qian, Y L; Li, L; Ni, D H; Yang, J B
Molecular markers can increase both the efficiency and speed of breeding programs. Functional markers that detect the functional mutations causing phenotypic changes offer a precise method for genetic identification. In this study, we used newly derived cleaved amplified polymorphic sequence markers to detect the functional mutations of tms5, which is a male sterile gene that is widely used in rice production in China. In addition, restriction cutting sites were designed to specifically digest amplicons of tms5 but not wild type (TMS5), in order to avoid the risk of false positive results. By optimizing the condition of the polymerase chain reaction amplifications and restriction enzyme digestions, the newly designed markers could accurately distinguish between tms5 and TMS5. These markers can be applied in marker-assisted selection for breeding novel thermo-sensitive genic male sterile (TGMS) lines, as well as to rapidly identify the TGMS hybrid seed purity.
Dominant and co-dominant molecular markers are routinely used in plant genetic diversity research. In the present study we assessed the success-rate of three marker-systems for estimating genotypic diversity, clustering varieties into populations, and assigning a single variety into the expected pop...
Krag, Kristian; Janss, Luc; Mahdi Shariati, Mohammad;
Heritability is a central element in quantitative genetics. New molecular markers to assess genetic variance and heritability are continually under development. The availability of molecular single nucleotide polymorphism (SNP) markers can be applied for estimation of variance components and heri......Heritability is a central element in quantitative genetics. New molecular markers to assess genetic variance and heritability are continually under development. The availability of molecular single nucleotide polymorphism (SNP) markers can be applied for estimation of variance components...
Verticillium wilt (VW) of alfalfa is a soilborne disease that causes severe yield loss in alfalfa. To identify molecular markers associated with VW resistance, an integrated framework of genome-wide association study (GWAS) with high-throughput genotyping by sequencing (GBS) was used for mapping lo...
Daša Jevšinek Skok
Full Text Available The objective of this preliminary study was to identify SNP markers within the FTO gene for evaluation of pedigree data accuracy and determination of haplotypes in paternal half-sib families of Slovenian Simmental cattle. Out of 23 polymorphic SNPs identified ten most informative SNPs for genotyping 31 sires and 56 half-sib progeny were used. The ATLAS program was used for paternity testing. Haplotype analysis revealed three haplotype blocks. The effect of SNPs “ex2 T>C” and “int2 indel*>T” was significant on three correlated carcass traits: live weight at slaughter (P= 0.03, carcass weight (P= 0.038, and lean weight (P= 0.048. The FTO gene can thus be regarded as a candidate for the marker assisted selection programs in our and possibly other populations of cattle. Future studies in cattle might also reveal novel roles of the FTO gene in carcass traits on livestock species as well as fatness control in other mammals.
Daša Jevšinek Skok
Full Text Available The objective of this preliminary study was to identify SNP markers within the FTO gene for evaluation of pedigree data accuracy and determination of haplotypes in paternal half-sib families of Slovenian Simmental cattle. Out of 23 polymorphic SNPs identified ten most informative SNPs for genotyping 31 sires and 56 half-sib progeny were used. The ATLAS program was used for paternity testing. Haplotype analysis revealed three haplotype blocks. The effect of SNPs “ex2 T>C” and “int2 indel*>T” was significant on three correlated carcass traits: live weight at slaughter (P= 0.03, carcass weight (P= 0.038, and lean weight (P= 0.048. The FTO gene can thus be regarded as a candidate for the marker assisted selection programs in our and possibly other populations of cattle. Future studies in cattle might also reveal novel roles of the FTO gene in carcass traits on livestock species as well as fatness control in other mammals.
Terracciano, I.; Maccaferri, M.; Bassi, F; Mantovani, P; Šimková, H. (Hana); Doležel, J. (Jaroslav); Tuberosa, R
Leaf rust (Puccinia triticina Eriks. & Henn.) is a major disease affecting durum wheat production. The Lr14a-resistant gene present in the durum wheat cv. Creso and its derivative cv. Colosseo is one of the best characterized leaf-rust resistance sources deployed in durum wheat breeding. Lr14a has been mapped close to the simple sequence repeat markers gwm146, gwm344 and wmc10 in the distal portion of the chromosome arm 7BL, a gene-dense region. The objectives of this study were: (1) to enric...
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA variation in vertebrates and may be used as genetic markers for a range of applications. This has led to an increased interest in identification of SNP markers in non-model species and farmed animals. The in silico SNP mining method used for discovery of most known SNPs in Atlantic salmon (Salmo salar has applied a global (genome-wide approach. In this study we present a targeted 3'UTR-primed SNP discovery strategy that utilizes sequence data from Salmo salar full length sequenced cDNAs (FLIcs. We compare the efficiency of this new strategy to the in silico SNP mining method when using both methods for targeted SNP discovery. Results The SNP discovery efficiency of the two methods was tested in a set of FLIc target genes. The 3'UTR-primed SNP discovery method detected novel SNPs in 35% of the target genes while the in silico SNP mining method detected novel SNPs in 15% of the target genes. Furthermore, the 3'UTR-primed SNP discovery strategy was the less labor intensive one and revealed a higher success rate than the in silico SNP mining method in the initial amplification step. When testing the methods we discovered 112 novel bi-allelic polymorphisms (type I markers in 88 salmon genes [dbSNP: ss179319972-179320081, ss250608647-250608648], and three of the SNPs discovered were missense substitutions. Conclusions Full length insert cDNAs (FLIcs are important genomic resources that have been developed in many farmed animals. The 3'UTR-primed SNP discovery strategy successfully utilized FLIc data to detect novel SNPs in the partially tetraploid Atlantic salmon. This strategy may therefore be useful for targeted SNP discovery in several species, and particularly useful in species that, like salmonids, have duplicated genomes.
Genomewide Linkage Analysis of Bipolar Disorder by Use of a High-Density Single-Nucleotide–Polymorphism (SNP) Genotyping Assay: A Comparison with Microsatellite Marker Assays and Finding of Significant Linkage to Chromosome 6q22
Middleton, F. A.; Pato, M. T.; Gentile, K. L.; Morley, C. P.; Zhao, X.; Eisener, A. F.; Brown, A.; Petryshen, T. L.; Kirby, A. N.; Medeiros, H.; Carvalho, C.; Macedo, A.; Dourado, A.; Coelho, I.; Valente, J.; Soares, M. J.; Ferreira, C. P.; Lei, M.; Azevedo, M. H.; Kennedy, J. L.; Daly, M. J.; Sklar, P.; Pato, C. N.
We performed a linkage analysis on 25 extended multiplex Portuguese families segregating for bipolar disorder, by use of a high-density single-nucleotide–polymorphism (SNP) genotyping assay, the GeneChip Human Mapping 10K Array (HMA10K). Of these families, 12 were used for a direct comparison of the HMA10K with the traditional 10-cM microsatellite marker set and the more dense 4-cM marker set. This comparative analysis indicated the presence of significant linkage peaks in the SNP assay in chromosomal regions characterized by poor coverage and low information content on the microsatellite assays. The HMA10K provided consistently high information and enhanced coverage throughout these regions. Across the entire genome, the HMA10K had an average information content of 0.842 with 0.21-Mb intermarker spacing. In the 12-family set, the HMA10K-based analysis detected two chromosomal regions with genomewide significant linkage on chromosomes 6q22 and 11p11; both regions had failed to meet this strict threshold with the microsatellite assays. The full 25-family collection further strengthened the findings on chromosome 6q22, achieving genomewide significance with a maximum nonparametric linkage (NPL) score of 4.20 and a maximum LOD score of 3.56 at position 125.8 Mb. In addition to this highly significant finding, several other regions of suggestive linkage have also been identified in the 25-family data set, including two regions on chromosome 2 (57 Mb, NPL = 2.98; 145 Mb, NPL = 3.09), as well as regions on chromosomes 4 (91 Mb, NPL = 2.97), 16 (20 Mb, NPL = 2.89), and 20 (60 Mb, NPL = 2.99). We conclude that at least some of the linkage peaks we have identified may have been largely undetected in previous whole-genome scans for bipolar disorder because of insufficient coverage or information content, particularly on chromosomes 6q22 and 11p11. PMID:15060841
Lin, Hui-Yi; Chen, Dung-Tsa; Huang, Po-Yu
MOTIVATION: Testing SNP-SNP interactions is considered as a key for overcoming bottlenecks of genetic association studies. However, related statistical methods for testing SNP-SNP interactions are underdeveloped. RESULTS: We propose the SNP Interaction Pattern Identifier (SIPI), which tests 45...
Background Vitis vinifera L. is one of society’s most important agricultural crops with a broad genetic variability. The difficulty in recognizing grapevine genotypes based on ampelographic traits and secondary metabolites prompted the development of molecular markers suitable for achieving variety genetic identification. Findings Here, we propose a comparison between a multi-locus barcoding approach based on six chloroplast markers and a single-copy nuclear gene sequencing method using five coding regions combined with a character-based system with the aim of reconstructing cultivar-specific haplotypes and genotypes to be exploited for the molecular characterization of 157 V. vinifera accessions. The analysis of the chloroplast target regions proved the inadequacy of the DNA barcoding approach at the subspecies level, and hence further DNA genotyping analyses were targeted on the sequences of five nuclear single-copy genes amplified across all of the accessions. The sequencing of the coding region of the UFGT nuclear gene (UDP-glucose: flavonoid 3-0-glucosyltransferase, the key enzyme for the accumulation of anthocyanins in berry skins) enabled the discovery of discriminant SNPs (1/34 bp) and the reconstruction of 130 V. vinifera distinct genotypes. Most of the genotypes proved to be cultivar-specific, and only few genotypes were shared by more, although strictly related, cultivars. Conclusion On the whole, this technique was successful for inferring SNP-based genotypes of grapevine accessions suitable for assessing the genetic identity and ancestry of international cultivars and also useful for corroborating some hypotheses regarding the origin of local varieties, suggesting several issues of misidentification (synonymy/homonymy). PMID:24298902
张成才; 谭礼强; 王丽鸳; 韦康; 成浩
为了提高茶树 SNPs 分型效率，促进 SNPs 在茶树遗传育种中的应用，研究了 SNaPshot 技术进行茶树SNPs分型的可行性。从实验室前期确证的SNPs中，选择10个作为目标SNPs；使用SNaPshot技术在不同的茶树品种中进行分型；然后，对分型结果进行比对和统计，以验证 SNaPshot 技术检测茶树 SNPs 的准确性、重复性以及在茶树遗传多样性分析等方面的可用性。结果发现，6个 SNPs分型结果与测序结果一致，准确率为60%；目标SNPs在龙井43及其重复实验中的分型结果完全一致；6个SNPs的等位基因数都是2，期望杂合度（He）介于0.37~0.52，观测杂合度（Ho）介于0.32~0.74，多态性信息含量（PIC）介于0.36~0.50；结果表明，SNaPshot技术对茶树SNPs的分型准确性高、重复性好，可以用于茶树遗传多样性分析以及茶树遗传图谱构建等方面的研究。%In order to increase the genotyping efficiency of tea SNPs and promote th e application of SNPs in tea genetic breeding investigation, the feasibility of SNaPshot in SNPs genotyping of tea plant was investigated. Ten SNPs were selected from previous experiment as target SNPs. Then, these SNPs were detected in different tea cultivars by SNaPshot technology. Six among 10 SNPs were successful detected, with an accuracy of 60%. The polymorphism diversity of these SNPs was also analyzed. The value of NA was 2, He ranged from 0.37 to 0.52, Ho ranged from 0.32 to 0.74, PIC ranged from 0.36 to 0.50. All these markers were shown coincide between`Longjing43`and its repeated test. The markers reported here will be useful for tea genetic linkage map construction and genetic diversity study. The SNaPshot technology will promote the genetic study and accelerate the breeding process in tea plant.
U.S. Department of Health & Human Services — dbSNP is a database of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and...
Full Text Available Angiogenesis has been shown to be associated with prostate cancer development. The majority of prostate cancer studies focused on individual single nucleotide polymorphisms (SNPs while SNP-SNP interactions are suggested having a great impact on unveiling the underlying mechanism of complex disease. Using 1,151 prostate cancer patients in the Cancer Genetic Markers of Susceptibility (CGEMS dataset, 2,651 SNPs in the angiogenesis genes associated with prostate cancer aggressiveness were evaluated. SNP-SNP interactions were primarily assessed using the two-stage Random Forests plus Multivariate Adaptive Regression Splines (TRM approach in the CGEMS group, and were then re-evaluated in the Moffitt group with 1,040 patients. For the identified gene pairs, cross-evaluation was applied to evaluate SNP interactions in both study groups. Five SNP-SNP interactions in three gene pairs (MMP16+ ROBO1, MMP16+ CSF1, and MMP16+ EGFR were identified to be associated with aggressive prostate cancer in both groups. Three pairs of SNPs (rs1477908+ rs1387665, rs1467251+ rs7625555, and rs1824717+ rs7625555 were in MMP16 and ROBO1, one pair (rs2176771+ rs333970 in MMP16 and CSF1, and one pair (rs1401862+ rs6964705 in MMP16 and EGFR. The results suggest that MMP16 may play an important role in prostate cancer aggressiveness. By integrating our novel findings and available biomedical literature, a hypothetical gene interaction network was proposed. This network demonstrates that our identified SNP-SNP interactions are biologically relevant and shows that EGFR may be the hub for the interactions. The findings provide valuable information to identify genotype combinations at risk of developing aggressive prostate cancer and improve understanding on the genetic etiology of angiogenesis associated with prostate cancer aggressiveness.
In this thesis the results are described of investigations of various application of genome wide SNP (single nucleotide polymorphism) markers. The set of SNP markers was identified by GBS (genotyping by sequencing) strategy. The resulting dataset of 129,156 SNPs across 83 tetraploid varieties was us
Full Text Available MicroRNAs (miRNAs play an important role in carcinogenesis through the regulation of their target genes. miRNA-related single nucleotide polymorphisms (miR-SNPs can affect miRNA biogenesis and target sites and can alter microRNA expression and functions. We examined 11 miR-SNPs, including 5 in microRNA genes, 3 in microRNA binding sites and 3 in microRNA-processing machinery components, and evaluated time to recurrence (TTR according to miR-SNP genotypes in 175 surgically resected non-small-cell lung cancer (NSCLC patients. Significant differences in TTR were found according to KRT81 rs3660 (median TTR: 20.3 months for the CC genotype versus 86.8 months for the CG or GG genotype; P = 0.003 and XPO5 rs11077 (median TTR: 24.7 months for the AA genotype versus 73.1 months for the AC or CC genotypes; P = 0.029. Moreover, when patients were divided according to stage, these differences were maintained for stage I patients (P = 0.002 for KRT81 rs3660; P<0.001 for XPO5 rs11077. When patients were divided into sub-groups according to histology, the effect of the KRT81 rs3660 genotype on TTR was significant in patients with squamous cell carcinoma (P = 0.004 but not in those with adenocarcinoma. In the multivariate analyses, the KRT81 rs3660 CC genotype (OR = 1.8; P = 0.023 and the XPO5 rs11077 AA genotype (OR = 1.77; P = 0.026 emerged as independent variables influencing TTR. Immunohistochemical analyses in 80 lung specimens showed that 95% of squamous cell carcinomas were positive for KRT81, compared to only 19% of adenocarcinomas (P<0.0001. In conclusion, miR-SNPs are a novel class of SNPs that can add useful prognostic information on the clinical outcome of resected NSCLC patients and may be a potential key tool for selecting high-risk stage I patients. Moreover, KRT81 has emerged as a promising immunohistochemical marker for the identification of squamous cell lung carcinoma.
Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra
One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git. PMID:27600083
Ezequiel Luis Nicolazzi
Full Text Available One of the main advantages of single nucleotide polymorphism (SNP array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.
李欧静; 张桂华; 兰青阔; 王永; 赵新; 朱珠; 陈锐; 郭永泽
单核苷酸多态性(SNP)标记具有二态性等优点,可用于种子纯度鉴定.应用焦磷酸测序技术对黄瓜SNP位点CLA13(C/G)进行了分析.结果表明,该位点在16个黄瓜杂交种中多态信息量达0.556,处于高度多态,适于其中7个品种的纯度检测 ;为提高种子纯度鉴定效率,引入DNA Pooling技术,建立高效率的3Pooling SNP Pyrosequencing检测分析模型,并验证了分析模型的可行性.%Single nucleotide polymorphism (SNP) can be used for identification of seed purity because its advantages of density, genetic stability and bi-allelic maker. The SNP site CLAD (C/G) of 16 cucumber varieties was analyzed by Pyrosequencing. The results showed that the polymorphism information content (PIC) of this site in all of 16 hybrid cucumber varieties is 0.556, which is high polymorphism, and it is suitable for identification of seed purity for seven cucumber varieties. In order to improve the efficiency for identification of seed purity, the DNA pooling technology was used, and a high efficient 3Pooling SNP Pyrosequencing molecular assay model was established. Moreover, the feasibility of the assay model was verified.
Plomion, C; Bartholomé, J; Lesur, I; Boury, C; Rodríguez-Quilón, I; Lagraulet, H; Ehrenmann, F; Bouffier, L; Gion, J M; Grivet, D; de Miguel, M; de María, N; Cervera, M T; Bagnoli, F; Isik, F; Vendramin, G G; González-Martínez, S C
Maritime pine provides essential ecosystem services in the south-western Mediterranean basin, where it covers around 4 million ha. Its scattered distribution over a range of environmental conditions makes it an ideal forest tree species for studies of local adaptation and evolutionary responses to climatic change. Highly multiplexed single nucleotide polymorphism (SNP) genotyping arrays are increasingly used to study genetic variation in living organisms and for practical applications in plant and animal breeding and genetic resource conservation. We developed a 9k Illumina Infinium SNP array and genotyped maritime pine trees from (i) a three-generation inbred (F2) pedigree, (ii) the French breeding population and (iii) natural populations from Portugal and the French Atlantic coast. A large proportion of the exploitable SNPs (2052/8410, i.e. 24.4%) segregated in the mapping population and could be mapped, providing the densest ever gene-based linkage map for this species. Based on 5016 SNPs, natural and breeding populations from the French gene pool exhibited similar level of genetic diversity. Population genetics and structure analyses based on 3981 SNP markers common to the Portuguese and French gene pools revealed high levels of differentiation, leading to the identification of a set of highly differentiated SNPs that could be used for seed provenance certification. Finally, we discuss how the validated SNPs could facilitate the identification of ecologically and economically relevant genes in this species, improving our understanding of the demography and selective forces shaping its natural genetic diversity, and providing support for new breeding strategies.
Li, Jin; Huang, Dongli; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Jiang, Yongshuai; Lv, Hongchao; Wang, Limei
Currently, most methods for detecting gene-gene interactions (GGIs) in genome-wide association studies are divided into SNP-based methods and gene-based methods. Generally, the gene-based methods can be more powerful than SNP-based methods. Some gene-based entropy methods can only capture the linear relationship between genes. We therefore proposed a nonparametric gene-based information gain method (GBIGM) that can capture both linear relationship and nonlinear correlation between genes. Through simulation with different odds ratio, sample size and prevalence rate, GBIGM was shown to be valid and more powerful than classic KCCU method and SNP-based entropy method. In the analysis of data from 17 genes on rheumatoid arthritis, GBIGM was more effective than the other two methods as it obtains fewer significant results, which was important for biological verification. Therefore, GBIGM is a suitable and powerful tool for detecting GGIs in case-control studies.
Fondevila, M; Børsting, C; Phillips, C
This review explores the key factors that influence the optimization, routine use, and profile interpretation of the SNaPshot single-base extension (SBE) system applied to forensic single-nucleotide polymorphism (SNP) genotyping. Despite being a mainly complimentary DNA genotyping technique...... to routine STR profiling, use of SNaPshot is an important part of the development of SNP sets for a wide range of forensic applications with these markers, from genotyping highly degraded DNA with very short amplicons to the introduction of SNPs to ascertain the ancestry and physical characteristics...... of an unidentified contact trace donor. However, this technology, as resourceful as it is, displays several features that depart from the usual STR genotyping far enough to demand a certain degree of expertise from the forensic analyst before tackling the complex casework on which SNaPshot application provides...
Full Text Available Many genetic association studies used single nucleotide polymorphisms (SNPs data to identify genetic variants for complex diseases. Although SNP-based associations are most common in genome-wide association studies (GWAS, gene-based association analysis has received increasing attention in understanding genetic etiologies for complex diseases. While both methods have been used to analyze the same data, few genome-wide association studies compare the results or observe the connection between them. We performed a comprehensive analysis of the data from the Study of Addiction: Genetics and Environment (SAGE and compared the results from the SNP-based and gene-based analyses. Our results suggest that the gene-based method complements the individual SNP-based analysis, and conceptually they are closely related. In terms of gene findings, our results validate many genes that were either reported from the analysis of the same dataset or based on animal studies for substance dependence.
Gupta, Shefali; Kumar, Tapan; Verma, Subodh; Bharadwaj, Chellapilla; Bhatia, Sabhyata
Seed weight and plant height are important agronomic traits and contribute to seed yield. The objective of this study was to identify QTLs underlying these traits using an intra-specific mapping population of chickpea. A F11 population of 177 recombinant inbred lines derived from a cross between SBD377 (100-seed weight--48 g and plant height--53 cm) and BGD112 (100-seed weight--15 g and plant height--65 cm) was used. A total of 367 novel EST-derived functional markers were developed which included 187 EST-SSRs, 130 potential intron polymorphisms (PIPs) and 50 expressed sequence tag polymorphisms (ESTPs). Along with these, 590 previously published markers including 385 EST-based markers and 205 genomic SSRs were utilized. Of the 957 markers tested for analysis of parental polymorphism between the two parents of the mapping population, 135 (14.64%) were found to be polymorphic. Of these, 131 polymorphic markers could be mapped to the 8 linkage groups. The linkage map had a total length of 1140.54 cM with an average marker density of 8.7 cM. The map was further used for QTL identification using composite interval mapping method (CIM). Two QTLs each for seed weight, qSW-1 and qSW-2 (explaining 11.54 and 19.24% of phenotypic variance, respectively) and plant height, qPH-1 and qPH-2 (explaining 13.98 and 12.17% of phenotypic variance, respectively) were detected. The novel set of genic markers, the intra-specific linkage map and the QTLs identified in the present study will serve as valuable genomic resources in improving the chickpea seed yield using marker-assisted selection (MAS) strategies.
Full Text Available To understand the genetic basis of tolerance to drought and heat stresses in chickpea, a comprehensive association mapping approach has been undertaken. Phenotypic data were generated on the reference set (300 accessions, including 211 mini-core collection accessions for drought tolerance related root traits, heat tolerance, yield and yield component traits from 1-7 seasons and 1-3 locations in India (Patancheru, Kanpur, Bangalore and three locations in Africa (Nairobi, Egerton in Kenya and Debre Zeit in Ethiopia. Diversity Array Technology (DArT markers equally distributed across chickpea genome were used to determine population structure and three sub-populations were identified using admixture model in STRUCTURE. The pairwise linkage disequilibrium (LD estimated using the squared-allele frequency correlations (r2; when r2<0.20 was found to decay rapidly with the genetic distance of 5 cM. For establishing marker-trait associations (MTAs, both genome-wide and candidate gene-sequencing based association mapping approaches were conducted using 1,872 markers (1,072 DArTs, 651 single nucleotide polymorphisms [SNPs], 113 gene-based SNPs and 36 simple sequence repeats [SSRs] and phenotyping data mentioned above employing mixed linear model (MLM analysis with optimum compression with P3D method and kinship matrix. As a result, 312 significant MTAs were identified and a maximum number of MTAs (70 was identified for 100-seed weight. A total of 18 SNPs from 5 genes (ERECTA, 11 SNPs; ASR, 4 SNPs; DREB, 1 SNP; CAP2 promoter, 1 SNP and AMDH, 1SNP were significantly associated with different traits. This study provides significant MTAs for drought and heat tolerance in chickpea that can be used, after validation, in molecular breeding for developing superior varieties with enhanced drought and heat tolerance.
Martin W Ganal; Andreas Polley; Eva-Maria Graner; Joerg Plieske; Ralf Wieseke; Hartmut Luerssen; Gregor Durstewitz
Genotyping with large numbers of molecular markers is now an indispensable tool within plant genetics and breeding. Especially through the identification of large numbers of single nucleotide polymorphism (SNP) markers using the novel high-throughput sequencing technologies, it is now possible to reliably identify many thousands of SNPs at many different loci in a given plant genome. For a number of important crop plants, SNP markers are now being used to design genotyping arrays containing thousands of markers spread over the entire genome and to analyse large numbers of samples. In this article, we discuss aspects that should be considered during the design of such large genotyping arrays and the analysis of individuals. The fact that crop plants are also often autopolyploid or allopolyploid is given due consideration. Furthermore, we outline some potential applications of large genotyping arrays including high-density genetic mapping, characterization (fingerprinting) of genetic material and breeding-related aspects such as association studies and genomic selection.
Full Text Available Abstract Background Genome-wide single-nucleotide polymorphism (SNP arrays containing hundreds of thousands of SNPs from the human genome have proven useful for studying important human genome questions. Data quality of SNP arrays plays a key role in the accuracy and precision of downstream data analyses. However, good indices for assessing data quality of SNP arrays have not yet been developed. Results We developed new quality indices to measure the quality of SNP arrays and/or DNA samples and investigated their statistical properties. The indices quantify a departure of estimated individual-level allele frequencies (AFs from expected frequencies via standardized distances. The proposed quality indices followed lognormal distributions in several large genomic studies that we empirically evaluated. AF reference data and quality index reference data for different SNP array platforms were established based on samples from various reference populations. Furthermore, a confidence interval method based on the underlying empirical distributions of quality indices was developed to identify poor-quality SNP arrays and/or DNA samples. Analyses of authentic biological data and simulated data show that this new method is sensitive and specific for the detection of poor-quality SNP arrays and/or DNA samples. Conclusions This study introduces new quality indices, establishes references for AFs and quality indices, and develops a detection method for poor-quality SNP arrays and/or DNA samples. We have developed a new computer program that utilizes these methods called SNP Array Quality Control (SAQC. SAQC software is written in R and R-GUI and was developed as a user-friendly tool for the visualization and evaluation of data quality of genome-wide SNP arrays. The program is available online (http://www.stat.sinica.edu.tw/hsinchou/genetics/quality/SAQC.htm.
Pootakham, Wirulda; Shearman, Jeremy R; Ruang-Areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke
Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10.
Ren, Jing; Chen, Liang; Sun, Daokun; You, Frank M; Wang, Jirui; Peng, Yunliang; Nevo, Eviatar; Beiles, Avigdor; Sun, Dongfa; Luo, Ming-Cheng; Peng, Junhua
.... However, few studies have been performed on the genetic structure and population divergence in wild emmer wheat using a large number of EST-related single nucleotide polymorphism (SNP) markers...
吴星波; 郝俊杰; 张晓艳; 万述伟; 李红卫; 邵阳; 孙吉禄
Eight SCAR primer combinations( SAU5, SS18, SF6Em3, SF12R9, SF13R10, SF18R7, SF18R15 and SMe1Em5) of common bean powdery mildew resistant genes were used to check the genome DNA of 78 common bean accessions. The SCAR primer pairs of SF12R9, SF13R10, SF18R7, SF18R15 and SMe1Em5 did not amplified target bands at all; SAU5 marker appeared in 52 accessions, SF6Em3 marker in 66 accessions, and SS18 marker in 76 accessions. 2~3 SCAR markers appeared in 76 accessions respectively. The types of powdery mildew resistant genes in each of the 78 accessions have been identified, and the accessions with pyramiding powdery mildew resistance genes were selected.%利用8个来自于普通菜豆抗白粉病基因SCAR 标记（ SAU5、SS18、SF6Em3、SF12R9、SF13R10、SF18R7、SF18R15和SMe1Em5）引物组，对78份普通菜豆资源进行抗白粉病分子标记的鉴定。结果表明：在这78份资源中， SF12R9、SF13R10、SF18R7、SF18R15和SMe1Em5均无扩增带。有52份资源含有SAU5标记，66份资源含有SF6Em3标记，76份资源含有SS18标记。76份资源含有2～3个标记。该研究明确了78份参试菜豆资源所含的抗白粉病基因类型，并筛选出抗白粉病基因聚合体的参试菜豆资源。
Peter P. Grimminger
Full Text Available Background. To further improve the screening, diagnosis, and therapy of patients with nonsmall cell lung cancer (NSCLC additional diagnostic tools are urgently needed. Gene expression of Cyclooxygenase-2 (COX-2 has been linked to prognosis in patients with NSCLC. The role of the COX-2 926G>C Single Nucleotide Polymorphism (SNP in patients with NSCLC remains unclear. The aim of this study was to investigate the potential of the COX-2 926G>C SNP as a molecular marker in this disease. Methods. COX-2 926G>C SNP was analyzed in surgically resected tumor tissue of 85 patients with NSCLC using a PCR-based RFLP technique. Results. The COX-2 926G>C SNP genotypes were detected with the following frequencies: GG n=62 (73%, GC n=20 (23%, CC n=3 (4%. There were no associations between COX-2 SNP genotype and histology, grading or gender detectable. COX-2 SNP was significantly associated with tumor stage (P=.032 and lymph node status (P=.016, Chi-square test. With a median followup of 85.9 months, the median survival was 59.7 months. There were no associations seen between the COX-2 SNP genotype and patients prognosis. Conclusions. The COX-2 926G>C SNP is detectable at a high frequency in patients with NSCLC. The COX-2 926G>C SNP genotype is not a prognostic molecular marker in this disease. However, patients with the GC or CC genotype seem more susceptible to lymph node metastases and higher tumor stage than patients with the GG genotype. The results suggest COX-2 926G>C SNP as a molecular marker for lymph node involvement in this disease.
Garcés-Claver, Ana; Fellman, Shanna Moore; Gil-Ortega, Ramiro; Jahn, Molly; Arnedo-Andrés, María S
A single nucleotide polymorphism (SNP) associated with pungency was detected within an expressed sequence tag (EST) of 307 bp. This fragment was identified after expression analysis of the EST clone SB2-66 in placenta tissue of Capsicum fruits. Sequence alignments corresponding to this new fragment allowed us to identify an SNP between pungent and non-pungent accessions. Two methods were chosen for the development of the SNP marker linked to pungency: tetra-primer amplification refractory mutation system-PCR (tetra-primer ARMS-PCR) and cleaved amplified polymorphic sequence. Results showed that both methods were successful in distinguishing genotypes. Nevertheless, tetra-primer ARMS-PCR was chosen for SNP genotyping because it was more rapid, reliable and less cost-effective. The utility of this SNP marker for pungency was demonstrated by the ability to distinguish between 29 pungent and non-pungent cultivars of Capsicum annuum. In addition, the SNP was also associated with phenotypic pungent character in the tested genotypes of C. chinense, C. baccatum, C. frutescens, C. galapagoense, C. eximium, C. tovarii and C. cardenasi. This SNP marker is a faster, cheaper and more reproducible method for identifying pungent peppers than other techniques such as panel tasting, and allows rapid screening of the trait in early growth stages.
Development of PCR-based markers for SNP detection is prerequisite for various genetic analyses. The use of restriction enzymes following PCR amplification is a common and relatively low cost method for SNP detection. Simple and cost-effective methodologies for SNP marker development that would en...
Minica, C.C.; Dolan, C.V.; Hottenga, J.J.; Pool, R.; Fedko, I.O; Mbarek, H.; Huppertz, C.; Bartels, M.; Boomsma, D.I.; Vink, J.M.
Prior searches for genetic variants (GVs) implicated in initiation of cannabis use have been limited to common single nucleotide polymorphisms (SNPs) typed in HapMap samples. Denser SNPs are now available with the completion of the 1000 Genomes and the Genome of the Netherlands projects. More densel
田义轲; 王彩虹; 白牡丹; 殷豪; 李节法
The functional markers based on gene sequences are ideal markers for genotyping,gene mapping and marker assistant selection for associated traits. In this research,5 PCR primer pairs were designed according to the cDNA sequence of PpKO gene in pear which codes the protein of ent-kaurene oxidase(KO),one of the key enzymes in the path of gibberellin synthesis. These primer pairs were tested in the population derived from the cross of‘Aishengli’×‘Chili’by high resolution melting(HRM) ananlysis,and one of them was selected for genotyping. Furthermore,the fine genotyping function of this primer pair was confirmed by 9 cultivars. Sequencing and comparison of amplicons from the progenies and the 9 cultivars showed that the primer pair could amplified a fragment of 200 bp in length,which located in the eighth exon of PpKO gene,with 2 non-synonymous SNPs(single nucleotide polymor- phisms)detected. The two SNPs are genetic markers suitable for high resolution analysis,which are useful tools for genetic mapping,gene location and germplasm identification.%基于基因序列的功能性标记是基因分型、图谱定位和相关性状标记辅助选择的理想标记。以＇矮生梨＇×＇茌梨＇的梨杂交分离群体和9个梨栽培品种为试材,以梨赤霉素合成代谢途径的关键酶基因--贝壳杉烯氧化酶基因（PpKO）的cDNA序列为基础,设计5对PCR引物,通过高通量熔解曲线（high resolution melting,HRM）分析,筛选到可对基因进行良好分型的1对引物。通过对杂种后代和测试品种的测序分析表明,此对引物的扩增子是一个位于PpKO第8外显子上的长度为200bp的序列,从中共检测到两处单核苷酸多态性（single nucleotide polymorphism,SNP）变化,且均为非同义cSNP。这两处SNP标记均可以作为适合高通量分析的遗传标记在遗传作图、基因定位或资源鉴定中加以利用。
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most common genetic variations in the human genome and are useful as genomic markers. Oligonucleotide SNP microarrays have been developed for high-throughput genotyping of up to 900,000 human SNPs and have been used widely in linkage and cancer genomics studies. We have previously used Hidden Markov Models (HMM to analyze SNP array data for inferring copy numbers and loss-of-heterozygosity (LOH from paired normal and tumor samples and unpaired tumor samples. Results We proposed and implemented major copy proportion (MCP analysis of oligonucleotide SNP array data. A HMM was constructed to infer unobserved MCP states from observed allele-specific signals through emission and transition distributions. We used 10 K, 100 K and 250 K SNP array datasets to compare MCP analysis with LOH and copy number analysis, and showed that MCP performs better than LOH analysis for allelic-imbalanced chromosome regions and normal contaminated samples. The major and minor copy alleles can also be inferred from allelic-imbalanced regions by MCP analysis. Conclusion MCP extends tumor LOH analysis to allelic imbalance analysis and supplies complementary information to total copy numbers. MCP analysis of mixing normal and tumor samples suggests the utility of MCP analysis of normal-contaminated tumor samples. The described analysis and visualization methods are readily available in the user-friendly dChip software.
Strain-specific genomic diversity in the Mycobacterium tuberculosis complex (MTBC) is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Several systems have been proposed to classify MTBC strains into distinct lineages and families. Here, we investigate single-nucleotide polymorphisms (SNPs) as robust (stable) markers of genetic variation for phylogenetic analysis. We identify ∼92k SNP across a global collection of 1,601 genomes. The SNP-based phylogeny is consistent with the gold-standard regions of difference (RD) classification system. Of the ∼7k strain-specific SNPs identified, 62 markers are proposed to discriminate known circulating strains. This SNP-based barcode is the first to cover all main lineages, and classifies a greater number of sublineages than current alternatives. It may be used to classify clinical isolates to evaluate tools to control the disease, including therapeutics and vaccines whose effectiveness may vary by strain type. © 2014 Macmillan Publishers Limited.
Full Text Available Abstract Background Breast cancer predisposition genes identified to date (e.g., BRCA1 and BRCA2 are responsible for less than 5% of all breast cancer cases. Many studies have shown that the cancer risks associated with individual commonly occurring single nucleotide polymorphisms (SNPs are incremental. However, polygenic models suggest that multiple commonly occurring low to modestly penetrant SNPs of cancer related genes might have a greater effect on a disease when considered in combination. Methods In an attempt to identify the breast cancer risk conferred by SNP interactions, we have studied 19 SNPs from genes involved in major cancer related pathways. All SNPs were genotyped by TaqMan 5'nuclease assay. The association between the case-control status and each individual SNP, measured by the odds ratio and its corresponding 95% confidence interval, was estimated using unconditional logistic regression models. At the second stage, two-way interactions were investigated using multivariate logistic models. The robustness of the interactions, which were observed among SNPs with stronger functional evidence, was assessed using a bootstrap approach, and correction for multiple testing based on the false discovery rate (FDR principle. Results None of these SNPs contributed to breast cancer risk individually. However, we have demonstrated evidence for gene-gene (SNP-SNP interaction among these SNPs, which were associated with increased breast cancer risk. Our study suggests cross talk between the SNPs of the DNA repair and immune system (XPD-[Lys751Gln] and IL10-[G(-1082A], cell cycle and estrogen metabolism (CCND1-[Pro241Pro] and COMT-[Met108/158Val], cell cycle and DNA repair (BARD1-[Pro24Ser] and XPD-[Lys751Gln], and within carcinogen metabolism (GSTP1-[Ile105Val] and COMT-[Met108/158Val] pathways. Conclusion The importance of these pathways and their communication in breast cancer predisposition has been emphasized previously, but their
Full Text Available Abstract Background Good genetic progress for pig reproduction traits has been achieved using a quantitative genetics-based multi-trait BLUP evaluation system. At present, whole-genome single nucleotide polymorphisms (SNP panels provide a new tool for pig selection. The purpose of this study was to identify SNP associated with reproduction traits in the Finnish Landrace pig breed using the Illumina PorcineSNP60 BeadChip. Methods Association of each SNP with different traits was tested with a weighted linear model, using SNP genotype as a covariate and animal as a random variable. Deregressed estimated breeding values of the progeny tested boars were used as the dependent variable and weights were based on their reliabilities. Statistical significance of the associations was based on Bonferroni-corrected P-values. Results Deregressed estimated breeding values were available for 328 genotyped boars. Of the 62 163 SNP in the chip, 57 868 SNP had a call rate > 0.9 and 7 632 SNP were monomorphic. Statistically significant results (P-value P-value P-value = 1.69E-08 more than unfavourable double homozygote animals. A region on chromosome 9 (66 Mb was statistically significant for piglet mortality between birth and weaning in later parity (0.44 piglets between homozygotes, P-value = 6.94E-08. Conclusions Three separate regions on chromosome 9 gave significant results for litter size and pig mortality. The frequencies of favourable alleles of the significant SNP are moderate in the Finnish Landrace population and these SNP are thus valuable candidates for possible marker-assisted selection.
Full Text Available The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50-60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed: Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes, sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.
Iliadis, Alexandros; Anastassiou, Dimitris; Wang, Xiaodong
Copy number variations (CNVs) are abundant in the human genome. They have been associated with complex traits in genome-wide association studies (GWAS) and expected to continue playing an important role in identifying the etiology of disease phenotypes. As a result of current high throughput whole-genome single-nucleotide polymorphism (SNP) arrays, we currently have datasets that simultaneously have integer copy numbers in CNV regions as well as SNP genotypes. At the same time, haplotypes that have been shown to offer advantages over genotypes in identifying disease traits even though available for SNP genotypes are largely not available for CNV/SNP data due to insufficient computational tools. We introduce a new framework for inferring haplotypes in CNV/SNP data using a sequential Monte Carlo sampling scheme 'Tree-Based Deterministic Sampling CNV' (TDSCNV). We compare our method with polyHap(v2.0), the only currently available software able to perform inference in CNV/SNP genotypes, on datasets of varying number of markers. We have found that both algorithms show similar accuracy but TDSCNV is an order of magnitude faster while scaling linearly with the number of markers and number of individuals and thus could be the method of choice for haplotype inference in such datasets. Our method is implemented in the TDSCNV package which is available for download at http://www.ee.columbia.edu/~anastas/tdscnv.
Mansueto, Locedie; Fuentes, Roven Rommel; Borja, Frances Nikki; Detras, Jeffery; Abriol-Santos, Juan Miguel; Chebotarov, Dmytro; Sanciangco, Millicent; Palis, Kevin; Copetti, Dario; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Wing, Rod A.; Hamilton, Ruaraidh Sackville; Mauleon, Ramil; McNally, Kenneth L.; Alexandrov, Nickolai
We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org. PMID:27899667
Gardner, S; Jaing, C
The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interim report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.
Janssen, Manoe J; Arcolino, Fanny O; Schoor, Perry; Kok, Robbert Jan; Mastrobattista, Enrico
In this review we provide an overview of the expanding molecular toolbox that is available for gene based therapies and how these therapies can be used for a large variety of kidney diseases. Gene based therapies range from restoring gene function in genetic kidney diseases to steering complex molec
Watson-Haigh Nathan S
Full Text Available Abstract Background Whole genome association studies using highly dense single nucleotide polymorphisms (SNPs are a set of methods to identify DNA markers associated with variation in a particular complex trait of interest. One of the main outcomes from these studies is a subset of statistically significant SNPs. Finding the potential biological functions of such SNPs can be an important step towards further use in human and agricultural populations (e.g., for identifying genes related to susceptibility to complex diseases or genes playing key roles in development or performance. The current challenge is that the information holding the clues to SNP functions is distributed across many different databases. Efficient bioinformatics tools are therefore needed to seamlessly integrate up-to-date functional information on SNPs. Many web services have arisen to meet the challenge but most work only within the framework of human medical research. Although we acknowledge the importance of human research, we identify there is a need for SNP annotation tools for other organisms. Description We introduce an R package called FunctSNP, which is the user interface to custom built species-specific databases. The local relational databases contain SNP data together with functional annotations extracted from online resources. FunctSNP provides a unified bioinformatics resource to link SNPs with functional knowledge (e.g., genes, pathways, ontologies. We also introduce dbAutoMaker, a suite of Perl scripts, which can be scheduled to run periodically to automatically create/update the customised SNP databases. We illustrate the use of FunctSNP with a livestock example, but the approach and software tools presented here can be applied also to human and other organisms. Conclusions Finding the potential functional significance of SNPs is important when further using the outcomes from whole genome association studies. FunctSNP is unique in that it is the only R
Full Text Available Abstract Background PCR-restriction fragment length polymorphism (RFLP assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. Results The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels, gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. Conclusions The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2.
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ~4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification pr...
Fountain, Emily D; Pauli, Jonathan N; Reid, Brendan N; Palsbøll, Per J; Peery, M Zachariah
Restriction-enzyme-based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in non-model organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction-enzyme-based methods remain largely unknown
Helyar, Sarah J; Limborg, Morten; Bekkevold, Dorte
to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic...
Aflitos, Saulo Alves; Sanchez-Perez, Gabino; de Ridder, Dick; Fransz, Paul; Schranz, Michael E; de Jong, Hans; Peters, Sander A
Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not viable. Here, we present the Introgression Browser (iBrowser), a bioinformatics tool aimed at visualizing introgressions at nucleotide or SNP (Single Nucleotide Polymorphisms) accuracy. The software selects homozygous SNPs from Variant Call Format (VCF) information and filters out heterozygous SNPs, multi-nucleotide polymorphisms (MNPs) and insertion-deletions (InDels). For data analysis iBrowser makes use of sliding windows, but if needed it can generate any desired fragmentation pattern through General Feature Format (GFF) information. In an example of tomato (Solanum lycopersicum) accessions we visualize SNP patterns and elucidate both position and boundaries of the introgressions. We also show that our tool is capable of identifying alien DNA in a panel of the closely related S. pimpinellifolium by examining phylogenetic relationships of the introgressed segments in tomato. In a third example, we demonstrate the power of the iBrowser in a panel of 597 Arabidopsis accessions, detecting the boundaries of a SNP-free region around a polymorphic 1.17 Mbp inverted segment on the short arm of chromosome 4. The architecture and functionality of iBrowser makes the software appropriate for a broad set of analyses including SNP mining, genome structure analysis, and pedigree analysis. Its functionality, together with the capability to process large data sets and efficient visualization of sequence variation, makes iBrowser a valuable breeding tool.
Børsting, Claus; Fordyce, Sarah L; Olofsson, Jill Katharina;
The Ion Torrent™ HID SNP assay amplified 136 autosomal SNPs and 33 Y-chromosome markers in one PCR and the markers were subsequently typed using the Ion PGM™ second generation sequencing platform. A total of 51 of the autosomal SNPs were selected from the SNPforID panel that is routinely used...... allele balance among samples. These SNPs should be excluded from the panel. The optimal amount of DNA in the PCR seemed to be ≥0.5ng. Allele drop-outs were rare and only seen in experiments with ... of the heterozygote allele balances were between 0.6 and 1.6, which is comparable to the heterozygote balances of STRs typed with PCR-CE. The number of reads with base calls that differed from the genotype call was typically less than five. This allowed detection of 1:100 mixtures with a high degree of certainty...
Yoshinaga Yoshimura, Tomoko Ohtake, Hajime Okada, Takehiro Ami, Tadashi Tsukaguchi and Kenzo Fujimoto
Full Text Available We describe a simple and inexpensive single-nucleotide polymorphism (SNP typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.
Yoshimura, Yoshinaga; Ohtake, Tomoko; Okada, Hajime; Fujimoto, Kenzo [School of Materials Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292 (Japan); Ami, Takehiro [Innovation Plaza Ishikawa, Japan Science and Technology Agency, 2-13 Asahidai, Nomi, Ishikawa 923-1211 (Japan); Tsukaguchi, Tadashi, E-mail: email@example.com [Faculty of Bioresources and Environmental Sciences, Ishikawa Prefectural University, 1-308 Suematsu, Nonoichi, Ishikawa 921-8836 (Japan)
We describe a simple and inexpensive single-nucleotide polymorphism (SNP) typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.
Engelsma, K.A.; Veerkamp, R.F.; Calus, M.P.L.; Bijma, P.; Windig, J.J.
Genetic diversity is often evaluated using pedigree information. Currently, diversity can be evaluated in more detail over the genome based on large numbers of SNP markers. Pedigree- and SNP-based diversity were compared for two small related groups of Holstein animals genotyped with the 50 k SNP
Full Text Available non-tagSNP -: SNP not included in LD bin calculation (MAF Best tagSNP The flag that indicates whether the SNP is the best... tagSNP or not. 1: Best tagSNP 0: non-best tagSNP -: SNP
Zhang, Jia; Li, Kai; Pardinas, Jose R; Liao, Duan F; Li, Hong J; Zhang, Xu
Single nucleotide polymorphisms (SNPs) are useful physical markers for genetic studies as well as the cause of some genetic diseases. To develop more reliable SNP assays, we examined the underlying molecular mechanisms by which deoxyribonucleic acid (DNA) polymerases with 3' exonuclease activity maintain the high fidelity of DNA replication. In addition to mismatch removal by proofreading, we have discovered a premature termination of polymerization mediated by a novel OFF-switch mechanism. Two SNP assays were developed, one based on proofreading using 3' end-labeled primer extension and the other based on the newly identified OFF-switch, respectively. These two new assays are well suited for conventional techniques, such as electrophoresis and microplates detection systems as well as the sophisticated microchips. Application of these reliable SNP assays will greatly facilitate genetic and biomedical studies in the postgenome era.
Full Text Available Abstract Background Cucurbita pepo is a member of the Cucurbitaceae family, the second- most important horticultural family in terms of economic importance after Solanaceae. The "summer squash" types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP, was recently generated using massive sequencing. A set of 384 SNP was selected to generate an Illumina GoldenGate assay in order to construct the first SNP-based genetic map of Cucurbita and map quantitative trait loci (QTL. Results We herein present the construction of the first SNP-based genetic map of Cucurbita pepo using a population derived from the cross of two varieties with contrasting phenotypes, representing the main cultivar groups of the species' two subspecies: Zucchini (subsp. pepo × Scallop (subsp. ovifera. The mapping population was genotyped with 384 SNP, a set of selected EST-SNP identified in silico after massive sequencing of the transcriptomes of both parents, using the Illumina GoldenGate platform. The global success rate of the assay was higher than 85%. In total, 304 SNP were mapped, along with 11 SSR from a previous map, giving a map density of 5.56 cM/marker. This map was used to infer syntenic relationships between C. pepo and cucumber and to successfully map QTL that control plant, flowering and fruit traits that are of benefit to squash breeding. The QTL effects were validated in backcross populations. Conclusion Our results show that massive sequencing in different genotypes is an excellent tool for SNP discovery, and that the Illumina GoldenGate platform can be successfully applied to constructing genetic maps and performing QTL analysis in Cucurbita. This is the first SNP-based genetic map in the Cucurbita genus and is an invaluable new tool for biological research
Mukherjee, Shubhabrata; Kim, Sungeun; Ramanan, Vijay K; Gibbons, Laura E; Nho, Kwangsik; Glymour, M Maria; Ertekin-Taner, Nilüfer; Montine, Thomas J; Saykin, Andrew J; Crane, Paul K
Resilience in executive functioning (EF) is characterized by high EF measured by neuropsychological test performance despite structural brain damage from neurodegenerative conditions. We previously reported single nucleotide polymorphism (SNP) genome-wide association study (GWAS) results for EF resilience. Here, we report gene- and pathway-based analyses of the same resilience phenotype, using an optimal SNP-set (Sequence) Kernel Association Test (SKAT) for gene-based analyses (conservative threshold for genome-wide significance = 0.05/18,123 = 2.8 × 10(-6)) and the gene-set enrichment package GSA-SNP for biological pathway analyses (False discovery rate (FDR) resilience (p = 1.33 × 10(-7)). Genetic pathways involved with dendritic/neuron spine, presynaptic membrane, postsynaptic density, etc., were enriched with association to EF resilience. Although replication of these results is necessary, our findings indicate the potential value of gene- and pathway-based analyses in research on determinants of cognitive resilience.
Cregan Perry B
Full Text Available Abstract Background Single nucleotide polymorphisms (SNP constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures. Results The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False. The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites (12 Mb in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb. SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV (i.e., fraction of candidate SNP being real were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes. Conclusion A machine learning (ML method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study
Hwang, Michael T; Landon, Preston B; Lee, Joon; Choi, Duyoung; Mo, Alexander H; Glinsky, Gennadi; Lal, Ratnesh
Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine.
Full Text Available Abstract Background Identification of causal SNPs in most genome wide association studies relies on approaches that consider each SNP individually. However, there is a strong correlation structure among SNPs that needs to be taken into account. Hence, increasingly modern computationally expensive regression methods are employed for SNP selection that consider all markers simultaneously and thus incorporate dependencies among SNPs. Results We develop a novel multivariate algorithm for large scale SNP selection using CAR score regression, a promising new approach for prioritizing biomarkers. Specifically, we propose a computationally efficient procedure for shrinkage estimation of CAR scores from high-dimensional data. Subsequently, we conduct a comprehensive comparison study including five advanced regression approaches (boosting, lasso, NEG, MCP, and CAR score and a univariate approach (marginal correlation to determine the effectiveness in finding true causal SNPs. Conclusions Simultaneous SNP selection is a challenging task. We demonstrate that our CAR score-based algorithm consistently outperforms all competing approaches, both uni- and multivariate, in terms of correctly recovered causal SNPs and SNP ranking. An R package implementing the approach as well as R code to reproduce the complete study presented here is available from http://strimmerlab.org/software/care/.
Gumus, Ergun; Gormez, Zeliha; Kursun, Olcay
Biomarker discovery is a challenging task of bioinformatics especially when targeting high dimensional problems such as SNP (single nucleotide polymorphism) datasets. Various types of feature selection methods can be applied to accomplish this task. Typically, using features versus class labels of samples in the training dataset, these methods aim at selecting feature subsets with maximal classification accuracies. Although finding such class-discriminative features is crucial, selection of relevant SNPs for maximizing other properties that exist in the nature of population genetics such as the correlation between genetic diversity and geographical distance of ethnic groups can also be equally important. In this work, a methodology using a multi objective optimization technique called Pareto Optimal is utilized for selecting SNP subsets offering both high classification accuracy and correlation between genomic and geographical distances. In this method, discriminatory power of an SNP is determined using mutual information and its contribution to the genomic-geographical correlation is estimated using its loadings on principal components. Combining these objectives, the proposed method identifies SNP subsets that can better discriminate ethnic groups than those obtained with sole mutual information and yield higher correlation than those obtained with sole principal components on the Human Genome Diversity Project (HGDP) SNP dataset.
Mooser, V; Waterworth, D M; Isenhour, T; Middleton, L
In the past pharmacological agents have contributed to a significant reduction in age-adjusted incidence of cardiovascular events. However, not all patients treated with these agents respond favorably, and some individuals may develop side-effects. With aging of the population and the growing prevalence of cardiovascular risk factors worldwide, it is expected that the demand for cardiovascular drugs will increase in the future. Accordingly, there is a growing need to identify the 'good' responders as well as the persons at risk for developing adverse events. Evidence is accumulating to indicate that responses to drugs are at least partly under genetic control. As such, pharmacogenetics - the study of variability in drug responses attributed to hereditary factors in different populations - may significantly assist in providing answers toward meeting this challenge. Pharmacogenetics mostly relies on associations between a specific genetic marker like single nucleotide polymorphisms (SNPs), either alone or arranged in a specific linear order on a certain chromosomal region (haplotypes), and a particular response to drugs. Numerous associations have been reported between selected genotypes and specific responses to cardiovascular drugs. Recently, for instance, associations have been reported between specific alleles of the apoE gene and the lipid-lowering response to statins, or the lipid-elevating effect of isotretinoin. Thus far, these types of studies have been mostly limited to a priori selected candidate genes due to restricted genotyping and analytical capacities. Thanks to the large number of SNPs now available in the public domain through the SNP Consortium and the newly developed technologies (high throughput genotyping, bioinformatics software), it is now possible to interrogate more than 200,000 SNPs distributed over the entire human genome. One pharmacogenetic study using this approach has been launched by GlaxoSmithKline to identify the approximately 4% of
McClure, Matthew C.; Sonstegard, Tad S.; Wiggans, George R.; Van Eenennaam, Alison L.; Weber, Kristina L.; Penedo, Cecilia T.; Berry, Donagh P.; Flynn, John; Garcia, Jose F.; Carmo, Adriana S.; Regitano, Luciana C. A.; Albuquerque, Milla; Silva, Marcos V. G. B.; Machado, Marco A.; Coffey, Mike; Moore, Kirsty; Boscher, Marie-Yvonne; Genestout, Lucie; Mazza, Raffaele; Taylor, Jeremy F.; Schnabel, Robert D.; Simpson, Barry; Marques, Elisa; McEwan, John C.; Cromie, Andrew; Coutinho, Luiz L.; Kuehn, Larry A.; Keele, John W.; Piper, Emily K.; Cook, Jim; Williams, Robert; Van Tassell, Curtis P.
To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ≤1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset. PMID:24065982
Matthew Charles Mcclure
Full Text Available To assist cattle producers transition from microsatellite (MS to single nucleotide polymorphism (SNP genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N=479 from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8,000 animals representing 39 breeds (Bos taurus and B. indicus were used to predict 9,410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles for 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N=2 to 36 breeds. These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had <1 Mendelian inheritance conflicts with their parents’ reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset.
Telfer, Emily J; Stovold, Grahame T; Li, Yongjun; Silva-Junior, Orzenil B; Grattapaglia, Dario G; Dungey, Heidi S
Pedigree reconstruction using molecular markers enables efficient management of inbreeding in open-pollinated breeding strategies, replacing expensive and time-consuming controlled pollination. This is particularly useful in preferentially outcrossed, insect pollinated Eucalypts known to suffer considerable inbreeding depression from related matings. A single nucleotide polymorphism (SNP) marker panel consisting of 106 markers was selected for pedigree reconstruction from the recently developed high-density Eucalyptus Infinium SNP chip (EuCHIP60K). The performance of this SNP panel for pedigree reconstruction in open-pollinated progenies of two Eucalyptus nitens seed orchards was compared with that of two microsatellite panels with 13 and 16 markers respectively. The SNP marker panel out-performed one of the microsatellite panels in the resolution power to reconstruct pedigrees and out-performed both panels with respect to data quality. Parentage of all but one offspring in each clonal seed orchard was correctly matched to the expected seed parent using the SNP marker panel, whereas parentage assignment to less than a third of the expected seed parents were supported using the 13-microsatellite panel. The 16-microsatellite panel supported all but one of the recorded seed parents, one better than the SNP panel, although there was still a considerable level of missing and inconsistent data. SNP marker data was considerably superior to microsatellite data in accuracy, reproducibility and robustness. Although microsatellites and SNPs data provide equivalent resolution for pedigree reconstruction, microsatellite analysis requires more time and experience to deal with the uncertainties of allele calling and faces challenges for data transferability across labs and over time. While microsatellite analysis will continue to be useful for some breeding tasks due to the high information content, existing infrastructure and low operating costs, the multi-species SNP resource
Wu, Xiaoping; Lund, Mogens S; Sahana, Goutam;
for mastitis traits: 54 k markers of a medium-density SNP (single nucleotide polymorphism) chip (MD), imputed 777 k markers of a high-density SNP chip (HD), and imputed whole-genome sequencing data (SEQ). Each dataset contained data for 4496 Danish Holstein cattle. Comparisons were performed using a linear...... when tested using the same statistical model. With the LM model, 120 (MD), 967 (HD), and 7209 (SEQ) SNPs were significantly associated with mastitis, whereas with the BVS model, 43 (MD), 131 (HD), and 1052 (SEQ) significant SNPs (Bayes factor > 3.2) were observed. A total of 26 (MD), 75 (HD), and 465......, LIFR, and EDN3 may be considered as candidate genes for mastitis susceptibility....
Talukder, Zahirul I; Gong, Li; Hulke, Brent S; Pegadaraju, Venkatramana; Song, Qijian; Schultz, Quentin; Qi, Lili
A high-resolution genetic map of sunflower was constructed by integrating SNP data from three F2 mapping populations (HA 89/RHA 464, B-line/RHA 464, and CR 29/RHA 468). The consensus map spanned a total length of 1443.84 cM, and consisted of 5,019 SNP markers derived from RAD tag sequencing and 118 publicly available SSR markers distributed in 17 linkage groups, corresponding to the haploid chromosome number of sunflower. The maximum interval between markers in the consensus map is 12.37 cM and the average distance is 0.28 cM between adjacent markers. Despite a few short-distance inversions in marker order, the consensus map showed high levels of collinearity among individual maps with an average Spearman's rank correlation coefficient of 0.972 across the genome. The order of the SSR markers on the consensus map was also in agreement with the order of the individual map and with previously published sunflower maps. Three individual and one consensus maps revealed the uneven distribution of markers across the genome. Additionally, we performed fine mapping and marker validation of the rust resistance gene R12, providing closely linked SNP markers for marker-assisted selection of this gene in sunflower breeding programs. This high resolution consensus map will serve as a valuable tool to the sunflower community for studying marker-trait association of important agronomic traits, marker assisted breeding, map-based gene cloning, and comparative mapping.
Wen, Weie; He, Zhonghu; Gao, Fengmei; Liu, Jindong; Jin, Hui; Zhai, Shengnan; Qu, Yanying; Xia, Xianchun
A high-density consensus map is a powerful tool for gene mapping, cloning and molecular marker-assisted selection in wheat breeding. The objective of this study was to construct a high-density, single nucleotide polymorphism (SNP)-based consensus map of common wheat (Triticum aestivum L.) by integrating genetic maps from four recombinant inbred line populations. The populations were each genotyped using the wheat 90K Infinium iSelect SNP assay. A total of 29,692 SNP markers were mapped on 21 linkage groups corresponding to 21 hexaploid wheat chromosomes, covering 2,906.86 cM, with an overall marker density of 10.21 markers/cM. Compared with the previous maps based on the wheat 90K SNP chip detected 22,736 (76.6%) of the SNPs with consistent chromosomal locations, whereas 1,974 (6.7%) showed different chromosomal locations, and 4,982 (16.8%) were newly mapped. Alignment of the present consensus map and the wheat expressed sequence tags (ESTs) Chromosome Bin Map enabled assignment of 1,221 SNP markers to specific chromosome bins and 819 ESTs were integrated into the consensus map. The marker orders of the consensus map were validated based on physical positions on the wheat genome with Spearman rank correlation coefficients ranging from 0.69 (4D) to 0.97 (1A, 4B, 5B, and 6A), and were also confirmed by comparison with genetic position on the previously 40K SNP consensus map with Spearman rank correlation coefficients ranging from 0.84 (6D) to 0.99 (6A). Chromosomal rearrangements reported previously were confirmed in the present consensus map and new putative rearrangements were identified. In addition, an integrated consensus map was developed through the combination of five published maps with ours, containing 52,607 molecular markers. The consensus map described here provided a high-density SNP marker map and a reliable order of SNPs, representing a step forward in mapping and validation of chromosomal locations of SNPs on the wheat 90K array. Moreover, it can be
Background Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L). Results The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes. Conclusion This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and
Full Text Available Cytogenetic analysis is essential for the diagnosis and prognosis of hematopoietic neoplasms in current clinical practice. Many hematopoietic malignancies are characterized by structural chromosomal abnormalities such as specific translocations, inversions, deletions and/or numerical abnormalities that can be identified by karyotype analysis or fluorescence in situ hybridization (FISH studies. Single nucleotide polymorphism (SNP arrays offer high-resolution identification of copy number variants (CNVs and acquired copy-neutral loss of heterozygosity (LOH/uniparental disomy (UPD that are usually not identifiable by conventional cytogenetic analysis and FISH studies. As a result, SNP arrays have been increasingly applied to hematopoietic neoplasms to search for clinically-significant genetic abnormalities. A large numbers of CNVs and UPDs have been identified in a variety of hematopoietic neoplasms. CNVs detected by SNP array in some hematopoietic neoplasms are of prognostic significance. A few specific genes in the affected regions have been implicated in the pathogenesis and may be the targets for specific therapeutic agents in the future. In this review, we summarize the current findings of application of SNP arrays in a variety of hematopoietic malignancies with an emphasis on the clinically significant genetic variants.
Dassonneville, R; Brøndum, Rasmus Froberg; Druet, T
The purpose of this study was to investigate the imputation error and loss of reliability of direct genomic values (DGV) or genomically enhanced breeding values (GEBV) when using genotypes imputed from a 3,000-marker single nucleotide polymorphism (SNP) panel to a 50,000-marker SNP panel. Data co...
Cuenca, José; Aleza, Pablo; Navarro, Luis; Ollitrault, Patrick
Background Polyploidy is a major component of eukaryote evolution. Estimation of allele copy numbers for molecular markers has long been considered a challenge for polyploid species, while this process is essential for most genetic research. With the increasing availability and whole-genome coverage of single nucleotide polymorphism (SNP) markers, it is essential to implement a versatile SNP genotyping method to assign allelic configuration efficiently in polyploids. Scope This work evaluates the usefulness of the KASPar method, based on competitive allele-specific PCR, for the assignment of SNP allelic configuration. Citrus was chosen as a model because of its economic importance, the ongoing worldwide polyploidy manipulation projects for cultivar and rootstock breeding, and the increasing availability of SNP markers. Conclusions Fifteen SNP markers were successfully designed that produced clear allele signals that were in agreement with previous genotyping results at the diploid level. The analysis of DNA mixes between two haploid lines (Clementine and pummelo) at 13 different ratios revealed a very high correlation (average = 0·9796; s.d. = 0·0094) between the allele ratio and two parameters [θ angle = tan−1 (y/x) and y′ = y/(x + y)] derived from the two normalized allele signals (x and y) provided by KASPar. Separated cluster analysis and analysis of variance (ANOVA) from mixed DNA simulating triploid and tetraploid hybrids provided 99·71 % correct allelic configuration. Moreover, triploid populations arising from 2n gametes and interploid crosses were easily genotyped and provided useful genetic information. This work demonstrates that the KASPar SNP genotyping technique is an efficient way to assign heterozygous allelic configurations within polyploid populations. This method is accurate, simple and cost-effective. Moreover, it may be useful for quantitative studies, such as relative allele-specific expression analysis and bulk segregant analysis
Janssen, Manoe J; Arcolino, Fanny O; Schoor, Perry; Kok, Robbert Jan; Mastrobattista, Enrico
In this review we provide an overview of the expanding molecular toolbox that is available for gene based therapies and how these therapies can be used for a large variety of kidney diseases. Gene based therapies range from restoring gene function in genetic kidney diseases to steering complex molecular pathways in chronic kidney disorders, and can provide a treatment or cure for diseases that otherwise may not be targeted. This approach involves the delivery of recombinant DNA sequences harboring therapeutic genes to improve cell function and thereby promote kidney regeneration. Depending on the therapy, the recombinant DNA will express a gene that directly plays a role in the function of the cell (gene addition), that regulates the expression of an endogenous gene (gene regulation), or that even changes the DNA sequence of endogenous genes (gene editing). Some interventions involve permanent changes in the genome whereas others are only temporary and leave no trace. Efficient and safe delivery are important steps for all gene based therapies and also depend on the mode of action of the therapeutic gene. Here we provide examples on how the different methods can be used to treat various diseases, which technologies are now emerging (such as gene repair through CRISPR/Cas9) and what the opportunities, perspectives, potential and the limitations of these therapies are for the treatment of kidney diseases.
Khrustaleva, A.M.; Limborg, Morten; Seeb, J. E.
of the two northern Kamchatka rivers (Palana River and Pakhacha River) differed significantly from the other populations studied. We estimated the efficiency for both types of markers for individual assignment of fish taken in mixtures. Accuracy was generally higher for assignment with SNP data; however...
We will present an ultra-dense genetic linkage map for the octoploid, cultivated strawberry (Fragaria x ananassa) consisting of over 13K Axiom® based SNP markers and 150 previously mapped reference SSR loci. The high quality of the map is demonstrated by the short sizes of each of the 28 linkage gro...
Full Text Available Open source single nucleotide polymorphism (SNP discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2, SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a
Braun, Rosemary; Buetow, Kenneth
Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.
Lakshmi K Matukumalli
Full Text Available The success of genome-wide association (GWA studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP genotyping for the identification of quantitative trait loci (QTL and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF ranging from 0.24 to 0.27. The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle.
This is the final version. It was first published by Wiley at http://onlinelibrary.wiley.com/doi/10.1002/gepi.21853/abstract. Pathway analysis can complement point-wise single nucleotide polymorphism (SNP) analysis in exploring genomewide association study (GWAS) data to identify specific disease-associated genes that can be candidate causal genes. We propose a straightforward methodology that can be used for conducting a gene-based pathway analysis using summary GWAS statistics in combina...
Liu, Yu-Xuan; Hu, Qing-Qing; Ma, Hong-Du; Huang, Dai-Xin
Single nucleotide polymorphism (SNP) refers to the single base sequence variation in specific location of the human genome. Phenotype informative SNP has gradually become one of the research hot spots in forensic science. In this paper, the forensic research situation and application prospect of phenotype informative SNP in the characteristics of hair, eye and skin color, height, and facial feature are reviewed.
Chao, Shiaoman; Jellen, Eric N.; Carson, Martin L.; Rines, Howard W.; Obert, Donald E.; Lutz, Joseph D.; Shackelford, Irene; Korol, Abraham B.; Wight, Charlene P.; Gardner, Kyle M.; Hattori, Jiro; Beattie, Aaron D.; Bjørnstad, Åsmund; Bonman, J. Michael; Jannink, Jean-Luc; Sorrells, Mark E.; Brown-Guedira, Gina L.; Mitchell Fetch, Jennifer W.; Harrison, Stephen A.; Howarth, Catherine J.; Ibrahim, Amir; Kolb, Frederic L.; McMullen, Michael S.; Murphy, J. Paul; Ohm, Herbert W.; Rossnagel, Brian G.; Yan, Weikai; Miclaus, Kelci J.; Hiller, Jordan; Maughan, Peter J.; Redman Hulse, Rachel R.; Anderson, Joseph M.; Islamovic, Emir
A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42) has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources. PMID:23533580
Rebekah E Oliver
Full Text Available A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42 has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources.
Yun Joo Yoo
Full Text Available Gene-based analysis of multiple single nucleotide polymorphisms (SNPs in a gene region is an alternative to single SNP analysis. The multi-bin linear combination test (MLC proposed in previous studies utilizes the correlation among SNPs within a gene to construct a gene-based global test. SNPs are partitioned into clusters of highly correlated SNPs, and the MLC test statistic quadratically combines linear combination statistics constructed for each cluster. The test has degrees of freedom equal to the number of clusters and can be more powerful than a fully quadratic or fully linear test statistic. In this study, we develop a new SNP clustering algorithm designed to find cliques, which are complete subnetworks of SNPs with all pairwise correlations above a threshold. We evaluate the performance of the MLC test using the clique-based CLQ algorithm versus using the tag-SNP-based LDSelect algorithm. In our numerical power calculations we observed that the two clustering algorithms produce identical clusters about 40~60% of the time, yielding similar power on average. However, because the CLQ algorithm tends to produce smaller clusters with stronger positive correlation, the MLC test is less likely to be affected by the occurrence of opposing signs in the individual SNP effect coefficients.
Yoo, Yun Joo; Kim, Sun Ah; Bull, Shelley B
Gene-based analysis of multiple single nucleotide polymorphisms (SNPs) in a gene region is an alternative to single SNP analysis. The multi-bin linear combination test (MLC) proposed in previous studies utilizes the correlation among SNPs within a gene to construct a gene-based global test. SNPs are partitioned into clusters of highly correlated SNPs, and the MLC test statistic quadratically combines linear combination statistics constructed for each cluster. The test has degrees of freedom equal to the number of clusters and can be more powerful than a fully quadratic or fully linear test statistic. In this study, we develop a new SNP clustering algorithm designed to find cliques, which are complete subnetworks of SNPs with all pairwise correlations above a threshold. We evaluate the performance of the MLC test using the clique-based CLQ algorithm versus using the tag-SNP-based LDSelect algorithm. In our numerical power calculations we observed that the two clustering algorithms produce identical clusters about 40~60% of the time, yielding similar power on average. However, because the CLQ algorithm tends to produce smaller clusters with stronger positive correlation, the MLC test is less likely to be affected by the occurrence of opposing signs in the individual SNP effect coefficients.
Liu, Z; Goddard, M E; Reinhardt, F; Reents, R
Compared with the currently widely used multi-step genomic models for genomic evaluation, single-step genomic models can provide more accurate genomic evaluation by jointly analyzing phenotypes and genotypes of all animals and can properly correct for the effect of genomic preselection on genetic evaluations. The objectives of this study were to introduce a single-step genomic model, allowing a direct estimation of single nucleotide polymorphism (SNP) effects, and to develop efficient computing algorithms for solving equations of the single-step SNP model. We proposed an alternative to the current single-step genomic model based on the genomic relationship matrix by including an additional step for estimating the effects of SNP markers. Our single-step SNP model allowed flexible modeling of SNP effects in terms of the number and variance of SNP markers. Moreover, our single-step SNP model included a residual polygenic effect with trait-specific variance for reducing inflation in genomic prediction. A kernel calculation of the SNP model involved repeated multiplications of the inverse of the pedigree relationship matrix of genotyped animals with a vector, for which numerical methods such as preconditioned conjugate gradients can be used. For estimating SNP effects, a special updating algorithm was proposed to separate residual polygenic effects from the SNP effects. We extended our single-step SNP model to general multiple-trait cases. By taking advantage of a block-diagonal (co)variance matrix of SNP effects, we showed how to estimate multivariate SNP effects in an efficient way. A general prediction formula was derived for candidates without phenotypes, which can be used for frequent, interim genomic evaluations without running the whole genomic evaluation process. We discussed various issues related to implementation of the single-step SNP model in Holstein populations with an across-country genomic reference population.
Full Text Available This study was undertaken to clarify the molecular basis for human skin color variation and the environmental adaptability to ultraviolet irradiation, with the ultimate goal of predicting the impact of changes in future environments on human health risk. One hundred twenty-two Caucasians living in Toledo, Ohio participated. Back and cheek skin were assayed for melanin as a quantitative trait marker. Buccal cell samples were collected and used for DNA extraction. DNA was used for SNP genotyping using the Masscode™ system, which entails two-step PCR amplification and a platform chemistry which allows cleavable mass spectrometry tags. The results show gene-gene interaction between SNP alleles at multiple loci (not necessarily on the same chromosome contributes to inter-individual skin color variation while suggesting a high probability of linkage disequilibrium. Confirmation of these findings requires further study with other ethic groups to analyze the associations between SNP alleles at multiple loci and human skin color variation. Our overarching goal is to use remote sensing data to clarify the interaction between atmospheric environments and SNP allelic frequency and investigate human adaptability to ultraviolet irradiation. Such information should greatly assist in the prediction of the health effects of future environmental changes such as ozone depletion and increased ultraviolet exposure. If such health effects are to some extent predictable, it might be possible to prepare for such changes in advance and thus reduce the extent of their impact.
Børsting, Claus; Sanchez, Juan J; Hansen, Hanna E; Hansen, Anders J; Bruun, Hanne Q; Morling, Niels
The performance of a multiplex assay with 52 autosomal single nucleotide polymorphisms (SNPs) developed for human identification was tested on 124 mother-child-father trios. The typical paternity indices (PIs) were 10(5)-10(6) for the trios and 10(3)-10(4) for the child-father duos. Using the SNP profiles from the randomly selected trios and 700 previously typed individuals, a total of 83,096 comparisons between mother, child and an unrelated man were performed. On average, 9-10 mismatches per comparison were detected. Four mismatches were genetic inconsistencies and 5-6 mismatches were opposite homozygosities. In only two of the 83,096 comparisons did an unrelated man match perfectly to a mother-child duo, and in both cases the PI of the true father was much higher than the PI of the unrelated man. The trios were also typed for 15 short tandem repeats (STRs) and seven variable number of tandem repeats (VNTRs). The typical PIs based on 15 STRs or seven VNTRs were 5-50 times higher than the typical PIs based on 52 SNPs. Six mutations in tandem repeats were detected among the randomly selected trios. In contrast, there was not found any mutations in the SNP loci. The results showed that the 52 SNP-plex assay is a very useful alternative to currently used methods in relationship testing. The usefulness of SNP markers with low mutation rates in paternity and immigration casework is discussed.
Full Text Available Abstract Background Until recently, only a small number of low- and mid-throughput methods have been used for single nucleotide polymorphism (SNP discovery and genotyping in grapevine (Vitis vinifera L.. However, following completion of the sequence of the highly heterozygous genome of Pinot Noir, it has been possible to identify millions of electronic SNPs (eSNPs thus providing a valuable source for high-throughput genotyping methods. Results Herein we report the first application of the SNPlex™ genotyping system in grapevine aiming at the anchoring of an eukaryotic genome. This approach combines robust SNP detection with automated assay readout and data analysis. 813 candidate eSNPs were developed from non-repetitive contigs of the assembled genome of Pinot Noir and tested in 90 progeny of Syrah × Pinot Noir cross. 563 new SNP-based markers were obtained and mapped. The efficiency rate of 69% was enhanced to 80% when multiple displacement amplification (MDA methods were used for preparation of genomic DNA for the SNPlex assay. Conclusion Unlike other SNP genotyping methods used to investigate thousands of SNPs in a few genotypes, or a few SNPs in around a thousand genotypes, the SNPlex genotyping system represents a good compromise to investigate several hundred SNPs in a hundred or more samples simultaneously. Therefore, the use of the SNPlex assay, coupled with whole genome amplification (WGA, is a good solution for future applications in well-equipped laboratories.
Dunston Georgia M
Full Text Available Abstract Background Admixture mapping is a powerful approach for identifying genetic variants involved in human disease that exploits the unique genomic structure in recently admixed populations. To use existing published panels of ancestry-informative markers (AIMs for admixture mapping, markers have to be genotyped de novo for each admixed study sample and samples representing the ancestral parental populations. The increased availability of dense marker data on commercial chips has made it feasible to develop panels wherein the markers need not be predetermined. Results We developed two panels of AIMs (~2,000 markers each based on the Affymetrix Genome-Wide Human SNP Array 6.0 for admixture mapping with African American samples. These two AIM panels had good map power that was higher than that of a denser panel of ~20,000 random markers as well as other published panels of AIMs. As a test case, we applied the panels in an admixture mapping study of hypertension in African Americans in the Washington, D.C. metropolitan area. Conclusions Developing marker panels for admixture mapping from existing genome-wide genotype data offers two major advantages: (1 no de novo genotyping needs to be done, thereby saving costs, and (2 markers can be filtered for various quality measures and replacement markers (to minimize gaps can be selected at no additional cost. Panels of carefully selected AIMs have two major advantages over panels of random markers: (1 the map power from sparser panels of AIMs is higher than that of ~10-fold denser panels of random markers, and (2 clusters can be labeled based on information from the parental populations. With current technology, chip-based genome-wide genotyping is less expensive than genotyping ~20,000 random markers. The major advantage of using random markers is the absence of ascertainment effects resulting from the process of selecting markers. The ability to develop marker panels informative for ancestry from
Full Text Available The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs with minor allele frequency (MAF ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10−12. Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized.
Full Text Available The analysis of next-generation sequence (NGS data is often a fragmented step-wise process. For example, multiple pieces of software are typically needed to map NGS reads, extract variant sites, and construct a DNA sequence matrix containing only single nucleotide polymorphisms (i.e., a SNP matrix for a set of individuals. The management and chaining of these software pieces and their outputs can often be a cumbersome and difficult task. Here, we present CFSAN SNP Pipeline, which combines into a single package the mapping of NGS reads to a reference genome with Bowtie2, processing of those mapping (BAM files using SAMtools, identification of variant sites using VarScan, and production of a SNP matrix using custom Python scripts. We also introduce a Python package (CFSAN SNP Mutator that when given a reference genome will generate variants of known position against which we validate our pipeline. We created 1,000 simulated Salmonella enterica sp. enterica Serovar Agona genomes at 100× and 20× coverage, each containing 500 SNPs, 20 single-base insertions and 20 single-base deletions. For the 100× dataset, the CFSAN SNP Pipeline recovered 98.9% of the introduced SNPs and had a false positive rate of 1.04 × 10−6; for the 20× dataset 98.8% of SNPs were recovered and the false positive rate was 8.34 × 10−7. Based on these results, CFSAN SNP Pipeline is a robust and accurate tool that it is among the first to combine into a single executable the myriad steps required to produce a SNP matrix from NGS data. Such a tool is useful to those working in an applied setting (e.g., food safety traceback investigations as well as for those interested in evolutionary questions.
Van Loo, Peter; Nilsen, Gro; Nordgard, Silje H; Vollan, Hans Kristian Moen; Børresen-Dale, Anne-Lise; Kristensen, Vessela N; Lingjærde, Ole Christian
Single nucleotide polymorphism (SNP) arrays are powerful tools to delineate genomic aberrations in cancer genomes. However, the analysis of these SNP array data of cancer samples is complicated by three phenomena: (a) aneuploidy: due to massive aberrations, the total DNA content of a cancer cell can differ significantly from its normal two copies; (b) nonaberrant cell admixture: samples from solid tumors do not exclusively contain aberrant tumor cells, but always contain some portion of nonaberrant cells; (c) intratumor heterogeneity: different cells in the tumor sample may have different aberrations. We describe here how these phenomena impact the SNP array profile, and how these can be accounted for in the analysis. In an extended practical example, we apply our recently developed and further improved ASCAT (allele-specific copy number analysis of tumors) suite of tools to analyze SNP array data using data from a series of breast carcinomas as an example. We first describe the structure of the data, how it can be plotted and interpreted, and how it can be segmented. The core ASCAT algorithm next determines the fraction of nonaberrant cells and the tumor ploidy (the average number of DNA copies), and calculates an ASCAT profile. We describe how these ASCAT profiles visualize both copy number aberrations as well as copy-number-neutral events. Finally, we touch upon regions showing intratumor heterogeneity, and how they can be detected in ASCAT profiles. All source code and data described here can be found at our ASCAT Web site ( http://www.ifi.uio.no/forskning/grupper/bioinf/Projects/ASCAT/).
Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.; Payne, Deborah A.
Current proteomics techniques, such as mass spectrometry, focus on protein identification, usually ignoring most types of modifications beyond post-translational modifications, with the assumption that only a small number of peptides have to be matched to a protein for a positive identification. However, not all proteins are being identified with current techniques and improved methods to locate points of mutation are becoming a necessity. In the case when single-nucleotide polymorphisms (SNPs) are observed, brute force is the most common method to locate them, quickly becoming computationally unattractive as the size of the database associated with the model organism grows. We have developed a Bayesian model for SNPs, BSNP, incorporating evolutionary information at both the nucleotide and amino acid levels. Formulating SNPs as a Bayesian inference problem allows probabilities of interest to be easily obtained, for example the probability of a specific SNP or specific type of mutation over a gene or entire genome. Three SNP databases were observed in the evaluation of the BSNP model; the first SNP database is a disease specific gene in human, hemoglobin, the second is also a disease specific gene in human, p53, and the third is a more general SNP database for multiple genes in mouse. We validate that the BSNP model assigns higher posterior probabilities to the SNPs defined in all three separate databases than can be attributed to chance under specific evolutionary information, for example the amino acid model described by Majewski and Ott in conjunction with either the four-parameter nucleotide model by Bulmer or seven-parameter nucleotide model by Majewski and Ott.
Sun, Guirong; Li, Ming; Li, Hong; Tian, Yadong; Chen, Qixin; Bai, Yichun; Kang, Xiangtao
The pre-melanin-concentrating hormone (PMCH) gene is an important gene functionally concerning the regulations of body fat content, feeding behavior and energy balance. In this study, the full-length cDNA of chicken PMCH gene was amplified by SMART RACE method. The single nucleotide polymorphisms (SNPs) in the PMCH gene were screened by comparative sequence analysis. The obtained non-synonymous coding SNPs (ncSNPs) were designed for genotyping firstly. Its effects on growth, carcass characteristics and meat quality traits were investigated employing the F2 resource population of Gushi chicken crossed with Anak broiler by AluI CRS-PCR-RFLP. Our results indicated that the cDNA of chicken PMCH shared 67.25 and 66.47% homology with that of human and bovine PMCH, respectively. The deduced amino acid sequence of chicken PMCH (163 amino acids) were 52.07 and 50.89% identical to those of human and bovine PMCH, respectively. The PMCH protein sequence is predicted to have several functional domains, including pro-MCH, CSP, IL7, XPGI and some low complexity sequence. It has 8 phosphorylation sites and no signal peptide sequence. gga-miR-18a, gga-miR-18b, gga-miR-499 microRNA targeting site was predicted in the 3' untranslated region of chicken PMCH mRNA. In addition, a total of seven SNPs including an ncSNP and a synonymous coding SNP, were identified in the PMCH gene. The ncSNP c.81 A>T was found to be in moderate polymorphic state (polymorphic index=0.365), and the frequencies for genotype AA, AB and BB were 0.3648, 0.4682 and 0.1670, respectively. Significant associations between the locus and shear force of breast and leg were observed. This polymorphic site may serve as a useful target for the marker assisted selection of the growth and meat quality traits in chicken.
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10...
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Esteras, Cristina; Formisano, Gelsomina; Roig, Cristina; Díaz, Aurora; Blanca, José; Garcia-Mas, Jordi; Gómez-Guillamón, María Luisa; López-Sesé, Ana Isabel; Lázaro, Almudena; Monforte, Antonio J; Picó, Belén
Novel sequencing technologies were recently used to generate sequences from multiple melon (Cucumis melo L.) genotypes, enabling the in silico identification of large single nucleotide polymorphism (SNP) collections. In order to optimize the use of these markers, SNP validation and large-scale genotyping are necessary. In this paper, we present the first validated design for a genotyping array with 768 SNPs that are evenly distributed throughout the melon genome. This customized Illumina GoldenGate assay was used to genotype a collection of 74 accessions, representing most of the botanical groups of the species. Of the assayed loci, 91 % were successfully genotyped. The array provided a large number of polymorphic SNPs within and across accessions. This set of SNPs detected high levels of variation in accessions from this crop's center of origin as well as from several other areas of melon diversification. Allele distribution throughout the genome revealed regions that distinguished between the two main groups of cultivated accessions (inodorus and cantalupensis). Population structure analysis showed a subdivision into five subpopulations, reflecting the history of the crop. A considerably low level of LD was detected, which decayed rapidly within a few kilobases. Our results show that the GoldenGate assay can be used successfully for high-throughput SNP genotyping in melon. Since many of the genotyped accessions are currently being used as the parents of breeding populations in various programs, this set of mapped markers could be used for future mapping and breeding efforts.
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980
Canovas, Fernando; Mota, Catarina; Ferreira-Costa, Joana; Serrao, Ester; Coyer, Jim; Olsen, Jeanine; Pearson, Gareth
We characterized 35 single nucleotide polymorphism (SNP) markers for the brown alga Fucus vesiculosus. Based on existing Fucus Expressed Sequence Tag libraries for heat and desiccation-stressed tissue, SNPs were developed and confirmed by re-sequencing cDNA from a diverse panel of individuals. SNP l
Canovas, Fernando; Mota, Catarina; Ferreira-Costa, Joana; Serrao, Ester; Coyer, Jim; Olsen, Jeanine; Pearson, Gareth
We characterized 35 single nucleotide polymorphism (SNP) markers for the brown alga Fucus vesiculosus. Based on existing Fucus Expressed Sequence Tag libraries for heat and desiccation-stressed tissue, SNPs were developed and confirmed by re-sequencing cDNA from a diverse panel of individuals. SNP
Canovas, Fernando; Mota, Catarina; Ferreira-Costa, Joana; Serrao, Ester; Coyer, Jim; Olsen, Jeanine; Pearson, Gareth
We characterized 35 single nucleotide polymorphism (SNP) markers for the brown alga Fucus vesiculosus. Based on existing Fucus Expressed Sequence Tag libraries for heat and desiccation-stressed tissue, SNPs were developed and confirmed by re-sequencing cDNA from a diverse panel of individuals. SNP l
Kilaru, V; Iyer, S V; Almli, L M; Stevens, J S; Lori, A; Jovanovic, T; Ely, T D; Bradley, B; Binder, E B; Koen, N; Stein, D J; Conneely, K N; Wingo, A P; Smith, A K; Ressler, K J
Post-traumatic stress disorder (PTSD) develops in only some people following trauma exposure, but the mechanisms differentially explaining risk versus resilience remain largely unknown. PTSD is heritable but candidate gene studies and genome-wide association studies (GWAS) have identified only a modest number of genes that reliably contribute to PTSD. New gene-based methods may help identify additional genes that increase risk for PTSD development or severity. We applied gene-based testing to GWAS data from the Grady Trauma Project (GTP), a primarily African American cohort, and identified two genes (NLGN1 and ZNRD1-AS1) that associate with PTSD after multiple test correction. Although the top SNP from NLGN1 did not replicate, we observed gene-based replication of NLGN1 with PTSD in the Drakenstein Child Health Study (DCHS) cohort from Cape Town. NLGN1 has previously been associated with autism, and it encodes neuroligin 1, a protein involved in synaptogenesis, learning, and memory. Within the GTP dataset, a single nucleotide polymorphism (SNP), rs6779753, underlying the gene-based association, associated with the intermediate phenotypes of higher startle response and greater functional magnetic resonance imaging activation of the amygdala, orbitofrontal cortex, right thalamus and right fusiform gyrus in response to fearful faces. These findings support a contribution of the NLGN1 gene pathway to the neurobiological underpinnings of PTSD.
Shen, Terry H; Tarczy-Hornoch, Peter; Detwiler, Landon T; Cadag, Eithon; Carlson, Christopher S
Genome wide association studies (GWAS) are an important approach to understanding the genetic mechanisms behind human diseases. Single nucleotide polymorphisms (SNPs) are the predominant markers used in genome wide association studies, and the ability to predict which SNPs are likely to be functional is important for both a priori and a posteriori analyses of GWA studies. This article describes the design, implementation and evaluation of a family of systems for the purpose of identifying SNPs that may cause a change in phenotypic outcomes. The methods described in this article characterize the feasibility of combinations of logical and probabilistic inference with federated data integration for both point and regional SNP annotation and analysis. Evaluations of the methods demonstrate the overall strong predictive value of logical, and logical with probabilistic, inference applied to the domain of SNP annotation.
Ren, Jing; Chen, Liang; Jin, Xiaoli; Zhang, Miaomiao; You, Frank M.; Wang, Jirui; Frenkel, Vladimir; Yin, Xuegui; Nevo, Eviatar; Sun, Dongfa; Luo, Ming-Cheng; Peng, Junhua
Whole-genome scans with large number of genetic markers provide the opportunity to investigate local adaptation in natural populations and identify candidate genes under positive selection. In the present study, adaptation genetic differentiation associated with solar radiation was investigated using 695 polymorphic SNP markers in wild emmer wheat originated in a micro-site at Yehudiyya, Israel. The test involved two solar radiation niches: (1) sun, in-between trees; and (2) shade, under tree canopy, separated apart by a distance of 2–4 m. Analysis of molecular variance showed a small (0.53%) but significant portion of overall variation between the sun and shade micro-niches, indicating a non-ignorable genetic differentiation between sun and shade habitats. Fifty SNP markers showed a medium (0.05 ≤ FST ≤ 0.15) or high genetic differentiation (FST > 0.15). A total of 21 outlier loci under positive selection were identified by using four different FST-outlier testing algorithms. The markers and genome locations under positive selection are consistent with the known patterns of selection. These results suggested that genetic differentiation between sun and shade habitats is substantial, radiation-associated, and therefore ecologically determined. Hence, the results of this study reflected effects of natural selection through solar radiation on EST-related SNP genetic diversity, resulting presumably in different adaptive complexes at a micro-scale divergence. The present work highlights the evolutionary theory and application significance of solar radiation-driven natural selection in wheat improvement. PMID:28352272
Edea, Zewdu; Dadi, Hailu; Kim, Sang Wook; Dessie, Tadelle; Kim, Kwan-Suk
Although a large number of single nucleotide polymorphisms (SNPs) have been identified from the bovine genome-sequencing project, few of these have been validated at large in Bos indicus breeds. We have genotyped 192 animals, representing 5 cattle populations of Ethiopia, with the Illumina Bovine 8K SNP BeadChip. These include 1 Sanga (Danakil), 3 zebu (Borana, Arsi and Ambo), and 1 zebu × Sanga intermediate (Horro) breeds. The Hanwoo (Bos taurus) was included for comparison purposes. Analysis of 7,045 SNP markers revealed that the mean minor allele frequency (MAF) was 0.23, 0.22, 0.21, 0.21, 0.23, and 0.29 for Ambo, Arsi, Borana, Danakil, Horro, and Hanwoo, respectively. Significant differences of MAF were observed between the indigenous Ethiopian cattle populations and Hanwoo breed (p < 0.001). Across the Ethiopian cattle populations, a common variant MAF (≥0.10 and ≤0.5) accounted for an overall estimated 73.79% of the 7,045 SNPs. The Hanwoo displayed a higher proportion of common variant SNPs (90%). Investigation within Ethiopian cattle populations showed that on average, 16.64% of the markers were monomorphic, but in the Hanwoo breed, only 6% of the markers were monomorphic. Across the sampled Ethiopian cattle populations, the mean observed and expected heterozygosities were 0.314 and 0.313, respectively. The level of SNP variation identified in this particular study highlights that these markers can be potentially used for genetic studies in African cattle breeds.
Rabbi, Ismail Yusuf; Kulembeka, Heneriko Philbert; Masumba, Esther; Marri, Pradeep Reddy; Ferguson, Morag
Cassava (Manihot esculenta Crantz) is one of the most important food security crops in the tropics and increasingly being adopted for agro-industrial processing. Genetic improvement of cassava can be enhanced through marker-assisted breeding. For this, appropriate genomic tools are required to dissect the genetic architecture of economically important traits. Here, a genome-wide SNP-based genetic map of cassava anchored in SSRs is presented. An outbreeder full-sib (F1) family was genotyped on two independent SNP assay platforms: an array of 1,536 SNPs on Illumina's GoldenGate platform was used to genotype a first batch of 60 F1. Of the 1,358 successfully converted SNPs, 600 which were polymorphic in at least one of the parents and was subsequently converted to KBiosciences' KASPar assay platform for genotyping 70 additional F1. High-precision genotyping of 163 informative SSRs using capillary electrophoresis was also carried out. Linkage analysis resulted in a final linkage map of 1,837 centi-Morgans (cM) containing 568 markers (434 SNPs and 134 SSRs) distributed across 19 linkage groups. The average distance between adjacent markers was 3.4 cM. About 94.2% of the mapped SNPs and SSRs have also been localized on scaffolds of version 4.1 assembly of the cassava draft genome sequence. This more saturated genetic linkage map of cassava that combines SSR and SNP markers should find several applications in the improvement of cassava including aligning scaffolds of the cassava genome sequence, genetic analyses of important agro-morphological traits, studying the linkage disequilibrium landscape and comparative genomics.
Full Text Available Abstract Background Performing high throughput sequencing on samples pooled from different individuals is a strategy to characterize genetic variability at a small fraction of the cost required for individual sequencing. In certain circumstances some variability estimators have even lower variance than those obtained with individual sequencing. SNP calling and estimating the frequency of the minor allele from pooled samples, though, is a subtle exercise for at least three reasons. First, sequencing errors may have a much larger relevance than in individual SNP calling: while their impact in individual sequencing can be reduced by setting a restriction on a minimum number of reads per allele, this would have a strong and undesired effect in pools because it is unlikely that alleles at low frequency in the pool will be read many times. Second, the prior allele frequency for heterozygous sites in individuals is usually 0.5 (assuming one is not analyzing sequences coming from, e.g. cancer tissues, but this is not true in pools: in fact, under the standard neutral model, singletons (i.e. alleles of minimum frequency are the most common class of variants because P(f ∝ 1/f and they occur more often as the sample size increases. Third, an allele appearing only once in the reads from a pool does not necessarily correspond to a singleton in the set of individuals making up the pool, and vice versa, there can be more than one read – or, more likely, none – from a true singleton. Results To improve upon existing theory and software packages, we have developed a Bayesian approach for minor allele frequency (MAF computation and SNP calling in pools (and implemented it in a program called snape: the approach takes into account sequencing errors and allows users to choose different priors. We also set up a pipeline which can simulate the coalescence process giving rise to the SNPs, the pooling procedure and the sequencing. We used it to compare the
Full Text Available Abstract Background With improvements in genotyping technologies, genome-wide association studies with hundreds of thousands of SNPs allow the identification of candidate genetic loci for multifactorial diseases in different populations. However, genotyping errors caused by genotyping platforms or genotype calling algorithms may lead to inflation of false associations between markers and phenotypes. In addition, the number of SNPs available for genome-wide association studies in the Japanese population has been investigated using only 45 samples in the HapMap project, which could lead to an inaccurate estimation of the number of SNPs with low minor allele frequencies. We genotyped 400 Japanese samples in order to estimate the number of SNPs available for genome-wide association studies in the Japanese population and to examine the performance of the current SNP Array 6.0 platform and the genotype calling algorithm "Birdseed". Results About 20% of the 909,622 SNP markers on the array were revealed to be monomorphic in the Japanese population. Consequently, 661,599 SNPs were available for genome-wide association studies in the Japanese population, after excluding the poorly behaving SNPs. The Birdseed algorithm accurately determined the genotype calls of each sample with a high overall call rate of over 99.5% and a high concordance rate of over 99.8% using more than 48 samples after removing low-quality samples by adjusting QC criteria. Conclusion Our results confirmed that the SNP Array 6.0 platform reached the level reported by the manufacturer, and thus genome-wide association studies using the SNP Array 6.0 platform have considerable potential to identify candidate susceptibility or resistance genetic factors for multifactorial diseases in the Japanese population, as well as in other populations.
Hand Melanie L
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs provide essential tools for the advancement of research in plant genomics, and the development of SNP resources for many species has been accelerated by the capabilities of second-generation sequencing technologies. The current study aimed to develop and use a novel bioinformatic pipeline to generate a comprehensive collection of SNP markers within the agriculturally important pasture grass tall fescue; an outbreeding allopolyploid species displaying three distinct morphotypes: Continental, Mediterranean and rhizomatous. Results A bioinformatic pipeline was developed that successfully identified SNPs within genotypes from distinct tall fescue morphotypes, following the sequencing of 414 polymerase chain reaction (PCR – generated amplicons using 454 GS FLX technology. Equivalent amplicon sets were derived from representative genotypes of each morphotype, including six Continental, five Mediterranean and one rhizomatous. A total of 8,584 and 2,292 SNPs were identified with high confidence within the Continental and Mediterranean morphotypes respectively. The success of the bioinformatic approach was demonstrated through validation (at a rate of 70% of a subset of 141 SNPs using both SNaPshot™ and GoldenGate™ assay chemistries. Furthermore, the quantitative genotyping capability of the GoldenGate™ assay revealed that approximately 30% of the putative SNPs were accessible to co-dominant scoring, despite the hexaploid genome structure. The sub-genome-specific origin of each SNP validated from Continental tall fescue was predicted using a phylogenetic approach based on comparison with orthologous sequences from predicted progenitor species. Conclusions Using the appropriate bioinformatic approach, amplicon resequencing based on 454 GS FLX technology is an effective method for the identification of polymorphic SNPs within the genomes of Continental and Mediterranean tall fescue. The
Li, S.; Ma, L.; Li, H.
research. Using a user-friendly web interface, genes can be searched by name, description, position, SNP ID or clone name. Several public databases are integrated, including gene information from Ensembl, protein features from Uniprot/SWISS-PROT, Pfam and DAS-CBS. Gene relationships are fetched from BIND......, MINT, KEGG and are integrated with ortholog data from TreeFam to extend the current interaction networks. Integrated tools for primer-design and mis-splicing analysis have been developed to facilitate experimental analysis of individual genes with focus on their variation. Snap is available at http...
Su, Hai-Xiang; Zhou, Hai-Hong; Wang, Ming-Yu; Cheng, Jin; Zhang, Shi-Chao; Hui, Feng; Chen, Xue-Zhong; Liu, Shan-Hui; Liu, Qin-Jiang; Zhu, Zi-Jiang; Hu, Qing-Rong; Wu, Yi; Ji, Shang-Rong
C-reactive protein (CRP) is an established marker of inflammation with pattern-recognition receptor-like activities. Despite the close association of the serum level of CRP with the risk and prognosis of several types of cancer, it remains elusive whether CRP contributes directly to tumorigenesis or just represents a bystander marker. We have recently identified recurrent mutations at the SNP position -286 (rs3091244) in the promoter of CRP gene in several tumor types, instead suggesting that locally produced CRP is a potential driver of tumorigenesis. However, it is unknown whether the -286 site is the sole SNP position of CRP gene targeted for mutation and whether there is any association between CRP SNP mutations and other frequently mutated genes in tumors. Herein, we have examined the genotypes of three common CRP non-coding SNPs (rs7553007, rs1205, rs3093077) in tumor/normal sample pairs of 5 cancer types (n = 141). No recurrent somatic mutations are found at these SNP positions, indicating that the -286 SNP mutations are preferentially selected during the development of cancer. Further analysis reveals that the -286 SNP mutations of CRP tend to co-occur with mutated APC particularly in rectal cancer (p = 0.04; n = 67). By contrast, mutations of CRP and p53 or K-ras appear to be unrelated. There results thus underscore the functional importance of the -286 mutation of CRP in tumorigenesis and imply an interaction between CRP and Wnt signaling pathway.
Sherry, S T; Ward, M H; Kholodov, M; Baker, J; Phan, L; Smigielski, E M; Sirotkin, K
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K. Sirotkin (1999) Genome Res., 9, 677-679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/.
Wang, Jingbo; Ronaghi, Mostafa; Chong, Samuel S; Lee, Caroline G L
Currently, >14,000,000 single nucleotide polymorphisms (SNPs) are reported. Identifying phenotype-affecting SNPs among these many SNPs pose significant challenges. Although several Web resources are available that can inform about the functionality of SNPs, these resources are mainly annotation databases and are not very comprehensive. In this article, we present a comprehensive, well-annotated, integrated pfSNP (potentially functional SNPs) Web resource (http://pfs.nus.edu.sg/), which is aimed to facilitate better hypothesis generation through knowledge syntheses mediated by better data integration and a user-friendly Web interface. pfSNP integrates >40 different algorithms/resources to interrogate >14,000,000 SNPs from the dbSNP database for SNPs of potential functional significance based on previous published reports, inferred potential functionality from genetic approaches as well as predicted potential functionality from sequence motifs. Its query interface has the user-friendly "auto-complete, prompt-as-you-type" feature and is highly customizable, facilitating different combination of queries using Boolean-logic. Additionally, to facilitate better understanding of the results and aid in hypotheses generation, gene/pathway-level information with text clouds highlighting enriched tissues/pathways as well as detailed-related information are also provided on the results page. Hence, the pfSNP resource will be of great interest to scientists focusing on association studies as well as those interested to experimentally address the functionality of SNPs.
This report is to discuss the marker development for radioactive waste disposal sites. The markers must be designed to last 10,000 years, and place no undue burdens on the future generations. Barriers cannot be constructed that preclude human intrusion. Design specifications for surface markers will be discussed, also marker pictograms will also be covered.
Stephen J. Amish,; Paul A. Hohenlohe,; Sally Painter,; Robb F. Leary,; Muhlfeld, Clint C.; Fred W. Allendorf,; Luikart, Gordon
Hybridization with introduced rainbow trout threatens most native westslope cutthroat trout populations. Understanding the genetic effects of hybridization and introgression requires a large set of high-throughput, diagnostic genetic markers to inform conservation and management. Recently, we identified several thousand candidate single-nucleotide polymorphism (SNP) markers based on RAD sequencing of 11 westslope cutthroat trout and 13 rainbow trout individuals. Here, we used flanking sequence for 56 of these candidate SNP markers to design high-throughput genotyping assays. We validated the assays on a total of 92 individuals from 22 populations and seven hatchery strains. Forty-six assays (82%) amplified consistently and allowed easy identification of westslope cutthroat and rainbow trout alleles as well as heterozygote controls. The 46 SNPs will provide high power for early detection of population admixture and improved identification of hybrid and nonhybridized individuals. This technique shows promise as a very low-cost, reliable and relatively rapid method for developing and testing SNP markers for nonmodel organisms with limited genomic resources.
Nicholas A. Tinker
Full Text Available Recognizing a need in cultivated hexaploid oat ( L. for a reliable set of reference single nucleotide polymorphisms (SNPs, we have developed a 6000 (6K BeadChip design containing 257 Infinium I and 5486 Infinium II designs corresponding to 5743 SNPs. Of those, 4975 SNPs yielded successful assays after array manufacturing. These SNPs were discovered based on a variety of bioinformatics pipelines in complementary DNA (cDNA and genomic DNA originating from 20 or more diverse oat cultivars. The array was validated in 1100 samples from six recombinant inbred line (RIL mapping populations and sets of diverse oat cultivars and breeding lines, and provided approximately 3500 discernible Mendelian polymorphisms. Here, we present an annotation of these SNPs, including methods of discovery, gene identification and orthology, population-genetic characteristics, and tentative positions on an oat consensus map. We also evaluate a new cluster-based method of calling SNPs. The SNP design sequences are made publicly available, and the full SNP genotyping platform is available for commercial purchase from an independent third party.
Full Text Available Abstract Background Mitochondrial single nucleotide polymorphisms (mtSNPs constitute important data when trying to shed some light on human diseases and cancers. Unfortunately, providing relevant mtSNP genotyping information in mtDNA databases in a neatly organized and transparent visual manner still remains a challenge. Amongst the many methods reported for SNP genotyping, determining the restriction fragment length polymorphisms (RFLPs is still one of the most convenient and cost-saving methods. In this study, we prepared the visualization of the mtDNA genome in a way, which integrates the RFLP genotyping information with mitochondria related cancers and diseases in a user-friendly, intuitive and interactive manner. The inherent problem associated with mtDNA sequences in BLAST of the NCBI database was also solved. Description V-MitoSNP provides complete mtSNP information for four different kinds of inputs: (1 color-coded visual input by selecting genes of interest on the genome graph, (2 keyword search by locus, disease and mtSNP rs# ID, (3 visualized input of nucleotide range by clicking the selected region of the mtDNA sequence, and (4 sequences mtBLAST. The V-MitoSNP output provides 500 bp (base pairs flanking sequences for each SNP coupled with the RFLP enzyme and the corresponding natural or mismatched primer sets. The output format enables users to see the SNP genotype pattern of the RFLP by virtual electrophoresis of each mtSNP. The rate of successful design of enzymes and primers for RFLPs in all mtSNPs was 99.1%. The RFLP information was validated by actual agarose electrophoresis and showed successful results for all mtSNPs tested. The mtBLAST function in V-MitoSNP provides the gene information within the input sequence rather than providing the complete mitochondrial chromosome as in the NCBI BLAST database. All mtSNPs with rs number entries in NCBI are integrated in the corresponding SNP in V-MitoSNP. Conclusion V-MitoSNP is a web
Full Text Available Abstract Background Several millions single nucleotide polymorphisms (SNPs have already been collected and deposited in public databases and these are important resources not only for use as markers to identify disease-associated genes, but also to understand the mechanisms that underlie the genome diversification. Results A spectrum analysis of SNP density distribution in the genomic regions around transcription start sites (TSSs revealed a remarkable periodicity of 146 nucleotides. This periodicity was observed in the regions that were associated with CpG islands (CGIs, but not in the regions without CpG islands (nonCGIs. An analysis of the sequence divergence of the same genomic regions between humans and chimpanzees also revealed a similar periodical pattern in CGI. The occurrences of any mono- or di-nucleotide sequences in these regions did not reveal such a periodicity, thus indicating that an interpretation of this periodicity solely based on the sequence-dependent susceptibility to mutation is highly unlikely. Conclusion The periodical patterns of nucleotide variability suggest the location of nucleosomes that are phased at TSS, and can be viewed as the genetic footprint of the chromatin state that has been maintained throughout mammalian evolutionary history. The results suggest the possible involvement of the nucleosome structure in the promoter function, and also a fundamental functional/structural difference between the two promoter classes, i.e., those with and without CGIs.
Molinari Laura M
Full Text Available Abstract Background CHARGE syndrome is a complex of birth defects including coloboma, choanal atresia, ear malformations and deafness, cardiac defects, and growth delay. We have previously hypothesized that CHARGE syndrome could be caused by unidentified genomic microdeletion, but no such deletion was detected using short tandem repeat (STR markers spaced an average of 5 cM apart. Recently, microdeletion at 8q12 locus was reported in two patients with CHARGE, although point mutation in CHD7 on chromosome 8 was the underlying etiology in most of the affected patients. Methods We have extended our previous study by employing a much higher density of SNP markers (3258 with an average spacing of approximately 800 kb. These SNP markers are diallelic and, therefore, have much different properties for detection of deletions than STRs. Results A global error rate estimate was produced based on Mendelian inconsistency. One marker, rs431722 exceeded the expected frequency of inconsistencies, but no deletion could be demonstrated after retesting the 4 inconsistent pedigrees with local flanking markers or by FISH with the corresponding BAC clone. Expected deletion detection (EDD was used to assess the coverage of specific intervals over the genome by deriving the probability of detecting a common loss of heterozygosity event over each genomic interval. This analysis estimated the fraction of unobserved deletions, taking into account the allele frequencies at the SNPs, the known marker spacing and sample size. Conclusions The results of our genotyping indicate that more than 35% of the genome is included in regions with very low probability of a deletion of at least 2 Mb.
Ali, Shahin S; Shao, Jonathan; Strem, Mary D; Phillips-Mora, Wilberth; Zhang, Dapeng; Meinhardt, Lyndel W; Bailey, Bryan A
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed single nucleotide polymorphism (SNP) markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNA sequencing (RNASeq)-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups). Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri.
Shahin S Ali
Full Text Available Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed SNP markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNASeq-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups. Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri.
HE Zhi-zhou; XIE Fang-ming; CHEN Li-yun; Madonna Angelita DELA PAZ
Investigation of genetic diversity and relationships among breeding lines is of great importance to facilitate parent selection in hybrid rice breeding programs.In this study,we characterized 168 hybrid rice parents from International Rice Research Institute with 207 simple sequence repeat (SSR) and 353 single nucleotide polymorphism (SNP) markers.A total of 1 267 SSR and 706 SNP alleles were detected with the averages of 6.1 (SSR) and 2.0 (SNP) alleles per locus respectively across all lines.Based on the genetic distances estimated from the SSR and SNP markers separately and combined,the unrooted neighbor-joining cluster and STRUCTURE analyses consistently separated the 168 hybrid rice parents into two major groups:B-line and R-line,which is consistent with known parent pedigree information.The genetic distance matrices derived from the SSR and SNP genotyping were highly correlated (r=0.81,P 0.001),indicating that both of the SSR and SNP markers have distinguishable power to detect polymorphism and are appropriate for genetic diversity analysis among tropical hybrid rice parents.A subset of 60 SSR markers were also chosen by the Core Hunter with 368 alleles,and the cluster analysis based on the total and subset of SSR markers highly corresponded at r =0.91 (P ＜ 0.001 ),suggesting that fewer SSR markers can be used to classify and evaluate genetic diversity among parental lines.
Full Text Available Abstract Background Single nucleotide polymorphism (SNP genotyping provides the means to develop a practical, rapid, inexpensive assay that will uniquely identify any Plasmodium falciparum parasite using a small amount of DNA. Such an assay could be used to distinguish recrudescence from re-infection in drug trials, to monitor the frequency and distribution of specific parasites in a patient population undergoing drug treatment or vaccine challenge, or for tracking samples and determining purity of isolates in the laboratory during culture adaptation and sub-cloning, as well as routine passage. Methods A panel of twenty-four SNP markers has been identified that exhibit a high minor allele frequency (average MAF > 35%, for which robust TaqMan genotyping assays were constructed. All SNPs were identified through whole genome sequencing and MAF was estimated through Affymetrix array-based genotyping of a worldwide collection of parasites. These assays create a "molecular barcode" to uniquely identify a parasite genome. Results Using 24 such markers no two parasites known to be of independent origin have yet been found to have the same allele signature. The TaqMan genotyping assays can be performed on a variety of samples including cultured parasites, frozen whole blood, or whole blood spotted onto filter paper with a success rate > 99%. Less than 5 ng of parasite DNA is needed to complete a panel of 24 markers. The ability of this SNP panel to detect and identify parasites was compared to the standard molecular methods, MSP-1 and MSP-2 typing. Conclusion This work provides a facile field-deployable genotyping tool that can be used without special skills with standard lab equipment, and at reasonable cost that will unambiguously identify and track P. falciparum parasites both from patient samples and in the laboratory.
Matsuda, Ryusuke; Iehisa, Julio C M; Takumi, Shigeo
Available information on genetically assigned molecular markers is not sufficient for efficient construction of a high-density linkage map in wheat. Here, we report on application of high resolution melting (HRM) analysis using a real-time PCR apparatus to develop single nucleotide polymorphism (SNP) markers linked to a hybrid necrosis gene, Net2, located on wheat chromosome 2D. Based on genomic information on barley chromosome 2H and wheat expressed sequence tag libraries, we selected wheat cDNA sequences presumed to be located near the Net2 chromosomal region, and then found SNPs between the parental Ae. tauschii accessions of the synthetic wheat mapping population. HRM analysis of the PCR products from F(2) individuals' DNA enabled us to assign 44.4% of the SNP-representing cDNAs to chromosome 2D despite the presence of the A and B genomes. In addition, the designed SNP markers were assigned to chromosome 2D of Ae. tauschii. The order of the assigned SNP markers in synthetic hexaploid wheat was confirmed by comparison with the markers in barley and Ae. tauschii. Thus, the SNP-genotyping method based on HRM analysis is a useful tool for development of molecular markers at target loci in wheat.
Charlier, Carole; Coppieters, Wouter; Rollin, Frédéric;
The widespread use of elite sires by means of artificial insemination in livestock breeding leads to the frequent emergence of recessive genetic defects, which cause significant economic and animal welfare concerns. Here we show that the availability of genome-wide, high-density SNP panels, combi...... (CMD) types 1 and 2 in Belgian Blue cattle and ichthyosis fetalis in Italian Chianina cattle. Identification of these causative mutations has an immediate translation into breeding practive, allowing marker assisted selection against the defects through avoidance of at-risk matings....
Chekanov, N.; Boulygina, E.; Beletskiy, A.; Prokhortchouk, E.; Skryabin, K.
A somatic cell genome was recently resequenced for a patient with renal cancer. The data were submitted to the NCBI Sequence Read Archive under the accession number SRA012240. Here, we have performed SNP calling for the genome and compared it with several published genomes. We have found 2, 921, 724 SNPs, including 1, 472, 679 newly described ones. Among them, 63, 462 SNPs have been mapped to the Y chromosome and, based on 18 markers, the genome has been ascribed to the R1a1a haplogroup predo...
More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.
Ferchaud, Anne-Laure; Pedersen, Susanne H.; Bekkevold, Dorte
have developed a cost-efficient low-density SNP array that allows for rapid screening of polymorphisms in threespine stickleback. The array provides a valuable tool for analyzing adaptive divergence between freshwater and marine stickleback populations beyond the well-established candidate gene...... stickleback populations at the phenotypic trait of lateral plate morphology and the underlying candidate gene Ectodysplacin (EDA). Many studies have focused on this trait and candidate gene, although other genes involved in marine-freshwater adaptation may be equally important. In order to develop a resource...... for rapid and cost efficient analysis of genetic divergence between freshwater and marine sticklebacks, we generated a low-density SNP (Single Nucleotide Polymorphism) array encompassing markers of chromosome regions under putative directional selection, along with neutral markers for background. Results...
Full Text Available Abstract With ever-increasing numbers of microbial genomes being sequenced, efficient tools are needed to perform strain-level identification of any newly sequenced genome. Here, we present the SNP identification for strain typing (SNIT pipeline, a fast and accurate software system that compares a newly sequenced bacterial genome with other genomes of the same species to identify single nucleotide polymorphisms (SNPs and small insertions/deletions (indels. Based on this information, the pipeline analyzes the polymorphic loci present in all input genomes to identify the genome that has the fewest differences with the newly sequenced genome. Similarly, for each of the other genomes, SNIT identifies the input genome with the fewest differences. Results from five bacterial species show that the SNIT pipeline identifies the correct closest neighbor with 75% to 100% accuracy. The SNIT pipeline is available for download at http://www.bhsai.org/snit.html
Full Text Available Recent advances in next-generation DNA sequencing technologies have made possible the development of high-throughput SNP genotyping platforms that allow for the simultaneous interrogation of thousands of single-nucleotide polymorphisms (SNPs. Such resources have the potential to facilitate the rapid development of high-density genetic maps, and to enable genome-wide association studies as well as molecular breeding approaches in a variety of taxa. Herein, we describe the development of a SNP genotyping resource for use in sunflower (Helianthus annuus L.. This work involved the development of a reference transcriptome assembly for sunflower, the discovery of thousands of high quality SNPs based on the generation and analysis of ca. 6 Gb of transcriptome re-sequencing data derived from multiple genotypes, the selection of 10,640 SNPs for inclusion in the genotyping array, and the use of the resulting array to screen a diverse panel of sunflower accessions as well as related wild species. The results of this work revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, greater than 95% of successful SNP assays revealed polymorphism, and more than 90% of these assays could be successfully transferred to related wild species. Analysis of the polymorphism data revealed patterns of genetic differentiation that were largely congruent with the evolutionary history of sunflower, though the large number of markers allowed for finer resolution than has previously been possible.
Bachlava, Eleni; Taylor, Christopher A; Tang, Shunxue; Bowers, John E; Mandel, Jennifer R; Burke, John M; Knapp, Steven J
Recent advances in next-generation DNA sequencing technologies have made possible the development of high-throughput SNP genotyping platforms that allow for the simultaneous interrogation of thousands of single-nucleotide polymorphisms (SNPs). Such resources have the potential to facilitate the rapid development of high-density genetic maps, and to enable genome-wide association studies as well as molecular breeding approaches in a variety of taxa. Herein, we describe the development of a SNP genotyping resource for use in sunflower (Helianthus annuus L.). This work involved the development of a reference transcriptome assembly for sunflower, the discovery of thousands of high quality SNPs based on the generation and analysis of ca. 6 Gb of transcriptome re-sequencing data derived from multiple genotypes, the selection of 10,640 SNPs for inclusion in the genotyping array, and the use of the resulting array to screen a diverse panel of sunflower accessions as well as related wild species. The results of this work revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, greater than 95% of successful SNP assays revealed polymorphism, and more than 90% of these assays could be successfully transferred to related wild species. Analysis of the polymorphism data revealed patterns of genetic differentiation that were largely congruent with the evolutionary history of sunflower, though the large number of markers allowed for finer resolution than has previously been possible.
Ochiai, Eriko; Minaguchi, Kiyoshi; Nambiar, Phrabhakaran; Kakimoto, Yu; Satoh, Fumiko; Nakatome, Masato; Miyashita, Keiko; Osawa, Motoki
The Y chromosomal haplogroup determined from single nucleotide polymorphism (SNP) combinations is a valuable genetic marker to study ancestral male lineage and ethical distribution. Next-generation sequencing has been developed for widely diverse genetics fields. For this study, we demonstrate 34 Y-SNP typing employing the Ion PGM™ system to perform haplogrouping. DNA libraries were constructed using the HID-Ion AmpliSeq™ Identity Panel. Emulsion PCR was performed, then DNA sequences were analyzed on the Ion 314 and 316 Chip Kit v2. Some difficulties became apparent during the analytic processes. No-call was reported at rs2032599 and M479 in six samples, in which the least coverage was observed at M479. A minor misreading occurred at rs2032631 and M479. A real time PCR experiment using other pairs of oligonucleotide primers showed that these events might result from the flanking sequence. Finally, Y haplogroup was determined completely for 81 unrelated males including Japanese (n=59) and Malay (n=22) subjects. The allelic divergence differed between the two populations. In comparison with the conventional Sanger method, next-generation sequencing provides a comprehensive SNP analysis with convenient procedures, but further system improvement is necessary. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Watson, Christopher M; Crinnion, Laura A; Gurgel-Gianetti, Juliana; Harrison, Sally M; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F; Pena, Sergio D J; Bonthron, David T; Carr, Ian M
Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease-causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome-wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution.
Full Text Available Abstract Background With the increasing availability of EST databases and whole genome sequences, SNPs have become the most abundant and powerful polymorphic markers. However, SNP chip data generally suffers from ascertainment biases caused by the SNP discovery and selection process in which a small number of individuals are used as discovery panels. The ongoing International Citrus Genome Consortium sequencing project of the highly heterozygous Clementine and sweet orange genomes will soon result in the release of several hundred thousand SNPs. The primary goals of this study were: (i to estimate the transferability within the genus Citrus of SNPs discovered from Clementine BACend sequencing (BES, (ii to estimate bias associated with the very narrow discovery panel, and (iii to evaluate the usefulness of the Clementine-derived SNP markers for diversity analysis and comparative mapping studies between the different cultivated Citrus species. Results Fifty-four accessions covering the main Citrus species and 52 interspecific hybrids between pummelo and Clementine were genotyped on a GoldenGate array platform using 1,457 SNPs mined from Clementine BES and 37 SNPs identified between and within C. maxima, C. medica, C. reticulata and C. micrantha. Consistent results were obtained from 622 SNP loci. Of these markers, 116 displayed incomplete transferability primarily in C. medica, C. maxima and wild Citrus species. The two primary biases associated with the SNP mining in Clementine were an overestimation of the C. reticulata diversity and an underestimation of the interspecific differentiation. However, the genetic stratification of the gene pool was high, with very frequent significant linkage disequilibrium. Furthermore, the shared intraspecific polymorphism and accession heterozygosity were generally enough to perform interspecific comparative genetic mapping. Conclusions A set of 622 SNP markers providing consistent results was selected. Of the
Ollitrault, Patrick; Terol, Javier; Garcia-Lor, Andres; Bérard, Aurélie; Chauveau, Aurélie; Froelicher, Yann; Belzile, Caroline; Morillon, Raphaël; Navarro, Luis; Brunel, Dominique; Talon, Manuel
With the increasing availability of EST databases and whole genome sequences, SNPs have become the most abundant and powerful polymorphic markers. However, SNP chip data generally suffers from ascertainment biases caused by the SNP discovery and selection process in which a small number of individuals are used as discovery panels. The ongoing International Citrus Genome Consortium sequencing project of the highly heterozygous Clementine and sweet orange genomes will soon result in the release of several hundred thousand SNPs. The primary goals of this study were: (i) to estimate the transferability within the genus Citrus of SNPs discovered from Clementine BACend sequencing (BES), (ii) to estimate bias associated with the very narrow discovery panel, and (iii) to evaluate the usefulness of the Clementine-derived SNP markers for diversity analysis and comparative mapping studies between the different cultivated Citrus species. Fifty-four accessions covering the main Citrus species and 52 interspecific hybrids between pummelo and Clementine were genotyped on a GoldenGate array platform using 1,457 SNPs mined from Clementine BES and 37 SNPs identified between and within C. maxima, C. medica, C. reticulata and C. micrantha. Consistent results were obtained from 622 SNP loci. Of these markers, 116 displayed incomplete transferability primarily in C. medica, C. maxima and wild Citrus species. The two primary biases associated with the SNP mining in Clementine were an overestimation of the C. reticulata diversity and an underestimation of the interspecific differentiation. However, the genetic stratification of the gene pool was high, with very frequent significant linkage disequilibrium. Furthermore, the shared intraspecific polymorphism and accession heterozygosity were generally enough to perform interspecific comparative genetic mapping. A set of 622 SNP markers providing consistent results was selected. Of the markers mined from Clementine, 80.5% were successfully
Gorlov, Ivan P.; Moore, Jason H.; Peng, Bo; Jin, Jennifer L.; Gorlova, Olga Y.; Amos, Christopher I.
Successful independent replication is the most direct approach for distinguishing real genotype-disease associations from false discoveries in Genome Wide Association Studies (GWAS). Selecting SNPs for replication has been primarily based on p-values from the discovery stage, although additional characteristics of SNPs may be used to improve replication success. We used disease-associated SNPs from more than 2,000 published GWASs to identify predictors of SNP reproducibility. SNP reproducibility was defined as a proportion of successful replications among all replication attempts. The study reporting association for the first time was considered to be discovery and all consequent studies targeting the same phenotype replications. We found that −Log(P), where P is a p-value from the discovery study, is the strongest predictor of the SNP reproducibility. Other significant predictors include type of the SNP (e.g. missense vs intronic SNPs) and minor allele frequency. Features of the genes linked to the disease-associated SNP also predict SNP reproducibility. Based on empirically defined rules, we developed a reproducibility score (RS) to predict SNP reproducibility independently of −Log(P). We used data from two lung cancer GWAS studies as well as recently reported disease-associated SNPs to validate RS. Minus Log(P) outperforms RS when the very top SNPs are selected, while RS works better with relaxed selection criteria. In conclusion, we propose an empirical model to predict SNP reproducibility, which can be used to select SNPs for validation and prioritization. PMID:25273843
Børsting, Claus; Fordyce, Sarah L; Olofsson, Jill Katharina
in our ISO 17025 accredited laboratory. Concordance between the Ion Torrent™ HID SNP assay and the SNPforID assay was tested by typing 44 Iraqis twice with the Ion Torrent™ HID SNP assay. The same samples were previously typed with the SNPforID assay and the Y-chromosome haplogroups of the individuals...
Calus Mario PL
Full Text Available Abstract Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Methods Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP genotypes between parent and offspring (PAR-OFF. Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT. The second test compares pedigree and SNP-based relationships (SIBREL. All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Results Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL identified 18 (22 additional inconsistent animals. Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error, were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error, were considerably higher for SIBREL compared to SIBCOUNT. Conclusions
Calus, Mario P L; Mulder, Han A; Bastiaansen, John W M
Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals.Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Tests to remove Mendelian inconsistencies between sibs should
Jamshidi, Maral; Nevanlinna, Heli; Van Dyck, Laurien
In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-...
Zahirul I Talukder
Full Text Available A high-resolution genetic map of sunflower was constructed by integrating SNP data from three F2 mapping populations (HA 89/RHA 464, B-line/RHA 464, and CR 29/RHA 468. The consensus map spanned a total length of 1443.84 cM, and consisted of 5,019 SNP markers derived from RAD tag sequencing and 118 publicly available SSR markers distributed in 17 linkage groups, corresponding to the haploid chromosome number of sunflower. The maximum interval between markers in the consensus map is 12.37 cM and the average distance is 0.28 cM between adjacent markers. Despite a few short-distance inversions in marker order, the consensus map showed high levels of collinearity among individual maps with an average Spearman's rank correlation coefficient of 0.972 across the genome. The order of the SSR markers on the consensus map was also in agreement with the order of the individual map and with previously published sunflower maps. Three individual and one consensus maps revealed the uneven distribution of markers across the genome. Additionally, we performed fine mapping and marker validation of the rust resistance gene R12, providing closely linked SNP markers for marker-assisted selection of this gene in sunflower breeding programs. This high resolution consensus map will serve as a valuable tool to the sunflower community for studying marker-trait association of important agronomic traits, marker assisted breeding, map-based gene cloning, and comparative mapping.
Rao, Kiran Prabhaker; Belogolovkin, Victoria
Marker chromosomes are a morphologically heterogeneous group of structurally abnormal chromosomes that pose a significant challenge in prenatal diagnosis. Phenotypes associated with marker chromosomes are highly variable and range from normal to severely abnormal. Clinical outcomes are very difficult to predict when marker chromosomes are detected prenatally. In this review, we outline the classification, etiology, cytogenetic characterization, and clinical consequences of marker chromosomes, as well as practical approaches to prenatal diagnosis and genetic counseling.
Manaffar, R; Zare, S; Agh, N; Abdolahzadeh, N; Soltanian, S; Sorgeloos, P; Bossier, P; Van Stappen, G
In order to find a marker for differentiating between a bisexual and a parthenogenetic Artemia strain, Exon-7 of the Na/K ATPase α(1) subunit gene was screened by RFLP technique. The results revealed a constant synonymous SNP (single nucleotide polymorphism) in digestion by the Tru1I enzyme that was consistent with these two types of Artemia. This SNP was identified as an accurate molecular marker for discrimination between bisexual and parthenogenetic Artemia. According to the Nei's genetic distance (1973), the lowest genetic distance was found between individuals from Artemia urmiana Günther 1890 and parthenogenetic populations, making the described marker the first marker to easily distinguish between these two cooccurring species.
曾治君; 刘晨龙; 杨慧; 杨斌; 杨竹青; 陈从英
[Objective] Serum glucose (GLU) and glycosylated serum protein (GSP) contents in a Sutai population and a large-scale White Duroc×Erhualian F2 intercross at the age of 240 days were measured. A genome-wide association study was carried out to identify the SNPs or chromosomal regions associated with GLU and GSP. The aim of the study is to establish a foundation for identification of causative genes influencing the serum GLU and GSP, and provide the clues for genetic analysis of human hypoglycemia and diabetes.[Method]The experimental pigs used in this study included 435 Sutai pigs that were bought from Sutai Pig Breeding Center in Suzhou city and 760 F2 individuals from White Duroc × Erhualian intercross that was constructed by Key Laboratory for Animal Biotechnology of Jiangxi Agricultural University. All experimental pigs were fed under the same farm conditions and slaughtered at the age of 240 days at Guohong abattoir. The collected blood samples were kept at room temperature for 5 hours, then centrifuged at 4℃, 3 000 r/min for 20 min. The serum GLU and GSP were determined with commercial kits. Genomic DNA was extracted from ear tissues using a standard phenol/chloroform method, the concentration and quality were determined by NANODROP 1000 analyzer. All DNA samples were diluted to 20ng·μL-1, and then stored at -20℃until used. All experimental animals were genotyped with Illumina porcine 60K SNP chip. Quality control of genotyping results was carried out using PLINK software. The genome-wide association studies were performed with the mixed linear model with the SNPs passed the quality control by using GenABEL software in the R packages to identify the significant SNPs associated with GLU and GSP at 240 days in the Sutai and White Duroc × Erhualian F2 intercross. The possible candidate genes were chosen for each of significant region according to gene annotations in Ensembl or NCBI websites. [Result]A total of 5 SNPs that significantly associated with
蔡明成; 杨魁; 王玲; 左福元
Single nucleotide polymorphisms (SNP), as a third generation molecular marker, has many characteristics, such as rich sites, strong representation and genetic stability, it had became the focus of molecular markers related fields. This paper reviewed the characteristics and detection methods of SNP and its application in beef cattle' s growth traits, reproductive traits, carcass and meat quality traits, as the candidate gene for molecular marker assisted selection, to accelerate the speed of beef cattle breeding.%单核苷酸多态性(Single nucleotide polymorphism,SNP)作为第三代分子标记,具有位点丰富、代表性强和遗传稳定等特点,已成为分子标记研究领域的焦点.本研究综述了SNP特点、检测方法及其在肉牛生长发育性状、繁殖性状、胴体与肉质性状的应用,作为分子标记辅助选择的候选基因,加快了肉牛育种进程.
Eduardoff, M; Gross, T E; Santos, C
The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ ...
Background Technological advances have lead to the rapid increase in availability of single nucleotide polymorphisms (SNPs) in a range of organisms, and there is a general optimism that SNPs will become the marker of choice for a range of evolutionary applications. Here, comparisons between 300 polymorphic SNPs and 14 short tandem repeats (STRs) were conducted on a data set consisting of approximately 500 Atlantic salmon arranged in 10 samples/populations. Results Global FST ranged from 0.033-0.115 and -0.002-0.316 for the 14 STR and 300 SNP loci respectively. Global FST was similar among 28 linkage groups when averaging data from mapped SNPs. With the exception of selecting a panel of SNPs taking the locus displaying the highest global FST for each of the 28 linkage groups, which inflated estimation of genetic differentiation among the samples, inferred genetic relationships were highly similar between SNP and STR data sets and variants thereof. The best 15 SNPs (30 alleles) gave a similar level of self-assignment to the best 4 STR loci (83 alleles), however, addition of further STR loci did not lead to a notable increase assignment whereas addition of up to 100 SNP loci increased assignment. Conclusion Whilst the optimal combinations of SNPs identified in this study are linked to the samples from which they were selected, this study demonstrates that identification of highly informative SNP loci from larger panels will provide researchers with a powerful approach to delineate genetic relationships at the individual and population levels. PMID:20051144
Rudy M Jonker
Full Text Available Migratory birds are of particular interest for population genetics because of the high connectivity between habitats and populations. A high degree of connectivity requires using many genetic markers to achieve the required statistical power, and a genome wide SNP set can fit this purpose. Here we present the development of a genome wide SNP set for the Barnacle Goose Branta leucopsis, a model species for the study of bird migration. We used the genome of a different waterfowl species, Mallard Anas platyrhynchos, as a reference to align Barnacle Goose second generation sequence reads from an RRL library and detected 2188 SNPs genome wide. Furthermore, we used chimeric flanking sequences, merged from both Mallard and Barnacle Goose DNA sequence information, to create primers for validation by genotyping. Validation with a 384 SNP genotyping set resulted in 374 (97% successfully typed SNPs in the assay, of which 358 (96% were polymorphic. Additionally, we validated our SNPs on relatively old (30 years museum samples, which resulted in a success rate of at least 80%. This shows that museum samples could be used in standard SNP genotyping assays. Our study also shows that the genome of a related species can be used as reference to detect genome wide SNPs in birds, because genomes of birds are highly conserved. This is illustrated by the use of chimeric flanking sequences, which showed that the incorporation of flanking nucleotides from Mallard into Barnacle Goose sequences lead to equal genotyping performance when compared to flanking sequences solely composed of Barnacle Goose sequence.
Trebbi, Daniele; Maccaferri, Marco; de Heer, Peter; Sørensen, Anker; Giuliani, Silvia; Salvi, Silvio; Sanguineti, Maria Corinna; Massi, Andrea; van der Vossen, Edwin Andries Gerard; Tuberosa, Roberto
We describe the application of complexity reduction of polymorphic sequences (CRoPS(®)) technology for the discovery of SNP markers in tetraploid durum wheat (Triticum durum Desf.). A next-generation sequencing experiment was carried out on reduced representation libraries obtained from four durum cultivars. SNP validation and minor allele frequency (MAF) estimate were carried out on a panel of 12 cultivars, and the feasibility of genotyping these SNPs in segregating populations was tested using the Illumina Golden Gate (GG) technology. A total of 2,659 SNPs were identified on 1,206 consensus sequences. Among the 768 SNPs that were chosen irrespective of their genomic repetitiveness level and assayed on the Illumina BeadExpress genotyping system, 275 (35.8%) SNPs matched the expected genotypes observed in the SNP discovery phase. MAF data indicated that the overall SNP informativeness was high: a total of 196 (71.3%) SNPs had MAF >0.2, of which 76 (27.6%) showed MAF >0.4. Of these SNPs, 157 were mapped in one of two mapping populations (Meridiano × Claudio and Colosseo × Lloyd) and integrated into a common genetic map. Despite the relatively low genotyping efficiency of the GG assay, the validated CRoPS-derived SNPs showed valuable features for genomics and breeding applications such as a uniform distribution across the wheat genome, a prevailing single-locus codominant nature and a high polymorphism. Here, we report a new set of 275 highly robust genome-wide Triticum SNPs that are readily available for breeding purposes.
Zhen-lin ZHANG; Jin-wei HE; Yue-juan QIN; Yun-qiu HU; Miao LI; Yu-juan LIU; Hao ZHANG; Wei-wei HU
Aim: To assess the contribution of single nucleotide polymorphisms (SNP) and haplotypes in the peroxisome proliferator-activated receptor-γ co-activator-1(PPARGC1) and adiponectin genes to normal bone mineral density (BMD) variation in healthy Chinese women and men. Methods: We performed population-based (ANOVA) and family-based (quantitative trait locus transmission disequi-librium test) association studies of PPARGC1 and adiponectin genes. SNP in the 2 genes were genotyped. BMD was measured using dual-energy X-ray absorptiometry in the lumbar spine and hip in 401 nuclear families with a total of1260 subjects, including 458 premenopausal women, 20-40 years of age; 401 post-menopausal women (mothers), 43-74 years of age; and 401 men (fathers), 49-76years of age. Results: Significant within-family association was found between the Thr394Thr polymorphism in the PPGAGC1 gene and peak BMD in the femoral neck (P=0.026). Subsequent permutations were in agreement with this significant within-family association result (P=0.016), but Thr394Thr SNP only accounted for0.7% of the variation in femoral neck peak BMD. However, no significant within-family association was detected between each SNP in the adiponect in gene and peak BMD. Although no significant association was found between BMD and SNP in the PPARGC1 and adiponectin genes in both men and postmenopausal women, haplotype 2 (T-T) in the adiponect in gene was associated with lumbar spine BMD in postmenopausal women (P=0.019). Conclusion: Our findings sug-gest that Thr394Thr SNP in the PPARGC1 gene was associated with peak BMD in the femoral neck in Chinese women. Confirmation of our results is needed in other populations and with more functional markers within and flanking the PPARGC1 or adiponectin genes region.
Schalkwyk Leonard C
Full Text Available Abstract Background Genetic influences underpinning complex traits are thought to involve multiple quantitative trait loci (QTLs of small effect size. Detection of such QTL associations requires systematic screening of large numbers of DNA markers within large sample populations. Using pooled DNA on SNP microarrays to screen for allelic frequency differences between groups such as cases and controls (called SNP Microarray and Pooling, or SNP-MaP has been validated as an efficient solution on both 10 k and 100 k platforms. We demonstrate that this approach can be effectively applied to the truly genomewide Affymetrix GeneChip® Mapping 500 K Array. Results In comparisons between five independent DNA pools (N ~200 per pool on separate Affymetrix GeneChip® Mapping 500 K Array sets, we show that, for SNPs with minor allele frequencies > 0.05, the reliability of the rank order of estimated allele frequencies, assessed as the average correlation between allele frequency estimates across the DNA pools, was 0.948 (average mean difference across the five pools = 0.069. Similarly, validity of the SNP-MaP approach was demonstrated by a rank-order correlation of 0.937 (average mean difference = 0.095 between the average DNA pool allele frequency estimates and the allele frequencies of an independent (CEPH sample of 60 unrelated individually genotyped subjects. Conclusion We conclude that SNP-MaP can be extended for use on the Affymetrix GeneChip® Mapping 500 K Array, providing a cost-effective, reliable and valid initial screen of 500 K SNP microarrays in genomewide association scans.
Edward; M.; Smith; Jack; Littrell; Michael; Olivier
High-throughput SNP genotyping platforms use automated genotype calling algo- rithms to assign genotypes. While these algorithms work efficiently for individual platforms, they are not compatible with other platforms, and have individual biases that result in missed genotype calls. Here we present data on the use of a second complementary SNP genotype clustering algorithm. The algorithm was originally designed for individual fluorescent SNP genotyping assays, and has been opti- mized to permit the clustering of large datasets generated from custom-designed Affymetrix SNP panels. In an analysis of data from a 3K array genotyped on 1,560 samples, the additional analysis increased the overall number of genotypes by over 45,000, significantly improving the completeness of the experimental data. This analysis suggests that the use of multiple genotype calling algorithms may be ad- visable in high-throughput SNP genotyping experiments. The software is written in Perl and is available from the corresponding author.
Edriss, Vahid; Guldbrandtsen, Bernt; Lund, Mogens Sandø
Genomic selection is a method to predict breeding values using genome-wide single-nucleotide polymorphism (SNP) markers. High-quality marker data are necessary for genomic selection. The aim of this study was to investigate the effect of marker-editing criteria on the accuracy of genomic predicti......Genomic selection is a method to predict breeding values using genome-wide single-nucleotide polymorphism (SNP) markers. High-quality marker data are necessary for genomic selection. The aim of this study was to investigate the effect of marker-editing criteria on the accuracy of genomic...... predictions in the Nordic Holstein and Jersey populations. Data included 4429 Holstein and 1071 Jersey bulls. In total, 48 222 SNP for Holstein and 44 305 SNP for Jersey were polymorphic. The SNP data were edited based on (i) minor allele frequencies (MAF) with thresholds of no limit, 0.001, 0.01, 0.02, 0.......05 and 0.10, (ii) deviations from Hardy–Weinberg proportions (HWP) with thresholds of no limit, chi-squared p-values of 0.001, 0.02, 0.05 and 0.10, and (iii) GenCall (GC) scores with thresholds of 0.15, 0.55, 0.60, 0.65 and 0.70. The marker data sets edited with different criteria were used for genomic...
Full Text Available Abstract Background The increasing number of genomic sequences of bacteria makes it possible to select unique SNPs of a particular strain/species at the whole genome level and thus design specific primers based on the SNPs. The high similarity of genomic sequences among phylogenetically-related bacteria requires the identification of the few loci in the genome that can serve as unique markers for strain differentiation. PrimerSNP attempts to identify reliable strain-specific markers, on which specific primers are designed for pathogen detection purpose. Results PrimerSNP is an online tool to design primers based on strain specific SNPs for multiple strains/species of microorganisms at the whole genome level. The allele-specific primers could distinguish query sequences of one strain from other homologous sequences by standard PCR reaction. Additionally, PrimerSNP provides a feature for designing common primers that can amplify all the homologous sequences of multiple strains/species of microorganisms. PrimerSNP is freely available at http://cropdisease.ars.usda.gov/~primer. Conclusion PrimerSNP is a high-throughput specific primer generation tool for the differentiation of phylogenetically-related strains/species. Experimental validation showed that this software had a successful prediction rate of 80.4 – 100% for strain specific primer design.
Gorham, James D; Ranson, Matthew S; Smith, Janebeth C; Gorham, Beverly J; Muirhead, Kristen-Ashley
State-of-the-art, genome-wide assessment of mouse genetic background uses single nucleotide polymorphism (SNP) PCR. As SNP analysis can use multiplex testing, it is amenable to high-throughput analysis and is the preferred method for shared resource facilities that offer genetic background assessment of mouse genomes. However, a typical individual SNP query yields only two alleles (A vs. B), limiting the application of this methodology to distinguishing contributions from no more than two inbred mouse strains. By contrast, simple sequence length polymorphism (SSLP) analysis yields multiple alleles but is not amenable to high-throughput testing. We sought to devise a SNP-based technique to identify donor strain origins when three distinct mouse strains potentially contribute to the genetic makeup of an individual mouse. A computational approach was used to devise a three-strain analysis (3SA) algorithm that would permit identification of three genetic backgrounds while still using a binary-output SNP platform. A panel of 15 mosaic mice with contributions from BALB/c, C57Bl/6, and DBA/2 genetic backgrounds was bred and analyzed using a genome-wide SNP panel using 1449 markers. The 3SA algorithm was applied and then validated using SSLP. The 3SA algorithm assigned 85% of 1449 SNPs as informative for the C57Bl/6, BALB/c, or DBA/2 backgrounds, respectively. Testing the panel of 15 F2 mice, the 3SA algorithm predicted donor strain origins genome-wide. Donor strain origins predicted by the 3SA algorithm correlated perfectly with results from individual SSLP markers located on five different chromosomes (n=70 tests). We have established and validated an analysis algorithm based on binary SNP data that can successfully identify the donor strain origins of chromosomal regions in mice that are bred from three distinct inbred mouse strains.
Luo Peng; Hu Chaoqun; Ren Chunhua; Zhang Lvping
Using PCR-denaturing gradient gel electrophoresis (DGGE) targeting the RNA polymerase beta subunit (rpoB) gene, a simultaneous detection method for Vibrio species was established. rpoB gene-based PCR-DGGE was carried out with eight Vibrio Reference strains (each from different species), mixed sample (including these Vibrio Reference strains),two non Vibrio strains, four environmental Vibrio strains, and three unidentified environmental strains. For comparison, 16S rRNA gene-based PCR-DGGE of the eight Vibrio Reference strains was performed with universal primers. In addition, three unidentified strains were identified by 16S rRNA and gyrB gene sequencing and API20E system in order to confirm the accuracy of rpoB gene-based PCR-DGGE detection. Results revealed that rpoB-based PCR-DGGE could well discriminate eight Vibrio Reference strains and could not discriminate different strains within the same species. The bands derived from two non Vibrio strains could not match with any bands in Reference marker. Meanwhile, 16S rRNA gene-based DGGE failed to distinguish these Reference strains. Furthermore, four out of eight Vibrio species exhibited heterogenous bands in 16S rRNA gene-based DGGE. Sequencing and API 20E identification of unidentified strains coincided with the detection by rpoB gene-based PCR-DGGE. The results demonstrated that rpoB-based PCR-DGGE provided a rapid and efficient method for simultaneous detection of multiple Vibrio species, which can avoid the limitations inherent in 16S rRNA gene-based PCR-DGGE.
Xu, Pei; Wu, Xiaohua; Wang, Baogen; Liu, Yonghua; Ehlers, Jeffery D; Close, Timothy J; Roberts, Philip A; Diop, Ndeye-Ndack; Qin, Dehui; Hu, Tingting; Lu, Zhongfu; Li, Guojing
Asparagus bean (Vigna. unguiculata ssp. sesquipedialis) is a distinctive subspecies of cowpea [Vigna. unguiculata (L.) Walp.] that apparently originated in East Asia and is characterized by extremely long and thin pods and an aggressive climbing growth habit. The crop is widely cultivated throughout Asia for the production of immature pods known as 'long beans' or 'asparagus beans'. While the genome of cowpea ssp. unguiculata has been characterized recently by high-density genetic mapping and partial sequencing, little is known about the genome of asparagus bean. We report here the first genetic map of asparagus bean based on SNP and SSR markers. The current map consists of 375 loci mapped onto 11 linkage groups (LGs), with 191 loci detected by SNP markers and 184 loci by SSR markers. The overall map length is 745 cM, with an average marker distance of 1.98 cM. There are four high marker-density blocks distributed on three LGs and three regions of segregation distortion (SDRs) identified on two other LGs, two of which co-locate in chromosomal regions syntenic to SDRs in soybean. Synteny between asparagus bean and the model legume Lotus. japonica was also established. This work provides the basis for mapping and functional analysis of genes/QTLs of particular interest in asparagus bean, as well as for comparative genomics study of cowpea at the subspecies level.
Full Text Available Asparagus bean (Vigna. unguiculata ssp. sesquipedialis is a distinctive subspecies of cowpea [Vigna. unguiculata (L. Walp.] that apparently originated in East Asia and is characterized by extremely long and thin pods and an aggressive climbing growth habit. The crop is widely cultivated throughout Asia for the production of immature pods known as 'long beans' or 'asparagus beans'. While the genome of cowpea ssp. unguiculata has been characterized recently by high-density genetic mapping and partial sequencing, little is known about the genome of asparagus bean. We report here the first genetic map of asparagus bean based on SNP and SSR markers. The current map consists of 375 loci mapped onto 11 linkage groups (LGs, with 191 loci detected by SNP markers and 184 loci by SSR markers. The overall map length is 745 cM, with an average marker distance of 1.98 cM. There are four high marker-density blocks distributed on three LGs and three regions of segregation distortion (SDRs identified on two other LGs, two of which co-locate in chromosomal regions syntenic to SDRs in soybean. Synteny between asparagus bean and the model legume Lotus. japonica was also established. This work provides the basis for mapping and functional analysis of genes/QTLs of particular interest in asparagus bean, as well as for comparative genomics study of cowpea at the subspecies level.
Full Text Available Peach was domesticated in China more than four millennia ago and from there it spread world-wide. Since the middle of the last century, peach breeding programs have been very dynamic generating hundreds of new commercial varieties, however, in most cases such varieties derive from a limited collection of parental lines (founders. This is one reason for the observed low levels of variability of the commercial gene pool, implying that knowledge of the extent and distribution of genetic variability in peach is critical to allow the choice of adequate parents to confer enhanced productivity, adaptation and quality to improved varieties. With this aim we genotyped 1,580 peach accessions (including a few closely related Prunus species maintained and phenotyped in five germplasm collections (four European and one Chinese with the International Peach SNP Consortium 9K SNP peach array. The study of population structure revealed the subdivision of the panel in three main populations, one mainly made up of Occidental varieties from breeding programs (POP1OCB, one of Occidental landraces (POP2OCT and the third of Oriental accessions (POP3OR. Analysis of linkage disequilibrium (LD identified differential patterns of genome-wide LD blocks in each of the populations. Phenotypic data for seven monogenic traits were integrated in a genome-wide association study (GWAS. The significantly associated SNPs were always in the regions predicted by linkage analysis, forming haplotypes of markers. These diagnostic haplotypes could be used for marker-assisted selection (MAS in modern breeding programs.
Chun Ming Wang
Full Text Available Jatropha curcas is a potential plant species for biodiesel production. However, its seed yield is too low for profitable production of biodiesel. To improve the productivity, genetic improvement through breeding is essential. A linkage map is an important component in molecular breeding. We established a first-generation linkage map using a mapping panel containing two backcross populations with 93 progeny. We mapped 506 markers (216 microsatellites and 290 SNPs from ESTs onto 11 linkage groups. The total length of the map was 1440.9 cM with an average marker space of 2.8 cM. Blasting of 222 Jatropha ESTs containing polymorphic SSR or SNP markers against EST-databases revealed that 91.0%, 86.5% and 79.2% of Jatropha ESTs were homologous to counterparts in castor bean, poplar and Arabidopsis respectively. Mapping 192 orthologous markers to the assembled whole genome sequence of Arabidopsis thaliana identified 38 syntenic blocks and revealed that small linkage blocks were well conserved, but often shuffled. The first generation linkage map and the data of comparative mapping could lay a solid foundation for QTL mapping of agronomic traits, marker-assisted breeding and cloning genes responsible for phenotypic variation.
Endelman Jeffrey B
Full Text Available Abstract Background The need to integrate information from multiple linkage maps is a long-standing problem in genetics. One way to visualize the complex ordinal relationships is with a directed graph, where each vertex in the graph is a bin of markers. When there are no ordering conflicts between the linkage maps, the result is a directed acyclic graph, or DAG, which can then be linearized to produce a consensus map. Results New algorithms for the simplification and linearization of consensus graphs have been implemented as a package for the R computing environment called DAGGER. The simplified consensus graphs produced by DAGGER exactly capture the ordinal relationships present in a series of linkage maps. Using either linear or quadratic programming, DAGGER generates a consensus map with minimum error relative to the linkage maps while remaining ordinally consistent with them. Both linearization methods produce consensus maps that are compressed relative to the mean of the linkage maps. After rescaling, however, the consensus maps had higher accuracy (and higher marker density than the individual linkage maps in genetic simulations. When applied to four barley linkage maps genotyped at nearly 3000 SNP markers, DAGGER produced a consensus map with improved fine structure compared to the existing barley consensus SNP map. The root-mean-squared error between the linkage maps and the DAGGER map was 0.82 cM per marker interval compared to 2.28 cM for the existing consensus map. Examination of the barley hardness locus at the 5HS telomere, for which there is a physical map, confirmed that the DAGGER output was more accurate for fine structure analysis. Conclusions The R package DAGGER is an effective, freely available resource for integrating the information from a set of consistent linkage maps.
Jamshidi, Maral; Fagerholm, Rainer; Khan, Sofia; Aittomäki, Kristiina; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Andrulis, Irene L; Chang-Claude, Jenny; Devilee, Peter; Fasching, Peter A; Michailidou, Kyriaki; Bolla, Manjeet K; Dennis, Joe; Wang, Qin; Guo, Qi; Rhenius, Valerie; Cornelissen, Sten; Rudolph, Anja; Knight, Julia A; Loehberg, Christian R; Burwinkel, Barbara; Marme, Frederik; Hopper, John L; Southey, Melissa C; Bojesen, Stig E; Flyger, Henrik; Brenner, Hermann; Holleczek, Bernd; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Van Dyck, Laurien; Nevelsteen, Ines; Couch, Fergus J; Olson, Janet E; Giles, Graham G; McLean, Catriona; Haiman, Christopher A; Henderson, Brian E; Winqvist, Robert; Pylkäs, Katri; Tollenaar, Rob A E M; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J; Martens, John W M; Cox, Angela; Cross, Simon S; Simard, Jacques; Dunning, Alison M; Easton, Douglas F; Pharoah, Paul D P; Hall, Per; Blomqvist, Carl; Schmidt, Marjanka K; Nevanlinna, Heli
In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-SNP interactions on survival using the likelihood-ratio test comparing multivariate Cox' regression models of SNP pairs without and with an interaction term. We found two interacting pairs associating with prognosis: patients simultaneously homozygous for the rare alleles of rs5996080 and rs7973914 had worse survival (HRinteraction 6.98, 95% CI=3.3-14.4, P=1.42E-07), and patients carrying at least one rare allele for rs17243893 and rs57890595 had better survival (HRinteraction 0.51, 95% CI=0.3-0.6, P = 2.19E-05). Based on in silico functional analyses and literature, we speculate that the rs5996080 and rs7973914 loci may affect the BAFFR and TNFR1/TNFR3 receptors and breast cancer survival, possibly by disturbing both the canonical and non-canonical NF-κB pathways or their dynamics, whereas, rs17243893-rs57890595 interaction on survival may be mediated through TRAF2-TRAIL-R4 interplay. These results warrant further validation and functional analyses.
Full Text Available Single nucleotide polymorphisms (SNPs play important roles as molecular markers in plant genomics and breeding studies. Although onion (Allium cepa L. is an important crop globally, relatively few molecular marker resources have been reported due to its large genome and high heterozygosity. Genotyping-by-sequencing (GBS offers a greater degree of complexity reduction followed by concurrent SNP discovery and genotyping for species with complex genomes. In this study, GBS was employed for SNP mining in onion, which currently lacks a reference genome. A segregating F2 population, derived from a cross between ‘NW-001’ and ‘NW-002,’ as well as multiple parental lines were used for GBS analysis. A total of 56.15 Gbp of raw sequence data were generated and 1,851,428 SNPs were identified from the de novo assembled contigs. Stringent filtering resulted in 10,091 high-fidelity SNP markers. Robust SNPs that satisfied the segregation ratio criteria and with even distribution in the mapping population were used to construct an onion genetic map. The final map contained eight linkage groups and spanned a genetic length of 1,383 centiMorgans (cM, with an average marker interval of 8.08 cM. These robust SNPs were further analyzed using the high-throughput Fluidigm platform for marker validation. This is the first study in onion to develop genome-wide SNPs using GBS. The resulting SNP markers and developed linkage map will be valuable tools for genetic mapping of important agronomic traits and marker-assisted selection in onion breeding programs.
Xu, Cheng; Ren, Yonghong; Jian, Yinqiao; Guo, Zifeng; Zhang, Yan; Xie, Chuanxiao; Fu, Junjie; Wang, Hongwu; Wang, Guoying; Xu, Yunbi; Li, Ping; Zou, Cheng
With the decrease of cost in genotyping, single nucleotide polymorphisms (SNPs) have gained wide acceptance because of their abundance, even distribution throughout the maize (Zea mays L.) genome, and suitability for high-throughput analysis. In this study, a maize 55 K SNP array with improved genome coverage for molecular breeding was developed on an Affymetrix® Axiom® platform with 55,229 SNPs evenly distributed across the genome, including 22,278 exonic and 19,425 intronic SNPs. This array contains 451 markers that are associated with 368 known genes and two traits of agronomic importance (drought tolerance and kernel oil biosynthesis), 4067 markers that are not covered by the current reference genome, 734 markers that are differentiated significantly between heterotic groups, and 132 markers that are tags for important transgenic events. To evaluate the performance of 55 K array, we genotyped 593 inbred lines with diverse genetic backgrounds. Compared with the widely-used Illumina® MaizeSNP50 BeadChip, our 55 K array has lower missing and heterozygous rates and more SNPs with lower minor allele frequency (MAF) in tropical maize, facilitating in-depth dissection of rare but possibly valuable variation in tropical germplasm resources. Population structure and genetic diversity analysis revealed that this 55 K array is also quite efficient in resolving heterotic groups and performing fine fingerprinting of germplasm. Therefore, this maize 55 K SNP array is a potentially powerful tool for germplasm evaluation (including germplasm fingerprinting, genetic diversity analysis, and heterotic grouping), marker-assisted breeding, and primary quantitative trait loci (QTL) mapping and genome-wide association study (GWAS) for both tropical and temperate maize.
Elliott, Timothy P; Spithill, Terry W
Triclabendazole (TCBZ) is widely used for control of Fasciola hepatica (liver fluke) in animals and humans and resistance to this drug is now widespread. However, the mechanism of resistance to TCBZ is not known. A T687G single nucleotide polymorphism (SNP) in a P-glycoprotein gene was proposed as a molecular marker for TCBZ resistance in F. hepatica (Wilkinson et al., 2012). We analyzed this Pgp gene from TCBZ-susceptible and TCBZ-resistant populations from Australia to determine if the SNP was a marker for TCBZ resistance. From the 21 parasites studied we observed 27 individual haplotypes in the Pgp sequences which comprised seven haplotypic groups (A-G), with haplotypes A and B representing 81% of the total observed. The T687G SNP was not observed in either of the resistant or susceptible populations. We conclude that the T687G SNP in this Pgp gene is not associated with TCBZ resistance in these Australian F. hepatica populations and therefore unlikely to be a universal molecular marker for TCBZ resistance.
Smith Richard JH
Full Text Available Abstract Background The identification of disease-associated genes using single nucleotide polymorphisms (SNPs has been increasingly reported. In particular, the Affymetrix Mapping 10 K SNP microarray platform uses one PCR primer to amplify the DNA samples and determine the genotype of more than 10,000 SNPs in the human genome. This provides the opportunity for large scale, rapid and cost-effective genotyping assays for linkage analysis. However, the analysis of such datasets is nontrivial because of the large number of markers, and visualizing the linkage scores in the context of genome maps remains less automated using the current linkage analysis software packages. For example, the haplotyping results are commonly represented in the text format. Results Here we report the development of a novel software tool called CompareLinkage for automated formatting of the Affymetrix Mapping 10 K genotype data into the "Linkage" format and the subsequent analysis with multi-point linkage software programs such as Merlin and Allegro. The new software has the ability to visualize the results for all these programs in dChip in the context of genome annotations and cytoband information. In addition we implemented a variant of the Lander-Green algorithm in the dChipLinkage module of dChip software (V1.3 to perform parametric linkage analysis and haplotyping of SNP array data. These functions are integrated with the existing modules of dChip to visualize SNP genotype data together with LOD score curves. We have analyzed three families with recessive and dominant diseases using the new software programs and the comparison results are presented and discussed. Conclusions The CompareLinkage and dChipLinkage software packages are freely available. They provide the visualization tools for high-density oligonucleotide SNP array data, as well as the automated functions for formatting SNP array data for the linkage analysis programs Merlin and Allegro and calling
Full Text Available Abstract Background High-throughput genotyping of single nucleotide polymorphisms (SNPs generates large amounts of data. In many SNP genotyping assays, the genotype assignment is based on scatter plots of signals corresponding to the two SNP alleles. In a robust assay the three clusters that define the genotypes are well separated and the distances between the data points within a cluster are short. "Silhouettes" is a graphical aid for interpretation and validation of data clusters that provides a measure of how well a data point was classified when it was assigned to a cluster. Thus "Silhouettes" can potentially be used as a quality measure for SNP genotyping results and for objective comparison of the performance of SNP assays at different circumstances. Results We created a program (ClusterA for calculating "Silhouette scores", and applied it to assess the quality of SNP genotype clusters obtained by single nucleotide primer extension ("minisequencing" in the Tag-microarray format. A Silhouette score condenses the quality of the genotype assignment for each SNP assay into a single numeric value, which ranges from 1.0, when the genotype assignment is unequivocal, down to -1.0, when the genotype assignment has been arbitrary. In the present study we applied Silhouette scores to compare the performance of four DNA polymerases in our minisequencing system by analyzing 26 SNPs in both DNA polarities in 16 DNA samples. We found Silhouettes to provide a relevant measure for the quality of SNP assays at different reaction conditions, illustrated by the four DNA polymerases here. According to our result, the genotypes can be unequivocally assigned without manual inspection when the Silhouette score for a SNP assay is > 0.65. All four DNA polymerases performed satisfactorily in our Tag-array minisequencing system. Conclusion "Silhouette scores" for assessing the quality of SNP genotyping clusters is convenient for evaluating the quality of SNP genotype
Jul 30, 2014 ... Simple sequence repeats (SSRs) are the most widely used marker system for plant variety characterization and ... gene tagging in marker assisted breeding and gene cloning in .... PLS-2 and PAU Selection Long) to 1.00 (between PC. 2062 and .... Comparative analyses of genetic diversities within tomato.
Apr 3, 2012 ... seeded and black-seeded cultivars and breeding lines. The group B included 70 ... maize, rice and tomatoes (Reif et al., 2006; Vigouroux et al., 2005; Warburton et ..... development of molecular markers for marker-assisted breeding. .... Selection under domestication: evidence for a sweep in the rice Waxy ...
Labate, Joanne A; Robertson, Larry D; Wu, Feinan; Tanksley, Steven D; Baldo, Angela M
Because cultivated tomato (Solanum lycopersicum L.) is low in genetic diversity, public, verified single nucleotide polymorphism (SNP) markers within the species are in demand. To promote marker development we resequenced approximately 23 kb in a diverse set of 31 tomato lines including TA496. Three classes of markers were sampled: (1) 26 expressed-sequence tag (EST), all of which were predicted to be polymorphic based on TA496, (2) 14 conserved ortholog set II (COSII) or unigene, and (3) ten published sequences, composed of nine fruit quality genes and one anonymous RFLP marker. The latter two types contained mostly noncoding DNA. In total, 154 SNPs and 34 indels were observed. The distributions of nucleotide diversity estimates among marker types were not significantly different from each other. Ascertainment bias of SNPs was evaluated for the EST markers. Despite the fact that the EST markers were developed using SNP prediction within a sample consisting of only one TA496 allele and one additional allele, the majority of polymorphisms in the 26 EST markers were represented among the other 30 tomato lines. Fifteen EST markers with published SNPs were more closely examined for bias. Mean SNP diversity observations were not significantly different between the original discovery sample of two lines (53 SNPs) and the 31 line diversity panel (56 SNPs). Furthermore, TA496 shared its haplotype with at least one other line at 11 of the 15 markers. These data demonstrate that public EST databases and noncoding regions are a valuable source of unbiased SNP markers in tomato.
Claudia D. Gherman
Full Text Available Objectives. We hypothesized that adiponectin gene SNP+45 (rs2241766 and SNP+276 (rs1501299 would be associated with atherosclerotic peripheral arterial disease (PAD. Furthermore, the association between circulating adiponectin levels, fetuin-A, and tumoral necrosis factor-alpha (TNF-α in patients with atherosclerotic peripheral arterial disease was investigated. Method. Several blood parameters (such as adiponectin, fetuin-A, and TNF-α were measured in 346 patients, 226 with atherosclerotic peripheral arterial disease (PAD and 120 without symptomatic PAD (non-PAD. Two common SNPs of the ADIPOQ gene represented by +45T/G 2 and +276G/T were also investigated. Results. Adiponectin concentrations showed lower circulating levels in the PAD patients compared to non-PAD patients (P0.05. Conclusion. The results of our study demonstrated that neither adiponectin SNP+45 nor SNP+276 is associated with the risk of PAD.
Three SNP markers were developed that are completely diagnostic in distinguishing the two fire ant species Solenopsis invicta and S. richteri. Although a fourth marker we developed is not fully diagnostic, it is still useful given one of the variants is confined to S. richteri. Joint use of these ma...
With the advent of next generation sequencing (NGS) technologies, single nucleotide polymorphisms (SNPs) have become the major type of marker for genotyping in many crops. However, the availability of SNP markers for important traits of bread wheat (Triticum aestivum L.) that can be effectively used...
Sargent, D J; Yang, Y; Šurbanovski, N; Bianco, L; Buti, M; Velasco, R; Giongo, L; Davis, T M
The cultivated strawberry, Fragaria×ananassa possesses a genetically complex allo-octoploid genome. Advances in genomics research in Fragaria, including the release of a genome sequence for F. vesca, have permitted the development of a high throughput whole genome genotyping array for strawberry, which promises to facilitate genetics and genomics research. In this investigation, we used the Axiom® IStraw90®)array for linkage map development, and produced a linkage map containing 8,407 SNP markers spanning 1,820cM. Whilst the linkage map provides good coverage of the genome of both parental genotypes, the map of 'Monterey' contained significantly fewer mapped markers than did that of 'Darselect'. The array contains a novel marker class known as haploSNPs, which exploit homoeologous sequence variants as probe destabilization sites to effectively reduce marker ploidy. We examined these sites as potential indicators of subgenomic identities by using comparisons to allele states in two ancestral diploids. On this basis, haploSNP loci could be inferred to be derived from F. vesca, F. iinumae, or from an unknown source. When the identity classifications of haploSNPs were considered in conjunction with their respective linkage map positions, it was possible to define two discrete subgenomes, while the remaining homoeologues of each chromosome could not be partitioned into two discrete subgenomic groupings. These findings suggested a novel hypothesis regarding octoploid strawberry subgenome structure and evolutionary origins.
Wei, Wei; Ayub, Qasim; Xue, Yali; Tyler-Smith, Chris
We have compared phylogenies and time estimates for Y-chromosomal lineages based on resequencing ∼9 Mb of DNA and applying the program GENETREE to similar analyses based on the more standard approach of genotyping 26 Y-SNPs plus 21 Y-STRs and applying the programs NETWORK and BATWING. We find that deep phylogenetic structure is not adequately reconstructed after Y-SNP plus Y-STR genotyping, and that times estimated using observed Y-STR mutation rates are several-fold too recent. In contrast, an evolutionary mutation rate gives times that are more similar to the resequencing data. In principle, systematic comparisons of this kind can in future studies be used to identify the combinations of Y-SNP and Y-STR markers, and time estimation methodologies, that correspond best to resequencing data. PMID:23768990
in the stratified analysis by p53 mutation status (GG vs TT: OR = 1.17, 95% CI = 0.75-1.82 and TG vs TT: OR = 1.09, 95% CI = 0.89-1.34 for positive p53 mutation status; GG vs TT: OR = 0.95, 95% CI = 0.72-1.25 and TG vs TT: OR = 1.06, 95% CI = 0.85-1.30 for negative p53 mutation status. Conclusions The analyses indicate that MDM2 SNP309 serves as a tumor susceptibility marker, and that there is an association between MDM2 SNP309 and p53 Arg72Pro regarding tumor susceptibility. Further studies that take into consideration environmental stresses and functional genetic variants in the p53-MDM2-related genes are warranted.
Shahid, Muhammad Qasim; Çiftçi, Vahdettin; E. Sáenz de Miera, Luis; Aasim, Muhammad; Nadeem, Muhammad Azhar; Aktaş, Husnu; Özkan, Hakan; Hatipoğlu, Rüştü
Until now, little attention has been paid to the geographic distribution and evaluation of genetic diversity of durum wheat from the Central Fertile Crescent (modern-day Turkey and Syria). Turkey and Syria are considered as primary centers of wheat diversity, and thousands of locally adapted wheat landraces are still present in the farmers’ small fields. We planned this study to evaluate the genetic diversity of durum wheat landraces from the Central Fertile Crescent by genotyping based on DArTseq and SNP analysis. A total of 39,568 DArTseq and 20,661 SNP markers were used to characterize the genetic characteristic of 91 durum wheat land races. Clustering based on Neighbor joining analysis, principal coordinate as well as Bayesian model implemented in structure, clearly showed that the grouping pattern is not associated with the geographical distribution of the durum wheat due to the mixing of the Turkish and Syrian landraces. Significant correlation between DArTseq and SNP markers was observed in the Mantel test. However, we detected a non-significant relationship between geographical coordinates and DArTseq (r = -0.085) and SNP (r = -0.039) loci. These results showed that unconscious farmer selection and lack of the commercial varieties might have resulted in the exchange of genetic material and this was apparent in the genetic structure of durum wheat in Turkey and Syria. The genomic characterization presented here is an essential step towards a future exploitation of the available durum wheat genetic resources in genomic and breeding programs. The results of this study have also depicted a clear insight about the genetic diversity of wheat accessions from the Central Fertile Crescent. PMID:28099442
... markers may be seen in conditions such as: Osteoporosis Paget disease Cancer that has spread to the bone (metastatic bone disease) Hyperparathyroidism Hyperthyroidism Osteomalacia in adults and rickets in children—lack of bone mineralization, ...
Dato, S; Soerensen, M; Lagani, V; Montesanto, A; Passarino, G; Christensen, K; Tan, Q; Christiansen, L
Preservation of functional ability is a well-recognized marker of longevity. At a molecular level, a major determinant of the physiological decline occurring with aging is the imbalance between production and accumulation of oxidative damage to macromolecules, together with a decreased efficiency of stress response to avoid or repair such damage. In this paper we investigated the association of 38 genes (311 SNPs) belonging to the pro-antioxidant pathways with physical and cognitive performances, by analyzing single SNP and gene-based associations with Hand Grip strength (HG), Activities of Daily Living (ADL), Walking Speed (WS), Mini Mental State Examination (MMSE) and Composite Cognitive Score (CCS) in a Cohort of 1089 Danish nonagenarians. Moreover, for each gene analyzed in the pro-antioxidant pathway, we tested the influence on longitudinal survival. In the whole sample, nominal associations were found for TXNRD1 variability with ADL and WS, NDUFS1 and UCP3 with HG and WS, GCLC and UCP2 with WS (p<0.05). Stronger associations although not holding the multiple comparison correction, were observed between MMSE and NDUFV1, MT1A and GSTP1 variability (p<0.009). Moreover, we found that association between genetic variability in the pro-antioxidant pathway and functional status at old age is influenced by sex. In particular, most significant associations were observed in nonagenarian females, between HG scores and GLRX and UCP3 variability, between ADL levels and TXNRD1, MMSE and MT1A genetic variability. In males, a borderline statistically significant association with ADL level was found for UQCRFS1 gene. Nominally significant associations in relation to survival were found in the female sample only with SOD2, NDUFS1, UCP3 and TXNRD1 variability, the latter two confirming previous observations reported in the same cohort. Overall, our work supports the evidence that genes belonging to the pro-anti-oxidant pathway are able to modulate physical and cognitive
Full Text Available C-reactive protein (CRP is an established marker of inflammation with pattern-recognition receptor-like activities. Despite the close association of the serum level of CRP with the risk and prognosis of several types of cancer, it remains elusive whether CRP contributes directly to tumorigenesis or just represents a bystander marker. We have recently identified recurrent mutations at the SNP position -286 (rs3091244 in the promoter of CRP gene in several tumor types, instead suggesting that locally produced CRP is a potential driver of tumorigenesis. However, it is unknown whether the -286 site is the sole SNP position of CRP gene targeted for mutation and whether there is any association between CRP SNP mutations and other frequently mutated genes in tumors. Herein, we have examined the genotypes of three common CRP non-coding SNPs (rs7553007, rs1205, rs3093077 in tumor/normal sample pairs of 5 cancer types (n = 141. No recurrent somatic mutations are found at these SNP positions, indicating that the -286 SNP mutations are preferentially selected during the development of cancer. Further analysis reveals that the -286 SNP mutations of CRP tend to co-occur with mutated APC particularly in rectal cancer (p = 0.04; n = 67. By contrast, mutations of CRP and p53 or K-ras appear to be unrelated. There results thus underscore the functional importance of the -286 mutation of CRP in tumorigenesis and imply an interaction between CRP and Wnt signaling pathway.
Antonio M Ramos
Full Text Available BACKGROUND: The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs. This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. METHODOLOGY/PRINCIPAL FINDINGS: A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI were sequenced using Illumina's Genome Analyzer (GA. The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%. Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF for all scorable SNPs was 0.274. CONCLUSIONS/SIGNIFICANCE: Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs.
Chen, Jin-Bor; Chuang, Li-Yeh; Lin, Yu-Da; Liou, Chia-Wei; Lin, Tsu-Kung; Lee, Wen-Chin; Cheng, Ben-Chung; Chang, Hsueh-Wei; Yang, Cheng-Hong
Single nucleotide polymorphism (SNP) interaction analysis can simultaneously evaluate the complex SNP interactions present in complex diseases. However, it is less commonly applied to evaluate the predisposition of chronic dialysis and its computational analysis remains challenging. In this study, we aimed to improve the analysis of SNP-SNP interactions within the mitochondrial D-loop in chronic dialysis. The SNP-SNP interactions between 77 reported SNPs within the mitochondrial D-loop in chronic dialysis study were evaluated in terms of SNP barcodes (different SNP combinations with their corresponding genotypes). We propose a genetic algorithm (GA) to generate SNP barcodes. The χ(2) values were then calculated by the occurrences of the specific SNP barcodes and their non-specific combinations between cases and controls. Each SNP barcode (2- to 7-SNP) with the highest value in the χ(2) test was regarded as the best SNP barcode (11.304 to 23.310; p algorithm to address the SNP-SNP interactions and demonstrated that many non-significant SNPs within the mitochondrial D-loop may play a role in jointed effects to chronic dialysis susceptibility.
Mehta, Bhavik; Daniel, Runa; Phillips, Chris; McNevin, Dennis
Short tandem repeats are the gold standard for human identification but are not informative for forensic DNA phenotyping (FDP). Single-nucleotide polymorphisms (SNPs) as genetic markers can be applied to both identification and FDP. The concept of DNA intelligence emerged with the potential for SNPs to infer biogeographical ancestry (BGA) and externally visible characteristics (EVCs), which together enable the FDP process. For more than a decade, the SNaPshot(®) technique has been utilised to analyse identity and FDP-associated SNPs in forensic DNA analysis. SNaPshot is a single-base extension (SBE) assay with capillary electrophoresis as its detection system. This multiplexing technique offers the advantage of easy integration into operational forensic laboratories without the requirement for any additional equipment. Further, the SNP panels from SNaPshot(®) assays can be incorporated into customised panels for massively parallel sequencing (MPS). Many SNaPshot(®) assays are available for identity, BGA and EVC profiling with examples including the well-known SNPforID 52-plex identity assay, the SNPforID 34-plex BGA assay and the HIrisPlex EVC assay. This review lists the major forensically relevant SNaPshot(®) assays for human DNA SNP analysis and can be used as a guide for selecting the appropriate assay for specific identity and FDP applications.
James W Kijas
Full Text Available The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability.
Brooks, Ashley; Creighton, Erica K; Gandolfi, Barbara; Khan, Razib; Grahn, Robert A; Lyons, Leslie A
Phenotypic and genotypic characteristics of the cat can be obtained from single nucleotide polymorphisms (SNPs) analyses of fur. This study developed miniplexes using SNPs with high discriminating power for random-bred domestic cats, focusing on individual and phenotypic identification. Seventy-eight SNPs were investigated using a multiplex PCR followed by a fluorescently labeled single base extension (SBE) technique (SNaPshot(®) ). The SNP miniplexes were evaluated for reliability, reproducibility, sensitivity, species specificity, detection limitations, and assignment accuracy. Six SNPplexes were developed containing 39 intergenic SNPs and 26 phenotypic SNPs, including a sex identification marker, ZFXY. The combined random match probability (cRMP) was 6.58 × 10(-19) across all Western cat populations and the likelihood ratio was 1.52 × 10(18) . These SNPplexes can distinguish individual cats and their phenotypic traits, which could provide insight into crime reconstructions. A SNP database of 237 cats from 13 worldwide populations is now available for forensic applications.
Liu, Jun-Jun; Sniezko, Richard A; Zamany, Arezoo; Williams, Holly; Wang, Ning; Kegley, Angelia; Savin, Douglas P; Chen, Hao; Sturrock, Rona N
Molecular breeding incorporates efficient tools to increase rust resistance in five-needle pines. Susceptibility of native five-needle pines to white pine blister rust (WPBR), caused by the non-native invasive fungus Cronartium ribicola (J.C. Fisch.), has significantly reduced wild populations of these conifers in North America. Major resistance (R) genes against specific avirulent pathotypes have been found in several five-needle pine species. In this study, we screened genic SNP markers by comparative transcriptome and genetic association analyses and constructed saturated linkage maps for the western white pine (Pinus monticola) R locus (Cr2). Phenotypic segregation was measured by a hypersensitive reaction (HR)-like response on the needles and disease symptoms of cankered stems post inoculation by the C. ribicola avcr2 race. SNP genotypes were determined by HRM- and TaqMan-based SNP genotyping. Saturated maps of the Cr2-linkage group (LG) were constructed in three seed families using a total of 34 SNP markers within 21 unique genes. Cr2 was consistently flanked by contig_2142 (encoding a ruvb-like protein) and contig_3772 (encoding a delta-fatty acid desaturase) across the three seed families. Cr2 was anchored to the Pinus consensus LG-1, which differs from LGs where other R loci of Pinus species were mapped. GO annotation identified a set of NBS-LRR and other resistance-related genes as R candidates in the Cr2 region. Association of one nonsynonymous SNP locus of an NBS-LRR gene with Cr2-mediated phenotypes provides a valuable tool for marker-assisted selection (MAS), which will shorten the breeding cycle of resistance screening and aid in the restoration of WPBR-disturbed forest ecosystems.
Full Text Available Abstract Background The recent development of new high-throughput technologies for SNP genotyping has opened the possibility of taking a genome-wide linkage approach to the search for new candidate genes involved in heredity diseases. The two major breast cancer susceptibility genes BRCA1 and BRCA2 are involved in 30% of hereditary breast cancer cases, but the discovery of additional breast cancer predisposition genes for the non-BRCA1/2 breast cancer families has so far been unsuccessful. Results In order to evaluate the power improvement provided by using SNP markers in a real situation, we have performed a whole genome screen of 19 non-BRCA1/2 breast cancer families using 4720 genomewide SNPs with Illumina technology (Illumina's Linkage III Panel, with an average distance of 615 Kb/SNP. We identified six regions on chromosomes 2, 3, 4, 7, 11 and 14 as candidates to contain genes involved in breast cancer susceptibility, and additional fine mapping genotyping using microsatellite markers around linkage peaks confirmed five of them, excluding the region on chromosome 3. These results were consistent in analyses that excluded SNPs in high linkage disequilibrium. The results were compared with those obtained previously using a 10 cM microsatellite scan (STR-GWS and we found lower or not significant linkage signals with STR-GWS data compared to SNP data in all cases. Conclusion Our results show the power increase that SNPs can supply in linkage studies.
Albrechtsen, Anders; Nielsen, Finn Cilius; Nielsen, Rasmus
Chip-based high-throughput genotyping has facilitated genome-wide studies of genetic diversity. Many studies have utilized these large data sets to make inferences about the demographic history of human populations using measures of genetic differentiation such as F(ST) or principal component...... analyses. However, the single nucleotide polymorphism (SNP) chip data suffer from ascertainment biases caused by the SNP discovery process in which a small number of individuals from selected populations are used as discovery panels. In this study, we investigate the effect of the ascertainment bias...... on inferences regarding genetic differentiation among populations in one of the common genome-wide genotyping platforms. We generate SNP genotyping data for individuals that previously have been subject to partial genome-wide Sanger sequencing and compare inferences based on genotyping data to inferences based...
Fadista, João; Bendixen, Christian
The field of genetics has come to rely heavily on commercial genotyping arrays and accompanying annotations for insights into genotype-phenotype associations. However, in order to avoid errors and false leads, it is imperative that the annotation of SNP chromosomal positions is accurate and unamb......The field of genetics has come to rely heavily on commercial genotyping arrays and accompanying annotations for insights into genotype-phenotype associations. However, in order to avoid errors and false leads, it is imperative that the annotation of SNP chromosomal positions is accurate...... and unambiguous. We report on genomic positional discrepancies of various SNP chips for human, cattle and mouse species, and discuss their causes and consequences....
Cornelis, Senne; Gansemans, Yannick; Deleye, Lieselot; Deforce, Dieter; Van Nieuwerburgh, Filip
One of the latest developments in next generation sequencing is the Oxford Nanopore Technologies’ (ONT) MinION nanopore sequencer. We studied the applicability of this system to perform forensic genotyping of the forensic female DNA standard 9947 A using the 52 SNP-plex assay developed by the SNPforID consortium. All but one of the loci were correctly genotyped. Several SNP loci were identified as problematic for correct and robust genotyping using nanopore sequencing. All these loci contained homopolymers in the sequence flanking the forensic SNP and most of them were already reported as problematic in studies using other sequencing technologies. When these problematic loci are avoided, correct forensic genotyping using nanopore sequencing is technically feasible. PMID:28155888
Engelsma, K A; Veerkamp, R F; Calus, M P L; Bijma, P; Windig, J J
Genetic diversity is often evaluated using pedigree information. Currently, diversity can be evaluated in more detail over the genome based on large numbers of SNP markers. Pedigree- and SNP-based diversity were compared for two small related groups of Holstein animals genotyped with the 50 k SNP chip, genome-wide, per chromosome and for part of the genome examined. Diversity was estimated with coefficient of kinship (pedigree) and expected heterozygosity (SNP). SNP-based diversity at chromosome regions was determined using 5-Mb sliding windows, and significance of difference between groups was determined by bootstrapping. Both pedigree- and SNP-based diversity indicated more diversity in one of the groups; 26 of the 30 chromosomes showed significantly more diversity for the same group, as did 25.9% of the chromosome regions. Even in small populations that are genetically close, differences in diversity can be detected. Pedigree- and SNP-based diversity give comparable differences, but SNP-based diversity shows on which chromosome regions these differences are based. For maintaining diversity in a gene bank, SNP-based diversity gives a more detailed picture than pedigree-based diversity. © 2012 Blackwell Verlag GmbH.
Supakankul, Pantaporn; Kumchoo, Tanavadee; Mekchay, Supamit
This study was conducted to identify and evaluate the effective single nucleotide polymorphism (SNP) markers for fat deposition in the longissimus dorsi muscles of pigs using the amplified fragment length polymorphism (AFLP) approach. Sixty-four selective primer combinations were used to identify the AFLP markers in the 20 highest- and 20 lowest-intramuscular fat (IMF) content phenotypes. Five AFLP fragments were converted into simple codominant SNP markers. These SNP markers were tested in terms of their association with IMF content and fatty acid (FA) composition traits in 620 commercially crossbred pigs. The SSC7 g.4937240C>G marker showed an association with IMF content (pIMF content and arachidonic levels (pA marker revealed an association with palmitoleic and ω9 FA levels (pT marker showed a significant association with IMF content and FA levels of palmitoleic, eicosenoic, arachidonic, monounsaturated fatty acids, and ω9 FA levels. However, no significant association of SSC8 g.47338181G>A was observed with any IMF and FA levels in this study. Four SNP markers (SSC7 g.4937240C>G, SSC9 g.5496647_5496662insdel, SSC10 g.71225134G>A, and SSC17 g.61976696G>T) were found to be associated with IMF and/or FA content traits in commercially crossbred pigs. These findings provide evidence of the novel SNP markers as being potentially useful for selecting pigs with the desirable IMF content and FA composition.
Talukder, Zahirul I; Seiler, Gerald J; Song, Qijian; Ma, Guojia; Qi, Lili
Basal stalk rot (BSR), caused by the ascomycete fungus (Lib.) de Bary, is a serious disease of sunflower ( L.) in the cool and humid production areas of the world. Quantitative trait loci (QTL) for BSR resistance were identified in a sunflower recombinant inbred line (RIL) population derived from the cross HA 441 × RHA 439. A genotyping-by-sequencing (GBS) approach was adapted to discover single nucleotide polymorphism (SNP) markers. A genetic linkage map was developed comprised of 1053 SNP markers on 17 linkage groups (LGs) spanning 1401.36 cM. The RILs were tested in five environments (locations and years) for resistance to BSR. Quantitative trait loci were identified in each environment separately and also with integrated data across environments. A total of six QTL were identified in all five environments: one of each on LGs 4, 9, 10, 11, 16, and 17. The most significant QTL, and , were identified at multiple environments on LGs 10 and 17, explaining 31.6 and 20.2% of the observed phenotypic variance, respectively. The remaining four QTL, , , , and , were detected in only one environment on LGs 4, 9, 11, and 16, respectively. Each of these QTL explains between 6.4 and 10.5% of the observed phenotypic variation in the RIL population. Alleles conferring increased resistance were contributed by both parents. The potential of the and in marker-assisted selection (MAS) breeding are discussed. Copyright © 2016 Crop Science Society of America.
Zahirul I. Talukder
Full Text Available Basal stalk rot (BSR, caused by the ascomycete fungus (Lib. de Bary, is a serious disease of sunflower ( L. in the cool and humid production areas of the world. Quantitative trait loci (QTL for BSR resistance were identified in a sunflower recombinant inbred line (RIL population derived from the cross HA 441 × RHA 439. A genotyping-by-sequencing (GBS approach was adapted to discover single nucleotide polymorphism (SNP markers. A genetic linkage map was developed comprised of 1053 SNP markers on 17 linkage groups (LGs spanning 1401.36 cM. The RILs were tested in five environments (locations and years for resistance to BSR. Quantitative trait loci were identified in each environment separately and also with integrated data across environments. A total of six QTL were identified in all five environments: one of each on LGs 4, 9, 10, 11, 16, and 17. The most significant QTL, and , were identified at multiple environments on LGs 10 and 17, explaining 31.6 and 20.2% of the observed phenotypic variance, respectively. The remaining four QTL, , , , and , were detected in only one environment on LGs 4, 9, 11, and 16, respectively. Each of these QTL explains between 6.4 and 10.5% of the observed phenotypic variation in the RIL population. Alleles conferring increased resistance were contributed by both parents. The potential of the and in marker-assisted selection (MAS breeding are discussed.
Kalogianni, Despina P; Bazakos, Christos; Boutsika, Lemonia M; Targem, Mehdi Ben; Christopoulos, Theodore K; Kalaitzis, Panagiotis; Ioannou, Penelope C
Olive oil cultivar verification is of primary importance for the competitiveness of the product and the protection of consumers and producers from fraudulence. Single-nucleotide polymorphisms (SNPs) have emerged as excellent DNA markers for authenticity testing. This paper reports the first multiplex SNP genotyping assay for olive oil cultivar identification that is performed on a suspension of fluorescence-encoded microspheres. Up to 100 sets of microspheres, with unique "fluorescence signatures", are available. Allele discrimination was accomplished by primer extension reaction. The reaction products were captured via hybridization on the microspheres and analyzed, within seconds, by a flow cytometer. The "fluorescence signature" of each microsphere is assigned to a specific allele, whereas the signal from a reporter fluorophore denotes the presence of the allele. As a model, a panel of three SNPs was chosen that enabled identification of five common Greek olive cultivars (Adramytini, Chondrolia Chalkidikis, Kalamon, Koroneiki, and Valanolia).
Chekanov, N N; Boulygina, E S; Beletskiy, A V; Prokhortchouk, E B; Skryabin, K G
A somatic cell genome was recently resequenced for a patient with renal cancer. The data were submitted to the NCBI Sequence Read Archive under the accession number SRA012240. Here, we have performed SNP calling for the genome and compared it with several published genomes. We have found 2, 921, 724 SNPs, including 1, 472, 679 newly described ones. Among them, 63, 462 SNPs have been mapped to the Y chromosome and, based on 18 markers, the genome has been ascribed to the R1a1a haplogroup predominant in Russian males. The mitochondrial haplogroup has been determined as U5a, which is also common in the European part of Russia. Short reads unmapped to the human genome were used for thede novoassembly of DNA sequences. This resulted in genome-specific contigs (more than 100 bp in length) with an overall length of 154 kbp (for GAII) and 4.7 kbp (for SOLiD).
Badano, I; Schurr, T G; Stietz, S M; Dulik, M C; Mampaey, M; Quintero, I M; Zinovich, J B; Campos, R H; Liotta, D J
The aim of this study is to describe genetic variation in the TNF promoter in the ethnically diverse population of Misiones, north-eastern Argentina. We analysed 210 women including 66 Amerindians of the Mbya-Guarani ethnic group and 144 white-admixed individuals from urban and rural areas of Misiones. Their DNA samples were surveyed for TNF polymorphisms -376 A/G, -308 A/G -244 A/G and -238 A/G by PCR amplification and direct sequencing and for the Amerindian marker -857 C/T by real-time PCR. Our main findings are as follows:(i) a distinctive pattern of Single Nucleotide Polymorphism (SNP) distribution among these groups, (ii) genetic differentiation between the Mbya-Guarani and the white-admixed populations (P Misiones.
Yun Joo eYoo
Full Text Available Multi-marker methods for genetic association analysis can be performed for common and low frequency SNPs to improve power. Regression models are an intuitive way to formulate multi-marker tests. In previous studies we evaluated regression-based multi-marker tests for common SNPs, and through identification of bins consisting of correlated SNPs, developed a multi-bin linear combination (MLC test that is a compromise between a 1df linear combination test and a multi-df global test. Bins of SNPs in high linkage disequilibrium (LD are identified, and a linear combination of individual SNP statistics is constructed within each bin. Then association with the phenotype is represented by an overall statistic with df as many or few as the number of bins. In this report we evaluate multi-marker tests for SNPs that occur at low frequencies. There are many linear and quadratic multi-marker tests that are suitable for common or low frequency variant analysis. We compared the performance of the MLC tests with various linear and quadratic statistics in joint or marginal regressions. For these comparisons, we performed a simulation study of genotypes and quantitative traits for 85 genes with many low frequency SNPs based on HapMap Phase III. We compared the tests using 1 set of all SNPs in a gene, 2 set of common SNPs in a gene (MAF≥5%, 3 set of low frequency SNPs (1%≤MAF
Ribas, Gloria; González-Neira, Anna; Salas, Antonio; Milne, Roger L; Vega, Ana; Carracedo, Begoña; González, Emilio; Barroso, Eva; Fernández, Lara P; Yankilevich, Patricio; Robledo, Mercedes; Carracedo, Angel; Benítez, Javier
One of the many potential uses of the HapMap project is its application to the investigation of complex disease aetiology among a wide range of populations. This study aims to assess the transferability of HapMap SNP data to the Spanish population in the context of cancer research. We have carried out a genotyping study in Spanish subjects involving 175 candidate cancer genes using an indirect gene-based approach and compared results with those for HapMap CEU subjects. Allele frequencies were very consistent between the two samples, with a high positive correlation (R) of 0.91 (PHapMap CEU data using pairwise r (2) thresholds of 0.8 and 0.5 was assessed by applying these to the Spanish and current HapMap data for 66 genes. In general, the HapMap tagSNPs performed very well. Our results show generally high concordance with HapMap data in allele frequencies and haplotype distributions and confirm the applicability of HapMap SNP data to the study of complex diseases among the Spanish population.
Schneider, J F; Rempel, L A; Snelling, W M; Wiedmann, R T; Nonneman, D J; Rohrer, G A
Reproductive efficiency has a great impact on the economic success of pork (sus scrofa) production. Number born alive (NBA) and average piglet birth weight (ABW) contribute greatly to reproductive efficiency. To better understand the underlying genetics of birth traits, a genome-wide association study (GWAS) was undertaken. Samples of DNA were collected and tested using the Illumina PorcineSNP60 BeadChip from 1,152 first parity gilts. Traits included total number born (TNB), NBA, number born dead (NBD), number stillborn (NSB), number of mummies (MUM), total litter birth weight (LBW), and ABW. A total of 41,151 SNP were tested using a Bayesian approach. Beginning with the first 5 SNP on SSC1 and ending with the last 5 SNP on the SSCX, SNP were assigned to groups of 5 consecutive SNP by chromosome-position order and analyzed again using a Bayesian approach. From that analysis, 5-SNP groups were selected having no overlap with another 5-SNP groups and no overlap across chromosomes. These selected 5-SNP non-overlapping groups were defined as QTL. Of the available 8,814 QTL, 124 were found to be statistically significant (P ABW, 9 on SSC1, 3 on SSC2, 9 on SSC5, 5 on SSC6, 1 on SSC7, 2 on SSC8, 2 on SSC9, 3 on SSC10, 1 on SSC11, 3 on SSC12, 2 on SSC13, 8 on SSC14, 8 on SSC15, 1 on SSC17, and 8 on SSC18. Several candidate genes have been identified that overlap QTL locations among TNB, NBA, NBD, and ABW. These QTL when combined with information on genes found in the same regions should provide useful information that could be used for marker assisted selection, marker assisted management, or genomic selection applications in commercial pig populations.
LIU Chengzhang; WANG Xia; XIANG Jianhai; LI Fuhua
Pacific white shrimp has become a major aquaculture and fishery species worldwide.Although a large scale EST resource has been publicly available since 2008,the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure.In this study,a set of 155 411 expressed sequence tags(ESTs)from the NCBI database were computationally analyzed and 17 225single nucleotide polymorphisms(SNPs)were predicted,including 9 546 transitions,5 124 transversions and 2 481 indels.Among the 7 298 SNP substitutions located in functionally annotated contigs,58.4％(4 262)are non-synonymous SNPs capable of introducing amino acid mutations.Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding.Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous,suggesting negative selection.Distribution of non-synonymous to synonymous substitutions(Ka/Ks)ratio ranges from 0 to 4.01,(average 0.42,median 0.26),suggesting that the majority of the affected genes are under purifying selection.Enrichment analysis identified multiple gene ontology categories under positive or negative selection.Categories involved in innate immune response and male gamete generation are rich in positively selected genes,which is similar to reports in Drosophila and primates.This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species.The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua
Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
Leekitcharoenphon, Pimlapas; Kaas, Rolf S; Thomsen, Martin Christen Frølund; Friis, Carsten; Rasmussen, Simon; Aarestrup, Frank M
The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script.The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evaluation results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.
Liu, Weiqiang; Zhang, Rui; Wei, Jun; Zhang, Huimin; Yu, Guojiu; Li, Zhihua; Chen, Min; Sun, Xiaofang
Imprinting disorders, such as Beckwith-Wiedemann syndrome (BWS), Prader-Willi syndrome (PWS) and Angelman syndrome (AS), can be detected via methylation analysis, methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA), or other methods. In this study, we applied single nucleotide polymorphism (SNP)-based chromosomal microarray analysis to detect copy number variations (CNVs) and uniparental disomy (UPD) events in patients with suspected imprinting disorders. Of 4 patients, 2 had a 5.25-Mb microdeletion in the 15q11.2q13.2 region, 1 had a 38.4-Mb mosaic UPD in the 11p15.4 region, and 1 had a 60-Mb detectable UPD between regions 14q13.2 and 14q32.13. Although the 14q32.2 region was classified as normal by SNP array for the 14q13 UPD patient, it turned out to be a heterodisomic UPD by short tandem repeat marker analysis. MS-MLPA analysis was performed to validate the variations. In conclusion, SNP-based microarray is an efficient alternative method for quickly and precisely diagnosing PWS, AS, BWS, and other imprinted gene-associated disorders when considering aberrations due to CNVs and most types of UPD.
Hägg, Sara; Ganna, Andrea; Van Der Laan, Sander W.; Esko, Tonu; Pers, Tune H.; Locke, Adam E.; Berndt, Sonja I.; Justice, Anne E.; Kahali, Bratati; Siemelink, Marten A.; Pasterkamp, Gerard; Strachan, David P.; Speliotes, Elizabeth K.; North, Kari E.; Loos, Ruth J.F.; Hirschhorn, Joel N.; Pawitan, Yudi; Ingelsson, Erik
To date, genome-wide association studies (GWASs) have identified >100 loci with single variants associated with body mass index (BMI). This approach may miss loci with high allelic heterogeneity; therefore, the aim of the present study was to use gene-based meta-analysis to identify regions with high allelic heterogeneity to discover additional obesity susceptibility loci. We included GWAS data from 123 865 individuals of European descent from 46 cohorts in Stage 1 and Metabochip data from additional 103 046 individuals from 43 cohorts in Stage 2, all within the Genetic Investigation of ANthropometric Traits (GIANT) consortium. Each cohort was tested for association between ∼2.4 million (Stage 1) or ∼200 000 (Stage 2) imputed or genotyped single variants and BMI, and summary statistics were subsequently meta-analyzed in 17 941 genes. We used the ‘VErsatile Gene-based Association Study’ (VEGAS) approach to assign variants to genes and to calculate gene-based P-values based on simulations. The VEGAS method was applied to each cohort separately before a gene-based meta-analysis was performed. In Stage 1, two known (FTO and TMEM18) and six novel (PEX2, MTFR2, SSFA2, IARS2, CEP295 and TXNDC12) loci were associated with BMI (P gene tests). We confirmed all loci, and six of them were gene-wide significant in Stage 2 alone. We provide biological support for the loci by pathway, expression and methylation analyses. Our results indicate that gene-based meta-analysis of GWAS provides a useful strategy to find loci of interest that were not identified in standard single-marker analyses due to high allelic heterogeneity. PMID:26376864
许家磊; 王宇; 后猛; 李强
As the most potential marker of the third-generation molecular markers, single nucleotide polymorphism (SNP) has been widely used in genetic analysis in recent years. At present, detection methods of SNP can be approximately divided into two types:one type is the classical detection methods based on gel electrophoresis, such as, single-strand conformational polymorphism (SSCP), denaturing gradient gel electrophoresis (DDGE), cleaved amplified polymorphic sequence (CAPS), and allele-specific PCR (AS-PCR), etc.. The other is high throughput and automated detection methods, such as direct sequencing, DNA chips, denaturing high-performance liquid chromatography (DHPLC), mass spectrometry detection technology, and high resolution melting (HRM), etc.. This paper summarized the principles and application of main detection methods, and analyzed their advantages and disadvantages.%作为近年来最有发展潜力的第三代分子标记，单核苷酸多态性(single nucleotide polymorphism, SNP)在遗传分析中得到了广泛应用。目前SNP的检测方法大致可以分为两大类：一大类是以单链构象多态性(SSCP)、变性梯度凝胶电泳(DDGE)、酶切扩增多态性序列(CAPS)、等位基因特异性PCR (allele-specific PCR, AS-PCR)等为代表的以凝胶电泳为基础的传统经典的检测方法。另一大类是以直接测序、DNA芯片、变性高效液相色谱(DHPLC)、质谱检测技术、高分辨率溶解曲线(HRM)等为代表的高通量、自动化程度较高的检测方法。本文综述了两大类SNP检测方法中主要检测技术的原理和应用，并分析在实际应用中各种检测技术的优缺点。
Full Text Available Abstract Background Flax (Linum usitatissimum L. is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs. Results Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents. Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%. Conclusions Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from
Duarte Delgado, Diana Lucia
El alto contenido de sacarosa (azúcar no reductor), glucosa y fructosa (azúcares reductores) en los tubérculos de papa representa un rasgo indeseable en la industria del procesamiento en frito, pues lo azúcares reductores conducen al ennegrecimiento de la papa frita y a la producción de compuestos tóxicos como la acrilamida que reducen la aceptación por los consumidores y ocasionan riesgos para la salud humana. El análisis de asociación genética es una estrategia para estudiar las bases molec...
Andersen, J R; Asp, T; Lu, Y C
Laccases, EC 126.96.36.199 or p-diphenol : dioxygen oxidoreductases, have been proposed to be involved in the oxidative polymerization of monolignols into lignins in plants. While 17 laccases have been identified in Arabidopsis, only five (ZmLac1-5) have so far been identified in maize. By a bioinform...
Pineapple (Ananas comosus [L.] Merr.) is the third most important tropical fruit in the world after banana and mango and a major agricultural commodity in Hawaii. As a crop with vegetative propagation, genetic redundancy is a major challenge for efficient genebank management and in breeding. Using E...
Bourke, Peter M.; Voorrips, Roeland E.; Kranenburg, Twan; Jansen, Hans; Visser, Richard G.F.; Maliepaard, Chris
Key message: Linkage mapping can help unravel the complexities of polyploid genomes. Here, we integrate haplotype-specific linkage maps in autotetraploid potato and explore the possibilities for mapping in other polyploid species.Abstract: High-density linkage mapping in autopolyploid species has
Henshall, John M; Dierens, Leanne; Sellars, Melony J
While much attention has focused on the development of high-density single nucleotide polymorphism (SNP) assays, the costs of developing and running low-density assays have fallen dramatically. This makes it feasible to develop and apply SNP assays for agricultural species beyond the major livestock species. Although low-cost low-density assays may not have the accuracy of the high-density assays widely used in human and livestock species, we show that when combined with statistical analysis approaches that use quantitative instead of discrete genotypes, their utility may be improved. The data used in this study are from a 63-SNP marker Sequenom® iPLEX Platinum panel for the Black Tiger shrimp, for which high-density SNP assays are not currently available. For quantitative genotypes that could be estimated, in 5% of cases the most likely genotype for an individual at a SNP had a probability of less than 0.99. Matrix formulations of maximum likelihood equations for parentage assignment were developed for the quantitative genotypes and also for discrete genotypes perturbed by an assumed error term. Assignment rates that were based on maximum likelihood with quantitative genotypes were similar to those based on maximum likelihood with perturbed genotypes but, for more than 50% of cases, the two methods resulted in individuals being assigned to different families. Treating genotypes as quantitative values allows the same analysis framework to be used for pooled samples of DNA from multiple individuals. Resulting correlations between allele frequency estimates from pooled DNA and individual samples were consistently greater than 0.90, and as high as 0.97 for some pools. Estimates of family contributions to the pools based on quantitative genotypes in pooled DNA had a correlation of 0.85 with estimates of contributions from DNA-derived pedigree. Even with low numbers of SNPs of variable quality, parentage testing and family assignment from pooled samples are
Nielsen, Rasmus; Williamson, Scott; Kim, Yuseob
of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence...
Børsting, Claus; Sanchez Sanchez, Juan Jose; Morling, Niels
We describe a single nucleotide polymorphism (SNP) typing protocol developed for the NanoChip electronic microarray. The NanoChip array consists of 100 electrodes covered by a thin hydrogel layer containing streptavidin. An electric currency can be applied to one, several, or all electrodes...
Andersen, Jeppe Dyrberg; Tvedebrink, Torben; Mogensen, Helle Smidt
-out of true alleles is possible. As part of the validation of the IrisPlex assay in our ISO17025 accredited, forensic genetic laboratory, we estimated the probability of drop-out of specific SNP alleles using 29 and 30 PCR cycles and 25, 50 and 100 Single Base Extension (SBE) cycles. We observed no drop...
Prokhorenko, Igor A.; Astakhova, Irina V.; Momynaliev, Kuvat T.
Excimer formation is a unique feature of some fluorescent dyes (e.g., pyrene) which can be used for probing the proximity of biomolecules. Pyrene excimer fluorescence has previously been used for homogeneous detection of single nucleotide polymorphism (SNP) on DNA. 1-Phenylethynylpyrene (1-1-PEPy...
Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Bellato, Cláudia M; Motilal, Lambert; Zhang, Dapeng
Cacao (Theobroma cacao L.), the source of cocoa, is an economically important tropical crop. One problem with the premium cacao market is contamination with off-types adulterating raw premium material. Accurate determination of the genetic identity of single cacao beans is essential for ensuring cocoa authentication. Using nanofluidic single nucleotide polymorphism (SNP) genotyping with 48 SNP markers, we generated SNP fingerprints for small quantities of DNA extracted from the seed coat of single cacao beans. On the basis of the SNP profiles, we identified an assumed adulterant variety, which was unambiguously distinguished from the authentic beans by multilocus matching. Assignment tests based on both Bayesian clustering analysis and allele frequency clearly separated all 30 authentic samples from the non-authentic samples. Distance-based principle coordinate analysis further supported these results. The nanofluidic SNP protocol, together with forensic statistical tools, is sufficiently robust to establish authentication and to verify gourmet cacao varieties. This method shows significant potential for practical application.
Arnaiz-Villena, Antonio; Fernández-Honrado, Mercedes; Rey, Diego; Enríquez-de-Salamanca, Mercedes; Abd-El-Fatah-Khalil, Sedeka; Arribas, Ignacio; Coca, Carmen; Algora, Manuel; Areces, Cristina
Adiponectin gene polymorphisms SNP45 and SNP276 have been related to metabolic syndrome (MS) and related pathologies, including obesity. However results of associations are contradictory depending on which population is studied. In the present study, these adiponectin SNPs are for the first time studied in Amerindians. Allele frequencies are obtained and comparison with obesity and other MS related parameters are performed. Amerindians were also defined by characteristic HLA genes. Our main results are: (1) SNP276 T is associated to low diastolic blood pressure in Amerindians, (2) SNP45 G allele is correlated with obesity in female but not in male Amerindians, (3) SNP45/SNP276 T/G haplotype in total obese/non-obese subjects tends to show a linkage with non-obese Amerindians, (4) SNP45/SNP276 T/T haplotype is linked to obese Amerindian males. Also, a world population study is carried out finding that SNP45 T and SNP276 T alleles are the most frequent in African Blacks and are found significantly in lower frequencies in Europeans and Asians. This together with the fact that there is a linkage of this haplotype to obese Amerindian males suggest that evolutionary forces related to famine (or population density in relation with available food) may have shaped world population adiponectin polymorphism frequencies.
Gaurav Kumar Srivastava; Nidhi Rajput; Kajal Kumar Jadav; Avadh Bihari Shrivastav; Himanshu R. Joshi
Aim: Partial fragment of D-loop region extending from 35 to 770 were compared with corresponding sequences of 16 wild pigs and 9 domestic pig breeds from different parts of the world for detection of single nucleotide polymorphism (SNP) markers in the region. The paper also reappraises SNP markers from two fragments of cytochrome b gene and a fragment 12S rRNA gene distinguishing the Indian wild pig from other pig species of the world. Materials and Methods: Deoxyribonucleic acid (DNA) was is...
Full Text Available Abstract Background The diploid, Solanum caripense, a wild relative of potato and tomato, possesses valuable resistance to potato late blight and we are interested in the genetic base of this resistance. Due to extremely low levels of genetic variation within the S. caripense genome it proved impossible to generate a dense genetic map and to assign individual Solanum chromosomes through the use of conventional chromosome-specific SSR, RFLP, AFLP, as well as gene- or locus-specific markers. The ease of detection of DNA polymorphisms depends on both frequency and form of sequence variation. The narrow genetic background of close relatives and inbreds complicates the detection of persisting, reduced polymorphism and is a challenge to the development of reliable molecular markers. Nonetheless, monomorphic DNA fragments representing not directly usable conventional markers can contain considerable variation at the level of single nucleotide polymorphisms (SNPs. This can be used for the design of allele-specific molecular markers. The reproducible detection of allele-specific markers based on SNPs has been a technical challenge. Results We present a fast and cost-effective protocol for the detection of allele-specific SNPs by applying Sequence Polymorphism-Derived (SPD markers. These markers proved highly efficient for fingerprinting of individuals possessing a homogeneous genetic background. SPD markers are obtained from within non-informative, conventional molecular marker fragments that are screened for SNPs to design allele-specific PCR primers. The method makes use of primers containing a single, 3'-terminal Locked Nucleic Acid (LNA base. We demonstrate the applicability of the technique by successful genetic mapping of allele-specific SNP markers derived from monomorphic Conserved Ortholog Set II (COSII markers mapped to Solanum chromosomes, in S. caripense. By using SPD markers it was possible for the first time to map the S. caripense alleles
Yin, Hao; Kanasty, Rosemary L; Eltoukhy, Ahmed A; Vegas, Arturo J; Dorkin, J Robert; Anderson, Daniel G
Gene-based therapy is the intentional modulation of gene expression in specific cells to treat pathological conditions. This modulation is accomplished by introducing exogenous nucleic acids such as DNA, mRNA, small interfering RNA (siRNA), microRNA (miRNA) or antisense oligonucleotides. Given the large size and the negative charge of these macromolecules, their delivery is typically mediated by carriers or vectors. In this Review, we introduce the biological barriers to gene delivery in vivo and discuss recent advances in material sciences, nanotechnology and nucleic acid chemistry that have yielded promising non-viral delivery systems, some of which are currently undergoing testing in clinical trials. The diversity of these systems highlights the recent progress of gene-based therapy using non-viral approaches.
Elizabeth T Cirulli
Full Text Available In recent years, genome and exome sequencing studies have implicated a plethora of new disease genes with rare causal variants. Here, I review 150 exome sequencing studies that claim to have discovered that a disease can be caused by different rare variants in the same gene, and I determine whether their methods followed the current best-practice guidelines in the interpretation of their data. Specifically, I assess whether studies appropriately assess controls for rare variants throughout the entire gene or implicated region as opposed to only investigating the specific rare variants identified in the cases, and I assess whether studies present sufficient co-segregation data for statistically significant linkage. I find that the proportion of studies performing gene-based analyses has increased with time, but that even in 2015 fewer than 40% of the reviewed studies used this method, and only 10% presented statistically significant co-segregation data. Furthermore, I find that the genes reported in these papers are explaining a decreasing proportion of cases as the field moves past most of the low-hanging fruit, with 50% of the genes from studies in 2014 and 2015 having variants in fewer than 5% of cases. As more studies focus on genes explaining relatively few cases, the importance of performing appropriate gene-based analyses is increasing. It is becoming increasingly important for journal editors and reviewers to require stringent gene-based evidence to avoid an avalanche of misleading disease gene discovery papers.
Sarup, Pernille Merete; Jensen, Just; Edwards, Stefan McKinnon
Genetic variance for complex traits in animal breeding are often estimated using linear mixed-models that incorporate information from SNP-markers using a realized genomic-relationship matrices. In these models, individual genetic markers are weighted equally and the variation in the genome...... is treated as a “black box”. While this approach has proved useful in selecting animals with high genetic potential, it does not generate insight into the biological mechanisms underlying trait variation. We propose to build a linear mixed model approach to evaluate the collective effects of sets of SNPs...
Jing Fan; Jennifer G.Dy; Chung-Che Chang; Xiaobo Zhou
Myelodysplastic syndromes have increased in frequency and incidence in the American population,but patient prognosis has not significantly improved over the last decade.Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identified.In this study,we propose a method that associates two state-of-the-art array technologies-single nucleotide polymorphism (SNP) array and gene expression array-with gene motifs considered transcription factor-binding sites (TFBS).We are particularly interested in SNP-containing motifs introduced by genetic variation and mutation as TFBS.The potential regulation of SNP-containing motifs affects only when certain mutations occur.These motifs can be identified from a group of co-expressed genes with copy number variation.Then,we used a sliding window to identify motif candidates near SNPs on gene sequences.The candidates were filtered by coarse thresholding and fine statistical testing.Using the regression-based LARS-EN algorithm and a level-wise sequence combination procedure,we identified 28 SNP-containing motifs as candidate TFBS.We confirmed 21 of the 28 motifs with ChIP-chip fragments in the TRANSFAC database.Another six motifs were validated by TRANSFAC via searching binding fragments on coregulated genes.The identified motifs and their location genes can be considered potential biomarkers for myelodysplastic syndromes.Thus,our proposed method,a novel strategy for associating two data categories,is capable of integrating information from different sources to identify reliable candidate regulatory SNP-containing motifs introduced by genetic variation and mutation.
Khor, S-S; Yang, W; Kawashima, M; Kamitsuji, S; Zheng, X; Nishida, N; Sawai, H; Toyoda, H; Miyagawa, T; Honda, M; Kamatani, N; Tokunaga, K
Statistical imputation of classical human leukocyte antigen (HLA) alleles is becoming an indispensable tool for fine-mappings of disease association signals from case-control genome-wide association studies. However, most currently available HLA imputation tools are based on European reference populations and are not suitable for direct application to non-European populations. Among the HLA imputation tools, The HIBAG R package is a flexible HLA imputation tool that is equipped with a wide range of population-based classifiers; moreover, HIBAG R enables individual researchers to build custom classifiers. Here, two data sets, each comprising data from healthy Japanese individuals of difference sample sizes, were used to build custom classifiers. HLA imputation accuracy in five HLA classes (HLA-A, HLA-B, HLA-DRB1, HLA-DQB1 and HLA-DPB1) increased from the 82.5-98.8% obtained with the original HIBAG references to 95.2-99.5% with our custom classifiers. A call threshold (CT) of 0.4 is recommended for our Japanese classifiers; in contrast, HIBAG references recommend a CT of 0.5. Finally, our classifiers could be used to identify the risk haplotypes for Japanese narcolepsy with cataplexy, HLA-DRB1*15:01 and HLA-DQB1*06:02, with 100% and 99.7% accuracy, respectively; therefore, these classifiers can be used to supplement the current lack of HLA genotyping data in widely available genome-wide association study data sets.
and exotic agrestis melons from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes. Conclusions This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties.
Full Text Available There is increasing evidence that strain variation in Mycobacterium tuberculosis complex (MTBC might influence the outcome of tuberculosis infection and disease. To assess genotype-phenotype associations, phylogenetically robust molecular markers and appropriate genotyping tools are required. Most current genotyping methods for MTBC are based on mobile or repetitive DNA elements. Because these elements are prone to convergent evolution, the corresponding genotyping techniques are suboptimal for phylogenetic studies and strain classification. By contrast, single nucleotide polymorphisms (SNP are ideal markers for classifying MTBC into phylogenetic lineages, as they exhibit very low degrees of homoplasy. In this study, we developed two complementary SNP-based genotyping methods to classify strains into the six main human-associated lineages of MTBC, the "Beijing" sublineage, and the clade comprising Mycobacterium bovis and Mycobacterium caprae. Phylogenetically informative SNPs were obtained from 22 MTBC whole-genome sequences. The first assay, referred to as MOL-PCR, is a ligation-dependent PCR with signal detection by fluorescent microspheres and a Luminex flow cytometer, which simultaneously interrogates eight SNPs. The second assay is based on six individual TaqMan real-time PCR assays for singleplex SNP-typing. We compared MOL-PCR and TaqMan results in two panels of clinical MTBC isolates. Both methods agreed fully when assigning 36 well-characterized strains into the main phylogenetic lineages. The sensitivity in allele-calling was 98.6% and 98.8% for MOL-PCR and TaqMan, respectively. Typing of an additional panel of 78 unknown clinical isolates revealed 99.2% and 100% sensitivity in allele-calling, respectively, and 100% agreement in lineage assignment between both methods. While MOL-PCR and TaqMan are both highly sensitive and specific, MOL-PCR is ideal for classification of isolates with no previous information, whereas TaqMan is faster
Full Text Available The effects of selection on genome variation were investigated and visualized in tomato using a high-density single nucleotide polymorphism (SNP array. 7,720 SNPs were genotyped on a collection of 426 tomato accessions (410 inbreds and 16 hybrids and over 97% of the markers were polymorphic in the entire collection. Principal component analysis (PCA and pairwise estimates of F(st supported that the inbred accessions represented seven sub-populations including processing, large-fruited fresh market, large-fruited vintage, cultivated cherry, landrace, wild cherry, and S. pimpinellifolium. Further divisions were found within both the contemporary processing and fresh market sub-populations. These sub-populations showed higher levels of genetic diversity relative to the vintage sub-population. The array provided a large number of polymorphic SNP markers across each sub-population, ranging from 3,159 in the vintage accessions to 6,234 in the cultivated cherry accessions. Visualization of minor allele frequency revealed regions of the genome that distinguished three representative sub-populations of cultivated tomato (processing, fresh market, and vintage, particularly on chromosomes 2, 4, 5, 6, and 11. The PCA loadings and F(st outlier analysis between these three sub-populations identified a large number of candidate loci under positive selection on chromosomes 4, 5, and 11. The extent of linkage disequilibrium (LD was examined within each chromosome for these sub-populations. LD decay varied between chromosomes and sub-populations, with large differences reflective of breeding history. For example, on chromosome 11, decay occurred over 0.8 cM for processing accessions and over 19.7 cM for fresh market accessions. The observed SNP variation and LD decay suggest that different patterns of genetic variation in cultivated tomato are due to introgression from wild species and selection for market specialization.
Bennett, G L; Shackelford, S D; Wheeler, T L; King, D A; Casas, E; Smith, T P L
Genetic markers in casein (CSN1S1) and thyroglobulin (TG) genes have previously been associated with fat distribution in cattle. Determining the nature of these genetic associations (additive, recessive, or dominant) has been difficult, because both markers have small minor allele frequencies in most beef cattle populations. This results in few animals homozygous for the minor alleles. selection to increase the frequencies of the minor alleles for 2 SNP markers in these genes was undertaken in a composite population. The objective was to obtain better estimates of genetic effects associated with these markers and determine if there were epistatic interactions. Selection increased the frequencies of minor alleles for both SNP from meat tenderness predicted at the abattoir by visible and near-infrared reflectance spectroscopy (P 0.10). Additive, dominance, and epistatic SNP association effects were estimated from genotypic effects for adjusted fat thickness and predicted meat tenderness. Adjusted fat thickness showed a dominance association with TG SNP (P meat tenderness, heterozygous TG meat was more tender than meat from either homozygote (P < 0.002). Dominance and epistatic associations can result in different SNP allele substitution effects in populations where SNP have the same linkage disequilibrium with causal mutations but have different frequencies. Although the complex associations estimated in this study would contribute little to within-population selection response, they could be important for marker-assisted management or reciprocal selection schemes.
Norman, Anita J; Street, Nathaniel R; Spong, Göran
Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos) within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia.
Anita J Norman
Full Text Available Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia.
Ramirez-Gonzalez, Ricardo H; Uauy, Cristobal; Caccamo, Mario
The design of genetic markers is of particular relevance in crop breeding programs. Despite many economically important crops being polyploid organisms, the current primer design tools are tailored for diploid species. Bread wheat, for instance, is a hexaploid comprising of three related genomes and the performance of genetic markers is diminished if the primers are not genome specific. PolyMarker is a pipeline that generates SNP markers by selecting candidate primers for a specified genome using local alignments and standard primer design tools to test the viability of the primers. A command line tool and a web interface are available to the community. PolyMarker is available as a ruby BioGem: bio-polyploid-tools. Web interface: http://polymarker.tgac.ac.uk. © The Author 2015. Published by Oxford University Press.
Dimauro, C; Cellesi, M; Pintus, M A; Macciotta, N P P
In genomic selection (GS) programmes, direct genomic values (DGV) are evaluated using information provided by high-density SNP chip. Being DGV accuracy strictly dependent on SNP density, it is likely that an increase in the number of markers per chip will result in severe computational consequences. Aim of present work was to test the effectiveness of principal component analysis (PCA) carried out by chromosome in reducing the marker dimensionality for GS purposes. A simulated data set of 5700 individuals with an equal number of SNP distributed over six chromosomes was used. PCs were extracted both genome-wide (ALL) and separately by chromosome (CHR) and used to predict DGVs. In the ALL scenario, the SNP variance-covariance matrix (S) was singular, positive semi-definite and contained null information which introduces 'spuriousness' in the derived results. On the contrary, the S matrix for each chromosome (CHR scenario) had a full rank. Obtained DGV accuracies were always better for CHR than ALL. Moreover, in the latter scenario, DGV accuracies became soon unsettled as the number of animals decreases, whereas in CHR, they remain stable till 900-1000 individuals. In real applications where a 54k SNP chip is used, the largest number of markers per chromosome is approximately 2500. Thus, a number of around 3000 genotyped animals could lead to reliable results when the original SNP variables are replaced by a reduced number of PCs. © 2011 Blackwell Verlag GmbH.
Full Text Available Genome-wide association studies (GWASs have identified low-penetrance common variants (i.e., single nucleotide polymorphisms, SNPs associated with breast cancer susceptibility. Although GWASs are primarily focused on single-locus effects, gene-gene interactions (i.e., epistasis are also assumed to contribute to the genetic risks for complex diseases including breast cancer. While it has been hypothesized that moderately ranked (P value based weak single-locus effects in GWASs could potentially harbor valuable information for evaluating epistasis, we lack systematic efforts to investigate SNPs showing consistent associations with weak statistical significance across independent discovery and replication stages. The objectives of this study were i to select SNPs showing single-locus effects with weak statistical significance for breast cancer in a GWAS and/or candidate-gene studies; ii to replicate these SNPs in an independent set of breast cancer cases and controls; and iii to explore their potential SNP-SNP interactions contributing to breast cancer susceptibility. A total of 17 SNPs related to DNA repair, modification and metabolism pathway genes were selected since these pathways offer a priori knowledge for potential epistatic interactions and an overall role in breast carcinogenesis. The study design included predominantly Caucasian women (2,795 cases and 4,505 controls from Alberta, Canada. We observed two two-way SNP-SNP interactions (APEX1-rs1130409 and RPAP1-rs2297381; MLH1-rs1799977 and MDM2-rs769412 in logistic regression that conferred elevated risks for breast cancer (P(interaction<7.3 × 10(-3. Logic regression identified an interaction involving four SNPs (MBD2-rs4041245, MLH1-rs1799977, MDM2-rs769412, BRCA2-rs1799943 (P(permutation = 2.4 × 10(-3. SNPs involved in SNP-SNP interactions also showed single-locus effects with weak statistical significance, while BRCA2-rs1799943 showed stronger statistical significance (P
Full Text Available We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with ∂a∂i, the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.
Full Text Available Identification of single nucleotide polymorphisms (SNPs and mutations is important for the discovery of genetic predisposition to complex diseases. PCR resequencing is the method of choice for de novo SNP discovery. However, manual curation of putative SNPs has been a major bottleneck in the application of this method to high-throughput screening. Therefore it is critical to develop a more sensitive and accurate computational method for automated SNP detection. We developed a software tool, SNPdetector, for automated identification of SNPs and mutations in fluorescence-based resequencing reads. SNPdetector was designed to model the process of human visual inspection and has a very low false positive and false negative rate. We demonstrate the superior performance of SNPdetector in SNP and mutation analysis by comparing its results with those derived by human inspection, PolyPhred (a popular SNP detection tool, and independent genotype assays in three large-scale investigations. The first study identified and validated inter- and intra-subspecies variations in 4,650 traces of 25 inbred mouse strains that belong to either the Mus musculus species or the M. spretus species. Unexpected heterozygosity in CAST/Ei strain was observed in two out of 1,167 mouse SNPs. The second study identified 11,241 candidate SNPs in five ENCODE regions of the human genome covering 2.5 Mb of genomic sequence. Approximately 50% of the candidate SNPs were selected for experimental genotyping; the validation rate exceeded 95%. The third study detected ENU-induced mutations (at 0.04% allele frequency in 64,896 traces of 1,236 zebra fish. Our analysis of three large and diverse test datasets demonstrated that SNPdetector is an effective tool for genome-scale research and for large-sample clinical studies. SNPdetector runs on Unix/Linux platform and is available publicly (http://lpg.nci.nih.gov.
Full Text Available Identification of single nucleotide polymorphisms (SNPs and mutations is important for the discovery of genetic predisposition to complex diseases. PCR resequencing is the method of choice for de novo SNP discovery. However, manual curation of putative SNPs has been a major bottleneck in the application of this method to high-throughput screening. Therefore it is critical to develop a more sensitive and accurate computational method for automated SNP detection. We developed a software tool, SNPdetector, for automated identification of SNPs and mutations in fluorescence-based resequencing reads. SNPdetector was designed to model the process of human visual inspection and has a very low false positive and false negative rate. We demonstrate the superior performance of SNPdetector in SNP and mutation analysis by comparing its results with those derived by human inspection, PolyPhred (a popular SNP detection tool, and independent genotype assays in three large-scale investigations. The first study identified and validated inter- and intra-subspecies variations in 4,650 traces of 25 inbred mouse strains that belong to either the Mus musculus species or the M. spretus species. Unexpected heterozgyosity in CAST/Ei strain was observed in two out of 1,167 mouse SNPs. The second study identified 11,241 candidate SNPs in five ENCODE regions of the human genome covering 2.5 Mb of genomic sequence. Approximately 50% of the candidate SNPs were selected for experimental genotyping; the validation rate exceeded 95%. The third study detected ENU-induced mutations (at 0.04% allele frequency in 64,896 traces of 1,236 zebra fish. Our analysis of three large and diverse test datasets demonstrated that SNPdetector is an effective tool for genome-scale research and for large-sample clinical studies. SNPdetector runs on Unix/Linux platform and is available publicly (http://lpg.nci.nih.gov.
Full Text Available Abstract Background Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm. Results In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots. Conclusion Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.
Katherine A Dick Krueger
Full Text Available The identification of genes for monogenic disorders has proven to be highly effective for understanding disease mechanisms, pathways and gene function in humans. Nevertheless, while thousands of Mendelian disorders have not yet been mapped there has been a trend away from studying single-gene disorders. In part, this is due to the fact that many of the remaining single-gene families are not large enough to map the disease locus to a single site in the genome. New tools and approaches are needed to allow researchers to effectively tap into this genetic gold-mine. Towards this goal, we have used haploid cell lines to experimentally validate the use of high-density single nucleotide polymorphism (SNP arrays to define genome-wide haplotypes and candidate regions, using a small amyotrophic lateral sclerosis (ALS family as a prototype. Specifically, we used haploid-cell lines to determine if high-density SNP arrays accurately predict haplotypes across entire chromosomes and show that haplotype information significantly enhances the genetic information in small families. Panels of haploid-cell lines were generated and a 5 centimorgan (cM short tandem repeat polymorphism (STRP genome scan was performed. Experimentally derived haplotypes for entire chromosomes were used to directly identify regions of the genome identical-by-descent in 5 affected individuals. Comparisons between experimentally determined and in silico haplotypes predicted from SNP arrays demonstrate that SNP analysis of diploid DNA accurately predicted chromosomal haplotypes. These methods precisely identified 12 candidate intervals, which are shared by all 5 affected individuals. Our study illustrates how genetic information can be maximized using readily available tools as a first step in mapping single-gene disorders in small families.
Full Text Available Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr.
Full Text Available The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb, which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP.
Van, Kyujung; Kang, Yang Jae; Han, Kwang-Soo; Lee, Yeong-Ho; Gwag, Jae-Gyun; Moon, Jung-Kyung; Lee, Suk-Ha
Mungbean [Vigna radiata (L.) Wilczek], a self-pollinated diploid plant with 2n = 22 chromosomes, is an important legume crop with a high-quality amino acid profile. Sequence variation at the whole-genome level was examined by comparing two mungbean cultivars, Sunhwanokdu and Gyeonggijaerae 5, using Illumina HiSeq sequencing data. More than 40 billion bp from both mungbean cultivars were sequenced to a depth of 72×. After de novo assembly of Sunhwanokdu contigs by ABySS 1.3.2 (N50 = 9,958 bp), those longer than 10 kb were aligned with Gyeonggijaerae 5 reads using the Burrows-Wheeler Aligner. SAMTools was used for retrieving single nucleotide polymorphisms (SNPs) between Sunhwanokdu and Gyeonggijaerae 5, defining the lowest and highest depths as 5 and 100, respectively, and the sequence quality as 100. Of the 305,504 single-base changes identified, 40,503 SNPs were considered heterozygous in Gyeonggijaerae 5. Among the remaining 265,001 SNPs, 65.9 % (174,579 cases) were transitions and 34.1 % (90,422 cases) were transversions. For SNP validation, a total of 42 SNPs were chosen among Sunhwanokdu contigs longer than 10 kb and sharing at least 80 % sequence identity with common bean expressed sequence tags as determined with est2genome. Using seven mungbean cultivars from various origins in addition to Sunhwanokdu and Gyeonggijaerae 5, most of the SNPs identified by bioinformatics tools were confirmed by Sanger sequencing. These genome-wide SNP markers could enrich the current molecular resources and might be of value for the construction of a mungbean genetic map and the investigation of genetic diversity.
Full Text Available Lentil (Lens culinaris Medik. is a self-pollinating, diploid, annual, cool-season, food legume crop that is cultivated throughout the world. Ascochyta blight (AB, caused by Ascochyta lentis Vassilievsky, is an economically important and widespread disease of lentil. Development of cultivars with high levels of durable resistance provides an environmentally acceptable and economically feasible method for AB control. A detailed understanding of the genetic basis of AB resistance is hence highly desirable, in order to obtain insight into the number and influence of resistance genes. Genetic linkage maps based on single nucleotide polymorphisms (SNP and simple sequence repeat (SSR markers have been developed from three recombinant inbred line (RIL populations. The IH x NF map contained 460 loci across 1461.6 cM, while the IH x DIG map contained 329 loci across 1302.5 cM and the third map, NF x DIG contained 330 loci across 1914.1 cM. Data from these maps were combined with a map from a previously published study through use of bridging markers to generate a consensus linkage map containing 689 loci distributed across 7 linkage groups (LGs, with a cumulative length of 2429.61 cM at an average density of one marker per 3.5 cM. Trait dissection of AB resistance was performed for the RIL populations, identifying totals of two and three quantitative trait loci (QTLs explaining 52% and 69% of phenotypic variation for resistance to infection in the IH x DIG and IH x NF populations, respectively. Presence of common markers in the vicinity of the AB_IH1- and AB_IH2.1/AB_IH2.2-containing regions on both maps supports the inference that a common genomic region is responsible for conferring resistance and is associated with the resistant parent, Indianhead. The third QTL was derived from Northfield. Evaluation of markers associated with AB resistance across a diverse lentil germplasm panel revealed that the identity of alleles associated with AB_IH1 predicted
Peng, Qian; Chen, Chang-Hui; Wu, Qing; Yang, Yuan
To investigate the association of rs72689236, a new functional single nucleotide polymorphism (SNP) of the gene encoding caspase-3 (CASP3), with the occurrence and development of Kawasaki disease by a meta analysis. A literature search was performed using databases at home and abroad according to inclusion and exclusion criteria, to acquire studies on the relationship between rs72689236 and Kawasaki disease published up to November 2012, including case-control studies and transmission disequilibrium tests. An integrated meta analysis was performed using RevMan 5.1 software after the studies were screened and evaluated. Six studies were extracted for systematic review of the association between rs72689236 and Kawasaki disease. The frequency of allele A of the SNP was significantly higher in patients with Kawasaki disease than in the controls (OR=1.34, 95%CI=1.24-1.46, PKawasaki disease in children with allele A (AA+AG) increased by approximately 44% compared with children with GG (OR=1.44, 95%CI=1.27-1.65, PKawasaki disease patients with coronary artery lesions than in those without coronary artery lesions (OR=1.51, 95%CI=1.10-2.07, P= 0.01); the risk for coronary artery lesions in Kawasaki disease patients with allele A (AA+AG) increased by approximately 59% compared with Kawasaki disease patients with GG (OR=1.59, 95%CI= 1.00-2.53, P=0.05]. No association between this SNP and the therapeutic effect of intravenous immunoglobulin (IVIG) was found in patients with Kawasaki disease. The allele A of functional SNP rs72689236 of CASP3 increases the risk for Kawasaki disease, and it may be used as the genetic marker for susceptibility to coronary artery lesions as a complication of Kawasaki disease. Currently, there is still no sufficient evidence that this SNP has an impact on the therapeutic effect of IVIG in patients with Kawasaki disease, and more studies are needed to investigate the feasibility of its application in individualized treatment.
Full Text Available In metazoans, miRNAs regulate gene expression primarily through binding to target sites in the 3' UTRs (untranslated regions of messenger RNAs (mRNAs. Cis-acting variants within, or close to, a gene are crucial in explaining the variability of gene expression measures. Single nucleotide polymorphisms (SNPs in the 3' UTRs of genes can affect the base-pairing between miRNAs and mRNAs, and hence disrupt existing target sites (in the reference sequence or create novel target sites, suggesting a possible mechanism for cis regulation of gene expression. Moreover, because the alleles of different SNPs within a DNA sequence of limited length tend to be in strong linkage disequilibrium (LD, we hypothesize the variants of miRNA target sites caused by SNPs potentially function as bridges linking the documented cis-SNP markers to the expression of the associated genes. A large-scale analysis was herein performed to test this hypothesis. By systematically integrating multiple latest information sources, we found 21 significant gene-level SNP-involved miRNA-mediated post-transcriptional regulation modules (SNP-MPRMs in the form of SNP-miRNA-mRNA triplets in lymphocyte cell lines for the CEU and YRI populations. Among the cognate genes, six including ALG8, DGKE, GNA12, KLF11, LRPAP1, and MMAB are related to multiple genetic diseases such as depressive disorder and Type-II diabetes. Furthermore, we found that ~35% of the documented transcript intensity-related cis-SNPs (~950 in a recent publication are identical to, or in significant linkage disequilibrium (LD (p<0.01 with, one or multiple SNPs located in miRNA target sites. Based on these associations (or identities, 69 significant exon-level SNP-MPRMs and 12 disease genes were further determined for two populations. These results provide concrete in silico evidence for the proposed hypothesis. The discovered modules warrant additional follow-up in independent laboratory studies.
González, Jorge; Fuentes, Glenda; Alarcón, Diego; Ruiz, Eduardo
Within a woody plant species, environmental heterogeneity has the potential to influence the distribution of genetic variation among populations through several evolutionary processes. In some species, a relationship between environmental characteristics and the distribution of genotypes can be detected, showing the importance of natural selection as the main source of differentiation. Nothofagus dombeyi (Mirb.) Oerst. (Nothofagaceae) is an endemic tree species occurring both in Chile and in Argentina temperate forests. Postglacial history has been studied with chloroplast DNA and evolutionary forces shaping genetic variation patterns have been analysed with isozymes but fine-scale genetic diversity studies are needed. The study of demographic and selection histories in Nothofagus dombeyi requires more informative markers such as single nucleotide polymorphisms (SNP). Genotyping-by-Sequencing tools now allow studying thousands of SNP markers at reasonable prices in nonmodel species. We investigated more than 10 K SNP loci for signatures of local adaptation and showed that interrogation of genomic resources can identify shifts in genetic diversity and putative adaptive signals in this nonmodel woody species. PMID:27446942
Shepherd, Ross K; Meuwissen, Theo H E; Woolliams, John A
The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time.
Full Text Available High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus. A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs. Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela
High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
Minarik, Marek; Benesova, Lucie; Fantova, Lucie; Horacek, Jiri; Heracek, Jiri; Loukola, Anu
Increasing importance of single-nucleotide polymorphisms (SNPs) in determination of disease susceptibility or in prediction of therapy response brings attention of many molecular diagnostic laboratories to simple and low-cost SNP genotyping methodologies. We have recently introduced a mutation detection technique based on analysis of homo- and heteroduplex PCR fragments resolved in cycling temperature gradient conditions on a conventional multicapillary-array DNA sequencer. The main advantage of this technique is in its simplicity with no requirement for sample cleanup prior to the analysis. In this report we present a practical application of the technology for genotyping of SNP markers in two separate clinical projects resulting in a combined set of 44 markers screened in over 500 patients. Initially, a design of PCR primers and conditions was performed for each SNP marker. Then, optimization of CE running conditions (limited just to the proper selection of temperature cycling) was performed on pools of 20 DNA samples to increase the probability of having each of the two allele types represented in the sample. After selecting the optimum conditions, screening of markers in patients was performed using a multiple-injection approach for further acceleration of the sample throughput. The rate of successful optimization of experimental conditions without any pre-selection based on the SNP sequence or melting characteristics was 80% from the initial SNP marker candidates. By studying the failed markers, we attempt to identify critical factors enabling successful typing. The presented technique is very useful for low to medium sized SNP genotyping projects mostly applied in pharmacogenomic research as well as in clinical diagnostics. The main advantages include low cost, simple setup and validation of SNP markers.
Full Text Available Wheat yield can be enhanced by modifying the spike morphology and the plant height. In this study, a population of 191 F9 recombinant inbred lines (RILs was developed from a cross between two winter cultivars Yumai 8679 and Jing 411. A dense genetic linkage map with 10,816 markers was constructed by incorporating single nucleotide polymorphism (SNP and simple sequence repeat (SSR marker information. Five spike morphological traits and plant height were evaluated under nine environments for the RILs and parental lines, and the number of detected environmentally stable QTLs were 18 and 3, respectively. The 1RS/1BL (rye translocation increased both spike length and spikelet number with constant spikelet compactness. The QPht.cau-2D.1 was identical to gene Rht8, which decreased spike length without modifying spikelet number. Notably, four novel QTLs locating on chromosomes 1AS (QSc.cau-1A.1, 2DS (QSc.cau-2D.1 and 7BS (QSl.cau-7B.1 and QSl.cau-7B.2 were firstly identified in this study, which provide further insights into the genetic factors that shaped the spike morphology in wheat. Moreover, SNP markers tightly linked to previously reported QTLs will eventually facilitate future studies including their positional cloning or marker-assisted selection.
Keith R. Merrill; Craig E. Coleman; Susan E. Meyer; Elizabeth A. Leger; Katherine A. Collins
Premise of the study: Bromus tectorum (Poaceae) is an annual grass species that is invasive in many areas of the world but most especially in the U.S. Intermountain West. Single-nucleotide polymorphism (SNP) markers were developed for use in investigating the geospatial and ecological diversity of B. tectorum in the Intermountain West to better understand the...
Zhang, Tiejun; Yu, Long-Xi; McCord, Per; Miller, David; Bhamidimarri, Suresh; Johnson, David; Monteros, Maria J; Ho, Julie; Reisen, Peter; Samac, Deborah A
Verticillium wilt, caused by the soilborne fungus, Verticillium alfalfae, is one of the most serious diseases of alfalfa (Medicago sativa L.) worldwide. To identify loci associated with resistance to Verticillium wilt, a bulk segregant analysis was conducted in susceptible or resistant pools constructed from 13 synthetic alfalfa populations, followed by association mapping in two F1 populations consisted of 352 individuals. Simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers were used for genotyping. Phenotyping was done by manual inoculation of the pathogen to replicated cloned plants of each individual and disease severity was scored using a standard scale. Marker-trait association was analyzed by TASSEL. Seventeen SNP markers significantly associated with Verticillium wilt resistance were identified and they were located on chromosomes 1, 2, 4, 7 and 8. SNP markers identified on chromosomes 2, 4 and 7 co-locate with regions of Verticillium wilt resistance loci reported in M. truncatula. Additional markers identified on chromosomes 1 and 8 located the regions where no Verticillium resistance locus has been reported. This study highlights the value of SNP genotyping by high resolution melting to identify the disease resistance loci in tetraploid alfalfa. With further validation, the markers identified in this study could be used for improving resistance to Verticillium wilt in alfalfa breeding programs.
Full Text Available Verticillium wilt, caused by the soilborne fungus, Verticillium alfalfae, is one of the most serious diseases of alfalfa (Medicago sativa L. worldwide. To identify loci associated with resistance to Verticillium wilt, a bulk segregant analysis was conducted in susceptible or resistant pools constructed from 13 synthetic alfalfa populations, followed by association mapping in two F1 populations consisted of 352 individuals. Simple sequence repeat (SSR and single nucleotide polymorphism (SNP markers were used for genotyping. Phenotyping was done by manual inoculation of the pathogen to replicated cloned plants of each individual and disease severity was scored using a standard scale. Marker-trait association was analyzed by TASSEL. Seventeen SNP markers significantly associated with Verticillium wilt resistance were identified and they were located on chromosomes 1, 2, 4, 7 and 8. SNP markers identified on chromosomes 2, 4 and 7 co-locate with regions of Verticillium wilt resistance loci reported in M. truncatula. Additional markers identified on chromosomes 1 and 8 located the regions where no Verticillium resistance locus has been reported. This study highlights the value of SNP genotyping by high resolution melting to identify the disease resistance loci in tetraploid alfalfa. With further validation, the markers identified in this study could be used for improving resistance to Verticillium wilt in alfalfa breeding programs.
Full Text Available BACKGROUND: Genetic polymorphisms in the human MDM2 gene are suggested to be a tumor susceptibility marker and a prognostic factor for cancer. It has been reported that a single nucleotide polymorphism (SNP c.309T>G in the MDM2 gene attenuates the tumor suppressor activity of p53 and accelerates tumor formation in humans. METHODOLOGY: In this study, to detect the SNP c.309T>G in the MDM2 gene, we have developed a new SNP detection method, named "Duplex SmartAmp," which enabled us to simultaneously detect both 309T and 309G alleles in one tube. To develop this new method, we introduced new primers i.e., nBP and oBPs, as well as two different fluorescent dyes that separately detect those genetic polymorphisms. RESULTS AND CONCLUSIONS: By the Duplex SmartAmp method, the genetic polymorphisms of the MDM2 gene were detected directly from a small amount of genomic DNA or blood samples. We used 96 genomic DNA and 24 blood samples to validate the Duplex SmartAmp by comparison with results of the conventional PCR-RFLP method; consequently, the Duplex SmartAmp results agreed totally with those of the PCR-RFLP method. Thus, the new SNP detection method is considered useful for detecting the SNP c.309T>G in the MDM2 gene so as to judge cancer susceptibility against some cellular stress in the clinical setting, and also to handle a large number of samples and enable rapid clinical diagnosis.
Brito, Luiz F; McEwan, John C; Miller, Stephen P; Pickering, Natalie K; Bain, Wendy E; Dodds, Ken G; Schenkel, Flávio S; Clarke, Shannon M
Knowledge about the genetic diversity of a population is a crucial parameter for the implementation of successful genomic selection and conservation of genetic resources. The aim of this research was to establish the scientific basis for the implementation of genomic selection in a composite Terminal sheep breeding scheme by providing consolidated linkage disequilibrium (LD) measures across SNP markers, estimating consistency of gametic phase between breed-groups, and assessing genetic diversity measures, such as effective population size (Ne), and population structure parameters, using a large number of animals (n = 14,845) genotyped with a high density SNP chip (606,006 markers). Information generated in this research will be useful for optimizing molecular breeding values predictions and managing the available genetic resources. Overall, as expected, levels of pairwise LD decreased with increasing distance between SNP pairs. The mean LD r(2) between adjacent SNP was 0.26 ± 0.10. The most recent effective population size for all animals (687) and separately per breed-groups: Primera (974), Lamb Supreme (380), Texel (227) and Dual-Purpose (125) was quite variable. The genotyped animals were outbred or had an average low level of inbreeding. Consistency of gametic phase was higher than 0.94 for all breed pairs at the average distance between SNP on the chip (~4.74 kb). Moreover, there was not a clear separation between the breed-groups based on principal component analysis, suggesting that a mixed-breed training population for calculation of molecular breeding values would be beneficial. This study reports, for the first time, estimates of linkage disequilibrium, genetic diversity and population structure parameters from a genome-wide perspective in New Zealand Terminal Sire composite sheep breeds. The levels of linkage disequilibrium indicate that genomic selection could be implemented with the high density SNP panel. The moderate to high consistency of
Full Text Available Cervical cancer is the most fatal disease among Indonesian women. In recognition of the substantial variation in the intrinsic response of individuals to radiation, an effort had been done to identify the genetic markers, primarily Single Nucleotide polymorphisms (SNPs, which are associated with responsiveness of cancer cells to radiation therapy. One of these SNPs is X-ray repair cross-complementing protein 1 (XRCC1 that is one of the most important genes in deoxyribonucleic acid (DNA repair pathways. Meta-analysis in the determination of the association of XRCC1 polymorphisms with cervical cancer revealed the potential role of XRCC1 polymorphisms in predicting cell response to radiotherapy.Our preliminary study with real-time polymerase chain reaction (RT-PCR showed that radiotherapy affected the XRCC1 gene analyzed in blood of cervical cancer patient. Other published study found three SNPs of XRCC1 (Arg194Trp, Arg280His, and Arg399Gln that cause amino acid substitutions. Arg194Trp is only SNPs that associated with high risk of cervical cancer but not others. Additionally, structure and function of this protein can be altered by functional SNPs, which may lead to the susceptibility of individuals to cancers. Anotherstudy found G399A polymorphisms. We concluded that SNP of this DNA repair genes have been found to be good predictors of efficacy of radiotherapy.Kanker serviks adalah penyakit yang paling fatal pada perempuan di Indonesia. Untuk memahami variasi substansial respon intrinsik individual terhadap radiasi, suatu usaha telah dilakukan untuk mengidentifikasi petanda genetik, terutama Single Nucleotide polymorphism (SNP, yang berkaitan dengan responsel kanker terhadap terapi radiasi. Satu dari SNP tersebut adalah X-ray repair cross-complementing protein 1 (XRCC1 yang merupakan satu dari gen paling penting dalam lajur perbaikan asam deoksiribonukleat (DNA. Meta-analysis dalam penentuan hubungan polimorfisme XRCC1 dengan kanker serviks
Full Text Available Cervical cancer is the most fatal disease among Indonesian women. In recognition of the substantial variation in the intrinsic response of individuals to radiation, an effort had been done to identify the genetic markers, primarily Single Nucleotide polymorphisms (SNPs, which are associated with responsiveness of cancer cells to radiation therapy. One of these SNPs is X-ray repair cross-complementing protein 1 (XRCC1 that is one of the most important genes in deoxyribonucleic acid (DNA repair pathways. Meta-analysis in the determination of the association of XRCC1 polymorphisms with cervical cancer revealed the potential role of XRCC1 polymorphisms in predicting cell response to radiotherapy.Our preliminary study with real-time polymerase chain reaction (RT-PCR showed that radiotherapy affected the XRCC1 gene analyzed in blood of cervical cancer patient. Other published study found three SNPs of XRCC1 (Arg194Trp, Arg280His, and Arg399Gln that cause amino acid substitutions. Arg194Trp is only SNPs that associated with high risk of cervical cancer but not others. Additionally, structure and function of this protein can be altered by functional SNPs, which may lead to the susceptibility of individuals to cancers. Anotherstudy found G399A polymorphisms. We concluded that SNP of this DNA repair genes have been found to be good predictors of efficacy of radiotherapy.Kanker serviks adalah penyakit yang paling fatal pada perempuan di Indonesia. Untuk memahami variasi substansial respon intrinsik individual terhadap radiasi, suatu usaha telah dilakukan untuk mengidentifikasi petanda genetik, terutama Single Nucleotide polymorphism (SNP, yang berkaitan dengan responsel kanker terhadap terapi radiasi. Satu dari SNP tersebut adalah X-ray repair cross-complementing protein 1 (XRCC1 yang merupakan satu dari gen paling penting dalam lajur perbaikan asam deoksiribonukleat (DNA. Meta-analysis dalam penentuan hubungan polimorfisme XRCC1 dengan kanker serviks
Rizzi, Giovanni; Østerberg, Frederik Westergaard; Dufva, Martin
We present a magnetoresistive sensor platform for hybridization assays and demonstrate its applicability on single nucleotide polymorphism (SNP) genotyping. The sensor relies on anisotropic magnetoresistance in a new geometry with a local negative reference and uses the magnetic field from...... the sensor bias current to magnetize magnetic beads in the vicinity of the sensor. The method allows for real-time measurements of the specific bead binding to the sensor surface during DNA hybridization and washing. Compared to other magnetic biosensing platforms, our approach eliminates the need...... for external electromagnets and thus allows for miniaturization of the sensor platform....
Li, Ruiqiang; Li, Yingrui; Fang, Xiaodong
of this information was integrated into a single quality score for each base under Bayesian theory to measure the accuracy of consensus calling. We tested this methodology using a large-scale human resequencing data set of 36x coverage and assembled a high-quality nonrepetitive consensus sequence for 92......-genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by carefully considering the data quality, alignment, and experimental errors common to this technology. All...
Genomic selection (GS) simultaneously incorporates dense SNP marker genotypes with phenotypic data from related animals to predict animal-specific genomic breeding value (GEBV), which circumvents the need to measure the disease phenotype in potential breeders. Marker assisted selection (MAS) involv...
Mikheecheva, Natalya E.; Zaychikova, Marina V.; Melerzanov, Alexander V.
Mycobacterium tuberculosis is divided into several distinct lineages, and various genetic markers such as IS-elements, VNTR, and SNPs are used for lineage identification. We propose an M. tuberculosis classification approach based on functional polymorphisms in virulence genes. An M. tuberculosis virulence genes catalog has been established, including 319 genes from various protein groups, such as proteases, cell wall proteins, fatty acid and lipid metabolism proteins, sigma factors, toxin–antitoxin systems. Another catalog of 1,573 M. tuberculosis isolates of different lineages has been developed. The developed SNP-calling program has identified 3,563 nonsynonymous SNPs. The constructed SNP-based phylogeny reflected the evolutionary relationship between lineages and detected new sublineages. SNP analysis of sublineage F15/LAM4/KZN revealed four lineage-specific mutations in cyp125, mce3B, vapC25, and vapB34. The Ural lineage has been divided into two geographical clusters based on different SNPs in virulence genes. A new sublineage, B0/N-90, was detected inside the Beijing-B0/W-148 by SNPs in irtB, mce3F and vapC46. We have found 27 members of B0/N-90 among the 227 available genomes of the Beijing-B0/W-148 sublineage. Whole-genome sequencing of strain B9741, isolated from an HIV-positive patient, was demonstrated to belong to the new B0/N-90 group. A primer set for PCR detection of B0/N-90 lineage-specific mutations has been developed. The prospective use of mce3 mutant genes as genetically engineered vaccine is discussed. PMID:28338924
Birolo, Giovanni; Prazzoli, Maria Lucia; Lorenzi, Silvia; Valle, Giorgio; Grando, Maria Stella
Whole-genome comparisons of Vitis vinifera subsp. sativa and V. vinifera subsp. sylvestris are expected to provide a better estimate of the valuable genetic diversity still present in grapevine, and help to reconstruct the evolutionary history of a major crop worldwide. To this aim, the increase of molecular marker density across the grapevine genome is fundamental. Here we describe the SNP discovery in a grapevine germplasm collection of 51 cultivars and 44 wild accessions through a novel protocol of restriction-site associated DNA (RAD) sequencing. By resequencing 1.1% of the grapevine genome at a high coverage, we recovered 34K BamHI unique restriction sites, of which 6.8% were absent in the ‘PN40024’ reference genome. Moreover, we identified 37,748 single nucleotide polymorphisms (SNPs), 93% of which belonged to the 19 assembled chromosomes with an average of 1.8K SNPs per chromosome. Nearly half of the SNPs fell in genic regions mostly assigned to the functional categories of metabolism and regulation, whereas some nonsynonymous variants were identified in genes related with the detection and response to environmental stimuli. SNP validation was carried-out, showing the ability of RAD-seq to accurately determine genotypes in a highly heterozygous species. To test the usefulness of our SNP panel, the main diversity statistics were evaluated, highlighting how the wild grapevine retained less genetic variability than the cultivated form. Furthermore, the analysis of Linkage Disequilibrium (LD) in the two subspecies separately revealed how the LD decays faster within the domesticated grapevine compared to its wild relative. Being the first application of RAD-seq in a diverse grapevine germplasm collection, our approach holds great promise for exploiting the genetic resources available in one of the most economically important fruit crops. PMID:28125640
Mikheecheva, Natalya E; Zaychikova, Marina V; Melerzanov, Alexander V; Danilenko, Valery N
Mycobacterium tuberculosis is divided into several distinct lineages, and various genetic markers such as IS-elements, VNTR, and SNPs are used for lineage identification. We propose an M. tuberculosis classification approach based on functional polymorphisms in virulence genes. An M. tuberculosis virulence genes catalog has been established, including 319 genes from various protein groups, such as proteases, cell wall proteins, fatty acid and lipid metabolism proteins, sigma factors, toxin-antitoxin systems. Another catalog of 1,573 M. tuberculosis isolates of different lineages has been developed. The developed SNP-calling program has identified 3,563 nonsynonymous SNPs. The constructed SNP-based phylogeny reflected the evolutionary relationship between lineages and detected new sublineages. SNP analysis of sublineage F15/LAM4/KZN revealed four lineage-specific mutations in cyp125, mce3B, vapC25, and vapB34. The Ural lineage has been divided into two geographical clusters based on different SNPs in virulence genes. A new sublineage, B0/N-90, was detected inside the Beijing-B0/W-148 by SNPs in irtB, mce3F and vapC46. We have found 27 members of B0/N-90 among the 227 available genomes of the Beijing-B0/W-148 sublineage. Whole-genome sequencing of strain B9741, isolated from an HIV-positive patient, was demonstrated to belong to the new B0/N-90 group. A primer set for PCR detection of B0/N-90 lineage-specific mutations has been developed. The prospective use of mce3 mutant genes as genetically engineered vaccine is discussed. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Jose A Seoane
Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.
Carole F S Koning-Boucoiran
Full Text Available In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array.Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L. genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.
Edriss, V; Guldbrandtsen, B; Lund, M S; Su, G
Genomic selection is a method to predict breeding values using genome-wide single-nucleotide polymorphism (SNP) markers. High-quality marker data are necessary for genomic selection. The aim of this study was to investigate the effect of marker-editing criteria on the accuracy of genomic predictions in the Nordic Holstein and Jersey populations. Data included 4429 Holstein and 1071 Jersey bulls. In total, 48,222 SNP for Holstein and 44,305 SNP for Jersey were polymorphic. The SNP data were edited based on (i) minor allele frequencies (MAF) with thresholds of no limit, 0.001, 0.01, 0.02, 0.05 and 0.10, (ii) deviations from Hardy-Weinberg proportions (HWP) with thresholds of no limit, chi-squared p-values of 0.001, 0.02, 0.05 and 0.10, and (iii) GenCall (GC) scores with thresholds of 0.15, 0.55, 0.60, 0.65 and 0.70. The marker data sets edited with different criteria were used for genomic prediction of protein yield, fertility and mastitis using a Bayesian variable selection and a GBLUP model. De-regressed EBV were used as response variables. The result showed little difference between prediction accuracies based on marker data sets edited with MAF and deviation from HWP. However, accuracy decreased with more stringent thresholds of GC score. According to the results of this study, it would be appropriate to edit data with restriction of MAF being between 0.01 and 0.02, a p-value of deviation from HWP being 0.05, and keeping all individual SNP genotypes having a GC score over 0.15. © 2012 Blackwell Verlag GmbH.
You Frank M
Full Text Available Abstract Background A genome-wide set of single nucleotide polymorphisms (SNPs is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a pipeline (AGSNP for genome-wide SNP discovery in coding sequences and other single-copy DNA without a complete genome sequence in self-pollinating (autogamous plants. Here we updated this pipeline for SNP discovery in outcrossing (allogamous species and demonstrated its efficacy in SNP discovery in walnut (Juglans regia L.. Results The first step in the original implementation of the AGSNP pipeline was the construction of a reference sequence and the identification of single-copy sequences in it. To identify single-copy sequences, multiple genome equivalents of short SOLiD reads of another individual were mapped to shallow genome coverage of long Sanger or Roche 454 reads making up the reference sequence. The relative depth of SOLiD reads was used to filter out repeated sequences from single-copy sequences in the reference sequence. The second step was a search for SNPs between SOLiD reads and the reference sequence. Polymorphism within the mapped SOLiD reads would have precluded SNP discovery; hence both individuals had to be homozygous. The AGSNP pipeline was updated here for using SOLiD or other type of short reads of a heterozygous individual for these two principal steps. A total of 32.6X walnut genome equivalents of SOLiD reads of vegetatively propagated walnut scion cultivar ‘Chandler’ were mapped to 48,661 ‘Chandler’ bacterial artificial chromosome (BAC end sequences (BESs produced by Sanger sequencing during the construction of a walnut physical map. A total of 22,799 putative SNPs were initially identified. A total of 6,000 Infinium II type SNPs evenly distributed along the walnut physical map were selected for the
Full Text Available Abstract Background The turkey (Meleagris gallopavo is an important agricultural species that is the second largest contributor to the world's poultry meat production. The genomic resources of turkey provide turkey breeders with tools needed for the genetic improvement of commercial breeds of turkey for economically important traits. A linkage map of turkey is essential not only for the mapping of quantitative trait loci, but also as a framework to enable the assignment of sequence contigs to specific chromosomes. Comparative genomics with chicken provides insight into mechanisms of genome evolution and helps in identifying rare genomic events such as genomic rearrangements and duplications/deletions. Results Eighteen full sib families, comprising 1008 (35 F1 and 973 F2 birds, were genotyped for 775 single nucleotide polymorphisms (SNPs. Of the 775 SNPs, 570 were informative and used to construct a linkage map in turkey. The final map contains 531 markers in 28 linkage groups. The total genetic distance covered by these linkage groups is 2,324 centimorgans (cM with the largest linkage group (81 loci measuring 326 cM. Average marker interval for all markers across the 28 linkage groups is 4.6 cM. Comparative mapping of turkey and chicken revealed two inter-, and 57 intrachromosomal rearrangements between these two species. Conclusion Our turkey genetic map of 531 markers reveals a genome length of 2,324 cM. Our linkage map provides an improvement of previously published maps because of the more even distribution of the markers and because the map is completely based on SNP markers enabling easier and faster genotyping assays than the microsatellitemarkers used in previous linkage maps. Turkey and chicken are shown to have a highly conserved genomic structure with a relatively low number of inter-, and intrachromosomal rearrangements.
Yuan Yuan Shi
Full Text Available BACKGROUND: The Eastern honey bee, Apis cerana Fabricius, is distributed in southern and eastern Asia, from India and China to Korea and Japan and southeast to the Moluccas. This species is also widely kept for honey production besides Apis mellifera. Apis cerana is also a model organism for studying social behavior, caste determination, mating biology, sexual selection, and host-parasite interactions. Few resources are available for molecular research in this species, and a linkage map was never constructed. A linkage map is a prerequisite for quantitative trait loci mapping and for analyzing genome structure. We used the Chinese honey bee, Apis cerana cerana to construct the first linkage map in the Eastern honey bee. RESULTS: F2 workers (N = 103 were genotyped for 126,990 single nucleotide polymorphisms (SNPs. After filtering low quality and those not passing the Mendel test, we obtained 3,000 SNPs, 1,535 of these were informative and used to construct a linkage map. The preliminary map contains 19 linkage groups, we then mapped the 19 linkage groups to 16 chromosomes by comparing the markers to the genome of A. mellfiera. The final map contains 16 linkage groups with a total of 1,535 markers. The total genetic distance is 3,942.7 centimorgans (cM with the largest linkage group (180 loci measuring 574.5 cM. Average marker interval for all markers across the 16 linkage groups is 2.6 cM. CONCLUSION: We constructed a high density linkage map for A. c. cerana with 1,535 markers. Because the map is based on SNP markers, it will enable easier and faster genotyping assays than randomly amplified polymorphic DNA or microsatellite based maps used in A. mellifera.
SHEN Li; WANG Lei; YANG Hai-Feng; LIU Xiao-Jun; LIU Hong-Ping
@@ We present a simple and efficient method for measuring the atomic lifetimes in order of tens of microseconds and demonstrate it in the lifetime determination of barium Rydberg states.This method extracts the lifetime information from the time-of-flight spectrum directly, which is much more efficient than other methods such as the time-delayed field ionization and the traditional laser induced fluorescence.The lifetimes determined with our method for barium Rydberg 6snp(n=37-59)series are well coincident with the values deduced from the absolute oscillator strengths of barium which were given in the literature [J.Phys.B 14(1981)4489, 29(1996)655]on experiments.%We present a simple and efficient method for measuring the atomic lifetimes in order of tens of microseconds and demonstrate it in the lifetime determination of barium Rydberg states. This method extracts the lifetime information from the time-of-flight spectrum directly, which is much more efficient than other methods such as the time-delayed field ionization and the traditional laser induced fluorescence. The lifetimes determined with our method for barium Rydberg 6snp (n=37-59) series are well coincident with the values deduced from the absolute oscillator strengths of barium which were given in the literature [J. Phys. B 14 (1981) 4489, 29 (1996) 655] onexperiments.
Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B
We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.
Full Text Available Abstract Background With the availability of large-scale genome-wide association study (GWAS data, choosing an optimal set of SNPs for disease susceptibility prediction is a challenging task. This study aimed to use single nucleotide polymorphisms (SNPs to predict psoriasis from searching GWAS data. Methods Totally we had 2,798 samples and 451,724 SNPs. Process for searching a set of SNPs to predict susceptibility for psoriasis consisted of two steps. The first one was to search top 1,000 SNPs with high accuracy for prediction of psoriasis from GWAS dataset. The second one was to search for an optimal SNP subset for predicting psoriasis. The sequential information bottleneck (sIB method was compared with classical linear discriminant analysis(LDA for classification performance. Results The best test harmonic mean of sensitivity and specificity for predicting psoriasis by sIB was 0.674(95% CI: 0.650-0.698, while only 0.520(95% CI: 0.472-0.524 was reported for predicting disease by LDA. Our results indicate that the new classifier sIB performs better than LDA in the study. Conclusions The fact that a small set of SNPs can predict disease status with average accuracy of 68% makes it possible to use SNP data for psoriasis prediction.
Full Text Available The metabolic adaptation of dairy cows during the transition period has been studied intensively in the last decades. However, until now, only few studies have paid attention to the genetic aspects of this process. Here, we present the results of a gene-based mapping and pathway analysis with the measurements of three key metabolites, (1 non-esterified fatty acids (NEFA, (2 beta-hydroxybutyrate (BHBA and (3 glucose, characterizing the metabolic adaptability of dairy cows before and after calving. In contrast to the conventional single-marker approach, we identify 99 significant and biologically sensible genes associated with at least one of the considered phenotypes and thus giving evidence for a genetic basis of the metabolic adaptability. Moreover, our results strongly suggest three pathways involved in the metabolism of steroids and lipids are potential candidates for the adaptive regulation of dairy cows in their early lactation. From our perspective, a closer investigation of our findings will lead to a step forward in understanding the variability in the metabolic adaptability of dairy cows in their early lactation.
Wang, Qian; Fu, Lihong; Zhang, Xiaojing; Dai, Xinyu; Bai, Mei; Fu, Guangping; Cong, Bin; Li, Shujin
A previously developed multiplex assay with 44 individual identification SNPs was expanded to a 55plex assay. Fifty-four highly informative SNPs and an amelogenin sex marker were amplified in one PCR reaction and then detected with two SNaPshot reactions using CE. PCR primers for four loci, 28 single-base extension primers, and the reaction conditions were altered to improve the robustness of the method. A detailed approach for allele calling was developed to guide analysis of the electropherogram. One hundred and eighty unrelated individuals and 100 father-child-mother trios of the Han population in Hebei, China were analyzed. No mutation was found in the SNP loci. The combined mean match probability and cumulative probability of exclusion were 1.327 × 10(-22) and 0.999932, respectively. Analysis of the 54 SNPs and 26 STRs (included in the AmpFLSTR Identifiler and Investigator HDplex kits) showed no significant linkage disequilibriums. Our research shows that the expanded SNP multiplex assay is an easily performed and valuable method to supplement STR analysis.
Holt, Kathryn E
Abstract Background Salmonella Typhi (S. Typhi) causes typhoid fever, which remains an important public health issue in many developing countries. Kathmandu, the capital of Nepal, is an area of high incidence and the pediatric population appears to be at high risk of exposure and infection. Methods We recently defined the population structure of S. Typhi, using new sequencing technologies to identify nearly 2,000 single nucleotide polymorphisms (SNPs) that can be used as unequivocal phylogenetic markers. Here we have used the GoldenGate (Illumina) platform to simultaneously type 1,500 of these SNPs in 62 S. Typhi isolates causing severe typhoid in children admitted to Patan Hospital in Kathmandu. Results Eight distinct S. Typhi haplotypes were identified during the 20-month study period, with 68% of isolates belonging to a subclone of the previously defined H58 S. Typhi. This subclone was closely associated with resistance to nalidixic acid, with all isolates from this group demonstrating a resistant phenotype and harbouring the same resistance-associated SNP in GyrA (Phe83). A secondary clone, comprising 19% of isolates, was observed only during the second half of the study. Conclusions Our data demonstrate the utility of SNP typing for monitoring bacterial populations over a defined period in a single endemic setting. We provide evidence for genotype introduction and define a nalidixic acid resistant subclone of S. Typhi, which appears to be the dominant cause of severe pediatric typhoid in Kathmandu during the study period.
Crooks, Lucy; Carlborg, Örjan; Marklund, Stefan; Johansson, Anna M
We analyzed genotypes from ~10K single-nucleotide polymorphisms (SNPs) in two families of an F2 intercross between Red Junglefowl and White Leghorn chickens. Possible null alleles were found by patterns of incompatible and missing genotypes. We estimated that 2.6% of SNPs had null alleles compared with 2.3% with genotyping errors and that 40% of SNPs in which a parent and offspring were genotyped as different homozygotes had null alleles. Putative deletions were identified by null alleles at adjacent markers. We found two candidate deletions that were supported by fluorescence intensity data from a 60K SNP chip. One of the candidate deletions was from the Red Junglefowl, and one was present in both the Red Junglefowl and White Leghorn. Both candidate deletions spanned protein-coding regions and were close to a previously detected quantitative trait locus affecting body weight in this population. This study demonstrates that the ~50K SNP genotyping arrays now available for several agricultural species can be used to identify null alleles and deletions in data from large families. We suggest that our approach could be a useful complement to linkage analysis in experimental crosses.
Edriss, Vahid; Guldbrandtsen, Bernt; Lund, Mogens Sandø
contained 1071 Jersey bulls that were genotyped with the Illumina Bovine 50K chip. After preliminary editing, 39227 SNP remained in the dataset. Four methods to handle missing genotypes were: 1) BEAGLE: missing markers were imputed using Beagle 3.3 software, 2) COMMON: missing genotypes at a locus were...... that missing genotypes should be imputed in order to improve genomic prediction. Editing the marker data with stringent threshold on GenCall (GC) scores and then imputing the discarded genotypes did not lead to higher accuracy. All marker genotypes with a GC score over 0.15 should be retained for genomic...
Fan, Wei; Zong, Jie; Luo, Zhijing; Chen, Mingjiao; Zhao, Xiangxiang; Zhang, Dabing; Qi, Yiping; Yuan, Zheng
Rapid and accurate genome-wide marker detection is essential to the marker-assisted breeding and functional genomics studies. In this work, we developed an integrated software, AgroMarker Finder (AMF: http://erp.novelbio.com/AMF), for providing graphical user interface (GUI) to facilitate the recently developed restriction-site associated DNA (RAD) sequencing data analysis in rice. By application of AMF, a total of 90,743 high-quality markers (82,878 SNPs and 7,865 InDels) were detected between rice varieties JP69 and Jiaoyuan5A. The density of the identified markers is 0.2 per Kb for SNP markers, and 0.02 per Kb for InDel markers. Sequencing validation revealed that the accuracy of genome-wide marker detection by AMF is 93%. In addition, a validated subset of 82 SNPs and 31 InDels were found to be closely linked to 117 important agronomic trait genes, providing a basis for subsequent marker-assisted selection (MAS) and variety identification. Furthermore, we selected 12 markers from 31 validated InDel markers to identify seed authenticity of variety Jiaoyuanyou69, and we also identified 10 markers closely linked to the fragrant gene BADH2 to minimize linkage drag for Wuxiang075 (BADH2 donor)/Jiachang1 recombinants selection. Therefore, this software provides an efficient approach for marker identification from RAD-seq data, and it would be a valuable tool for plant MAS and variety protection.
Hedrich Hans J
Full Text Available Abstract Background The laboratory rat (Rattus norvegicus is an important model for studying many aspects of human health and disease. Detailed knowledge on genetic variation between strains is important from a biomedical, particularly pharmacogenetic point of view and useful for marker selection for genetic cloning and association studies. Results We show that Single Nucleotide Polymorphisms (SNPs in commonly used rat strains are surprisingly well represented in wild rat isolates. Shotgun sequencing of 814 Kbp in one wild rat resulted in the identification of 485 SNPs as compared with the Brown Norway genome sequence. Genotyping 36 commonly used inbred rat strains showed that 84% of these alleles are also polymorphic in a representative set of laboratory rat strains. Conclusion We postulate that shotgun sequencing in a wild rat sample and subsequent genotyping in multiple laboratory or domesticated strains rather than direct shotgun sequencing of multiple strains, could be the most efficient SNP discovery approach. For the rat, laboratory strains still harbor a large portion of the haplotypes present in wild isolates, suggesting a relatively recent common origin and supporting the idea that rat inbred strains, in contrast to mouse inbred strains, originate from a single species, R. norvegicus.
Hurgobin, Bhavna; Edwards, David
Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species; it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms (SNPs) are also an important form of genetic variation. The advent of next-generation sequencing (NGS) coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci (QTL) analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data.
Full Text Available Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species; it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms (SNPs are also an important form of genetic variation. The advent of next-generation sequencing (NGS coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci (QTL analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data.
Full Text Available BACKGROUND: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS. This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP data. SNPpy and its dependencies are open source software. RESULTS: The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. CONCLUSIONS: By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data.
vonHoldt, Bridgett M; Pollinger, John P; Earl, Dent A; Parker, Heidi G; Ostrander, Elaine A; Wayne, Robert K
The ability to detect recent hybridization between dogs and wolves is important for conservation and legal actions, which often require accurate and rapid resolution of ancestry. The availability of a genetic test for dog-wolf hybrids would greatly support federal and legal enforcement efforts, particularly when the individual in question lacks prior ancestry information. We have developed a panel of 100 unlinked ancestry-informative SNP markers that can detect mixed ancestry within up to four generations of dog-wolf hybridization based on simulations of seven genealogical classes constructed following the rules of Mendelian inheritance. We establish 95 % confidence regions around the spatial clustering of each genealogical class using a tertiary plot of allele dosage and heterozygosity. The first- and second-backcrossed-generation hybrids were the most distinct from parental populations, with >90 % correctly assigned to genealogical class. In this article we provide a tool kit with population-level statistical quantification that can detect recent dog-wolf hybridization using a panel of dog-wolf ancestry-informative SNPs with divergent allele frequency distributions.
Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
Full Text Available The concurrent development of high-throughput genotyping platforms and next generation sequencing (NGS has increased the number and density of genetic markers, the efficiency of constructing detailed linkage maps, and our ability to overlay recombination and physical maps of the genome. We developed an array for tomato with 8,784 Single Nucleotide Polymorphisms (SNPs mainly discovered based on NGS-derived transcriptome sequences. Of the SNPs, 7,720 (88% passed manufacturing quality control and could be scored in tomato germplasm. The array was used to generate high-density linkage maps for three interspecific F(2 populations: EXPEN 2000 (Solanum lycopersicum LA0925 x S. pennellii LA0716, 79 individuals, EXPEN 2012 (S. lycopersicum Moneymaker x S. pennellii LA0716, 160 individuals, and EXPIM 2012 (S. lycopersicum Moneymaker x S. pimpinellifolium LA0121, 183 individuals. The EXPEN 2000-SNP and EXPEN 2012 maps consisted of 3,503 and 3,687 markers representing 1,076 and 1,229 unique map positions (genetic bins, respectively. The EXPEN 2000-SNP map had an average marker bin interval of 1.6 cM, while the EXPEN 2012 map had an average bin interval of 0.9 cM. The EXPIM 2012 map was constructed with 4,491 markers (1,358 bins and an average bin interval of 0.8 cM. All three linkage maps revealed an uneven distribution of markers across the genome. The dense EXPEN 2012 and EXPIM 2012 maps showed high levels of colinearity across all 12 chromosomes, and also revealed evidence of small inversions between LA0716 and LA0121. Physical positions of 7,666 SNPs were identified relative to the tomato genome sequence. The genetic and physical positions were mostly consistent. Exceptions were observed for chromosomes 3, 10 and 12. Comparing genetic positions relative to physical positions revealed that genomic regions with high recombination rates were consistent with the known distribution of euchromatin across the 12 chromosomes, while very low recombination rates
Zhao Patrick X
Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most common type of sequence variation among plants and are often functionally important. We describe the use of 454 technology and high resolution melting analysis (HRM for high throughput SNP discovery in tetraploid alfalfa (Medicago sativa L., a species with high economic value but limited genomic resources. Results The alfalfa genotypes selected from M. sativa subsp. sativa var. 'Chilean' and M. sativa subsp. falcata var. 'Wisfal', which differ in water stress sensitivity, were used to prepare cDNA from tissue of clonally-propagated plants grown under either well-watered or water-stressed conditions, and then pooled for 454 sequencing. Based on 125.2 Mb of raw sequence, a total of 54,216 unique sequences were obtained including 24,144 tentative consensus (TCs sequences and 30,072 singletons, ranging from 100 bp to 6,662 bp in length, with an average length of 541 bp. We identified 40,661 candidate SNPs distributed throughout the genome. A sample of candidate SNPs were evaluated and validated using high resolution melting (HRM analysis. A total of 3,491 TCs harboring 20,270 candidate SNPs were located on the M. truncatula (MT 3.5.1 chromosomes. Gene Ontology assignments indicate that sequences obtained cover a broad range of GO categories. Conclusions We describe an efficient method to identify thousands of SNPs distributed throughout the alfalfa genome covering a broad range of GO categories. Validated SNPs represent valuable molecular marker resources that can be used to enhance marker density in linkage maps, identify potential factors involved in heterosis and genetic variation, and as tools for association mapping and genomic selection in alfalfa.
Grandell, Ida; Samara, Raed; Tillmar, Andreas O
Within forensic genetics, there is still a need for supplementary DNA marker typing in order to increase the power to solve cases for both identity testing and complex kinship issues. One major disadvantage with current capillary electrophoresis (CE) methods is the limitation in DNA marker multiplex capability. By utilizing massive parallel sequencing (MPS) technology, this capability can, however, be increased. We have designed a customized GeneRead DNASeq SNP panel (Qiagen) of 140 previously published autosomal forensically relevant identity SNPs for analysis using MPS. One single amplification step was followed by library preparation using the GeneRead Library Prep workflow (Qiagen). The sequencing was performed on a MiSeq System (Illumina), and the bioinformatic analyses were done using the software Biomedical Genomics Workbench (CLC Bio, Qiagen). Forty-nine individuals from a Swedish population were genotyped in order to establish genotype frequencies and to evaluate the performance of the assay. The analyses showed to have a balanced coverage among the included loci, and the heterozygous balance showed to have less than 0.5 % outliers. Analyses of dilution series of the 2800M Control DNA gave reproducible results down to 0.2 ng DNA input. In addition, typing of FTA samples and bone samples was performed with promising results. Further studies and optimizations are, however, required for a more detailed evaluation of the performance of degraded and PCR-inhibited forensic samples. In summary, the assay offers a straightforward sample-to-genotype workflow and could be useful to gain information in forensic casework, for both identity testing and in order to solve complex kinship issues.
Zeng, Qifan; Fu, Qiang; Li, Yun; Waldbieser, Geoff; Bosworth, Brian; Liu, Shikai; Yang, Yujia; Bao, Lisui; Yuan, Zihao; Li, Ning; Liu, Zhanjiang
Single nucleotide polymorphisms (SNPs) are capable of providing the highest level of genome coverage for genomic and genetic analysis because of their abundance and relatively even distribution in the genome. Such a capacity, however, cannot be achieved without an efficient genotyping platform such as SNP arrays. In this work, we developed a high-density SNP array with 690,662 unique SNPs (herein 690 K array) that were relatively evenly distributed across the entire genome, and covered 98.6% of the reference genome sequence. Here we also report linkage mapping using the 690 K array, which allowed mapping of over 250,000 SNPs on the linkage map, the highest marker density among all the constructed linkage maps. These markers were mapped to 29 linkage groups (LGs) with 30,591 unique marker positions. This linkage map anchored 1,602 scaffolds of the reference genome sequence to LGs, accounting for over 97% of the total genome assembly. A total of 1,007 previously unmapped scaffolds were placed to LGs, allowing validation and in few instances correction of the reference genome sequence assembly. This linkage map should serve as a valuable resource for various genetic and genomic analyses, especially for GWAS and QTL mapping for genes associated with economically important traits. PMID:28079141
Ismail, Nor Asiah; Rafii, M Y; Mahmud, T M M; Hanafi, M M; Miah, Gous
Ginger is an economically important and valuable plant around the world. Ginger is used as a food, spice, condiment, medicine and ornament. There is available information on biochemical aspects of ginger, but few studies have been reported on its molecular aspects. The main objective of this review is to accumulate the available molecular marker information and its application in diverse ginger studies. This review article was prepared by combing material from published articles and our own research. Molecular markers allow the identification and characterization of plant genotypes through direct access to hereditary material. In crop species, molecular markers are applied in different aspects and are useful in breeding programs. In ginger, molecular markers are commonly used to identify genetic variation and classify the relatedness among varieties, accessions, and species. Consequently, it provides important input in determining resourceful management strategies for ginger improvement programs. Alternatively, a molecular marker could function as a harmonizing tool for documenting species. This review highlights the application of molecular markers (isozyme, RAPD, AFLP, SSR, ISSR and others such as RFLP, SCAR, NBS and SNP) in genetic diversity studies of ginger species. Some insights on the advantages of the markers are discussed. The detection of genetic variation among promising cultivars of ginger has significance for ginger improvement programs. This update of recent literature will help researchers and students select the appropriate molecular markers for ginger-related research.
Murmann, Andrea E; Conrad, Donald F; Mashek, Heather; Curtis, Chris A; Nicolae, Raluca I; Ober, Carole; Schwartz, Stuart
Acentric inverted duplication (inv dup) markers, the largest group of chromosomal abnormalities with neocentromere formation, are found in patients both with idiopathic mental retardation and with cancer. The mechanism of their formation has been investigated by analyzing the breakpoints and the genotypes of 12 inv dup marker cases (three trisomic, six tetrasomic, two polysomic and one X chromosome derived marker) using a combination of fluorescence in situ hybridization, quantitative SNP array and microsatellite analysis. Inv dup markers were found to form either symmetrically with one breakpoint or asymmetrically with two distinct breakpoints. Genotype analyses revealed that all inv dup markers formed from one single chromatid end. This observation is incompatible with the previously suggested model by which the acentric inv dup markers form through inter-chromosomal U-type exchange. On the basis of the identification of DNA sequence motifs with inverted homologies within all observed breakpoint regions, a new general mechanism is proposed for the acentric inv dup marker formation: following a double-strand break an acentric fragment forms, during either meiosis or mitosis. The open DNA end of the acentric fragment is stabilized by the formation of an intra-chromosomal loop promoted by the presence of sequences with inverted homologies. Likely coinciding with the neocentromere formation, this stabilized fragment is duplicated during an early mitotic event, insuring the marker's survival during cell division and its presence in all cells.
Full Text Available BACKGROUND: Possible single nucleotide polymorphism (SNP interactions in breast cancer are usually not investigated in genome-wide association studies. Previously, we proposed a particle swarm optimization (PSO method to compute these kinds of SNP interactions. However, this PSO does not guarantee to find the best result in every implement, especially when high-dimensional data is investigated for SNP-SNP interactions. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we propose IPSO algorithm to improve the reliability of PSO for the identification of the best protective SNP barcodes (SNP combinations and genotypes with maximum difference between cases and controls associated with breast cancer. SNP barcodes containing different numbers of SNPs were computed. The top five SNP barcode results are retained for computing the next SNP barcode with a one-SNP-increase for each processing step. Based on the simulated data for 23 SNPs of six steroid hormone metabolisms and signalling-related genes, the performance of our proposed IPSO algorithm is evaluated. Among 23 SNPs, 13 SNPs displayed significant odds ratio (OR values (1.268 to 0.848; p<0.05 for breast cancer. Based on IPSO algorithm, the jointed effect in terms of SNP barcodes with two to seven SNPs show significantly decreasing OR values (0.84 to 0.57; p<0.05 to 0.001. Using PSO algorithm, two to four SNPs show significantly decreasing OR values (0.84 to 0.77; p<0.05 to 0.001. Based on the results of 20 simulations, medians of the maximum differences for each SNP barcode generated by IPSO are higher than by PSO. The interquartile ranges of the boxplot, as well as the upper and lower hinges for each n-SNP barcode (n = 3∼10 are more narrow in IPSO than in PSO, suggesting that IPSO is highly reliable for SNP barcode identification. CONCLUSIONS/SIGNIFICANCE: Overall, the proposed IPSO algorithm is robust to provide exact identification of the best protective SNP barcodes for breast cancer.
Teh, Soon Li; Fresnedo-Ramírez, Jonathan; Clark, Matthew D; Gadoury, David M; Sun, Qi; Cadle-Davidson, Lance; Luby, James J
Quantitative trait locus (QTL) identification in perennial fruit crops is impeded largely by their lengthy generation time, resulting in costly and labor-intensive maintenance of breeding programs. In a grapevine (genus Vitis) breeding program, although experimental families are typically unreplicated, the genetic backgrounds may contain similar progenitors previously selected due to their contribution of favorable alleles. In this study, we investigated the utility of joint QTL identification provided by analyzing half-sib families. The genetic control of powdery mildew was studied using two half-sib F1 families, namely GE0711/1009 (MN1264 × MN1214; N = 147) and GE1025 (MN1264 × MN1246; N = 125) with multiple species in their ancestry. Maternal genetic maps consisting of 1077 and 1641 single nucleotide polymorphism (SNP) markers, respectively, were constructed using a pseudo-testcross strategy. Ratings of field resistance to powdery mildew were obtained based on whole-plant evaluation of disease severity. This 2-year analysis uncovered two QTLs that were validated on a consensus map in these half-sib families with improved precision relative to the parental maps. Examination of haplotype combinations based on the two QTL regions identified strong association of haplotypes inherited from 'Seyval blanc', through MN1264, with powdery mildew resistance. This investigation also encompassed the use of microsatellite markers to establish a correlation between 206-bp (UDV-015b) and 357-bp (VViv67) fragment sizes with resistance-carrying haplotypes. Our work is one of the first reports in grapevine demonstrating the use of SNP-based maps and haplotypes for QTL identification and tagging of powdery mildew resistance in half-sib families.
Yang, Shun-Fa; Wu, Tzu-Fan; Tsai, Hsiu-Ting; Lin, Long-Yau; Wang, Po-Hui
Pelvic inflammatory disease (PID) is a common infection in women of reproductive age. However, diagnosis of PID can be difficult due to the wide variation in the symptoms and signs, ranging from subtle or mild symptoms to severe pain in the lower abdomen. Clinical diagnosis alone has only 87% sensitivity and 50% specificity. Therefore, identifying biological factors that are useful for early diagnosis and correlating their expression with the severity of PID could provide significant benefits to women suffering from PID. Pentraxin 3 (PTX3), E-cadherin, myeloperoxidase, stromal cell-derived factor 1 (SDF-1) and the matrix metalloproteinase-9 (MMP-9)/MMP-2 ratio are potential candidates for detecting PID reliably. As PID is often subtle, highly sensitive PID detection methods are needed to promote the prevention of severe sequelae. Growth arrest-specific 6 (Gas6), in combination with its soluble tyrosine kinase receptor, sAxl, could elevate the sensitivity to 92%, which was higher than all other markers tested. Moreover, PTX3, D-dimer and YKL-40 concentrations can predict the clinical course of PID. Although single nucleotide polymorphisms of biomarker genes are not associated with the development of PID, myeloperoxidase SNP -463 G/A and SDF-1 SNP 801 G/A may affect the aggravated expression of their biomarkers in PID. Copyright © 2014. Published by Elsevier B.V.
Geraldes, A; Difazio, S P; Slavov, G T; Ranjan, P; Muchero, W; Hannemann, J; Gunter, L E; Wymore, A M; Grassa, C J; Farzaneh, N; Porth, I; McKown, A D; Skyba, O; Li, E; Fujita, M; Klápště, J; Martin, J; Schackwitz, W; Pennacchio, C; Rokhsar, D; Friedmann, M C; Wasteneys, G O; Guy, R D; El-Kassaby, Y A; Mansfield, S D; Cronk, Q C B; Ehlting, J; Douglas, C J; Tuskan, G A
Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. For such studies, the use of large single nucleotide polymorphism (SNP) genotyping arrays still offers the most cost-effective solution. Herein we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species latitudinal range. We adopted a candidate gene approach to the array design that resulted in the selection of 34 131 SNPs, the majority of which are located in, or within 2 kb of, 3543 candidate genes. A subset of the SNPs on the array (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%. We demonstrate that even among small numbers of samples (n = 10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca. Finally, we provide evidence for the utility of the array to address evolutionary questions such as intraspecific studies of genetic differentiation, species assignment and the detection of natural hybrids.
Pavan, Márcio G.; Mesquita, Rafael D.; Lawrence, Gena G.; Lazoski, Cristiano; Dotson, Ellen M.; Abubucker, Sahar; Mitreva, Makedonka; Randall-Maher, Jennifer; Monteiro, Fernando A.
The design and application of rational strategies that rely on accurate species identification are pivotal for effective vector control. When morphological identification of the target vector species is impractical, the use of molecular markers is required. Here we describe a non-coding, single-copy nuclear DNA fragment that contains a single-nucleotide polymorphism (SNP) with the potential to distinguish the important domestic Chagas disease vector, Rhodnius prolixus, from members of the four sylvatic Rhodnius robustus cryptic species complex. A total of 96 primer pairs obtained from whole genome shotgun sequencing of the R. prolixus genome (12,626 random reads) were tested on 43 R. prolixus and R. robustus s.l. samples. One of the seven amplicons selected (AmpG) presented a SNP, potentially diagnostic for R. prolixus, on the 280th site. The diagnostic nature of this SNP was then performed on 154 R. prolixus and R. robustus s.l. samples aimed at achieving the widest possible geographic coverage. The results of a 60% majority rule Bayesian consensus tree and a median-joining network constructed based on the genetic variability observed reveal the paraphyletic nature of the R. robustus species complex, with respect to R. prolixus. AmpG region is located in the fourth intron of the Transmembrane protein 165 gene, which seems to be in the R. prolixus X chromosome. Other possible chromosomal locations of the AmpG region in the R. prolixus genome are also presented and discussed. PMID:23219914
Lee, Jea-Young; Lee, Jong-Hyeong; Yeo, Jung-Sou; Kim, Jong-Joo
The purpose of this study was to investigate interaction effects of genes using a Harvester method. A sample of Korean cattle, Hanwoo (n = 476) was chosen from the National Livestock Research Institute of Korea that were sired by 50 Korean proven bulls. The steers were born between the spring of 1998 and the autumn of 2002 and reared under a progeny-testing program at the Daekwanryeong and Namwon branches of NLRI. The steers were slaughtered at approximately 24 months of age and carcass quality traits were measured. A SNP Harvester method was applied with a support vector machine (SVM) to detect significant SNPs in the CCDC158 gene and interaction effects between the SNPs that were associated with average daily gains, cold carcass weight, longissimus dorsi muscle area, and marbling scores. The statistical significance of the major SNP combinations was evaluated with x (2)-statistics. The genotype combinations of three SNPs, g.34425+102 A>T(AA), g.4102636T>G(GT), and g.11614+19G>T(GG) had a greater effect than the rest of SNP combinations, e.g. 0.82 vs. 0.75 kg, 343 vs. 314 kg, 80.4 vs 74.7 cm(2), and 7.35 vs. 5.01, for the four respective traits (pHarvester method is a good option when multiple SNPs and interaction effects are tested. The significant SNPs could be applied to improve meat quality of Hanwoo via marker-assisted selection.
Evangelina López de Maturana
Full Text Available The relationship between inflammation and cancer is well established in several tumor types, including bladder cancer. We performed an association study between 886 inflammatory-gene variants and bladder cancer risk in 1,047 cases and 988 controls from the Spanish Bladder Cancer (SBC/EPICURO Study. A preliminary exploration with the widely used univariate logistic regression approach did not identify any significant SNP after correcting for multiple testing. We further applied two more comprehensive methods to capture the complexity of bladder cancer genetic susceptibility: Bayesian Threshold LASSO (BTL, a regularized regression method, and AUC-Random Forest, a machine-learning algorithm. Both approaches explore the joint effect of markers. BTL analysis identified a signature of 37 SNPs in 34 genes showing an association with bladder cancer. AUC-RF detected an optimal predictive subset of 56 SNPs. 13 SNPs were identified by both methods in the total population. Using resources from the Texas Bladder Cancer study we were able to replicate 30% of the SNPs assessed. The associations between inflammatory SNPs and bladder cancer were reexamined among non-smokers to eliminate the effect of tobacco, one of the strongest and most prevalent environmental risk factor for this tumor. A 9 SNP-signature was detected by BTL. Here we report, for the first time, a set of SNP in inflammatory genes jointly associated with bladder cancer risk. These results highlight the importance of the complex structure of genetic susceptibility associated with cancer risk.
Full Text Available Abstract Background Arrayed primer extension (APEX is a microarray-based rapid minisequencing methodology that may have utility in 'personalized medicine' applications that involve genetic diagnostics of single nucleotide polymorphisms (SNPs. However, to date there have been few reports that objectively evaluate the assay completion rate, call rate and accuracy of APEX. We have further developed robust assay design, chemistry and analysis methodologies, and have sought to determine how effective APEX is in comparison to leading 'gold-standard' genotyping platforms. Our methods have been tested against industry-leading technologies in two blinded experiments based on Coriell DNA samples and SNP genotype data from the International HapMap Project. Results In the first experiment, we genotyped 50 SNPs across the entire 270 HapMap Coriell DNA sample set. For each Coriell sample, DNA template was amplified in a total of 7 multiplex PCRs prior to genotyping. We obtained good results for 41 of the SNPs, with 99.8% genotype concordance with HapMap data, at an automated call rate of 94.9% (not including the 9 failed SNPs. In the second experiment, involving modifications to the initial DNA amplification so that a single 50-plex PCR could be achieved, genotyping of the same 50 SNPs across each of 49 randomly chosen Coriell DNA samples allowed extremely robust 50-plex genotyping from as little as 5 ng of DNA, with 100% assay completion rate, 100% call rate and >99.9% accuracy. Conclusion We have shown our methods to be effective for robust multiplex SNP genotyping using APEX, with 100% call rate and >99.9% accuracy. We believe that such methodology may be useful in future point-of-care clinical diagnostic applications where accuracy and call rate are both paramount.
Full Text Available Abstract Background Copy number variation (CNV is essential to understand the pathology of many complex diseases at the DNA level. Affymetrix SNP arrays, which are widely used for CNV studies, significantly depend on accurate copy number (CN estimation. Nevertheless, CN estimation may be biased by several factors, including cross-hybridization and training sample batch, as well as genomic waves of intensities induced by sequence-dependent hybridization rate and amplification efficiency. Since many available algorithms only address one or two of the three factors, a high false discovery rate (FDR often results when identifying CNV. Therefore, we have developed a new CNV detection pipeline which is based on hybridization and amplification rate correction (CNVhac. Methods CNVhac first estimates the allelic concentrations (ACs of target sequences by using the sample independent parameters trained through physicochemical hybridization law. Then the raw CN is estimated by taking the ratio of AC to the corresponding average AC from a reference sample set for one specific site. Finally, a hidden Markov model (HMM segmentation process is implemented to detect CNV regions. Results Based on public HapMap data, the results show that CNVhac effectively smoothes the genomic waves and facilitates more accurate raw CN estimates compared to other methods. Moreover, CNVhac alleviates, to a certain extent, the sample dependence of inference and makes CNV calling with appreciable low FDRs. Conclusion CNVhac is an effective approach to address the common difficulties in SNP array analysis, and the working principles of CNVhac can be easily extended to other platforms.
Glover, Kevin A.; Hansen, Michael Møller; Lien, Sigbjørn
between SNP and STR data sets and variants thereof. The best 15 SNPs (30 alleles) gave a similar level of self-assignment to the best 4 STR loci (83 alleles), however, addition of further STR loci did not lead to a notable increase assignment whereas addition of up to 100 SNP loci increased assignment...
Jyh-Der Leu; I-Feng Lin; Ying-Fang Sun; Su-Mei Chen; Chih-Chao Liu; Yi-Jang Lee
AIM: To investigate the risk association and compare the onset age of hepatocellular carcinoma (HCC)patients in Taiwan with different genotypes of MDM2-SNP309.METHODS: We analyzed MDM2-SNP309 genotypes from 58 patients with HCC and 138 cancer-free healthy controls consecutively. Genotyping of MDM2-SNP309 was conducted by restriction fragment length polymorphism assay.RESULTS: The proportion of homozygous MDM2-SNP309 genotype (G/G) in cases and cancer-free healthy controls was similar (17.2% vs 16.7%). Multivariate analysis showed that the risk of G/G genotype of MDM2-SNP309 vs wild-type T/T genotype in patients with HCC was not significant (OR = 1.265, 95%CI = 0.074-21.77) after adjustment for sex, hepatitis B or C virus infection, age, and cardiovascular disease/diabetes. Nevertheless, there was a trend that GG genotype of MDM2-SNP309 might increase the risk in HCC patients infected with hepatitis virus (OR = 2.568,95% CI = 0.054-121.69). Besides, the homozygous MDM2-SNP309 genotype did not exhibit a significantly earlier age of onset for HCC.CONCLUSION: Current data suggest that the association between MDM2-SNP309 GG genotype and HCC is not significant, while the risk may be enhanced in patients infected by hepatitis virus in Taiwan.
A custom 60K SNP panel, extracted from Bovine HD SNP chip was used to evaluate genotypic frequency changes in Braford (BF, a composite breed) when compared to progenitor breeds: Hereford (HF), Brahman (BR), and Nelore (NE). Samples from both the U. S. and Brazil were used. The new panel differentiat...
Shi, Shanshan; Lin, Shaobin; Liao, Yanfen; Li, Weijing
To analyze a case with Angelman syndrome (AS) using single nucleotide polymorphism array (SNP array) and explore its genotype-phenotype correlation. G-banded karyotyping and SNP array were performed on a child featuring congenital malformations, intellectual disability and developmental delay. Mendelian error checking based on the SNP information was used to delineate the parental origin of detected abnormality. Result of the SNP array was validated with fluorescence in situ hybridization (FISH). The SNP array has detected a 6.053 Mb deletion at 15q11.2q13.1 (22,770,421- 28,823,722) which overlapped with the critical region of AS (type 1). The parents of the child showed no abnormal results for G-banded karyotyping, SNP array and FISH analysis, indicating a de novo origin of the deletion. Mendelian error checking based on the SNP information suggested that the 15q11.2q13.1 deletion was of maternal origin. SNP array can accurately define the size, location and parental origin of chromosomal microdeletions, which may facilitate the diagnosis of AS due to 15q11q13 deletion and better understanding of its genotype-phenotype correlation.
Tomas Mas, Carmen; Børsting, Claus; Morling, Niels
SNPs are being increasingly used by forensic laboratories. Different platforms have been developed for SNP typing. We describe the GenPlex™ HID system protocol, a new SNP-typing platform developed by Applied Biosystems where 48 of the 52 SNPforID SNPs and amelogenin are included. The GenPlex™ HID...
Full Text Available Abstract Background Comparative teleost studies are of great interest since they are important in aquaculture and in evolutionary issues. Comparing genomes of fully sequenced model fish species with those of farmed fish species through comparative mapping offers shortcuts for quantitative trait loci (QTL detections and for studying genome evolution through the identification of regions of conserved synteny in teleosts. Here a comparative mapping study is presented by radiation hybrid (RH mapping genes of the gilthead sea bream Sparus aurata, a non-model teleost fish of commercial and evolutionary interest, as it represents the worldwide distributed species-rich family of Sparidae. Results An additional 74 microsatellite markers and 428 gene-based markers appropriate for comparative mapping studies were mapped on the existing RH map of Sparus aurata. The anchoring of the RH map to the genetic linkage map resulted in 24 groups matching the karyotype of Sparus aurata. Homologous sequences to Tetraodon were identified for 301 of the gene-based markers positioned on the RH map of Sparus aurata. Comparison between Sparus aurata RH groups and Tetraodon chromosomes (karyotype of Tetraodon consists of 21 chromosomes in this study reveals an unambiguous one-to-one relationship suggesting that three Tetraodon chromosomes correspond to six Sparus aurata radiation hybrid groups. The exploitation of this conserved synteny relationship is furthermore demonstrated by in silico mapping of gilthead sea bream expressed sequence tags (EST that give a significant similarity hit to Tetraodon. Conclusion The addition of primarily gene-based markers increased substantially the density of the existing RH map and facilitated comparative analysis. The anchoring of this gene-based radiation hybrid map to the genome maps of model species broadened the pool of candidate genes that mainly control growth, disease resistance, sex determination and reversal, reproduction as well
Doron, Shany; Shweiki, Dorit
SNP-based research strongly affects our biomedical and clinically associated knowledge. Nonunique and false-positive SNP existence in commonly used datasets may thus lead to biased, inaccurate clinically associated conclusions. We designed a computational study to reveal the degree of nonunique/false-positive SNPs in the HapMap dataset. Two sets of SNP flanking sequences were used as queries for BLAT analysis against the human genome. 4.2% and 11.9% of HapMap SNPs align to the genome nonuniquely (long and short, respectively). Furthermore, an average of 7.9% nonunique SNPs are included in common commercial genotyping arrays (according to our designed probes). Nonunique SNPs identified in this study are represented to various degrees in clinically associated databases, stressing the consequence of inaccurate SNP annotation and hence SNP utilization. Unfortunately, our results question some disease-related genotyping analyses, raising a worrisome concern on their validity.
Wong, Melissa M L; Cannon, Charles H; Wickneswari, Ratnam
Next Generation Sequencing has provided comprehensive, affordable and high-throughput DNA sequences for Single Nucleotide Polymorphism (SNP) discovery in Acacia auriculiformis and Acacia mangium. Like other non-model species, SNP detection and genotyping in Acacia are challenging due to lack of genome sequences. The main objective of this study is to develop the first high-throughput SNP genotyping assay for linkage map construction of A. auriculiformis x A. mangium hybrids. We identified a total of 37,786 putative SNPs by aligning short read transcriptome data from four parents of two Acacia hybrid mapping populations using Bowtie against 7,839 de novo transcriptome contigs. Given a set of 10 validated SNPs from two lignin genes, our in silico SNP detection approach is highly accurate (100%) compared to the traditional in vitro approach (44%). Further validation of 96 SNPs using Illumina GoldenGate Assay gave an overall assay success rate of 89.6% and conversion rate of 37.5%. We explored possible factors lowering assay success rate by predicting exon-intron boundaries and paralogous genes of Acacia contigs using Medicago truncatula genome as reference. This assessment revealed that presence of exon-intron boundary is the main cause (50%) of assay failure. Subsequent SNPs filtering and improved assay design resulted in assay success and conversion rate of 92.4% and 57.4%, respectively based on 768 SNPs genotyping. Analysis of clustering patterns revealed that 27.6% of the assays were not reproducible and flanking sequence might play a role in determining cluster compression. In addition, we identified a total of 258 and 319 polymorphic SNPs in A. auriculiformis and A. mangium natural germplasms, respectively. We have successfully discovered a large number of SNP markers in A. auriculiformis x A. mangium hybrids using next generation transcriptome sequencing. By using a reference genome from the most closely related species, we converted most SNPs to successful
Wong Melissa ML
Full Text Available Abstract Background Next Generation Sequencing has provided comprehensive, affordable and high-throughput DNA sequences for Single Nucleotide Polymorphism (SNP discovery in Acacia auriculiformis and Acacia mangium. Like other non-model species, SNP detection and genotyping in Acacia are challenging due to lack of genome sequences. The main objective of this study is to develop the first high-throughput SNP genotyping assay for linkage map construction of A. auriculiformis x A. mangium hybrids. Results We identified a total of 37,786 putative SNPs by aligning short read transcriptome data from four parents of two Acacia hybrid mapping populations using Bowtie against 7,839 de novo transcriptome contigs. Given a set of 10 validated SNPs from two lignin genes, our in silico SNP detection approach is highly accurate (100% compared to the traditional in vitro approach (44%. Further validation of 96 SNPs using Illumina GoldenGate Assay gave an overall assay success rate of 89.6% and conversion rate of 37.5%. We explored possible factors lowering assay success rate by predicting exon-intron boundaries and paralogous genes of Acacia contigs using Medicago truncatula genome as reference. This assessment revealed that presence of exon-intron boundary is the main cause (50% of assay failure. Subsequent SNPs filtering and improved assay design resulted in assay success and conversion rate of 92.4% and 57.4%, respectively based on 768 SNPs genotyping. Analysis of clustering patterns revealed that 27.6% of the assays were not reproducible and flanking sequence might play a role in determining cluster compression. In addition, we identified a total of 258 and 319 polymorphic SNPs in A. auriculiformis and A. mangium natural germplasms, respectively. Conclusion We have successfully discovered a large number of SNP markers in A. auriculiformis x A. mangium hybrids using next generation transcriptome sequencing. By using a reference genome from the most closely
Full Text Available BACKGROUND: Sickle cell anemia is caused by a single type of mutation, a homozygous A→T substitution in the ß globin gene. Clinical severity is diverse, partially due to additional, disease-modifying genetic factors. We are studying one such modifier locus, HMIP (HBS1L-MYB intergenic polymorphism, chromosome 6q23.3. Working with a genetically admixed patient population, we have encountered the necessity to generate haplotype signatures of genetic markers to label genomic fragments with distinct genealogical origin at this locus. With the goal to generate haplotype signatures from patients experimentally, we have investigated the suitability of an existing nanofluidic assay platform to perform phase alignment with single-nucleotide polymorphism alleles. METHODOLOGY/PRINCIPAL FINDINGS: Patient DNA samples were loaded onto Fluidigm Digital Arrays and individual DNA molecules were assayed with allele-specific probes for SNP markers. Here we present data showing the utility of the nanofluidic approach, yielding haplotype data identical to those obtained with a family-based method. We then determined haplotype composition in a group of patients with sickle cell disease, including in those where a mathematical inference approach gave ambiguous or misleading results. Experimental phasing of genotypes across 3.8 kb for rs9399137, rs9402685, and rs11759553 created unequivocal haplotype signatures for each of the patients. In 68 patients, we found 8 copies of a haplotype signature ('C-C-T', which is known to be prevalent in Europeans but to be absent in West African populations. We have confirmed the identity of our phased allele pairs by single-molecule sequencing and have demonstrated, in principle, that three-allele phasing (using three colors is a potential extension to this method. CONCLUSIONS/SIGNIFICANCE: Phased haplotypes yield more information than the individual marker genotypes. Procedures such as the one described here would therefore
Alonso, Santos; Boyano, M. Dolores; Peña-Chilet, Maria; Pita, Guillermo; Aviles, Jose A.; Mayor, Matias; Gomez-Fernandez, Cristina; Casado, Beatriz; Martin-Gonzalez, Manuel; Izagirre, Neskuts; De la Rua, Concepcion; Asumendi, Aintzane; Perez-Yarza, Gorka; Arroyo-Berdugo, Yoana; Boldo, Enrique; Lozoya, Rafael; Torrijos-Aguilar, Arantxa; Pitarch, Ana; Pitarch, Gerard; Sanchez-Motilla, Jose M.; Valcuende-Cavero, Francisca; Tomas-Cabedo, Gloria; Perez-Pastor, Gemma; Diaz-Perez, Jose L.; Gardeazabal, Jesus; de Lizarduy, Iñigo Martinez; Sanchez-Diez, Ana; Valdes, Carlos; Pizarro, Angel; Casado, Mariano; Carretero, Gregorio; Botella-Estrada, Rafael; Nagore, Eduardo; Lazaro, Pablo; Lluch, Ana; Benitez, Javier; Martinez-Cadenas, Conrado; Ribas, Gloria
As the incidence of Malignant Melanoma (MM) reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2) and rs2069398 (SILV/CKD2), were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls). A novel SNP located on the SLC45A2 gene (rs35414) was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001). None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively) had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls). Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date. PMID:21559390
Full Text Available As the incidence of Malignant Melanoma (MM reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2 and rs2069398 (SILV/CKD2, were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls. A novel SNP located on the SLC45A2 gene (rs35414 was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001. None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls. Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date.
Fischer, Carina; Trajanoski, Slave; Papić, Lea; Windpassinger, Christian; Bernert, Günther; Freilinger, Michael; Schabhüttl, Maria; Arslan-Kirchner, Mine; Javaher-Haghighi, Poupak; Plecko, Barbara; Senderek, Jan; Rauscher, Christian; Löscher, Wolfgang N; Pieber, Thomas R; Janecke, Andreas R; Auer-Grumbach, Michaela
Considerable non-allelic heterogeneity for autosomal recessively inherited Charcot-Marie-Tooth (ARCMT) disease has challenged molecular testing and often requires a large amount of work in terms of DNA sequencing and data interpretation or remains unpractical. This study tested the value of SNP array-based whole-genome homozygosity mapping as a first step in the molecular genetic diagnosis of sporadic or ARCMT in patients from inbred families or outbred populations with the ancestors originating from the same geographic area. Using 10 K 2.0 and 250 K Nsp Affymetrix SNP arrays, 15 (63%) of 24 CMT patients received an accurate genetic diagnosis. We used our Java-based script eHoPASA CMT-easy Homozygosity Profiling of SNP arrays for CMT patients to display the location of homozygous regions and their extent of marker count and base-pairs throughout the whole genome. CMT4C was the most common genetic subtype with mutations detected in SH3TC2, one (p.E632Kfs13X) appearing to be a novel founder mutation. A sporadic patient with severe CMT was homozygous for the c.250G > C (p.G84R) HSPB1 mutation which has previously been reported to cause autosomal dominant dHMN. Two distantly related CMT1 patients with early disease onset were found to carry a novel homozygous mutation in MFN2 (p.N131S). We conclude that SNP array-based homozygosity mapping is a fast, powerful, and economic tool to guide molecular genetic testing in ARCMT and in selected sporadic CMT patients.
Selinski, Silvia; Blaszkewicz, Meinolf; Lehmann, Marie-Louise; Ovsiannikov, Daniel; Moormann, Oliver; Guballa, Christoph; Kress, Alexander; Truss, Michael C; Gerullis, Holger; Otto, Thomas; Barski, Dimitri; Niegisch, Günter; Albers, Peter; Frees, Sebastian; Brenner, Walburgis; Thüroff, Joachim W; Angeli-Greaves, Miriam; Seidel, Thilo; Roth, Gerhard; Dietrich, Holger; Ebbinghaus, Rainer; Prager, Hans M; Bolt, Hermann M; Falkenstein, Michael; Zimmermann, Anna; Klein, Torsten; Reckwitz, Thomas; Roemer, Hermann C; Löhlein, Dietrich; Weistenhöfer, Wobbeke; Schöps, Wolfgang; Hassan Rizvi, Syed Adibul; Aslam, Muhammad; Bánfi, Gergely; Romics, Imre; Steffens, Michael; Ekici, Arif B; Winterpacht, Andreas; Ickstadt, Katja; Schwender, Holger; Hengstler, Jan G; Golka, Klaus
Genotyping N-acetyltransferase 2 (NAT2) is of high relevance for individualized dosing of antituberculosis drugs and bladder cancer epidemiology. In this study we compared a recently published tagging single nucleotide polymorphism (SNP) (rs1495741) to the conventional 7-SNP genotype (G191A, C282T, T341C, C481T, G590A, A803G and G857A haplotype pairs) and systematically analysed if novel SNP combinations outperform the latter. For this purpose, we studied 3177 individuals by PCR and phenotyped 344 individuals by the caffeine test. Although the tagSNP and the 7-SNP genotype showed a high degree of correlation (R=0.933, P<0.0001) the 7-SNP genotype nevertheless outperformed the tagging SNP with respect to specificity (1.0 vs. 0.9444, P=0.0065). Considering all possible SNP combinations in a receiver operating characteristic analysis we identified a 2-SNP genotype (C282T, T341C) that outperformed the tagging SNP and was equivalent to the 7-SNP genotype. The 2-SNP genotype predicted the correct phenotype with a sensitivity of 0.8643 and a specificity of 1.0. In addition, it predicted the 7-SNP genotype with sensitivity and specificity of 0.9993 and 0.9880, respectively. The prediction of the NAT2 genotype by the 2-SNP genotype performed similar in populations of Caucasian, Venezuelan and Pakistani background. A 2-SNP genotype predicts NAT2 phenotypes with similar sensitivity and specificity as the conventional 7-SNP genotype. This procedure represents a facilitation in individualized dosing of NAT2 substrates without losing sensitivity or specificity.
The single nucleotide polymorphism( SNP )is the third generation of DNA genetic marker after microsatellites. More and more disease-susceptibility genes and loci have been identified as the basis of SNP. However,because the traditional statistical methods may generate a number of false-positive results, which increased the probability of wrong conclusions. To solve this problem,the researchers have used many techniques including the plasmid transfection,EMSA and ChIP to identify the real functional SNPs from the candidate loci. Here is to make a review on all the methods for SNP function research in recent year according to the SNP positions in structural genes and the SNP effect.%单核苷酸多态性(SNP)是继微卫星后第三代DNA遗传标记.近年来已鉴别出大量疾病的易患基因与位点,然而由于传统统计学方法的不足,加大了假阳性结论产生的概率.为解决这一问题,研究者们应用质粒转染、电泳移动漂移实验以及染色质免疫沉淀等技术从大量潜在的易患位点中鉴别出许多真正具有功能的SNP位点.现以结构基因中SNP所在位置进行分类,从影响基因表达以及改变蛋白质功能等方面出发,对各SNP功能研究方法进行总结.
Johansen, Peter; Andersen, Jeppe Dyrberg; Madsen, Linnea Nørgård;
multiple mole and melanoma (FAMMM) syndrome. We typed 32 pigmentary SNP markers and sequenced MC1R in 246 healthy individuals and 116 individuals attending periodic control for malignant melanoma development, 50 of which were diagnosed with FAMMM. It was observed that individuals with any two grouped MC1R...
Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc
This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected...... itself. Depending on the trait’s economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage...... was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from...
Zhang, Jun-yi; Zhu, Bing-chuan; Xu, Chao; Ding, Xiao; Li, Jun-feng; Zhang, Xue-gong; Lu, Zu-hong
The advent of next generation sequencing technology enables parallel analysis of the whole microbial community from multiple samples. Particularly, sequencing 16S rRNA hypervariable tags has become the most efficient and cost-effective method for assessing microbial diversity. Due to its short read length of the 2nd-generation sequencing methods that cannot cover the full 16S rRNA genomic region, specific hypervariable regions or V-regions must be selected to act as the proxy. Over the past decade, selection of V-regions has not been consistent in assessing microbial diversity. Here we evaluated the current strategies of selecting 16S rRNA hypervariable regions for surveying microbial diversity. The environmental condition was considered as one of the important factors for selection of 16S rRNA hypervariable regions. We suggested that a pilot study to test different V-regions is required in bacterial diversity studies based on 16S rRNA genes.
Full Text Available Abstract Background Next generation sequencing (NGS technologies are providing new ways to accelerate fine-mapping and gene isolation in many species. To date, the majority of these efforts have focused on diploid organisms with readily available whole genome sequence information. In this study, as a proof of concept, we tested the use of NGS for SNP discovery in tetraploid wheat lines differing for the previously cloned grain protein content (GPC gene GPC-B1. Bulked segregant analysis (BSA was used to define a subset of putative SNPs within the candidate gene region, which were then used to fine-map GPC-B1. Results We used Illumina paired end technology to sequence mRNA (RNAseq from near isogenic lines differing across a ~30-cM interval including the GPC-B1 locus. After discriminating for SNPs between the two homoeologous wheat genomes and additional quality filtering, we identified inter-varietal SNPs in wheat unigenes between the parental lines. The relative frequency of these SNPs was examined by RNAseq in two bulked samples made up of homozygous recombinant lines differing for their GPC phenotype. SNPs that were enriched at least 3-fold in the corresponding pool (6.5% of all SNPs were further evaluated. Marker assays were designed for a subset of the enriched SNPs and mapped using DNA from individuals of each bulk. Thirty nine new SNP markers, corresponding to 67% of the validated SNPs, mapped across a 12.2-cM interval including GPC-B1. This translated to 1 SNP marker per 0.31 cM defining the GPC-B1 gene to within 13-18 genes in syntenic cereal genomes and to a 0.4 cM interval in wheat. Conclusions This study exemplifies the use of RNAseq for SNP discovery in polyploid species and supports the use of BSA as an effective way to target SNPs to specific genetic intervals to fine-map genes in unsequenced genomes.
You, Frank M; Huo, Naxin; Gu, Yong Q; Lazo, Gerard R; Dvorak, Jan; Anderson, Olin D
In some genomic applications it is necessary to design large numbers of PCR primers in exons flanking one or several introns on the basis of orthologous gene sequences in related species. The primer pairs designed by this target gene approach are called "intron-flanking primers" or because they are located in exonic sequences which are usually conserved between related species, "conserved primers". They are useful for large-scale single nucleotide polymorphism (SNP) discovery and marker development, especially in species, such as wheat, for which a large number of ESTs are available but for which genome sequences and intron/exon boundaries are not available. To date, no suitable high-throughput tool is available for this purpose. We have developed, the ConservedPrimers 2.0 pipeline, for designing intron-flanking primers for large-scale SNP discovery and marker development, and demonstrated its utility in wheat. This tool uses non-redundant wheat EST sequences, such as wheat contigs and singleton ESTs, and related genomic sequences, such as those of rice, as inputs. It aligns the ESTs to the genomic sequences to identify unique colinear exon blocks and predicts intron lengths. Intron-flanking primers are then designed based on the intron/exon information using the Primer3 core program or BatchPrimer3. Finally, a tab-delimited file containing intron-flanking primer pair sequences and their primer properties is generated for primer ordering and their PCR applications. Using this tool, 1,922 bin-mapped wheat ESTs (31.8% of the 6,045 in total) were found to have unique colinear exon blocks suitable for primer design and 1,821 primer pairs were designed from these single- or low-copy genes for PCR amplification and SNP discovery. With these primers and subsequently designed genome-specific primers, a total of 1,527 loci were found to contain one or more genome-specific SNPs. The ConservedPrimers 2.0 pipeline for designing intron-flanking primers was developed and its
Muñoz, Irene; Henriques, Dora; Johnston, J Spencer; Chávez-Galarza, Julio; Kryger, Per; Pinto, M Alice
Beekeeping activities, especially queen trading, have shaped the distribution of honey bee (Apis mellifera) subspecies in Europe, and have resulted in extensive introductions of two eastern European C-lineage subspecies (A. m. ligustica and A. m. carnica) into the native range of the M-lineage A. m. mellifera subspecies in Western Europe. As a consequence, replacement and gene flow between native and commercial populations have occurred at varying levels across western European populations. Genetic identification and introgression analysis using molecular markers is an important tool for management and conservation of honey bee subspecies. Previous studies have monitored introgression by using microsatellite, PCR-RFLP markers and most recently, high density assays using single nucleotide polymorphism (SNP) markers. While the latter are almost prohibitively expensive, the information gained to date can be exploited to create a reduced panel containing the most ancestry-informative markers (AIMs) for those purposes with very little loss of information. The objective of this study was to design reduced panels of AIMs to verify the origin of A. m. mellifera individuals and to provide accurate estimates of the level of C-lineage introgression into their genome. The discriminant power of the SNPs using a variety of metrics and approaches including the Weir & Cockerham's FST, an FST-based outlier test, Delta, informativeness (In), and PCA was evaluated. This study shows that reduced AIMs panels assign individuals to the correct origin and calculates the admixture level with a high degree of accuracy. These panels provide an essential tool in Europe for genetic stock identification and estimation of admixture levels which can assist management strategies and monitor honey bee conservation programs.
Yang, Cheng-Hong; Cheng, Yu-Huei; Yang, Cheng-Huei; Chuang, Li-Yeh
Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) is useful in small-scale basic research studies of complex genetic diseases that are associated with single nucleotide polymorphism (SNP). Designing a feasible primer pair is an important work before performing PCR-RFLP for SNP genotyping. However, in many cases, restriction enzymes to discriminate the target SNP resulting in the primer design is not applicable. A mutagenic primer is introduced to solve this problem. GA-based Mismatch PCR-RFLP Primers Design (GAMPD) provides a method that uses a genetic algorithm to search for optimal mutagenic primers and available restriction enzymes from REBASE. In order to improve the efficiency of the proposed method, a mutagenic matrix is employed to judge whether a hypothetical mutagenic primer can discriminate the target SNP by digestion with available restriction enzymes. The available restriction enzymes for the target SNP are mined by the updated core of SNP-RFLPing. GAMPD has been used to simulate the SNPs in the human SLC6A4 gene under different parameter settings and compared with SNP Cutter for mismatch PCR-RFLP primer design. The in silico simulation of the proposed GAMPD program showed that it designs mismatch PCR-RFLP primers. The GAMPD program is implemented in JAVA and is freely available at http://bio.kuas.edu.tw/gampd/.
Montero-Pau, Javier; Blanca, José; Esteras, Cristina; Martínez-Pérez, Eva Ma; Gómez, Pedro; Monforte, Antonio J; Cañizares, Joaquín; Picó, Belén
Cucurbita pepo is a cucurbit with growing economic importance worldwide. Zucchini morphotype is the most important within this highly variable species. Recently, transcriptome and Simple Sequence Repeat (SSR)- and Single Nucleotide Polymorphism (SNP)-based medium density maps have been reported, however further genomic tools are needed for efficient molecular breeding in the species. Our objective is to combine currently available complete transcriptomes and the Zucchini genome sequence with high throughput genotyping methods, mapping population development and extensive phenotyping to facilitate the advance of genomic research in this species. We report the Genotyping-by-sequencing analysis of a RIL population developed from the inter subspecific cross Zucchini x Scallop (ssp. pepo x ssp. ovifera). Several thousands of SNP markers were identified and genotyped, followed by the construction of a high-density linkage map based on 7,718 SNPs (average of 386 markers/linkage group) covering 2,817.6 cM of the whole genome, which is a great improvement with respect to previous maps. A QTL analysis was performed using phenotypic data obtained from the RIL population from three environments. In total, 48 consistent QTLs for vine, flowering and fruit quality traits were detected on the basis of a multiple-environment analysis, distributed in 33 independent positions in 15 LGs, and each QTL explained 1.5-62.9% of the phenotypic variance. Eight major QTLs, which could explain greater than 20% of the phenotypic variation were detected and the underlying candidate genes identified. Here we report the first SNP saturated map in the species, anchored to the physical map. Additionally, several consistent QTLs related to early flowering, fruit shape and length, and rind and flesh color are reported as well as candidate genes for them. This information will enhance molecular breeding in C. pepo and will assist the gene cloning underlying the studied QTLs, helping to reveal the
Avila, C M; Atienza, S G; Moreno, M T; Torres, A M
Faba bean varieties with determinacy of the apical meristem are relevant to green production. A diagnostic CAPS (cleavage amplification polymorphic sequence) marker for determinate growth habit (ti) in faba bean was previously developed by Avila et al. (Mol Breed 17:185-190, 2006) but was effective only on a limited range of cultivars or genotypes. In this study, we studied the reasons for this limited application and developed a new marker useful for most faba bean-breeding programs. By designing a new set of primers, the complete genomic Vf_TFL1 sequences from different genotypes contrasting for the character were obtained and additional base changes associated with the ti phenotype were identified. The comparison among faba bean sequences showed that the previous CAPS marker was based on a SNP (single nucleotide polymorphism) at position 469 in the intron 2-3, a silent mutation. On the contrary, a SNP at position 26 that distinguishes determinate and indeterminate growth habit genotypes lead to an amino acid change (Leu-9 to Arg) in the determinate growth habit genotypes that could account for the ti phenotype. A dCAPS marker based on this SNP that creates a TaqI site in the ti allele was developed. The marker was 100% successful in predicting ti phenotypes in a broad range of faba bean germplasm representing all major cultivars historically grown in Europe. The outcome confirms the utility of the new dCAPS in worldwide marker-assisted selection programs.
Fernandez I Marti, Angel; Athanson, Blessing; Koepke, Tyson; Font I Forcada, Carolina; Dhingra, Amit; Oraguzie, Nnadozie
Most previous studies on genetic fingerprinting and cultivar relatedness in sweet cherry were based on isoenzyme, RAPD, and simple sequence repeat (SSR) markers. This study was carried out to assess the utility of single nucleotide polymorphism (SNP) markers generated from 3' untranslated regions (UTR) for genetic fingerprinting in sweet cherry. A total of 114 sweet cherry germplasm representing advanced selections, commercial cultivars, and old cultivars imported from different parts of the world were screened with seven SSR markers developed from other Prunus species and with 40 SNPs obtained from 3' UTR sequences of Rainier and Bing sweet cherry cultivars. Both types of marker study had 99 accessions in common. The SSR data was used to validate the SNP results. Results showed that the average number of alleles per locus, mean observed heterozygosity, expected heterozygosity, and polymorphic information content values were higher in SSRs than in SNPs although both set of markers were similar in their grouping of the sweet cherry accessions as shown in the dendrogram. SNPs were able to distinguish sport mutants from their wild type germplasm. For example, "Stella" was separated from "Compact Stella." This demonstrates the greater power of SNPs for discriminating mutants from their original parents than SSRs. In addition, SNP markers confirmed parentage and also determined relationships of the accessions in a manner consistent with their pedigree relationships. We would recommend the use of 3' UTR SNPs for genetic fingerprinting, parentage verification, gene mapping, and study of genetic diversity in sweet cherry.
Marjolein van Gent
Full Text Available To monitor changes in Bordetella pertussis populations, mainly two typing methods are used; Pulsed-Field Gel Electrophoresis (PFGE and Multiple-Locus Variable-Number Tandem Repeat Analysis (MLVA. In this study, a single nucleotide polymorphism (SNP typing method, based on 87 SNPs, was developed and compared with PFGE and MLVA. The discriminatory indices of SNP typing, PFGE and MLVA were found to be 0.85, 0.95 and 0.83, respectively. Phylogenetic analysis, using SNP typing as Gold Standard, revealed false homoplasies in the PFGE and MLVA trees. Further, in contrast to the SNP-based tree, the PFGE- and MLVA-based trees did not reveal a positive correlation between root-to-tip distance and the isolation year of strains. Thus PFGE and MLVA do not allow an estimation of the relative age of the selected strains. In conclusion, SNP typing was found to be phylogenetically more informative than PFGE and more discriminative than MLVA. Further, in contrast to PFGE, it is readily standardized allowing interlaboratory comparisons. We applied SNP typing to study strains with a novel allele for the pertussis toxin promoter, ptxP3, which have a worldwide distribution and which have replaced the resident ptxP1 strains in the last 20 years. Previously, we showed that ptxP3 strains showed increased pertussis toxin expression and that their emergence was associated with increased notification in The Netherlands. SNP typing showed that the ptxP3 strains isolated in the Americas, Asia, Australia and Europe formed a monophyletic branch which recently diverged from ptxP1 strains. Two predominant ptxP3 SNP types were identified which spread worldwide. The widespread use of SNP typing will enhance our understanding of the evolution and global epidemiology of B. pertussis.
Zheng, Jie; Gaunt, Tom R; Day, Ian N M
Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool (http://apps.biocompute.org.uk/sssrap/sssrap.cgi). To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data. © 2012 Blackwell Publishing Ltd/University College London.
Full Text Available Tea is an important cash crop, representing a $40 billion-a-year global market. Differentiation of the tea market has resulted in increasing demand for tea products that are sustainably and responsibly produced. Tea authentication is important because of growing concerns about fraud involving premium tea products. Analytical technologies are needed for protection and value enhancement of high-quality brands. For loose-leaf teas, the challenge is that the authentication needs to be established on the basis of a single leaf, so that the products can be traced back to the original varieties. A new generation of molecular markers offers an ideal solution for authentication of processed agricultural products. Using a nanofluidic array to identify variant SNP sequences, we tested genetic identities using DNA extracted from single leaves of 14 processed commercial tea products. Based on the profiles of 60 SNP markers, the genetic identity of each tea sample was unambiguously identified by multilocus matching and ordination analysis. Results for repeated samples of multiple tea leaves from the same products (using three independent DNA extractions showed 100% concordance, showing that the nanofluidic system is a reliable platform for generating tea DNA fingerprints with high accuracy. The method worked well on green, oolong, and black teas, and can handle a large number of samples in a short period of time. It is robust and cost-effective, thus showing high potential for practical application in the value chain of the tea industry.
Leaché, Adam D.; Banbury, Barbara L.; Felsenstein, Joseph; de Oca, Adrián nieto-Montes; Stamatakis, Alexandros
Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the
Fu, Yanfeng; Li, Lan; Li, Bixia; Fang, Xiaomin; Ren, Shouwen
To ascertain whether the long form leptin receptor (LEPR) affects the regulation of embryo attachment and whether there are LEPR Single Nucleotide Polymorphisms (SNPs) associated with reproductive traits in pigs, Real-time qPCR was used to detect relative abundance of LEPR mRNA pattern in different tissues of Suzhong sows during the embryo attachment period (pregnancy day 13, 18 and 24) to the uterus, and PCR-RFLP as well as PCR-sequencing were used to investigate the coding sequence for SNPs of LEPR in a population of 512 Suzhong sows. Real-time qPCR results indicated that LEPR mRNA was present in all 22 tissues of pigs with differences in relative abundance of the LEPR mRNA (Pattachment site (Pattachment periods, LEPR mRNA was greatest on Day 18 (attachment; Pattachment), and relative abundance was least on Day 13 (pre-attachment). The prevalence of the LEPR mRNA in pregnant sows was greater than in non-pregnant sows (PT locus of LEPR, Chi-square test results demonstrated that allele and genotype frequencies were in Hardy-Weinberg disequilibrium at this locus, PCR-RFLP results revealed that Genotype TT was greater than Genotype CC (PT locus has advantageous effects on litter size and litter weight in Suzhong pigs. In conclusion, the expression of the LEPR gene might be involved in the regulation of embryo attachment mechanisms in pigs, and the LEPR SNP c.2856C>T could be a molecular marker for improving litter size and litter weight in pig breeding.
Binder, Harald; Müller, Tina; Schwender, Holger; Golka, Klaus; Steffens, Michael; Hengstler, Jan G; Ickstadt, Katja; Schumacher, Martin
The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, these groups of individuals are identified using a clustering approach, where each group may be defined via different SNPs. This allows for representing complex interaction patterns, such as compositional epistasis, that might not be detected by a single main effects model. In a simulation study, the CLR approach results in improved prediction performance, compared to the main effects approach, and identification of important SNPs in several scenarios. Improved prediction performance is also obtained for an application example considering urinary bladder cancer. Some of the identified SNPs are predictive for all individuals, while others are only relevant for a specific group. Together with the sets of SNPs that define the groups, potential interaction patterns are uncovered.
Chen, Nancy; Van Hout, Cristopher V; Gottipati, Srikanth; Clark, Andrew G
Restriction site-associated DNA sequencing or genotyping-by-sequencing (GBS) approaches allow for rapid and cost-effective discovery and genotyping of thousands of single-nucleotide polymorphisms (SNPs) in multiple individuals. However, rigorous quality control practices are needed to avoid high levels of error and bias with these reduced representation methods. We developed a formal statistical framework for filtering spurious loci, using Mendelian inheritance patterns in nuclear families, that accommodates variable-quality genotype calls and missing data--both rampant issues with GBS data--and for identifying sex-linked SNPs. Simulations predict excellent performance of both the Mendelian filter and the sex-linkage assignment under a variety of conditions. We further evaluate our method by applying it to real GBS data and validating a subset of high-quality SNPs. These results demonstrate that our metric of Mendelian inheritance is a powerful quality filter for GBS loci that is complementary to standard coverage and Hardy-Weinberg filters. The described method, implemented in the software MendelChecker, will improve quality control during SNP discovery in nonmodel as well as model organisms. Copyright © 2014 by the Genetics Society of America.
Vukcevic, Damjan; Traherne, James A; Næss, Sigrid; Ellinghaus, Eva; Kamatani, Yoichiro; Dilthey, Alexander; Lathrop, Mark; Karlsen, Tom H; Franke, Andre; Moffatt, Miriam; Cookson, William; Trowsdale, John; McVean, Gil; Sawcer, Stephen; Leslie, Stephen
Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease.
Douglas S Goodin
Full Text Available BACKGROUND: Genome-wide association studies (GWAS identify disease-associations for single-nucleotide-polymorphisms (SNPs from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association. METHODOLOGY/PRINCIPAL FINDINGS: Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person's SNP-genotype over specified chromosomal segments. Two multiple sclerosis (MS-associated genetic regions were modeled; DRB1 (a Class II molecule of the major histocompatibility complex and MMEL1 (an endopeptidase that degrades both neuropeptides and β-amyloid. For each locus, we considered sets of eleven adjacent SNPs, surrounding the putative disease-associated gene and spanning ∼200 kb of DNA. The SNP-information was converted into an ordered-set of eleven-numbers (subject-vectors based on whether a person had zero, one, or two copies of particular SNP-variant at each sequential SNP-location. SNP-strings were defined as those ordered-combinations of eleven-numbers (0 or 1, representing a haplotype, two of which combined to form the observed subject-vector. Subject-vectors were resolved using probabilistic methods. In both regions, only a small number of SNP-strings were present. We compared our method to the SHAPEIT-2 phasing-algorithm. When the SNP-information spanning 200 kb was used, SHAPEIT-2 was inaccurate. When the SHAPEIT-2 window was increased to 2,000 kb, the concordance between the two methods, in both of these eleven-SNP regions, was over 99%, suggesting that, in these regions, both methods were quite accurate. Nevertheless, correspondence was not uniformly high over the entire DNA-span but, rather, was characterized by alternating peaks and valleys of concordance. Moreover, in the valleys of poor-correspondence, SHAPEIT-2 was also inconsistent with itself
Martinez, Pierre; Kimberley, Christopher; Birkbak, Nicolai Juul
standard genomic data such as SNP-arrays, that could be implemented routinely. We designed two novel scores S and R, respectively based on the Shannon diversity index and Ripley's L statistic of spatial homogeneity, to quantify ITH in single SNP-array samples. We created in-silico and in-vitro mixtures...... sequencing data but heterogeneity in the fraction of tumour cells present across samples hampered accurate quantification. The prognostic potential of both scores was moderate but significantly predictive of survival in several tumour types (corrected p = 0.03). Our work thus shows how individual SNP...
Sedighi, Abootaleb; Li, Paul C H
Here, we describe detection of single nucleotide polymorphism (SNP) in genomic DNA samples using a NanoBioArray (NBA) chip. Fast DNA hybridization is achieved in the chip when target DNAs are introduced to the surface-arrayed probes using centrifugal force. Gold nanoparticles (AuNPs) are used to assist SNP detection at room temperature. The parallel setting of sample introduction in the spiral channels of the NBA chip enables multiple analyses on many samples, resulting in a technique appropriate for high-throughput SNP detection. The experimental procedure, including chip fabrication, probe array printing, DNA amplification, hybridization, signal detection, and data analysis, is described in detail.
Hansen, Thomas V. O.; Vikesaa, Jonas; Buhl, Sine S
) arrays can provide additional diagnostic power to assess HER2 gene status. METHODS: DNA from 65 breast tumor samples previously diagnosed by HER2 IHC and FISH analysis were blinded and examined for HER2 copy number variation employing SNP array analysis. RESULTS: SNP array analysis identified 24 (37......%) samples with selective amplification or imbalance of the HER2 region in the q-arm of chromosome 17. In contrast, only 15 (23%) tumors were found to have HER2 amplification by IHC and FISH analysis. In total, there was a discrepancy in 19 (29%) samples between SNP array and IHC/FISH analysis. In 12...
Lukens, C.E. [Rockwell International Corp., Richland, WA (United States). Rockwell Hanford Operations
The client submitted 5 sets of porcelain and stoneware subsurface (radioactive site) marker prototypes (31 markers each set). The following were determined: compressive strength, thermal shock resistance, thermal crazing resistance, alkali resistance, color retention, and chemical resistance.
Fernández Ana I
Full Text Available Abstract Background The traditional strategy to map QTL is to use linkage analysis employing a limited number of markers. These analyses report wide QTL confidence intervals, making very difficult to identify the gene and polymorphisms underlying the QTL effects. The arrival of genome-wide panels of SNPs makes available thousands of markers increasing the information content and therefore the likelihood of detecting and fine mapping QTL regions. The aims of the current study are to confirm previous QTL regions for growth and body composition traits in different generations of an Iberian x Landrace intercross (IBMAP and especially identify new ones with narrow confidence intervals by employing the PorcineSNP60 BeadChip in linkage analyses. Results Three generations (F3, Backcross 1 and Backcross 2 of the IBMAP and their related animals were genotyped with PorcineSNP60 BeadChip. A total of 8,417 SNPs equidistantly distributed across autosomes were selected after filtering by quality, position and frequency to perform the QTL scan. The joint and separate analyses of the different IBMAP generations allowed confirming QTL regions previously identified in chromosomes 4 and 6 as well as new ones mainly for backfat thickness in chromosomes 4, 5, 11, 14 and 17 and shoulder weight in chromosomes 1, 2, 9 and 13; and many other to the chromosome-wide signification level. In addition, most of the detected QTLs displayed narrow confidence intervals, making easier the selection of positional candidate genes. Conclusions The use of higher density of markers has allowed to confirm results obtained in previous QTL scans carried out with microsatellites. Moreover several new QTL regions have been now identified in regions probably not covered by markers in previous scans, most of these QTLs displayed narrow confidence intervals. Finally, prominent putative biological and positional candidate genes underlying those QTL effects are listed based on recent porcine
Full Text Available Next-generation sequencing technology is now frequently being used to develop genomic tools for non-model organisms, which are generally important for advancing studies of evolutionary ecology. One such species, the marine annelid Streblospio benedicti, is an ideal system to study the evolutionary consequences of larval life history mode because the species displays a rare offspring dimorphism termed poecilogony, where females can produce either many small offspring or a few large ones. To further develop S. benedicti as a model system for studies of life history evolution, we apply 454 sequencing to characterize the transcriptome for embryos, larvae, and juveniles of this species, for which no genomic resources are currently available. Here we performed a de novo alignment of 336,715 reads generated by a quarter GS-FLX (Roche 454 run, which produced 7,222 contigs. We developed a novel approach for evaluating the site frequency spectrum across the transcriptome to identify potential signatures of selection. We also developed 84 novel single nucleotide polymorphism (SNP markers for this species that are used to distinguish coastal populations of S. benedicti. We validated the SNPs by genotyping individuals of different developmental modes using the BeadXPress Golden Gate assay (Illumina. This allowed us to evaluate markers that may be associated with life-history mode.
Vandepitte, K; Honnay, O; Mergeay, J; Breyne, P; Roldán-Ruiz, I; De Meyer, T
Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost-effective approaches to uncover genome-wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD-PE (Restriction site Associated DNA Paired-End sequencing) approach. RAD tags were generated from the PstI-digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired-end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N(50) = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD-PE as an inexpensive genome-wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.
Alves Aflitos, S.; Sanchez-Perez, G.; de Ridder, D.; Fransz, P.; Schranz, M.E.; de Jong, H.; Peters, S.A.
Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not
Aflitos, S.A.; Aflitos, S.A.; Sanchez Perez, G.F.; Ridder, de D.; Fransz, P.; Schranz, M.E.; Jong, de J.H.S.G.M.; Peters, S.A.
Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not
de los Campos, Gustavo; Sorensen, Daniel
, imperfect marker–causal loci linkage disequilibrium (LD). Consequently, the marker-based model may largely misrepresent the data-generating process; this is exacerbated with unrelated individuals5. Under these conditions, it is not clear that the finite sample estimate of h2G-BLUP is an unbiased estimate...