WorldWideScience

Sample records for polymorphism genomic arrays

  1. Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp using a soybean genome array

    Directory of Open Access Journals (Sweden)

    Wanamaker Steve

    2008-02-01

    Full Text Available Abstract Background Cowpea (Vigna unguiculata L. Walp is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP is a microarray-based marker which can be used for high throughput genotyping and high density mapping. Results Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max genome array. Robustified projection pursuit (RPP was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. Conclusion We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.

  2. Efficient oligonucleotide probe selection for pan-genomic tiling arrays

    Directory of Open Access Journals (Sweden)

    Zhang Wei

    2009-09-01

    Full Text Available Abstract Background Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. Results This paper presents a new probe selection algorithm (PanArray that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. Conclusion PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on

  3. Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  4. Development and validation of a 20K single nucleotide polymorphism (SNP whole genome genotyping array for apple (Malus × domestica Borkh.

    Directory of Open Access Journals (Sweden)

    Luca Bianco

    Full Text Available High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus. A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs. Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  5. Development and Validation of a 20K Single Nucleotide Polymorphism (SNP) Whole Genome Genotyping Array for Apple (Malus × domestica Borkh)

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs. PMID:25303088

  6. Novel applications of array comparative genomic hybridization in molecular diagnostics.

    Science.gov (United States)

    Cheung, Sau W; Bi, Weimin

    2018-05-31

    In 2004, the implementation of array comparative genomic hybridization (array comparative genome hybridization [CGH]) into clinical practice marked a new milestone for genetic diagnosis. Array CGH and single-nucleotide polymorphism (SNP) arrays enable genome-wide detection of copy number changes in a high resolution, and therefore microarray has been recognized as the first-tier test for patients with intellectual disability or multiple congenital anomalies, and has also been applied prenatally for detection of clinically relevant copy number variations in the fetus. Area covered: In this review, the authors summarize the evolution of array CGH technology from their diagnostic laboratory, highlighting exonic SNP arrays developed in the past decade which detect small intragenic copy number changes as well as large DNA segments for the region of heterozygosity. The applications of array CGH to human diseases with different modes of inheritance with the emphasis on autosomal recessive disorders are discussed. Expert commentary: An exonic array is a powerful and most efficient clinical tool in detecting genome wide small copy number variants in both dominant and recessive disorders. However, whole-genome sequencing may become the single integrated platform for detection of copy number changes, single-nucleotide changes as well as balanced chromosomal rearrangements in the near future.

  7. A high-density Diversity Arrays Technology (DArT microarray for genome-wide genotyping in Eucalyptus

    Directory of Open Access Journals (Sweden)

    Myburg Alexander A

    2010-06-01

    Full Text Available Abstract Background A number of molecular marker technologies have allowed important advances in the understanding of the genetics and evolution of Eucalyptus, a genus that includes over 700 species, some of which are used worldwide in plantation forestry. Nevertheless, the average marker density achieved with current technologies remains at the level of a few hundred markers per population. Furthermore, the transferability of markers produced with most existing technology across species and pedigrees is usually very limited. High throughput, combined with wide genome coverage and high transferability are necessary to increase the resolution, speed and utility of molecular marker technology in eucalypts. We report the development of a high-density DArT genome profiling resource and demonstrate its potential for genome-wide diversity analysis and linkage mapping in several species of Eucalyptus. Findings After testing several genome complexity reduction methods we identified the PstI/TaqI method as the most effective for Eucalyptus and developed 18 genomic libraries from PstI/TaqI representations of 64 different Eucalyptus species. A total of 23,808 cloned DNA fragments were screened and 13,300 (56% were found to be polymorphic among 284 individuals. After a redundancy analysis, 6,528 markers were selected for the operational array and these were supplemented with 1,152 additional clones taken from a library made from the E. grandis tree whose genome has been sequenced. Performance validation for diversity studies revealed 4,752 polymorphic markers among 174 individuals. Additionally, 5,013 markers showed segregation when screened using six inter-specific mapping pedigrees, with an average of 2,211 polymorphic markers per pedigree and a minimum of 859 polymorphic markers that were shared between any two pedigrees. Conclusions This operational DArT array will deliver 1,000-2,000 polymorphic markers for linkage mapping in most eucalypt pedigrees

  8. Detection of Hereditary 1,25-Hydroxyvitamin D-Resistant Rickets Caused by Uniparental Disomy of Chromosome 12 Using Genome-Wide Single Nucleotide Polymorphism Array.

    Directory of Open Access Journals (Sweden)

    Mayuko Tamura

    Full Text Available Hereditary 1,25-dihydroxyvitamin D-resistant rickets (HVDRR is an autosomal recessive disease caused by biallelic mutations in the vitamin D receptor (VDR gene. No patients have been reported with uniparental disomy (UPD.Using genome-wide single nucleotide polymorphism (SNP array to confirm whether HVDRR was caused by UPD of chromosome 12.A 2-year-old girl with alopecia and short stature and without any family history of consanguinity was diagnosed with HVDRR by typical laboratory data findings and clinical features of rickets. Sequence analysis of VDR was performed, and the origin of the homozygous mutation was investigated by target SNP sequencing, short tandem repeat analysis, and genome-wide SNP array.The patient had a homozygous p.Arg73Ter nonsense mutation. Her mother was heterozygous for the mutation, but her father was negative. We excluded gross deletion of the father's allele or paternal discordance. Genome-wide SNP array of the family (the patient and her parents showed complete maternal isodisomy of chromosome 12. She was successfully treated with high-dose oral calcium.This is the first report of HVDRR caused by UPD, and the third case of complete UPD of chromosome 12, in the published literature. Genome-wide SNP array was useful for detecting isodisomy and the parental origin of the allele. Comprehensive examination of the homozygous state is essential for accurate genetic counseling of recurrence risk and appropriate monitoring for other chromosome 12 related disorders. Furthermore, oral calcium therapy was effective as an initial treatment for rickets in this instance.

  9. Hapsembler: An Assembler for Highly Polymorphic Genomes

    Science.gov (United States)

    Donmez, Nilgun; Brudno, Michael

    As whole genome sequencing has become a routine biological experiment, algorithms for assembly of whole genome shotgun data has become a topic of extensive research, with a plethora of off-the-shelf methods that can reconstruct the genomes of many organisms. Simultaneously, several recently sequenced genomes exhibit very high polymorphism rates. For these organisms genome assembly remains a challenge as most assemblers are unable to handle highly divergent haplotypes in a single individual. In this paper we describe Hapsembler, an assembler for highly polymorphic genomes, which makes use of paired reads. Our experiments show that Hapsembler produces accurate and contiguous assemblies of highly polymorphic genomes, while performing on par with the leading tools on haploid genomes. Hapsembler is available for download at http://compbio.cs.toronto.edu/hapsembler.

  10. A simple optimization can improve the performance of single feature polymorphism detection by Affymetrix expression arrays

    Directory of Open Access Journals (Sweden)

    Fujisawa Hironori

    2010-05-01

    Full Text Available Abstract Background High-density oligonucleotide arrays are effective tools for genotyping numerous loci simultaneously. In small genome species (genome size: Results We compared the single feature polymorphism (SFP detection performance of whole-genome and transcript hybridizations using the Affymetrix GeneChip® Rice Genome Array, using the rice cultivars with full genome sequence, japonica cultivar Nipponbare and indica cultivar 93-11. Both genomes were surveyed for all probe target sequences. Only completely matched 25-mer single copy probes of the Nipponbare genome were extracted, and SFPs between them and 93-11 sequences were predicted. We investigated optimum conditions for SFP detection in both whole genome and transcript hybridization using differences between perfect match and mismatch probe intensities of non-polymorphic targets, assuming that these differences are representative of those between mismatch and perfect targets. Several statistical methods of SFP detection by whole-genome hybridization were compared under the optimized conditions. Causes of false positives and negatives in SFP detection in both types of hybridization were investigated. Conclusions The optimizations allowed a more than 20% increase in true SFP detection in whole-genome hybridization and a large improvement of SFP detection performance in transcript hybridization. Significance analysis of the microarray for log-transformed raw intensities of PM probes gave the best performance in whole genome hybridization, and 22,936 true SFPs were detected with 23.58% false positives by whole genome hybridization. For transcript hybridization, stable SFP detection was achieved for highly expressed genes, and about 3,500 SFPs were detected at a high sensitivity (> 50% in both shoot and young panicle transcripts. High SFP detection performances of both genome and transcript hybridizations indicated that microarrays of a complex genome (e.g., of Oryza sativa can be

  11. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle.

    Science.gov (United States)

    Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S

    2014-11-01

    Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  12. Genome-wide DNA polymorphism analyses using VariScan

    Directory of Open Access Journals (Sweden)

    Vilella Albert J

    2006-09-01

    Full Text Available Abstract Background DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. Results We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i exhaustive population-genetic analyses including those based on the coalescent theory; ii analysis adapted to the shallow data generated by the high-throughput genome projects; iii use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v visualization of the results integrated with current genome annotations in commonly available genome browsers. Conclusion VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

  13. Comprehensive survey of SNPs in the Affymetrix exon array using the 1000 Genomes dataset.

    Directory of Open Access Journals (Sweden)

    Eric R Gamazon

    Full Text Available Microarray gene expression data has been used in genome-wide association studies to allow researchers to study gene regulation as well as other complex phenotypes including disease risks and drug response. To reach scientifically sound conclusions from these studies, however, it is necessary to get reliable summarization of gene expression intensities. Among various factors that could affect expression profiling using a microarray platform, single nucleotide polymorphisms (SNPs in target mRNA may lead to reduced signal intensity measurements and result in spurious results. The recently released 1000 Genomes Project dataset provides an opportunity to evaluate the distribution of both known and novel SNPs in the International HapMap Project lymphoblastoid cell lines (LCLs. We mapped the 1000 Genomes Project genotypic data to the Affymetrix GeneChip Human Exon 1.0ST array (exon array, which had been used in our previous studies and for which gene expression data had been made publicly available. We also evaluated the potential impact of these SNPs on the differentially spliced probesets we had identified previously. Though the 1000 Genomes Project data allowed a comprehensive survey of the SNPs in this particular array, the same approach can certainly be applied to other microarray platforms. Furthermore, we present a detailed catalogue of SNP-containing probesets (exon-level and transcript clusters (gene-level, which can be considered in evaluating findings using the exon array as well as benefit the design of follow-up experiments and data re-analysis.

  14. Genome-Wide Association of Copy Number Polymorphisms and Kidney Function.

    Directory of Open Access Journals (Sweden)

    Man Li

    Full Text Available Genome-wide association studies (GWAS using single nucleotide polymorphisms (SNPs have identified more than 50 loci associated with estimated glomerular filtration rate (eGFR, a measure of kidney function. However, significant SNPs account for a small proportion of eGFR variability. Other forms of genetic variation have not been comprehensively evaluated for association with eGFR. In this study, we assess whether changes in germline DNA copy number are associated with GFR estimated from serum creatinine, eGFRcrea. We used hidden Markov models (HMMs to identify copy number polymorphic regions (CNPs from high-throughput SNP arrays for 2,514 African (AA and 8,645 European ancestry (EA participants in the Atherosclerosis Risk in Communities (ARIC study. Separately for the EA and AA cohorts, we used Bayesian Gaussian mixture models to estimate copy number at regions identified by the HMM or previously reported in the HapMap Project. We identified 312 and 464 autosomal CNPs among individuals of EA and AA, respectively. Multivariate models adjusted for SNP-derived covariates of population structure identified one CNP in the EA cohort near genome-wide statistical significance (Bonferroni-adjusted p = 0.067 located on chromosome 5 (876-880kb. Overall, our findings suggest a limited role of CNPs in explaining eGFR variability.

  15. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  16. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    Science.gov (United States)

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  17. Development and Applications of a High Throughput Genotyping Tool for Polyploid Crops: Single Nucleotide Polymorphism (SNP Array

    Directory of Open Access Journals (Sweden)

    Qian You

    2018-02-01

    Full Text Available Polypoid species play significant roles in agriculture and food production. Many crop species are polyploid, such as potato, wheat, strawberry, and sugarcane. Genotyping has been a daunting task for genetic studies of polyploid crops, which lags far behind the diploid crop species. Single nucleotide polymorphism (SNP array is considered to be one of, high-throughput, relatively cost-efficient and automated genotyping approaches. However, there are significant challenges for SNP identification in complex, polyploid genomes, which has seriously slowed SNP discovery and array development in polyploid species. Ploidy is a significant factor impacting SNP qualities and validation rates of SNP markers in SNP arrays, which has been proven to be a very important tool for genetic studies and molecular breeding. In this review, we (1 discussed the pros and cons of SNP array in general for high throughput genotyping, (2 presented the challenges of and solutions to SNP calling in polyploid species, (3 summarized the SNP selection criteria and considerations of SNP array design for polyploid species, (4 illustrated SNP array applications in several different polyploid crop species, then (5 discussed challenges, available software, and their accuracy comparisons for genotype calling based on SNP array data in polyploids, and finally (6 provided a series of SNP array design and genotype calling recommendations. This review presents a complete overview of SNP array development and applications in polypoid crops, which will benefit the research in molecular breeding and genetics of crops with complex genomes.

  18. Templated sequence insertion polymorphisms in the human genome

    Science.gov (United States)

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  19. Array-based techniques for fingerprinting medicinal herbs

    Directory of Open Access Journals (Sweden)

    Xue Charlie

    2011-05-01

    Full Text Available Abstract Poor quality control of medicinal herbs has led to instances of toxicity, poisoning and even deaths. The fundamental step in quality control of herbal medicine is accurate identification of herbs. Array-based techniques have recently been adapted to authenticate or identify herbal plants. This article reviews the current array-based techniques, eg oligonucleotides microarrays, gene-based probe microarrays, Suppression Subtractive Hybridization (SSH-based arrays, Diversity Array Technology (DArT and Subtracted Diversity Array (SDA. We further compare these techniques according to important parameters such as markers, polymorphism rates, restriction enzymes and sample type. The applicability of the array-based methods for fingerprinting depends on the availability of genomics and genetics of the species to be fingerprinted. For the species with few genome sequence information but high polymorphism rates, SDA techniques are particularly recommended because they require less labour and lower material cost.

  20. Generation of a genomic tiling array of the human Major Histocompatibility Complex (MHC and its application for DNA methylation analysis

    Directory of Open Access Journals (Sweden)

    Ottaviani Diego

    2008-05-01

    Full Text Available Abstract Background The major histocompatibility complex (MHC is essential for human immunity and is highly associated with common diseases, including cancer. While the genetics of the MHC has been studied intensively for many decades, very little is known about the epigenetics of this most polymorphic and disease-associated region of the genome. Methods To facilitate comprehensive epigenetic analyses of this region, we have generated a genomic tiling array of 2 Kb resolution covering the entire 4 Mb MHC region. The array has been designed to be compatible with chromatin immunoprecipitation (ChIP, methylated DNA immunoprecipitation (MeDIP, array comparative genomic hybridization (aCGH and expression profiling, including of non-coding RNAs. The array comprises 7832 features, consisting of two replicates of both forward and reverse strands of MHC amplicons and appropriate controls. Results Using MeDIP, we demonstrate the application of the MHC array for DNA methylation profiling and the identification of tissue-specific differentially methylated regions (tDMRs. Based on the analysis of two tissues and two cell types, we identified 90 tDMRs within the MHC and describe their characterisation. Conclusion A tiling array covering the MHC region was developed and validated. Its successful application for DNA methylation profiling indicates that this array represents a useful tool for molecular analyses of the MHC in the context of medical genomics.

  1. Whole genome DNA copy number changes identified by high density oligonucleotide arrays

    Directory of Open Access Journals (Sweden)

    Huang Jing

    2004-05-01

    Full Text Available Abstract Changes in DNA copy number are one of the hallmarks of the genetic instability common to most human cancers. Previous micro-array-based methods have been used to identify chromosomal gains and losses; however, they are unable to genotype alleles at the level of single nucleotide polymorphisms (SNPs. Here we describe a novel algorithm that uses a recently developed high-density oligonucleotide array-based SNP genotyping method, whole genome sampling analysis (WGSA, to identify genome-wide chromosomal gains and losses at high resolution. WGSA simultaneously genotypes over 10,000 SNPs by allele-specific hybridisation to perfect match (PM and mismatch (MM probes synthesised on a single array. The copy number algorithm jointly uses PM intensity and discrimination ratios between paired PM and MM intensity values to identify and estimate genetic copy number changes. Values from an experimental sample are compared with SNP-specific distributions derived from a reference set containing over 100 normal individuals to gain statistical power. Genomic regions with statistically significant copy number changes can be identified using both single point analysis and contiguous point analysis of SNP intensities. We identified multiple regions of amplification and deletion using a panel of human breast cancer cell lines. We verified these results using an independent method based on quantitative polymerase chain reaction and found that our approach is both sensitive and specific and can tolerate samples which contain a mixture of both tumour and normal DNA. In addition, by using known allele frequencies from the reference set, statistically significant genomic intervals can be identified containing contiguous stretches of homozygous markers, potentially allowing the detection of regions undergoing loss of heterozygosity (LOH without the need for a matched normal control sample. The coupling of LOH analysis, via SNP genotyping, with copy number

  2. A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing

    Directory of Open Access Journals (Sweden)

    Guangtu Gao

    2018-04-01

    Full Text Available Single-nucleotide polymorphisms (SNPs are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout (Oncorhynchus mykiss, SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD libraries, reduced representation libraries (RRL and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1 which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup, followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs and multi-sequence variants (MSVs. Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25. The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and

  3. The polydeoxyadenylate tract of Alu repetitive elements is polymorphic in the human genome

    International Nuclear Information System (INIS)

    Economou, E.P.; Bergen, A.W.; Warren, A.C.; Antonarakis, S.E.

    1990-01-01

    To identify DNA polymorphisms that are abundant in the human genome and are detectable by polymerase chain reaction amplification of genomic DNA, the authors hypothesize that the polydeoxyadenylate tract of the Alu family of repetitive elements is polymorphic among human chromosomes. Analysis of the 3' ends of three specific Alu sequences showed two occurrences, one in the adenosine deaminase gene and other in the β-globin pseudogene, were polymorphic. This novel class of polymorphism, termed AluVpA [Alu variable poly(A)] may represent one of the most useful and informative group of DNA markers in the human genome

  4. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2005-05-01

    Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.

  5. High-throughput single nucleotide polymorphism genotyping using nanofluidic Dynamic Arrays

    Directory of Open Access Journals (Sweden)

    Crenshaw Andrew

    2009-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs have emerged as the genetic marker of choice for mapping disease loci and candidate gene association studies, because of their high density and relatively even distribution in the human genomes. There is a need for systems allowing medium multiplexing (ten to hundreds of SNPs with high throughput, which can efficiently and cost-effectively generate genotypes for a very large sample set (thousands of individuals. Methods that are flexible, fast, accurate and cost-effective are urgently needed. This is also important for those who work on high throughput genotyping in non-model systems where off-the-shelf assays are not available and a flexible platform is needed. Results We demonstrate the use of a nanofluidic Integrated Fluidic Circuit (IFC - based genotyping system for medium-throughput multiplexing known as the Dynamic Array, by genotyping 994 individual human DNA samples on 47 different SNP assays, using nanoliter volumes of reagents. Call rates of greater than 99.5% and call accuracies of greater than 99.8% were achieved from our study, which demonstrates that this is a formidable genotyping platform. The experimental set up is very simple, with a time-to-result for each sample of about 3 hours. Conclusion Our results demonstrate that the Dynamic Array is an excellent genotyping system for medium-throughput multiplexing (30-300 SNPs, which is simple to use and combines rapid throughput with excellent call rates, high concordance and low cost. The exceptional call rates and call accuracy obtained may be of particular interest to those working on validation and replication of genome- wide- association (GWA studies.

  6. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  7. Tandemly Arrayed Genes in Vertebrate Genomes

    Directory of Open Access Journals (Sweden)

    Deng Pan

    2008-01-01

    Full Text Available Tandemly arrayed genes (TAGs are duplicated genes that are linked as neighbors on a chromosome, many of which have important physiological and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72–94% have parallel transcription orientation (i.e., they are encoded on the same strand in contrast to the genome, which has about 50% of its genes in parallel transcription orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our recent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mechanism of duplication, especially in large families. Finally, several species have a higher proportion of large tandem arrays that are species-specific than random expectation.

  8. Evaluation of SNP Data from the Malus Infinium Array Identifies Challenges for Genetic Analysis of Complex Genomes of Polyploid Origin.

    Directory of Open Access Journals (Sweden)

    Michela Troggio

    Full Text Available High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432, but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the 'Golden Delicious' genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies.

  9. Genetic mapping using the Diversity Arrays Technology (DArT) : application and validation using the whole-genome sequences of Arabidopsis thaliana and the fungal wheat pathogen Mycosphaerella graminicola

    NARCIS (Netherlands)

    Wittenberg, A.H.J.

    2007-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds- to thousands of restriction site based polymorphisms between genotypes and does not require DNA sequence

  10. Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms.

    Science.gov (United States)

    Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro

    2010-04-27

    To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be

  11. Uninformative polymorphisms bias genome scans for signatures of selection

    Directory of Open Access Journals (Sweden)

    Roesti Marius

    2012-06-01

    Full Text Available Abstract Background With the establishment of high-throughput sequencing technologies and new methods for rapid and extensive single nucleotide (SNP discovery, marker-based genome scans in search of signatures of divergent selection between populations occupying ecologically distinct environments are becoming increasingly popular. Methods and Results On the basis of genome-wide SNP marker data generated by RAD sequencing of lake and stream stickleback populations, we show that the outcome of such studies can be systematically biased if markers with a low minor allele frequency are included in the analysis. The reason is that these ‘uninformative’ polymorphisms lack the adequate potential to capture signatures of drift and hitchhiking, the focal processes in ecological genome scans. Bias associated with uninformative polymorphisms is not eliminated by just avoiding technical artifacts in the data (PCR and sequencing errors, as a high proportion of SNPs with a low minor allele frequency is a general biological feature of natural populations. Conclusions We suggest that uninformative markers should be excluded from genome scans based on empirical criteria derived from careful inspection of the data, and that these criteria should be reported explicitly. Together, this should increase the quality and comparability of genome scans, and hence promote our understanding of the processes driving genomic differentiation.

  12. Genome-to-genome analysis highlights the impact of the human innate and adaptive immune systems on the hepatitis C virus

    Science.gov (United States)

    Ip, Camilla; Magri, Andrea; Von Delft, Annette; Bonsall, David; Chaturvedi, Nimisha; Bartha, Istvan; Smith, David; Nicholson, George; McVean, Gilean; Trebes, Amy; Piazza, Paolo; Fellay, Jacques; Cooke, Graham; Foster, Graham R; Hudson, Emma; McLauchlan, John; Simmonds, Peter; Bowden, Rory; Klenerman, Paul; Barnes, Eleanor; Spencer, Chris C. A.

    2018-01-01

    Outcomes of hepatitis C virus (HCV) infection and treatment depend on viral and host genetic factors. We use human genome-wide genotyping arrays and new whole-genome HCV viral sequencing technologies to perform a systematic genome-to-genome study of 542 individuals chronically infected with HCV, predominately genotype 3. We show that both HLA alleles and interferon lambda innate immune system genes drive viral genome polymorphism, and that IFNL4 genotypes determine HCV viral load through a mechanism that is dependent on a specific polymorphism in the HCV polyprotein. We highlight the interplay between innate immune responses and the viral genome in HCV control. PMID:28394351

  13. Genome-wide divergence and linkage disequilibrium analyses for Capsicum baccatum revealed by genome-anchored single nucleotide polymorphisms

    Science.gov (United States)

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...

  14. Mosaic maternal uniparental disomy of chromosome 15 in Prader-Willi syndrome: utility of genome-wide SNP array.

    Science.gov (United States)

    Izumi, Kosuke; Santani, Avni B; Deardorff, Matthew A; Feret, Holly A; Tischler, Tanya; Thiel, Brian D; Mulchandani, Surabhi; Stolle, Catherine A; Spinner, Nancy B; Zackai, Elaine H; Conlin, Laura K

    2013-01-01

    Prader-Willi syndrome is caused by the loss of paternal gene expression on 15q11.2-q13.2, and one of the mechanisms resulting in Prader-Willi syndrome phenotype is maternal uniparental disomy of chromosome 15. Various mechanisms including trisomy rescue, monosomy rescue, and post fertilization errors can lead to uniparental disomy, and its mechanism can be inferred from the pattern of uniparental hetero and isodisomy. Detection of a mosaic cell line provides a unique opportunity to understand the mechanism of uniparental disomy; however, mosaic uniparental disomy is a rare finding in patients with Prader-Willi syndrome. We report on two infants with Prader-Willi syndrome caused by mosaic maternal uniparental disomy 15. Patient 1 has mosaic uniparental isodisomy of the entire chromosome 15, and Patient 2 has mosaic uniparental mixed iso/heterodisomy 15. Genome-wide single-nucleotide polymorphism array was able to demonstrate the presence of chromosomally normal cell line in the Patient 1 and trisomic cell line in Patient 2, and provide the evidence that post-fertilization error and trisomy rescue as a mechanism of uniparental disomy in each case, respectively. Given its ability of detecting small percent mosaicism as well as its capability of identifying the loss of heterozygosity of chromosomal regions, genome-wide single-nucleotide polymorphism array should be utilized as an adjunct to the standard methylation analysis in the evaluation of Prader-Willi syndrome. Copyright © 2012 Wiley Periodicals, Inc.

  15. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  16. High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

    Science.gov (United States)

    2011-01-01

    Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across

  17. High-density single nucleotide polymorphism (SNP) array mapping in Brassica oleracea: identification of QTL associated with carotenoid variation in broccoli florets.

    Science.gov (United States)

    Brown, Allan F; Yousef, Gad G; Chebrolu, Kranthi K; Byrd, Robert W; Everhart, Koyt W; Thomas, Aswathy; Reid, Robert W; Parkin, Isobel A P; Sharpe, Andrew G; Oliver, Rebekah; Guzman, Ivette; Jackson, Eric W

    2014-09-01

    A high-resolution genetic linkage map of B. oleracea was developed from a B. napus SNP array. The work will facilitate genetic and evolutionary studies in Brassicaceae. A broccoli population, VI-158 × BNC, consisting of 150 F2:3 families was used to create a saturated Brassica oleracea (diploid: CC) linkage map using a recently developed rapeseed (Brassica napus) (tetraploid: AACC) Illumina Infinium single nucleotide polymorphism (SNP) array. The map consisted of 547 non-redundant SNP markers spanning 948.1 cM across nine chromosomes with an average interval size of 1.7 cM. As the SNPs are anchored to the genomic reference sequence of the rapid cycling B. oleracea TO1000, we were able to estimate that the map provides 96 % coverage of the diploid genome. Carotenoid analysis of 2 years data identified 3 QTLs on two chromosomes that are associated with up to half of the phenotypic variation associated with the accumulation of total or individual compounds. By searching the genome sequences of the two related diploid species (B. oleracea and B. rapa), we further identified putative carotenoid candidate genes in the region of these QTLs. This is the first description of the use of a B. napus SNP array to rapidly construct high-density genetic linkage maps of one of the constituent diploid species. The unambiguous nature of these markers with regard to genomic sequences provides evidence to the nature of genes underlying the QTL, and demonstrates the value and impact this resource will have on Brassica research.

  18. SAQC: SNP Array Quality Control

    Directory of Open Access Journals (Sweden)

    Li Ling-Hui

    2011-04-01

    Full Text Available Abstract Background Genome-wide single-nucleotide polymorphism (SNP arrays containing hundreds of thousands of SNPs from the human genome have proven useful for studying important human genome questions. Data quality of SNP arrays plays a key role in the accuracy and precision of downstream data analyses. However, good indices for assessing data quality of SNP arrays have not yet been developed. Results We developed new quality indices to measure the quality of SNP arrays and/or DNA samples and investigated their statistical properties. The indices quantify a departure of estimated individual-level allele frequencies (AFs from expected frequencies via standardized distances. The proposed quality indices followed lognormal distributions in several large genomic studies that we empirically evaluated. AF reference data and quality index reference data for different SNP array platforms were established based on samples from various reference populations. Furthermore, a confidence interval method based on the underlying empirical distributions of quality indices was developed to identify poor-quality SNP arrays and/or DNA samples. Analyses of authentic biological data and simulated data show that this new method is sensitive and specific for the detection of poor-quality SNP arrays and/or DNA samples. Conclusions This study introduces new quality indices, establishes references for AFs and quality indices, and develops a detection method for poor-quality SNP arrays and/or DNA samples. We have developed a new computer program that utilizes these methods called SNP Array Quality Control (SAQC. SAQC software is written in R and R-GUI and was developed as a user-friendly tool for the visualization and evaluation of data quality of genome-wide SNP arrays. The program is available online (http://www.stat.sinica.edu.tw/hsinchou/genetics/quality/SAQC.htm.

  19. Diversity arrays technology (DArT) markers in apple for genetic linkage maps

    NARCIS (Netherlands)

    Schouten, H.J.; Weg, van de W.E.; Carling, J.; Khan, S.A.; McKay, S.J.; Kaauwen, van M.P.W.

    2012-01-01

    Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for

  20. Bioinformatics analysis of SARS coronavirus genome polymorphism

    Directory of Open Access Journals (Sweden)

    Pavlović-Lažetić Gordana M

    2004-05-01

    Full Text Available Abstract Background We have compared 38 isolates of the SARS-CoV complete genome. The main goal was twofold: first, to analyze and compare nucleotide sequences and to identify positions of single nucleotide polymorphism (SNP, insertions and deletions, and second, to group them according to sequence similarity, eventually pointing to phylogeny of SARS-CoV isolates. The comparison is based on genome polymorphism such as insertions or deletions and the number and positions of SNPs. Results The nucleotide structure of all 38 isolates is presented. Based on insertions and deletions and dissimilarity due to SNPs, the dataset of all the isolates has been qualitatively classified into three groups each having their own subgroups. These are the A-group with "regular" isolates (no insertions / deletions except for 5' and 3' ends, the B-group of isolates with "long insertions", and the C-group of isolates with "many individual" insertions and deletions. The isolate with the smallest average number of SNPs, compared to other isolates, has been identified (TWH. The density distribution of SNPs, insertions and deletions for each group or subgroup, as well as cumulatively for all the isolates is also presented, along with the gene map for TWH. Since individual SNPs may have occurred at random, positions corresponding to multiple SNPs (occurring in two or more isolates are identified and presented. This result revises some previous results of a similar type. Amino acid changes caused by multiple SNPs are also identified (for the annotated sequences, as well as presupposed amino acid changes for non-annotated ones. Exact SNP positions for the isolates in each group or subgroup are presented. Finally, a phylogenetic tree for the SARS-CoV isolates has been produced using the CLUSTALW program, showing high compatibility with former qualitative classification. Conclusions The comparative study of SARS-CoV isolates provides essential information for genome

  1. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    Science.gov (United States)

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  2. Intra-strain polymorphisms are detected but no genomic alteration is found in cloned mice

    International Nuclear Information System (INIS)

    Gotoh, Koshichi; Inoue, Kimiko; Ogura, Atsuo; Oishi, Michio

    2006-01-01

    In-gel competitive reassociation (IGCR) is a method for differential subtraction of polymorphic (RFLP) DNA fragments between two DNA samples of interest without probes or specific sequence information. Here, we applied the IGCR procedure to two cloned mice derived from an F1 hybrid of the C57BL/6Cr and DBA/2 strains, in order to investigate the possibility of genomic alteration in the cloned mouse genomes. Each of the five of the genomic alterations we detected between the two cloned mice corresponded to the 'intra-strain' polymorphisms in the C57BL/6Cr and DBA/2 mouse strains. Our result suggests that no severe aberration of genome sequences occurs due to somatic cell nuclear transfer

  3. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Science.gov (United States)

    Krishnan S, Gopala; Waters, Daniel L E; Henry, Robert J

    2014-01-01

    Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts). Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  4. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Directory of Open Access Journals (Sweden)

    Gopala Krishnan S

    Full Text Available BACKGROUND: Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. RESULTS: We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts. Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. CONCLUSIONS: Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  5. Fitness consequences of polymorphic inversions in the zebra finch genome.

    Science.gov (United States)

    Knief, Ulrich; Hemmrich-Stanisak, Georg; Wittig, Michael; Franke, Andre; Griffith, Simon C; Kempenaers, Bart; Forstmeier, Wolfgang

    2016-09-29

    Inversion polymorphisms constitute an evolutionary puzzle: they should increase embryo mortality in heterokaryotypic individuals but still they are widespread in some taxa. Some insect species have evolved mechanisms to reduce the cost of embryo mortality but humans have not. In birds, a detailed analysis is missing although intraspecific inversion polymorphisms are regarded as common. In Australian zebra finches (Taeniopygia guttata), two polymorphic inversions are known cytogenetically and we set out to detect these two and potentially additional inversions using genomic tools and study their effects on embryo mortality and other fitness-related and morphological traits. Using whole-genome SNP data, we screened 948 wild zebra finches for polymorphic inversions and describe four large (12-63 Mb) intraspecific inversion polymorphisms with allele frequencies close to 50 %. Using additional data from 5229 birds and 9764 eggs from wild and three captive zebra finch populations, we show that only the largest inversions increase embryo mortality in heterokaryotypic males, with surprisingly small effect sizes. We test for a heterozygote advantage on other fitness components but find no evidence for heterosis for any of the inversions. Yet, we find strong additive effects on several morphological traits. The mechanism that has carried the derived inversion haplotypes to such high allele frequencies remains elusive. It appears that selection has effectively minimized the costs associated with inversions in zebra finches. The highly skewed distribution of recombination events towards the chromosome ends in zebra finches and other estrildid species may function to minimize crossovers in the inverted regions.

  6. Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar)

    Science.gov (United States)

    2014-01-01

    Background Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array. Results SNP discovery was performed using extensive deep sequencing of Reduced Representation (RR-Seq), Restriction site-Associated DNA (RAD-Seq) and mRNA (RNA-Seq) libraries derived from farmed and wild Atlantic salmon samples (n = 283) resulting in the discovery of > 400 K putative SNPs. An Affymetrix Axiom® myDesign Custom Array was created and tested on samples of animals of wild and farmed origin (n = 96) revealing a total of 132,033 polymorphic SNPs with high call rate, good cluster separation on the array and stable Mendelian inheritance in our sample. At least 38% of these SNPs are from transcribed genomic regions and therefore more likely to include functional variants. Linkage analysis utilising the lack of male recombination in salmonids allowed the mapping of 40,214 SNPs distributed across all 29 pairs of chromosomes, highlighting the extensive genome-wide coverage of the SNPs. An identity-by-state clustering analysis revealed that the array can clearly distinguish between fish of different origins, within and between farmed and wild populations. Finally, Y-chromosome-specific probes included on the array provide an accurate molecular genetic test for sex. Conclusions This manuscript describes the first high-density SNP genotyping array for Atlantic salmon. This array will be publicly available and is likely to be used as a platform for high-resolution genetics research into traits of evolutionary and economic importance in

  7. Extensive variation in the density and distribution of DNA polymorphism in sorghum genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Evans

    Full Text Available Sorghum genotypes currently used for grain production in the United States were developed from African landraces that were imported starting in the mid-to-late 19(th century. Farmers and plant breeders selected genotypes for grain production with reduced plant height, early flowering, increased grain yield, adaptation to drought, and improved resistance to lodging, diseases and pests. DNA polymorphisms that distinguish three historically important grain sorghum genotypes, BTx623, BTx642 and Tx7000, were characterized by genome sequencing, genotyping by sequencing, genetic mapping, and pedigree-based haplotype analysis. The distribution and density of DNA polymorphisms in the sequenced genomes varied widely, in part because the lines were derived through breeding and selection from diverse Kafir, Durra, and Caudatum race accessions. Genomic DNA spanning dw1 (SBI-09 and dw3 (SBI-07 had identical haplotypes due to selection for reduced height. Lower SNP density in genes located in pericentromeric regions compared with genes located in euchromatic regions is consistent with background selection in these regions of low recombination. SNP density was higher in euchromatic DNA and varied >100-fold in contiguous intervals that spanned up to 300 Kbp. The localized variation in DNA polymorphism density occurred throughout euchromatic regions where recombination is elevated, however, polymorphism density was not correlated with gene density or DNA methylation. Overall, sorghum chromosomes contain distal euchromatic regions characterized by extensive, localized variation in DNA polymorphism density, and large pericentromeric regions of low gene density, diversity, and recombination.

  8. Streptococcus pneumoniae Supragenome Hybridization Arrays for Profiling of Genetic Content and Gene Expression.

    Science.gov (United States)

    Kadam, Anagha; Janto, Benjamin; Eutsey, Rory; Earl, Joshua P; Powell, Evan; Dahlgren, Margaret E; Hu, Fen Z; Ehrlich, Garth D; Hiller, N Luisa

    2015-02-02

    There is extensive genomic diversity among Streptococcus pneumoniae isolates. Approximately half of the comprehensive set of genes in the species (the supragenome or pangenome) is present in all the isolates (core set), and the remaining is unevenly distributed among strains (distributed set). The Streptococcus pneumoniae Supragenome Hybridization (SpSGH) array provides coverage for an extensive set of genes and polymorphisms encountered within this species, capturing this genomic diversity. Further, the capture is quantitative. In this manner, the SpSGH array allows for both genomic and transcriptomic analyses of diverse S. pneumoniae isolates on a single platform. In this unit, we present the SpSGH array, and describe in detail its design and implementation for both genomic and transcriptomic analyses. The methodology can be applied to construction and modification of SpSGH array platforms, as well to other bacterial species as long as multiple whole-genome sequences are available that collectively capture the vast majority of the species supragenome. Copyright © 2015 John Wiley & Sons, Inc.

  9. Insertion and deletion polymorphisms of the ancient AluS family in the human genome.

    Science.gov (United States)

    Kryatova, Maria S; Steranka, Jared P; Burns, Kathleen H; Payer, Lindsay M

    2017-01-01

    Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion

  10. High-resolution SNP array analysis of patients with developmental disorder and normal array CGH results

    Directory of Open Access Journals (Sweden)

    Siggberg Linda

    2012-09-01

    Full Text Available Abstract Background Diagnostic analysis of patients with developmental disorders has improved over recent years largely due to the use of microarray technology. Array methods that facilitate copy number analysis have enabled the diagnosis of up to 20% more patients with previously normal karyotyping results. A substantial number of patients remain undiagnosed, however. Methods and Results Using the Genome-Wide Human SNP array 6.0, we analyzed 35 patients with a developmental disorder of unknown cause and normal array comparative genomic hybridization (array CGH results, in order to characterize previously undefined genomic aberrations. We detected no seemingly pathogenic copy number aberrations. Most of the vast amount of data produced by the array was polymorphic and non-informative. Filtering of this data, based on copy number variant (CNV population frequencies as well as phenotypically relevant genes, enabled pinpointing regions of allelic homozygosity that included candidate genes correlating to the phenotypic features in four patients, but results could not be confirmed. Conclusions In this study, the use of an ultra high-resolution SNP array did not contribute to further diagnose patients with developmental disorders of unknown cause. The statistical power of these results is limited by the small size of the patient cohort, and interpretation of these negative results can only be applied to the patients studied here. We present the results of our study and the recurrence of clustered allelic homozygosity present in this material, as detected by the SNP 6.0 array.

  11. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome.

    Science.gov (United States)

    Sharp, Andrew J; Hansen, Sierra; Selzer, Rebecca R; Cheng, Ze; Regan, Regina; Hurst, Jane A; Stewart, Helen; Price, Sue M; Blair, Edward; Hennekam, Raoul C; Fitzpatrick, Carrie A; Segraves, Rick; Richmond, Todd A; Guiver, Cheryl; Albertson, Donna G; Pinkel, Daniel; Eis, Peggy S; Schwartz, Stuart; Knight, Samantha J L; Eichler, Evan E

    2006-09-01

    Genomic disorders are characterized by the presence of flanking segmental duplications that predispose these regions to recurrent rearrangement. Based on the duplication architecture of the genome, we investigated 130 regions that we hypothesized as candidates for previously undescribed genomic disorders. We tested 290 individuals with mental retardation by BAC array comparative genomic hybridization and identified 16 pathogenic rearrangements, including de novo microdeletions of 17q21.31 found in four individuals. Using oligonucleotide arrays, we refined the breakpoints of this microdeletion, defining a 478-kb critical region containing six genes that were deleted in all four individuals. We mapped the breakpoints of this deletion and of four other pathogenic rearrangements in 1q21.1, 15q13, 15q24 and 17q12 to flanking segmental duplications, suggesting that these are also sites of recurrent rearrangement. In common with the 17q21.31 deletion, these breakpoint regions are sites of copy number polymorphism in controls, indicating that these may be inherently unstable genomic regions.

  12. Methylation-Sensitive Amplification Length Polymorphism (MS-AFLP) Microarrays for Epigenetic Analysis of Human Genomes.

    Science.gov (United States)

    Alonso, Sergio; Suzuki, Koichi; Yamamoto, Fumiichiro; Perucho, Manuel

    2018-01-01

    Somatic, and in a minor scale also germ line, epigenetic aberrations are fundamental to carcinogenesis, cancer progression, and tumor phenotype. DNA methylation is the most extensively studied and arguably the best understood epigenetic mechanisms that become altered in cancer. Both somatic loss of methylation (hypomethylation) and gain of methylation (hypermethylation) are found in the genome of malignant cells. In general, the cancer cell epigenome is globally hypomethylated, while some regions-typically gene-associated CpG islands-become hypermethylated. Given the profound impact that DNA methylation exerts on the transcriptional profile and genomic stability of cancer cells, its characterization is essential to fully understand the complexity of cancer biology, improve tumor classification, and ultimately advance cancer patient management and treatment. A plethora of methods have been devised to analyze and quantify DNA methylation alterations. Several of the early-developed methods relied on the use of methylation-sensitive restriction enzymes, whose activity depends on the methylation status of their recognition sequences. Among these techniques, methylation-sensitive amplification length polymorphism (MS-AFLP) was developed in the early 2000s, and successfully adapted from its original gel electrophoresis fingerprinting format to a microarray format that notably increased its throughput and allowed the quantification of the methylation changes. This array-based platform interrogates over 9500 independent loci putatively amplified by the MS-AFLP technique, corresponding to the NotI sites mapped throughout the human genome.

  13. Evaluation of multiple approaches to identify genome-wide polymorphisms in closely related genotypes of sweet cherry (Prunus avium L.

    Directory of Open Access Journals (Sweden)

    Seanna Hewitt

    Full Text Available Identification of genetic polymorphisms and subsequent development of molecular markers is important for marker assisted breeding of superior cultivars of economically important species. Sweet cherry (Prunus avium L. is an economically important non-climacteric tree fruit crop in the Rosaceae family and has undergone a genetic bottleneck due to breeding, resulting in limited genetic diversity in the germplasm that is utilized for breeding new cultivars. Therefore, it is critical to recognize the best platforms for identifying genome-wide polymorphisms that can help identify, and consequently preserve, the diversity in a genetically constrained species. For the identification of polymorphisms in five closely related genotypes of sweet cherry, a gel-based approach (TRAP, reduced representation sequencing (TRAPseq, a 6k cherry SNParray, and whole genome sequencing (WGS approaches were evaluated in the identification of genome-wide polymorphisms in sweet cherry cultivars. All platforms facilitated detection of polymorphisms among the genotypes with variable efficiency. In assessing multiple SNP detection platforms, this study has demonstrated that a combination of appropriate approaches is necessary for efficient polymorphism identification, especially between closely related cultivars of a species. The information generated in this study provides a valuable resource for future genetic and genomic studies in sweet cherry, and the insights gained from the evaluation of multiple approaches can be utilized for other closely related species with limited genetic diversity in the breeding germplasm. Keywords: Polymorphisms, Prunus avium, Next-generation sequencing, Target region amplification polymorphism (TRAP, Genetic diversity, SNParray, Reduced representation sequencing, Whole genome sequencing (WGS

  14. Balanced into array : genome-wide array analysis in 54 patients with an apparently balanced de novo chromosome rearrangement and a meta-analysis

    NARCIS (Netherlands)

    Feenstra, Ilse; Hanemaaijer, Nicolien; Sikkema-Raddatz, Birgit; Yntema, Helger; Dijkhuizen, Trijnie; Lugtenberg, Dorien; Verheij, Joke; Green, Andrew; Hordijk, Roel; Reardon, William; de Vries, Bert; Brunner, Han; Bongers, Ernie; de Leeuw, Nicole; van Ravenswaaij-Arts, Conny

    2011-01-01

    High-resolution genome-wide array analysis enables detailed screening for cryptic and submicroscopic imbalances of microscopically balanced de novo rearrangements in patients with developmental delay and/or congenital abnormalities. In this report, we added the results of genome-wide array analysis

  15. Probing genomic diversity and evolution of Streptococcus suis serotype 2 by NimbleGen tiling arrays

    Directory of Open Access Journals (Sweden)

    Liao Hui

    2011-05-01

    Full Text Available Abstract Background Our previous studies revealed that a new disease form of streptococcal toxic shock syndrome (STSS is associated with specific Streptococcus suis serotype 2 (SS2 strains. To achieve a better understanding of the pathogenicity and evolution of SS2 at the whole-genome level, comparative genomic analysis of 18 SS2 strains, selected on the basis of virulence and geographic origin, was performed using NimbleGen tiling arrays. Results Our results demonstrate that SS2 isolates have highly divergent genomes. The 89K pathogenicity island (PAI, which has been previously recognized as unique to the Chinese epidemic strains causing STSS, was partially included in some other virulent and avirulent strains. The ABC-type transport systems, encoded by 89K, were hypothesized to greatly contribute to the catastrophic features of STSS. Moreover, we identified many polymorphisms in genes encoding candidate or known virulence factors, such as PlcR, lipase, sortases, the pilus-associated proteins, and the response regulator RevS and CtsR. On the basis of analysis of regions of differences (RDs across the entire genome for the 18 selected SS2 strains, a model of microevolution for these strains is proposed, which provides clues into Streptococcus pathogenicity and evolution. Conclusions Our deep comparative genomic analysis of the 89K PAI present in the genome of SS2 strains revealed details into how some virulent strains acquired genes that may contribute to STSS, which may lead to better environmental monitoring of epidemic SS2 strains.

  16. Polymorphic integrations of an endogenous gammaretrovirus in the mule deer genome.

    Science.gov (United States)

    Elleder, Daniel; Kim, Oekyung; Padhi, Abinash; Bankert, Jason G; Simeonov, Ivan; Schuster, Stephan C; Wittekindt, Nicola E; Motameny, Susanne; Poss, Mary

    2012-03-01

    Endogenous retroviruses constitute a significant genomic fraction in all mammalian species. Typically they are evolutionarily old and fixed in the host species population. Here we report on a novel endogenous gammaretrovirus (CrERVγ; for cervid endogenous gammaretrovirus) in the mule deer (Odocoileus hemionus) that is insertionally polymorphic among individuals from the same geographical location, suggesting that it has a more recent evolutionary origin. Using PCR-based methods, we identified seven CrERVγ proviruses and demonstrated that they show various levels of insertional polymorphism in mule deer individuals. One CrERVγ provirus was detected in all mule deer sampled but was absent from white-tailed deer, indicating that this virus originally integrated after the split of the two species, which occurred approximately one million years ago. There are, on average, 100 CrERVγ copies in the mule deer genome based on quantitative PCR analysis. A CrERVγ provirus was sequenced and contained intact open reading frames (ORFs) for three virus genes. Transcripts were identified covering the entire provirus. CrERVγ forms a distinct branch of the gammaretrovirus phylogeny, with the closest relatives of CrERVγ being endogenous gammaretroviruses from sheep and pig. We demonstrated that white-tailed deer (Odocoileus virginianus) and elk (Cervus canadensis) DNA contain proviruses that are closely related to mule deer CrERVγ in a conserved region of pol; more distantly related sequences can be identified in the genome of another member of the Cervidae, the muntjac (Muntiacus muntjak). The discovery of a novel transcriptionally active and insertionally polymorphic retrovirus in mammals could provide a useful model system to study the dynamic interaction between the host genome and an invading retrovirus.

  17. A Universal Genome Array and Transcriptome Atlas for Brachypodium Distachyon

    Energy Technology Data Exchange (ETDEWEB)

    Mockler, Todd [Oregon State Univ., Corvallis, OR (United States)

    2017-04-17

    Brachypodium distachyon is the premier experimental model grass platform and is related to candidate feedstock crops for bioethanol production. Based on the DOE-JGI Brachypodium Bd21 genome sequence and annotation we designed a whole genome DNA microarray platform. The quality of this array platform is unprecedented due to the exceptional quality of the Brachypodium genome assembly and annotation and the stringent probe selection criteria employed in the design. We worked with members of the international community and the bioinformatics/design team at Affymetrix at all stages in the development of the array. We used the Brachypodium arrays to interrogate the transcriptomes of plants grown in a variety of environmental conditions including diurnal and circadian light/temperature conditions and under a variety of environmental conditions. We examined the transciptional responses of Brachypodium seedlings subjected to various abiotic stresses including heat, cold, salt, and high intensity light. We generated a gene expression atlas representing various organs and developmental stages. The results of these efforts including all microarray datasets are published and available at online public databases.

  18. DELISHUS: an efficient and exact algorithm for genome-wide detection of deletion polymorphism in autism

    Science.gov (United States)

    Aguiar, Derek; Halldórsson, Bjarni V.; Morrow, Eric M.; Istrail, Sorin

    2012-01-01

    Motivation: The understanding of the genetic determinants of complex disease is undergoing a paradigm shift. Genetic heterogeneity of rare mutations with deleterious effects is more commonly being viewed as a major component of disease. Autism is an excellent example where research is active in identifying matches between the phenotypic and genomic heterogeneities. A considerable portion of autism appears to be correlated with copy number variation, which is not directly probed by single nucleotide polymorphism (SNP) array or sequencing technologies. Identifying the genetic heterogeneity of small deletions remains a major unresolved computational problem partly due to the inability of algorithms to detect them. Results: In this article, we present an algorithmic framework, which we term DELISHUS, that implements three exact algorithms for inferring regions of hemizygosity containing genomic deletions of all sizes and frequencies in SNP genotype data. We implement an efficient backtracking algorithm—that processes a 1 billion entry genome-wide association study SNP matrix in a few minutes—to compute all inherited deletions in a dataset. We further extend our model to give an efficient algorithm for detecting de novo deletions. Finally, given a set of called deletions, we also give a polynomial time algorithm for computing the critical regions of recurrent deletions. DELISHUS achieves significantly lower false-positive rates and higher power than previously published algorithms partly because it considers all individuals in the sample simultaneously. DELISHUS may be applied to SNP array or sequencing data to identify the deletion spectrum for family-based association studies. Availability: DELISHUS is available at http://www.brown.edu/Research/Istrail_Lab/. Contact: Eric_Morrow@brown.edu and Sorin_Istrail@brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22689755

  19. Genome-wide SNP detection, validation, and development of an 8K SNP array for apple.

    Directory of Open Access Journals (Sweden)

    David Chagné

    Full Text Available As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of 'Golden Delicious', SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional, and genomic selection in apple.

  20. Genome-Wide SNP Detection, Validation, and Development of an 8K SNP Array for Apple

    Science.gov (United States)

    Chagné, David; Crowhurst, Ross N.; Troggio, Michela; Davey, Mark W.; Gilmore, Barbara; Lawley, Cindy; Vanderzande, Stijn; Hellens, Roger P.; Kumar, Satish; Cestaro, Alessandro; Velasco, Riccardo; Main, Dorrie; Rees, Jasper D.; Iezzoni, Amy; Mockler, Todd; Wilhelm, Larry; Van de Weg, Eric; Gardiner, Susan E.; Bassil, Nahla; Peace, Cameron

    2012-01-01

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of ‘Golden Delicious’, SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple. PMID:22363718

  1. Oligonucleotide array discovery of polymorphisms in cultivated tomato (Solanum lycopersicum L. reveals patterns of SNP variation associated with breeding

    Directory of Open Access Journals (Sweden)

    Zhu Tong

    2009-10-01

    Full Text Available Abstract Background Cultivated tomato (Solanum lycopersicum L. has narrow genetic diversity that makes it difficult to identify polymorphisms between elite germplasm. We explored array-based single feature polymorphism (SFP discovery as a high-throughput approach for marker development in cultivated tomato. Results Three varieties, FL7600 (fresh-market, OH9242 (processing, and PI114490 (cherry were used as a source of genomic DNA for hybridization to oligonucleotide arrays. Identification of SFPs was based on outlier detection using regression analysis of normalized hybridization data within a probe set for each gene. A subset of 189 putative SFPs was sequenced for validation. The rate of validation depended on the desired level of significance (α used to define the confidence interval (CI, and ranged from 76% for polymorphisms identified at α ≤ 10-6 to 60% for those identified at α ≤ 10-2. Validation percentage reached a plateau between α ≤ 10-4 and α ≤ 10-7, but failure to identify known SFPs (Type II error increased dramatically at α ≤ 10-6. Trough sequence validation, we identified 279 SNPs and 27 InDels in 111 loci. Sixty loci contained ≥ 2 SNPs per locus. We used a subset of validated SNPs for genetic diversity analysis of 92 tomato varieties and accessions. Pairwise estimation of θ (Fst suggested significant differentiation between collections of fresh-market, processing, vintage, Latin American (landrace, and S. pimpinellifolium accessions. The fresh-market and processing groups displayed high genetic diversity relative to vintage and landrace groups. Furthermore, the patterns of SNP variation indicated that domestication and early breeding practices have led to progressive genetic bottlenecks while modern breeding practices have reintroduced genetic variation into the crop from wild species. Finally, we examined the ratio of non-synonymous (Ka to synonymous substitutions (Ks for 20 loci with multiple SNPs (≥ 4 per

  2. Microarray-based genomic surveying of gene polymorphisms in Chlamydia trachomatis

    OpenAIRE

    Brunelle, Brian W; Nicholson, Tracy L; Stephens, Richard S

    2004-01-01

    By comparing two fully sequenced genomes of Chlamydia trachomatis using competitive hybridization on DNA microarrays, a logarithmic correlation was demonstrated between the signal ratio of the arrays and the 75-99% range of nucleotide identities of the genes. Variable genes within 14 uncharacterized strains of C. trachomatis were identified by array analysis and verified by DNA sequencing. These genes may be crucial for understanding chlamydial virulence and pathogenesis.

  3. Genomic Relatedness of Chlamydia Isolates Determined by Amplified Fragment Length Polymorphism Analysis

    OpenAIRE

    Meijer, Adam; Morré, Servaas A.; Van Den Brule, Adriaan J. C.; Savelkoul, Paul H. M.; Ossewaarde, Jacobus M.

    1999-01-01

    The genomic relatedness of 19 Chlamydia pneumoniae isolates (17 from respiratory origin and 2 from atherosclerotic origin), 21 Chlamydia trachomatis isolates (all serovars from the human biovar, an isolate from the mouse biovar, and a porcine isolate), 6 Chlamydia psittaci isolates (5 avian isolates and 1 feline isolate), and 1 Chlamydia pecorum isolate was studied by analyzing genomic amplified fragment length polymorphism (AFLP) fingerprints. The AFLP procedure was adapted from a previously...

  4. Concentrating and labeling genomic DNA in a nanofluidic array

    DEFF Research Database (Denmark)

    Marie, Rodolphe; Pedersen, Jonas Nyvold; Mir, Kalim U.

    2018-01-01

    , however, hinder the polymerase activity. We demonstrate a device and a protocol for the enzymatic labeling of genomic DNA arranged in a dense array of single molecules without attaching the enzyme or the DNA to a surface. DNA molecules accumulate in a dense array of pits embedded within a nanoslit due...... to entropic trapping. We then perform ϕ29 polymerase extension from single-strand nicks created on the trapped molecules to incorporate fluorescent nucleotides into the DNA. The array of entropic traps can be loaded with λ-DNA molecules to more than 90% of capacity at a flow rate of 10 pL min-1. The final...

  5. Development and evaluation of the first high-throughput SNP array for common carp (Cyprinus carpio).

    Science.gov (United States)

    Xu, Jian; Zhao, Zixia; Zhang, Xiaofeng; Zheng, Xianhu; Li, Jiongtang; Jiang, Yanliang; Kuang, Youyi; Zhang, Yan; Feng, Jianxin; Li, Chuangju; Yu, Juhua; Li, Qiang; Zhu, Yuanyuan; Liu, Yuanyuan; Xu, Peng; Sun, Xiaowen

    2014-04-24

    A large number of single nucleotide polymorphisms (SNPs) have been identified in common carp (Cyprinus carpio) but, as yet, no high-throughput genotyping platform is available for this species. C. carpio is an important aquaculture species that accounts for nearly 14% of freshwater aquaculture production worldwide. We have developed an array for C. carpio with 250,000 SNPs and evaluated its performance using samples from various strains of C. carpio. The SNPs used on the array were selected from two resources: the transcribed sequences from RNA-seq data of four strains of C. carpio, and the genome re-sequencing data of five strains of C. carpio. The 250,000 SNPs on the resulting array are distributed evenly across the reference C.carpio genome with an average spacing of 6.6 kb. To evaluate the SNP array, 1,072 C. carpio samples were collected and tested. Of the 250,000 SNPs on the array, 185,150 (74.06%) were found to be polymorphic sites. Genotyping accuracy was checked using genotyping data from a group of full-siblings and their parents, and over 99.8% of the qualified SNPs were found to be reliable. Analysis of the linkage disequilibrium on all samples and on three domestic C.carpio strains revealed that the latter had the longer haplotype blocks. We also evaluated our SNP array on 80 samples from eight species related to C. carpio, with from 53,526 to 71,984 polymorphic SNPs. An identity by state analysis divided all the samples into three clusters; most of the C. carpio strains formed the largest cluster. The Carp SNP array described here is the first high-throughput genotyping platform for C. carpio. Our evaluation of this array indicates that it will be valuable for farmed carp and for genetic and population biology studies in C. carpio and related species.

  6. Three gangliogliomas: results of GTG-banding, SKY, genome-wide high resolution SNP-array, gene expression and review of the literature.

    Science.gov (United States)

    Xu, Li-Xin; Holland, Heidrun; Kirsten, Holger; Ahnert, Peter; Krupp, Wolfgang; Bauer, Manfred; Schober, Ralf; Mueller, Wolf; Fritzsch, Dominik; Meixensberger, Jürgen; Koschny, Ronald

    2015-04-01

    According to the World Health Organization gangliogliomas are classified as well-differentiated and slowly growing neuroepithelial tumors, composed of neoplastic mature ganglion and glial cells. It is the most frequent tumor entity observed in patients with long-term epilepsy. Comprehensive cytogenetic and molecular cytogenetic data including high-resolution genomic profiling (single nucleotide polymorphism (SNP)-array) of gangliogliomas are scarce but necessary for a better oncological understanding of this tumor entity. For a detailed characterization at the single cell and cell population levels, we analyzed genomic alterations of three gangliogliomas using trypsin-Giemsa banding (GTG-banding) and by spectral karyotyping (SKY) in combination with SNP-array and gene expression array experiments. By GTG and SKY, we could confirm frequently detected chromosomal aberrations (losses within chromosomes 10, 13 and 22; gains within chromosomes 5, 7, 8 and 12), and identify so far unknown genetic aberrations like the unbalanced non-reciprocal translocation t(1;18)(q21;q21). Interestingly, we report on the second so far detected ganglioglioma with ring chromosome 1. Analyses of SNP-array data from two of the tumors and respective germline DNA (peripheral blood) identified few small gains and losses and a number of copy-neutral regions with loss of heterozygosity (LOH) in germline and in tumor tissue. In comparison to germline DNA, tumor tissues did not show substantial regions with significant loss or gain or with newly developed LOH. Gene expression analyses of tumor-specific genes revealed similarities in the profile of the analyzed samples regarding different relevant pathways. Taken together, we describe overlapping but also distinct and novel genetic aberrations of three gangliogliomas. © 2014 Japanese Society of Neuropathology.

  7. Comparing genetic variants detected in the 1000 genomes project ...

    Indian Academy of Sciences (India)

    Single-nucleotide polymorphisms (SNPs) determined based on SNP arrays from the international HapMap consortium (HapMap) and the genetic variants detected in the 1000 genomes project (1KGP) can serve as two references for genomewide association studies (GWAS). We conducted comparative analyses to provide ...

  8. Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex changes and multiple forms of chromosomal instability in colorectal cancers

    DEFF Research Database (Denmark)

    Gaasenbeek, Michelle; Howarth, Kimberley; Rowan, Andrew J

    2006-01-01

    Cancers with chromosomal instability (CIN) are held to be aneuploid/polyploid with multiple large-scale gains/deletions, but the processes underlying CIN are unclear and different types of CIN might exist. We investigated colorectal cancer cell lines using array-comparative genomic hybridization...

  9. Spatially Defined Oligonucleotide Arrays. Technical Report for Phase II; FINAL

    International Nuclear Information System (INIS)

    None

    2000-01-01

    The goal of the Human Genome Project is to sequence all 3 billion base pairs of the human genome. Progress in this has been rapid; GenBank(reg s ign) finished 1994 with 286 million bases of sequence and grew by 2470 in the first quarter of 1995. The challenge to the scientific community is to understand the biological relevance of this genetic information. In most cases the sequence being generated for any single region of the genome represents the genotype of a single individual. A complete understanding of the function of specific genes and other regions of the genome and their role in human disease and development will only become apparent when the sequence of many more individuals is known. Access to genetic information is ultimately limited by the ability to screen DNA sequence. Although the pioneering sequencing methods of Sanger et al. (15) and Maxam and Gilbert (11) have become standard in virtually all molecular biology laboratories, the basic protocols remain largely unchanged. The throughput of this sequencing technology is now becoming the rate-limiting step in both large-scale sequencing projects such as the Human Genome Project and the subsequent efforts to understand genetic diversity. This has inspired the development of advanced DNA sequencing technologies (9), Incremental improvements to Sanger sequencing have been made in DNA labeling and detection. High-speed electrophoresis methods using ultrathin gels or capillary arrays are now being more widely employed. However, these methods are throughput-limited by their sequential nature and the speed and resolution of separations. This limitation will become more pronounced as the need to rapidly screen newly discovered genes for biologically relevant polymorphisms increases. An alternative to gel-based sequencing is to use high-density oligonucleotide probe arrays. Oligonucleotide probe arrays display specific oligonucleotide probes at precise locations in a high density, information-rich format (5

  10. Whole-Genome Sequencing and iPLEX MassARRAY Genotyping Map an EMS-Induced Mutation Affecting Cell Competition in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Chang-Hyun Lee

    2016-10-01

    Full Text Available Cell competition, the conditional loss of viable genotypes only when surrounded by other cells, is a phenomenon observed in certain genetic mosaic conditions. We conducted a chemical mutagenesis and screen to recover new mutations that affect cell competition between wild-type and RpS3 heterozygous cells. Mutations were identified by whole-genome sequencing, making use of software tools that greatly facilitate the distinction between newly induced mutations and other sources of apparent sequence polymorphism, thereby reducing false-positive and false-negative identification rates. In addition, we utilized iPLEX MassARRAY for genotyping recombinant chromosomes. These approaches permitted the mapping of a new mutation affecting cell competition when only a single allele existed, with a phenotype assessed only in genetic mosaics, without the benefit of complementation with existing mutations, deletions, or duplications. These techniques expand the utility of chemical mutagenesis and whole-genome sequencing for mutant identification. We discuss mutations in the Atm and Xrp1 genes identified in this screen.

  11. Genome-wide patterns of nucleotide polymorphism in domesticated rice

    DEFF Research Database (Denmark)

    Caicedo, Ana L; Williamson, Scott H; Hernandez, Ryan D

    2007-01-01

    Domesticated Asian rice (Oryza sativa) is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments......, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs) in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models...... to explain contemporary patterns of polymorphisms in rice, including a (i) selectively neutral population bottleneck model, (ii) bottleneck plus migration model, (iii) multiple selective sweeps model, and (iv) bottleneck plus selective sweeps model. We find that a simple bottleneck model, which has been...

  12. Genome-wide association study for ovarian cancer susceptibility using pooled DNA.

    NARCIS (Netherlands)

    Lu, Y.; Chen, X.; Beesley, J.; Johnatty, S.E.; Defazio, A.; Lambrechts, S.; Lambrechts, D.; Despierre, E.; Vergotes, I.; Chang-Claude, J.; Hein, R.; Nickels, S.; Wang-Gohrke, S.; Dork, T.; Durst, M.; Antonenkova, N.; Bogdanova, N.; Goodman, M.T.; Lurie, G.; Wilkens, L.R.; Carney, M.E.; Butzow, R.; Nevanlinna, H.; Heikkinen, T.; Leminen, A.; Kiemeney, L.A.L.M.; Massuger, L.F.A.G.; Altena, A.M. van; Aben, K.K.H.; Kjaer, S.K.; Hogdall, E.; Jensen, A.; Brooks-Wilson, A.; Le, N.; Cook, L.; Earp, M.; Kelemen, L.; Easton, D.; Pharoah, P.; Song, H.; Tyrer, J.; Ramus, S.; Menon, U.; Gentry-Maharaj, A.; Gayther, S.A.; Bandera, E.V.; Olson, S.H.; Orlow, I.; Rodriguez-Rodriguez, L.; MacGregor, S.; Chenevix-Trench, G.

    2012-01-01

    Recent Genome-Wide Association Studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate- or low-penetrance variants exist among the subset of single-nucleotide polymorphisms (SNPs) not well tagged by the genotyping arrays used in

  13. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  14. Effects of As2O3 on DNA methylation, genomic instability, and LTR retrotransposon polymorphism in Zea mays.

    Science.gov (United States)

    Erturk, Filiz Aygun; Aydin, Murat; Sigmaz, Burcu; Taspinar, M Sinan; Arslan, Esra; Agar, Guleray; Yagci, Semra

    2015-12-01

    Arsenic is a well-known toxic substance on the living organisms. However, limited efforts have been made to study its DNA methylation, genomic instability, and long terminal repeat (LTR) retrotransposon polymorphism causing properties in different crops. In the present study, effects of As2O3 (arsenic trioxide) on LTR retrotransposon polymorphism and DNA methylation as well as DNA damage in Zea mays seedlings were investigated. The results showed that all of arsenic doses caused a decreasing genomic template stability (GTS) and an increasing Random Amplified Polymorphic DNAs (RAPDs) profile changes (DNA damage). In addition, increasing DNA methylation and LTR retrotransposon polymorphism characterized a model to explain the epigenetically changes in the gene expression were also found. The results of this experiment have clearly shown that arsenic has epigenetic effect as well as its genotoxic effect. Especially, the increasing of polymorphism of some LTR retrotransposon under arsenic stress may be a part of the defense system against the stress.

  15. A hidden Markov model approach for determining expression from genomic tiling micro arrays

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Gardner, P. P.; Arctander, Peter

    2006-01-01

    Background Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion. Results We present a probabilistic procedure, ExpressHMM, that adaptiv......Background Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion. Results We present a probabilistic procedure, Express...

  16. Marker-assisted introgression of drought tolerance from wild ancestors into popular Indian rice varieties using a 7K Infinium SNP array

    Directory of Open Access Journals (Sweden)

    Ravindra Donde

    2017-10-01

    Full Text Available Recent advances in the area of genomics have led to the development of high throughput genotyping platforms that have immensely contributed to molecular breeding programs. Custom-designed single nucleotide polymorphism (SNP arrays provide an efficient, cost effective, high throughput genotyping tool for QTL/gene mapping, variety identification, marker-assisted selection, etc. In the current study, two interspecific libraries of Chromosome Segment Substitution Lines (CSSLs were evaluated under both drought and control conditions to identify lines with superior yield under drought. The CSSL libraries consisted of 48 BC4F3 lines derived from O. sativa cv. Curinga (tropical japonica x O. rufipogon, and 32 BC4F3 lines derived from O. sativa cv. Curinga (tropical japonica x O. meridionalis. The phenotypic screening of these 80 CSSLs led to the identification of three lines, MER-20, RUF-16, and RUF-44, that yielded well under drought stress. This line was backcrossed with popular rice variety of India, Swarna-Sub1 to introgress wild chromosome segments responsible for reproductive stage drought tolerance. During backcrossing, tracking of wild introgressions and monitoring of recurrent parent genome recovery was facilitated by the use of the Cornell 6K and 7K Infinium rice SNP arrays. The 6K and 7K SNP arrays assayed 5275 SNPs and 7099 SNPs, respectively, distributed across the 12 chromosomes. In our populations of (MER-20X Swarna sub1 BC2F1 lines, 1775 SNPs were polymorphic using the 6K array. The percentage of recurrent parent genome in these backcrossed lines ranged from 33-92% and the percentage of wild donor genome ranged from 8-67%. Using genotypic selection, 5% of plants were identified for further marker assisted backcrossing, based on the presence of the target donor (wild segment and maximum recovery of recurrent parent background. In the next generation, BC3F1 lines were genotyped using the 7K SNP array, which identified 2521 polymorphic SNPs

  17. Diversity arrays technology (DArT) markers in apple for genetic linkage maps

    OpenAIRE

    Schouten, H.J.; Weg, van de, W.E.; Carling, J.; Khan, S.A.; McKay, S.J.; Kaauwen, van, M.P.W.

    2012-01-01

    Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for apple. This is the first paper on DArT in horticultural trees. Genetic mapping of DArT markers in two mapping populations and their integration with other marker types showed that DArT is a powerf...

  18. Diversity arrays technology (DArT) markers in apple for genetic linkage maps

    OpenAIRE

    Schouten, Henk J.; van de Weg, W. Eric; Carling, Jason; Khan, Sabaz Ali; McKay, Steven J.; van Kaauwen, Martijn P. W.; Wittenberg, Alexander H. J.; Koehorst-van Putten, Herma J. J.; Noordijk, Yolanda; Gao, Zhongshan; Rees, D. Jasper G.; Van Dyk, Maria M.; Jaccoud, Damian; Considine, Michael J.; Kilian, Andrzej

    2011-01-01

    Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for apple. This is the first paper on DArT in horticultural trees. Genetic mapping of DArT markers in two mapping populations and their integration with other marker types showed that DArT is a powerf...

  19. Genomic relations among 31 species of Mammillaria haworth (Cactaceae) using random amplified polymorphic DNA.

    Science.gov (United States)

    Mattagajasingh, Ilwola; Mukherjee, Arup Kumar; Das, Premananda

    2006-01-01

    Thirty-one species of Mammillaria were selected to study the molecular phylogeny using random amplified polymorphic DNA (RAPD) markers. High amount of mucilage (gelling polysaccharides) present in Mammillaria was a major obstacle in isolating good quality genomic DNA. The CTAB (cetyl trimethyl ammonium bromide) method was modified to obtain good quality genomic DNA. Twenty-two random decamer primers resulted in 621 bands, all of which were polymorphic. The similarity matrix value varied from 0.109 to 0.622 indicating wide variability among the studied species. The dendrogram obtained from the unweighted pair group method using arithmetic averages (UPGMA) analysis revealed that some of the species did not follow the conventional classification. The present work shows the usefulness of RAPD markers for genetic characterization to establish phylogenetic relations among Mammillaria species.

  20. A MITE-based genotyping method to reveal hundreds of DNA polymorphisms in an animal genome after a few generations of artificial selection

    Directory of Open Access Journals (Sweden)

    Tetreau Guillaume

    2008-10-01

    Full Text Available Abstract Background For most organisms, developing hundreds of genetic markers spanning the whole genome still requires excessive if not unrealistic efforts. In this context, there is an obvious need for methodologies allowing the low-cost, fast and high-throughput genotyping of virtually any species, such as the Diversity Arrays Technology (DArT. One of the crucial steps of the DArT technique is the genome complexity reduction, which allows obtaining a genomic representation characteristic of the studied DNA sample and necessary for subsequent genotyping. In this article, using the mosquito Aedes aegypti as a study model, we describe a new genome complexity reduction method taking advantage of the abundance of miniature inverted repeat transposable elements (MITEs in the genome of this species. Results Ae. aegypti genomic representations were produced following a two-step procedure: (1 restriction digestion of the genomic DNA and simultaneous ligation of a specific adaptor to compatible ends, and (2 amplification of restriction fragments containing a particular MITE element called Pony using two primers, one annealing to the adaptor sequence and one annealing to a conserved sequence motif of the Pony element. Using this protocol, we constructed a library comprising more than 6,000 DArT clones, of which at least 5.70% were highly reliable polymorphic markers for two closely related mosquito strains separated by only a few generations of artificial selection. Within this dataset, linkage disequilibrium was low, and marker redundancy was evaluated at 2.86% only. Most of the detected genetic variability was observed between the two studied mosquito strains, but individuals of the same strain could still be clearly distinguished. Conclusion The new complexity reduction method was particularly efficient to reveal genetic polymorphisms in Ae. egypti. Overall, our results testify of the flexibility of the DArT genotyping technique and open new

  1. Genome-wide survey of allele-specific splicing in humans

    Directory of Open Access Journals (Sweden)

    Scheffler Konrad

    2008-06-01

    Full Text Available Abstract Background Accurate mRNA splicing depends on multiple regulatory signals encoded in the transcribed RNA sequence. Many examples of mutations within human splice regulatory regions that alter splicing qualitatively or quantitatively have been reported and allelic differences in mRNA splicing are likely to be a common and important source of phenotypic diversity at the molecular level, in addition to their contribution to genetic disease susceptibility. However, because the effect of a mutation on the efficiency of mRNA splicing is often difficult to predict, many mutations that cause disease through an effect on splicing are likely to remain undiscovered. Results We have combined a genome-wide scan for sequence polymorphisms likely to affect mRNA splicing with analysis of publicly available Expressed Sequence Tag (EST and exon array data. The genome-wide scan uses published tools and identified 30,977 SNPs located within donor and acceptor splice sites, branch points and exonic splicing enhancer elements. For 1,185 candidate splicing polymorphisms the difference in splicing between alternative alleles was corroborated by publicly available exon array data from 166 lymphoblastoid cell lines. We developed a novel probabilistic method to infer allele-specific splicing from EST data. The method uses SNPs and alternative mRNA isoforms mapped to EST sequences and models both regulated alternative splicing as well as allele-specific splicing. We have also estimated heritability of splicing and report that a greater proportion of genes show evidence of splicing heritability than show heritability of overall gene expression level. Our results provide an extensive resource that can be used to assess the possible effect on splicing of human polymorphisms in putative splice-regulatory sites. Conclusion We report a set of genes showing evidence of allele-specific splicing from an integrated analysis of genomic polymorphisms, EST data and exon array

  2. Investigation of inversion polymorphisms in the human genome using principal components analysis.

    Science.gov (United States)

    Ma, Jianzhong; Amos, Christopher I

    2012-01-01

    Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases.

  3. Genetic profiles of gastroesophageal cancer: combined analysis using expression array and tiling array--comparative genomic hybridization

    DEFF Research Database (Denmark)

    Isinger-Ekstrand, Anna; Johansson, Jan; Ohlsson, Mattias

    2010-01-01

    15, 13q34, and 12q13, whereas different profiles with gains at 5p15, 7p22, 2q35, and 13q34 characterized gastric cancers. CDK6 and EGFR were identified as putative target genes in cancers of the esophagus and the gastroesophageal junction, with upregulation in one quarter of the tumors. Gains......We aimed to characterize the genomic profiles of adenocarcinomas in the gastroesophageal junction in relation to cancers in the esophagus and the stomach. Profiles of gains/losses as well as gene expression profiles were obtained from 27 gastroesophageal adenocarcinomas by means of 32k high......-resolution array-based comparative genomic hybridization and 27k oligo gene expression arrays, and putative target genes were validated in an extended series. Adenocarcinomas in the distal esophagus and the gastroesophageal junction showed strong similarities with the most common gains at 20q13, 8q24, 1q21-23, 5p...

  4. Genome-based polymorphic microsatellite development and validation in the mosquito Aedes aegypti and application to population genetics in Haiti

    Directory of Open Access Journals (Sweden)

    Streit Thomas G

    2009-12-01

    Full Text Available Abstract Background Microsatellite markers have proven useful in genetic studies in many organisms, yet microsatellite-based studies of the dengue and yellow fever vector mosquito Aedes aegypti have been limited by the number of assayable and polymorphic loci available, despite multiple independent efforts to identify them. Here we present strategies for efficient identification and development of useful microsatellites with broad coverage across the Aedes aegypti genome, development of multiplex-ready PCR groups of microsatellite loci, and validation of their utility for population analysis with field collections from Haiti. Results From 79 putative microsatellite loci representing 31 motifs identified in 42 whole genome sequence supercontig assemblies in the Aedes aegypti genome, 33 microsatellites providing genome-wide coverage amplified as single copy sequences in four lab strains, with a range of 2-6 alleles per locus. The tri-nucleotide motifs represented the majority (51% of the polymorphic single copy loci, and none of these was located within a putative open reading frame. Seven groups of 4-5 microsatellite loci each were developed for multiplex-ready PCR. Four multiplex-ready groups were used to investigate population genetics of Aedes aegypti populations sampled in Haiti. Of the 23 loci represented in these groups, 20 were polymorphic with a range of 3-24 alleles per locus (mean = 8.75. Allelic polymorphic information content varied from 0.171 to 0.867 (mean = 0.545. Most loci met Hardy-Weinberg expectations across populations and pairwise FST comparisons identified significant genetic differentiation between some populations. No evidence for genetic isolation by distance was observed. Conclusion Despite limited success in previous reports, we demonstrate that the Aedes aegypti genome is well-populated with single copy, polymorphic microsatellite loci that can be uncovered using the strategy developed here for rapid and efficient

  5. Genome-to-genome analysis highlights the effect of the human innate and adaptive immune systems on the hepatitis C virus.

    Science.gov (United States)

    Ansari, M Azim; Pedergnana, Vincent; L C Ip, Camilla; Magri, Andrea; Von Delft, Annette; Bonsall, David; Chaturvedi, Nimisha; Bartha, Istvan; Smith, David; Nicholson, George; McVean, Gilean; Trebes, Amy; Piazza, Paolo; Fellay, Jacques; Cooke, Graham; Foster, Graham R; Hudson, Emma; McLauchlan, John; Simmonds, Peter; Bowden, Rory; Klenerman, Paul; Barnes, Eleanor; Spencer, Chris C A

    2017-05-01

    Outcomes of hepatitis C virus (HCV) infection and treatment depend on viral and host genetic factors. Here we use human genome-wide genotyping arrays and new whole-genome HCV viral sequencing technologies to perform a systematic genome-to-genome study of 542 individuals who were chronically infected with HCV, predominantly genotype 3. We show that both alleles of genes encoding human leukocyte antigen molecules and genes encoding components of the interferon lambda innate immune system drive viral polymorphism. Additionally, we show that IFNL4 genotypes determine HCV viral load through a mechanism dependent on a specific amino acid residue in the HCV NS5A protein. These findings highlight the interplay between the innate immune system and the viral genome in HCV control.

  6. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  7. Whole-Genome Sequencing and iPLEX MassARRAY Genotyping Map an EMS-Induced Mutation Affecting Cell Competition in Drosophila melanogaster.

    Science.gov (United States)

    Lee, Chang-Hyun; Rimesso, Gerard; Reynolds, David M; Cai, Jinlu; Baker, Nicholas E

    2016-10-13

    Cell competition, the conditional loss of viable genotypes only when surrounded by other cells, is a phenomenon observed in certain genetic mosaic conditions. We conducted a chemical mutagenesis and screen to recover new mutations that affect cell competition between wild-type and RpS3 heterozygous cells. Mutations were identified by whole-genome sequencing, making use of software tools that greatly facilitate the distinction between newly induced mutations and other sources of apparent sequence polymorphism, thereby reducing false-positive and false-negative identification rates. In addition, we utilized iPLEX MassARRAY for genotyping recombinant chromosomes. These approaches permitted the mapping of a new mutation affecting cell competition when only a single allele existed, with a phenotype assessed only in genetic mosaics, without the benefit of complementation with existing mutations, deletions, or duplications. These techniques expand the utility of chemical mutagenesis and whole-genome sequencing for mutant identification. We discuss mutations in the Atm and Xrp1 genes identified in this screen. Copyright © 2016 Lee et al.

  8. Comparison between fluorescent in-situ hybridisation and array comparative genomic hybridisation in preimplantation genetic diagnosis in translocation carriers.

    Science.gov (United States)

    Lee, Vivian C Y; Chow, Judy F C; Lau, Estella Y L; Yeung, William S B; Ho, P C; Ng, Ernest H Y

    2015-02-01

    To compare the pregnancy outcome of the fluorescent in-situ hybridisation and array comparative genomic hybridisation in preimplantation genetic diagnosis of translocation carriers. Historical cohort. A teaching hospital in Hong Kong. All preimplantation genetic diagnosis treatment cycles performed for translocation carriers from 2001 to 2013. Overall, 101 treatment cycles for preimplantation genetic diagnosis in translocation were included: 77 cycles for reciprocal translocation and 24 cycles for Robertsonian translocation. Fluorescent in-situ hybridisation and array comparative genomic hybridisation were used in 78 and 11 cycles, respectively. The ongoing pregnancy rate per initiated cycle after array comparative genomic hybridisation was significantly higher than that after fluorescent in-situ hybridisation in all translocation carriers (36.4% vs 9.0%; P=0.010). The miscarriage rate was comparable with both techniques. The testing method (array comparative genomic hybridisation or fluorescent in-situ hybridisation) was the only significant factor affecting the ongoing pregnancy rate after controlling for the women's age, type of translocation, and clinical information of the preimplantation genetic diagnosis cycles by logistic regression (odds ratio=1.875; P=0.023; 95% confidence interval, 1.090-3.226). This local retrospective study confirmed that comparative genomic hybridisation is associated with significantly higher pregnancy rates versus fluorescent in-situ hybridisation in translocation carriers. Array comparative genomic hybridisation should be the technique of choice in preimplantation genetic diagnosis cycles in translocation carriers.

  9. DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

    Directory of Open Access Journals (Sweden)

    Inês Soares

    Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.

  10. Characterization of genomic alterations in radiation-associated breast cancer among childhood cancer survivors, using comparative genomic hybridization (CGH arrays.

    Directory of Open Access Journals (Sweden)

    Xiaohong R Yang

    Full Text Available Ionizing radiation is an established risk factor for breast cancer. Epidemiologic studies of radiation-exposed cohorts have been primarily descriptive; molecular events responsible for the development of radiation-associated breast cancer have not been elucidated. In this study, we used array comparative genomic hybridization (array-CGH to characterize genome-wide copy number changes in breast tumors collected in the Childhood Cancer Survivor Study (CCSS. Array-CGH data were obtained from 32 cases who developed a second primary breast cancer following chest irradiation at early ages for the treatment of their first cancers, mostly Hodgkin lymphoma. The majority of these cases developed breast cancer before age 45 (91%, n = 29, had invasive ductal tumors (81%, n = 26, estrogen receptor (ER-positive staining (68%, n = 19 out of 28, and high proliferation as indicated by high Ki-67 staining (77%, n = 17 out of 22. Genomic regions with low-copy number gains and losses and high-level amplifications were similar to what has been reported in sporadic breast tumors, however, the frequency of amplifications of the 17q12 region containing human epidermal growth factor receptor 2 (HER2 was much higher among CCSS cases (38%, n = 12. Our findings suggest that second primary breast cancers in CCSS were enriched for an "amplifier" genomic subgroup with highly proliferative breast tumors. Future investigation in a larger irradiated cohort will be needed to confirm our findings.

  11. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

    Directory of Open Access Journals (Sweden)

    Xiao-Lin Wu

    Full Text Available Low-density (LD single nucleotide polymorphism (SNP arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD or high-density (HD SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE or haplotype-averaged Shannon entropy (HASE and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus

  12. Genome-wide DNA polymorphism in the indica rice varieties RGD-7S and Taifeng B as revealed by whole genome re-sequencing.

    Science.gov (United States)

    Fu, Chong-Yun; Liu, Wu-Ge; Liu, Di-Lin; Li, Ji-Hua; Zhu, Man-Shan; Liao, Yi-Long; Liu, Zhen-Rong; Zeng, Xue-Qin; Wang, Feng

    2016-03-01

    Next-generation sequencing technologies provide opportunities to further understand genetic variation, even within closely related cultivars. We performed whole genome resequencing of two elite indica rice varieties, RGD-7S and Taifeng B, whose F1 progeny showed hybrid weakness and hybrid vigor when grown in the early- and late-cropping seasons, respectively. Approximately 150 million 100-bp pair-end reads were generated, which covered ∼86% of the rice (Oryza sativa L. japonica 'Nipponbare') reference genome. A total of 2,758,740 polymorphic sites including 2,408,845 SNPs and 349,895 InDels were detected in RGD-7S and Taifeng B, respectively. Applying stringent parameters, we identified 961,791 SNPs and 46,640 InDels between RGD-7S and Taifeng B (RGD-7S/Taifeng B). The density of DNA polymorphisms was 256.8 SNPs and 12.5 InDels per 100 kb for RGD-7S/Taifeng B. Copy number variations (CNVs) were also investigated. In RGD-7S, 1989 of 2727 CNVs were overlapped in 218 genes, and 1231 of 2010 CNVs were annotated in 175 genes in Taifeng B. In addition, we verified a subset of InDels in the interval of hybrid weakness genes, Hw3 and Hw4, and obtained some polymorphic InDel markers, which will provide a sound foundation for cloning hybrid weakness genes. Analysis of genomic variations will also contribute to understanding the genetic basis of hybrid weakness and heterosis.

  13. DNA suspension arrays: silencing discrete artifacts for high-sensitivity applications.

    Directory of Open Access Journals (Sweden)

    Matthew S Lalonde

    2010-11-01

    Full Text Available Detection of low frequency single nucleotide polymorphisms (SNPs has important implications in early screening for tumorgenesis, genetic disorders and pathogen drug resistance. Nucleic acid arrays are a powerful tool for genome-scale SNP analysis, but detection of low-frequency SNPs in a mixed population on an array is problematic. We demonstrate a model assay for HIV-1 drug resistance mutations, wherein ligase discrimination products are collected on a suspension array. In developing this system, we discovered that signal from multiple polymorphisms was obscured by two discrete hybridization artifacts. Specifically: 1 tethering of unligated probes on the template DNA elicited false signal and 2 unpredictable probe secondary structures impaired probe capture and suppressed legitimate signal from the array. Two sets of oligonucleotides were used to disrupt these structures; one to displace unligated reporter labels from the bead-bound species and another to occupy sequences which interfered with array hybridization. This artifact silencing system resulted in a mean 21-fold increased sensitivity for 29 minority variants of 17 codons in our model assay for mutations most commonly associated with HIV-1 drug resistance. Furthermore, since the artifacts we characterized are not unique to our system, their specific inhibition might improve the quality of data from solid-state microarrays as well as from the growing number of multiple analyte suspension arrays relying on sequence-specific nucleic acid target capture.

  14. A hidden Markov model approach for determining expression from genomic tiling micro arrays

    Directory of Open Access Journals (Sweden)

    Krogh Anders

    2006-05-01

    Full Text Available Abstract Background Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion. Results We present a probabilistic procedure, ExpressHMM, that adaptively models tiling data prior to predicting expression on genomic sequence. A hidden Markov model (HMM is used to model the distributions of tiling array probe scores in expressed and non-expressed regions. The HMM is trained on sets of probes mapped to regions of annotated expression and non-expression. Subsequently, prediction of transcribed fragments is made on tiled genomic sequence. The prediction is accompanied by an expression probability curve for visual inspection of the supporting evidence. We test ExpressHMM on data from the Cheng et al. (2005 tiling array experiments on ten Human chromosomes 1. Results can be downloaded and viewed from our web site 2. Conclusion The value of adaptive modelling of fluorescence scores prior to categorisation into expressed and non-expressed probes is demonstrated. Our results indicate that our adaptive approach is superior to the previous analysis in terms of nucleotide sensitivity and transfrag specificity.

  15. A SNP Genotyping Array for Hexaploid Oat

    Directory of Open Access Journals (Sweden)

    Nicholas A. Tinker

    2014-11-01

    Full Text Available Recognizing a need in cultivated hexaploid oat ( L. for a reliable set of reference single nucleotide polymorphisms (SNPs, we have developed a 6000 (6K BeadChip design containing 257 Infinium I and 5486 Infinium II designs corresponding to 5743 SNPs. Of those, 4975 SNPs yielded successful assays after array manufacturing. These SNPs were discovered based on a variety of bioinformatics pipelines in complementary DNA (cDNA and genomic DNA originating from 20 or more diverse oat cultivars. The array was validated in 1100 samples from six recombinant inbred line (RIL mapping populations and sets of diverse oat cultivars and breeding lines, and provided approximately 3500 discernible Mendelian polymorphisms. Here, we present an annotation of these SNPs, including methods of discovery, gene identification and orthology, population-genetic characteristics, and tentative positions on an oat consensus map. We also evaluate a new cluster-based method of calling SNPs. The SNP design sequences are made publicly available, and the full SNP genotyping platform is available for commercial purchase from an independent third party.

  16. Methylation-sensitive amplified polymorphism-based genome-wide analysis of cytosine methylation profiles in Nicotiana tabacum cultivars.

    Science.gov (United States)

    Jiao, J; Wu, J; Lv, Z; Sun, C; Gao, L; Yan, X; Cui, L; Tang, Z; Yan, B; Jia, Y

    2015-11-26

    This study aimed to investigate cytosine methylation profiles in different tobacco (Nicotiana tabacum) cultivars grown in China. Methylation-sensitive amplified polymorphism was used to analyze genome-wide global methylation profiles in four tobacco cultivars (Yunyan 85, NC89, K326, and Yunyan 87). Amplicons with methylated C motifs were cloned by reamplified polymerase chain reaction, sequenced, and analyzed. The results show that geographical location had a greater effect on methylation patterns in the tobacco genome than did sampling time. Analysis of the CG dinucleotide distribution in methylation-sensitive polymorphic restriction fragments suggested that a CpG dinucleotide cluster-enriched area is a possible site of cytosine methylation in the tobacco genome. The sequence alignments of the Nia1 gene (that encodes nitrate reductase) in Yunyan 87 in different regions indicate that a C-T transition might be responsible for the tobacco phenotype. T-C nucleotide replacement might also be responsible for the tobacco phenotype and may be influenced by geographical location.

  17. Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

    Science.gov (United States)

    Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

    2012-01-01

    Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference

  18. Genomic characterization of DArT markers based on high-density linkage analysis and physical mapping to the Eucalyptus genome.

    Directory of Open Access Journals (Sweden)

    César D Petroli

    Full Text Available Diversity Arrays Technology (DArT provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for

  19. Characterization of the Gray Whale Eschrichtius robustus Genome and a Genotyping Array Based on Single-Nucleotide Polymorphisms in Candidate Genes.

    Science.gov (United States)

    DeWoody, J Andrew; Fernandez, Nadia B; Brüniche-Olsen, Anna; Antonides, Jennifer D; Doyle, Jacqueline M; San Miguel, Phillip; Westerman, Rick; Vertyankin, Vladimir V; Godard-Codding, Céline A J; Bickham, John W

    2017-06-01

    Genetic and genomic approaches have much to offer in terms of ecology, evolution, and conservation. To better understand the biology of the gray whale Eschrichtius robustus (Lilljeborg, 1861), we sequenced the genome and produced an assembly that contains ∼95% of the genes known to be highly conserved among eukaryotes. From this assembly, we annotated 22,711 genes and identified 2,057,254 single-nucleotide polymorphisms (SNPs). Using this assembly, we generated a curated list of candidate genes potentially subject to strong natural selection, including genes associated with osmoregulation, oxygen binding and delivery, and other aspects of marine life. From these candidate genes, we queried 92 autosomal protein-coding markers with a panel of 96 SNPs that also included 2 sexing and 2 mitochondrial markers. Genotyping error rates, calculated across loci and across 69 intentional replicate samples, were low (0.021%), and observed heterozygosity was 0.33 averaged over all autosomal markers. This level of variability provides substantial discriminatory power across loci (mean probability of identity of 1.6 × 10 -25 and mean probability of exclusion >0.999 with neither parent known), indicating that these markers provide a powerful means to assess parentage and relatedness in gray whales. We found 29 unique multilocus genotypes represented among our 36 biopsies (indicating that we inadvertently sampled 7 whales twice). In total, we compiled an individual data set of 28 western gray whales (WGSs) and 1 presumptive eastern gray whale (EGW). The lone EGW we sampled was no more or less related to the WGWs than expected by chance alone. The gray whale genomes reported here will enable comparative studies of natural selection in cetaceans, and the SNP markers should be highly informative for future studies of gray whale evolution, population structure, demography, and relatedness.

  20. Development of highly polymorphic simple sequence repeat markers using genome-wide microsatellite variant analysis in Foxtail millet [Setaria italica (L.) P. Beauv].

    Science.gov (United States)

    Zhang, Shuo; Tang, Chanjuan; Zhao, Qiang; Li, Jing; Yang, Lifang; Qie, Lufeng; Fan, Xingke; Li, Lin; Zhang, Ning; Zhao, Meicheng; Liu, Xiaotong; Chai, Yang; Zhang, Xue; Wang, Hailong; Li, Yingtao; Li, Wen; Zhi, Hui; Jia, Guanqing; Diao, Xianmin

    2014-01-28

    Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1' by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei's genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species.

  1. Genome-wide DNA polymorphisms in Kavuni, a traditional rice cultivar with nutritional and therapeutic properties.

    Science.gov (United States)

    Rathinasabapathi, Pasupathi; Purushothaman, Natarajan; Parani, Madasamy

    2016-05-01

    Although rice genome was sequenced in the year 2002, efforts in resequencing the large number of available accessions, landraces, traditional cultivars, and improved varieties of this important food crop are limited. We have initiated resequencing of the traditional cultivars from India. Kavuni is an important traditional rice cultivar from South India that attracts premium price for its nutritional and therapeutic properties. Whole-genome sequencing of Kavuni using Illumina platform and SNPs analysis using Nipponbare reference genome identified 1 150 711 SNPs of which 377 381 SNPs were located in the genic regions. Non-synonymous SNPs (62 708) were distributed in 19 251 genes, and their number varied between 1 and 115 per gene. Large-effect DNA polymorphisms (7769) were present in 3475 genes. Pathway mapping of these polymorphisms revealed the involvement of genes related to carbohydrate metabolism, translation, protein-folding, and cell death. Analysis of the starch biosynthesis related genes revealed that the granule-bound starch synthase I gene had T/G SNPs at the first intron/exon junction and a two-nucleotide combination, which were reported to favour high amylose content and low glycemic index. The present study provided a valuable genomics resource to study the rice varieties with nutritional and medicinal properties.

  2. Construction and evaluation of a high-density SNP array for the Pacific oyster (Crassostrea gigas.

    Directory of Open Access Journals (Sweden)

    Haigang Qi

    Full Text Available Single nucleotide polymorphisms (SNPs are widely used in genetics and genomics research. The Pacific oyster (Crassostrea gigas is an economically and ecologically important marine bivalve, and it possesses one of the highest levels of genomic DNA variation among animal species. Pacific oyster SNPs have been extensively investigated; however, the mechanisms by which these SNPs may be used in a high-throughput, transferable, and economical manner remain to be elucidated. Here, we constructed an oyster 190K SNP array using Affymetrix Axiom genotyping technology. We designed 190,420 SNPs on the chip; these SNPs were selected from 54 million SNPs identified through re-sequencing of 472 Pacific oysters collected in China, Japan, Korea, and Canada. Our genotyping results indicated that 133,984 (70.4% SNPs were polymorphic and successfully converted on the chip. The SNPs were distributed evenly throughout the oyster genome, located in 3,595 scaffolds with a length of ~509.4 million; the average interval spacing was 4,210 bp. In addition, 111,158 SNPs were distributed in 21,050 coding genes, with an average of 5.3 SNPs per gene. In comparison with genotypes obtained through re-sequencing, ~69% of the converted SNPs had a concordance rate of >0.971; the mean concordance rate was 0.966. Evaluation based on genotypes of full-sib family individuals revealed that the average genotyping accuracy rate was 0.975. Carrying 133 K polymorphic SNPs, our oyster 190K SNP array is the first commercially available high-density SNP chip for mollusks, with the highest throughput. It represents a valuable tool for oyster genome-wide association studies, fine linkage mapping, and population genetics.

  3. Cross-platform array comparative genomic hybridization meta-analysis separates hematopoietic and mesenchymal from epithelial tumors

    NARCIS (Netherlands)

    Jong, C.; Marchiori, E.; van der Vaart, A.W.; Chin, S.F.; Carvalho, B; Tijssen, M.; Eijk, P.P.; van den IJssel, P.; Grabsch, H.; Quirke, P.; Oudejans, J.J.; Meijer, G.J.; Caldas, C.; Ylstra, B.

    2007-01-01

    A series of studies have been published that evaluate the chromosomal copy number changes of different tumor classes using array comparative genomic hybridization (array CGH); however, the chromosomal aberrations that distinguish the different tumor classes have not been fully characterized.

  4. Population Genomics of Inversion Polymorphisms in Drosophila melanogaster

    Science.gov (United States)

    Corbett-Detig, Russell B.; Hartl, Daniel L.

    2012-01-01

    Chromosomal inversions have been an enduring interest of population geneticists since their discovery in Drosophila melanogaster. Numerous lines of evidence suggest powerful selective pressures govern the distributions of polymorphic inversions, and these observations have spurred the development of many explanatory models. However, due to a paucity of nucleotide data, little progress has been made towards investigating selective hypotheses or towards inferring the genealogical histories of inversions, which can inform models of inversion evolution and suggest selective mechanisms. Here, we utilize population genomic data to address persisting gaps in our knowledge of D. melanogaster's inversions. We develop a method, termed Reference-Assisted Reassembly, to assemble unbiased, highly accurate sequences near inversion breakpoints, which we use to estimate the age and the geographic origins of polymorphic inversions. We find that inversions are young, and most are African in origin, which is consistent with the demography of the species. The data suggest that inversions interact with polymorphism not only in breakpoint regions but also chromosome-wide. Inversions remain differentiated at low levels from standard haplotypes even in regions that are distant from breakpoints. Although genetic exchange appears fairly extensive, we identify numerous regions that are qualitatively consistent with selective hypotheses. Finally, we show that In(1)Be, which we estimate to be ∼60 years old (95% CI 5.9 to 372.8 years), has likely achieved high frequency via sex-ratio segregation distortion in males. With deeper sampling, it will be possible to build on our inferences of inversion histories to rigorously test selective models—particularly those that postulate that inversions achieve a selective advantage through the maintenance of co-adapted allele complexes. PMID:23284285

  5. Concept and design of a genome-wide association genotyping array tailored for transplantation-specific studies

    DEFF Research Database (Denmark)

    Li, Yun R.; van Setten, Jessica; Verma, Shefali S.

    2015-01-01

    genome-wide genotyping array, the 'TxArray', comprising approximately 782,000 markers with tailored content for deeper capture of variants across HLA, KIR, pharmacogenomic, and metabolic loci important in transplantation. To test concordance and genotyping quality, we genotyped 85 HapMap samples...

  6. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Science.gov (United States)

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  7. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Directory of Open Access Journals (Sweden)

    Theresa A Hill

    Full Text Available The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs. Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP. Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and

  8. Clinical utility of an array comparative genomic hybridization analysis for Williams syndrome.

    Science.gov (United States)

    Yagihashi, Tatsuhiko; Torii, Chiharu; Takahashi, Reiko; Omori, Mikimasa; Kosaki, Rika; Yoshihashi, Hiroshi; Ihara, Masahiro; Minagawa-Kawai, Yasuyo; Yamamoto, Junichi; Takahashi, Takao; Kosaki, Kenjiro

    2014-11-01

    To reveal the relation between intellectual disability and the deleted intervals in Williams syndrome, we performed an array comparative genomic hybridization analysis and standardized developmental testing for 11 patients diagnosed as having Williams syndrome based on fluorescent in situ hybridization testing. One patient had a large 4.2-Mb deletion spanning distally beyond the common 1.5-Mb intervals observed in 10/11 patients. We formulated a linear equation describing the developmental age of the 10 patients with the common deletion; the developmental age of the patient with the 4.2-Mb deletion was significantly below the expectation (developmental age = 0.51 × chronological age). The large deletion may account for the severe intellectual disability; therefore, the use of array comparative genomic hybridization may provide practical information regarding individuals with Williams syndrome. © 2014 Japanese Teratology Society.

  9. Genomic polymorphism, recombination, and linkage disequilibrium in human major histocompatibility complex-encoded antigen-processing genes.

    Science.gov (United States)

    van Endert, P M; Lopez, M T; Patel, S D; Monaco, J J; McDevitt, H O

    1992-01-01

    Recently, two subunits of a large cytosolic protease and two putative peptide transporter proteins were found to be encoded by genes within the class II region of the major histocompatibility complex (MHC). These genes have been suggested to be involved in the processing of antigenic proteins for presentation by MHC class I molecules. Because of the high degree of polymorphism in MHC genes, and previous evidence for both functional and polypeptide sequence polymorphism in the proteins encoded by the antigen-processing genes, we tested DNA from 27 consanguineous human cell lines for genomic polymorphism by restriction fragment length polymorphism (RFLP) analysis. These studies demonstrate a strong linkage disequilibrium between TAP1 and LMP2 RFLPs. Moreover, RFLPs, as well as a polymorphic stop codon in the telomeric TAP2 gene, appear to be in linkage disequilibrium with HLA-DR alleles and RFLPs in the HLA-DO gene. A high rate of recombination, however, seems to occur in the center of the complex, between the TAP1 and TAP2 genes. Images PMID:1360671

  10. OpenADAM: an open source genome-wide association data management system for Affymetrix SNP arrays

    Directory of Open Access Journals (Sweden)

    Sham P C

    2008-12-01

    Full Text Available Abstract Background Large scale genome-wide association studies have become popular since the introduction of high throughput genotyping platforms. Efficient management of the vast array of data generated poses many challenges. Description We have developed an open source web-based data management system for the large amount of genotype data generated from the Affymetrix GeneChip® Mapping Array and Affymetrix Genome-Wide Human SNP Array platforms. The database supports genotype calling using DM, BRLMM, BRLMM-P or Birdseed algorithms provided by the Affymetrix Power Tools. The genotype and corresponding pedigree data are stored in a relational database for efficient downstream data manipulation and analysis, such as calculation of allele and genotype frequencies, sample identity checking, and export of genotype data in various file formats for analysis using commonly-available software. A novel method for genotyping error estimation is implemented using linkage disequilibrium information from the HapMap project. All functionalities are accessible via a web-based user interface. Conclusion OpenADAM provides an open source database system for management of Affymetrix genome-wide association SNP data.

  11. Human Xq28 Inversion Polymorphism: From Sex Linkage to Genomics--A Genetic Mother Lode

    Science.gov (United States)

    Kirby, Cait S.; Kolber, Natalie; Salih Almohaidi, Asmaa M.; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to…

  12. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Directory of Open Access Journals (Sweden)

    Yash Paul Khajuria

    Full Text Available The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777 of an inter-specific reference mapping population. High amplification efficiency (87%, experimental validation success rate (81% and polymorphic potential (55% of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48% detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%. An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777 having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped

  13. High resolution array-based comparative genomic hybridisation of medulloblastomas and supra-tentorial primitive neuroectodermal tumours

    Science.gov (United States)

    McCabe, Martin Gerard; Ichimura, Koichi; Liu, Lu; Plant, Karen; Bäcklund, L Magnus; Pearson, Danita M; Collins, Vincent Peter

    2010-01-01

    Medulloblastomas and supratentorial primitive neuroectodermal tumours are aggressive childhood tumours. We report our findings using array comparative genomic hybridisation (CGH) on a whole-genome BAC/PAC/cosmid array with a median clone separation of 0.97Mb to study 34 medulloblastomas and 7 supratentorial primitive neuroectodermal tumours. Array CGH allowed identification and mapping of numerous novel small regions of copy number change to genomic sequence, in addition to the large regions already known from previous studies. Novel amplifications were identified, some encompassing oncogenes, MYCL1, PDGFRA, KIT and MYB, not previously reported to show amplification in these tumours. In addition, one supratentorial primitive neuroectodermal tumour had lost both copies of the tumour suppressor genes CDKN2A & CDKN2B. Ten medulloblastomas had findings suggestive of isochromosome 17q. In contrast to previous reports using conventional CGH, array CGH identified three distinct breakpoints in these cases: Ch 17: 17940393-19251679 (17p11.2, n=6), Ch 17: 20111990-23308272 (17p11.2-17q11.2, n=4) and Ch 17: 38425359-39091575 (17q21.31, n=1). Significant differences were found in the patterns of copy number change between medulloblastomas and supratentorial primitive neuroectodermal tumours, providing further evidence that these tumours are genetically distinct despite their morphological and behavioural similarities. PMID:16783165

  14. Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs and an online database.

    Directory of Open Access Journals (Sweden)

    Christopher A Raistrick

    2010-10-01

    Full Text Available Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS discovered a polymorphism in a 5' non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5' untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants--splice translational efficiency polymorphisms (STEPs--may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL using publicly available data.Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs. 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5' non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison.Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.

  15. Genomic profiling of oral squamous cell carcinoma by array-based comparative genomic hybridization.

    Directory of Open Access Journals (Sweden)

    Shunichi Yoshioka

    Full Text Available We designed a study to investigate genetic relationships between primary tumors of oral squamous cell carcinoma (OSCC and their lymph node metastases, and to identify genomic copy number aberrations (CNAs related to lymph node metastasis. For this purpose, we collected a total of 42 tumor samples from 25 patients and analyzed their genomic profiles by array-based comparative genomic hybridization. We then compared the genetic profiles of metastatic primary tumors (MPTs with their paired lymph node metastases (LNMs, and also those of LNMs with non-metastatic primary tumors (NMPTs. Firstly, we found that although there were some distinctive differences in the patterns of genomic profiles between MPTs and their paired LNMs, the paired samples shared similar genomic aberration patterns in each case. Unsupervised hierarchical clustering analysis grouped together 12 of the 15 MPT-LNM pairs. Furthermore, similarity scores between paired samples were significantly higher than those between non-paired samples. These results suggested that MPTs and their paired LNMs are composed predominantly of genetically clonal tumor cells, while minor populations with different CNAs may also exist in metastatic OSCCs. Secondly, to identify CNAs related to lymph node metastasis, we compared CNAs between grouped samples of MPTs and LNMs, but were unable to find any CNAs that were more common in LNMs. Finally, we hypothesized that subpopulations carrying metastasis-related CNAs might be present in both the MPT and LNM. Accordingly, we compared CNAs between NMPTs and LNMs, and found that gains of 7p, 8q and 17q were more common in the latter than in the former, suggesting that these CNAs may be involved in lymph node metastasis of OSCC. In conclusion, our data suggest that in OSCCs showing metastasis, the primary and metastatic tumors share similar genomic profiles, and that cells in the primary tumor may tend to metastasize after acquiring metastasis-associated CNAs.

  16. Whole-genome single-nucleotide polymorphism (SNP marker discovery and association analysis with the eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content in Larimichthys crocea

    Directory of Open Access Journals (Sweden)

    Shijun Xiao

    2016-12-01

    Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.

  17. Comparative study of single-nucleotide polymorphism array and next generation sequencing based strategies on triploid identification in preimplantation genetic diagnosis and screen.

    Science.gov (United States)

    Xu, Jiawei; Niu, Wenbin; Peng, Zhaofeng; Bao, Xiao; Zhang, Meixiang; Wang, Linlin; Du, Linqing; Zhang, Nan; Sun, Yingpu

    2016-12-06

    Triploidy occurred about 2-3% in human pregnancies and contributed to approximately 15% of chromosomally caused human early miscarriage. It is essential for preimplantation genetic diagnosis and screen to distinct triploidy sensitively. Here, we performed comparative investigations between MALBAC-NGS and MDA-SNP array sensitivity on triploidy detection. Self-correction and reference-correction algorism were used to analyze the NGS data. We identified 5 triploid embryos in 1198 embryos of 218 PGD and PGS cycles using MDA-SNP array, the rate of tripoidy was 4.17‰ in PGS and PGD patients. Our results indicated that the MDA-SNP array was sensitive to digyny and diandry triploidy, MALBAC-NGS combined with self and reference genome correction strategies analyze were not sensitive to detect triploidy. Our study demonstrated that triploidy occurred at 4.17‰ in PGD and PGS, MDA-SNP array could successfully identify triploidy in PGD and PGS and genomic DNA. MALBAC-NGS combined with self and reference genome correction strategies were not sensitive to triploidy.

  18. Genome-wide comparison of paired fresh frozen and formalin-fixed paraffin-embedded gliomas by custom BAC and oligonucleotide array comparative genomic hybridization: facilitating analysis of archival gliomas.

    Science.gov (United States)

    Mohapatra, Gayatry; Engler, David A; Starbuck, Kristen D; Kim, James C; Bernay, Derek C; Scangas, George A; Rousseau, Audrey; Batchelor, Tracy T; Betensky, Rebecca A; Louis, David N

    2011-04-01

    Array comparative genomic hybridization (aCGH) is a powerful tool for detecting DNA copy number alterations (CNA). Because diffuse malignant gliomas are often sampled by small biopsies, formalin-fixed paraffin-embedded (FFPE) blocks are often the only tissue available for genetic analysis; FFPE tissues are also needed to study the intratumoral heterogeneity that characterizes these neoplasms. In this paper, we present a combination of evaluations and technical advances that provide strong support for the ready use of oligonucleotide aCGH on FFPE diffuse gliomas. We first compared aCGH using bacterial artificial chromosome (BAC) arrays in 45 paired frozen and FFPE gliomas, and demonstrate a high concordance rate between FFPE and frozen DNA in an individual clone-level analysis of sensitivity and specificity, assuring that under certain array conditions, frozen and FFPE DNA can perform nearly identically. However, because oligonucleotide arrays offer advantages to BAC arrays in genomic coverage and practical availability, we next developed a method of labeling DNA from FFPE tissue that allows efficient hybridization to oligonucleotide arrays. To demonstrate utility in FFPE tissues, we applied this approach to biphasic anaplastic oligoastrocytomas and demonstrate CNA differences between DNA obtained from the two components. Therefore, BAC and oligonucleotide aCGH can be sensitive and specific tools for detecting CNAs in FFPE DNA, and novel labeling techniques enable the routine use of oligonucleotide arrays for FFPE DNA. In combination, these advances should facilitate genome-wide analysis of rare, small and/or histologically heterogeneous gliomas from FFPE tissues.

  19. High-resolution genomic fingerprinting of Campylobacter jejuni and Campylobacter coli by analysis of amplified fragment length polymorphisms

    DEFF Research Database (Denmark)

    Kokotovic, Branko; On, Stephen L.W.

    1999-01-01

    A method for high-resolution genomic fingerprinting of the enteric pathogens Campylobacter jejuni and Campylobacter coli, based on the determination of amplified fragment length polymorphism, is described. The potential of this method for molecular epidemiological studies of these species...... is evaluated with 50 type, reference, and well-characterised field strains. Amplified fragment length polymorphism fingerprints comprised over 60 bands detected in the size range 35-500 bp. Groups of outbreak strains, replicate subcultures, and 'genetically identical' strains from humans, poultry and cattle......, proved indistinguishable by amplified fragment length polymorphism fingerprinting, but were differentiated fi-om unrelated isolates. Previously unknown relationships between three hippurate-negative C. jejuni strains, and two C. coil var, hyoilei strains, were identified. These relationships corresponded...

  20. A resource of genome-wide single-nucleotide polymorphisms generated by RAD tag sequencing in the critically endangered European eel

    DEFF Research Database (Denmark)

    Pujolar, J.M.; Jacobsen, M.W.; Frydenberg, J.

    2013-01-01

    Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the Eu...... 425 loci and 376 918 associated SNPs provides a valuable tool for future population genetics and genomics studies and allows for targeting specific genes and particularly interesting regions of the eel genome...

  1. Development and validation of the Axiom(®) Apple480K SNP genotyping array.

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Linsmith, Gareth; Muranty, Hélène; Denancé, Caroline; Théron, Anthony; Poncet, Charles; Micheletti, Diego; Kerschbamer, Emanuela; Di Pierro, Erica A; Larger, Simone; Pindo, Massimo; Van de Weg, Eric; Davassi, Alessandro; Laurens, François; Velasco, Riccardo; Durel, Charles-Eric; Troggio, Michela

    2016-04-01

    Cultivated apple (Malus × domestica Borkh.) is one of the most important fruit crops in temperate regions, and has great economic and cultural value. The apple genome is highly heterozygous and has undergone a recent duplication which, combined with a rapid linkage disequilibrium decay, makes it difficult to perform genome-wide association (GWA) studies. Single nucleotide polymorphism arrays offer highly multiplexed assays at a relatively low cost per data point and can be a valid tool for the identification of the markers associated with traits of interest. Here, we describe the development and validation of a 487K SNP Affymetrix Axiom(®) genotyping array for apple and discuss its potential applications. The array has been built from the high-depth resequencing of 63 different cultivars covering most of the genetic diversity in cultivated apple. The SNPs were chosen by applying a focal points approach to enrich genic regions, but also to reach a uniform coverage of non-genic regions. A total of 1324 apple accessions, including the 92 progenies of two mapping populations, have been genotyped with the Axiom(®) Apple480K to assess the effectiveness of the array. A large majority of SNPs (359 994 or 74%) fell in the stringent class of poly high resolution polymorphisms. We also devised a filtering procedure to identify a subset of 275K very robust markers that can be safely used for germplasm surveys in apple. The Axiom(®) Apple480K has now been commercially released both for public and proprietary use and will likely be a reference tool for GWA studies in apple. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  2. Genetic analysis of glucosinolate variability in broccoli florets using genome-anchored single nucleotide polymorphisms.

    Science.gov (United States)

    Brown, Allan F; Yousef, Gad G; Reid, Robert W; Chebrolu, Kranthi K; Thomas, Aswathy; Krueger, Christopher; Jeffery, Elizabeth; Jackson, Eric; Juvik, John A

    2015-07-01

    The identification of genetic factors influencing the accumulation of individual glucosinolates in broccoli florets provides novel insight into the regulation of glucosinolate levels in Brassica vegetables and will accelerate the development of vegetables with glucosinolate profiles tailored to promote human health. Quantitative trait loci analysis of glucosinolate (GSL) variability was conducted with a B. oleracea (broccoli) mapping population, saturated with single nucleotide polymorphism markers from a high-density array designed for rapeseed (Brassica napus). In 4 years of analysis, 14 QTLs were associated with the accumulation of aliphatic, indolic, or aromatic GSLs in floret tissue. The accumulation of 3-carbon aliphatic GSLs (2-propenyl and 3-methylsulfinylpropyl) was primarily associated with a single QTL on C05, but common regulation of 4-carbon aliphatic GSLs was not observed. A single locus on C09, associated with up to 40 % of the phenotypic variability of 2-hydroxy-3-butenyl GSL over multiple years, was not associated with the variability of precursor compounds. Similarly, QTLs on C02, C04, and C09 were associated with 4-methylsulfinylbutyl GSL concentration over multiple years but were not significantly associated with downstream compounds. Genome-specific SNP markers were used to identify candidate genes that co-localized to marker intervals and previously sequenced Brassica oleracea BAC clones containing known GSL genes (GSL-ALK, GSL-PRO, and GSL-ELONG) were aligned to the genomic sequence, providing support that at least three of our 14 QTLs likely correspond to previously identified GSL loci. The results demonstrate that previously identified loci do not fully explain GSL variation in broccoli. The identification of additional genetic factors influencing the accumulation of GSL in broccoli florets provides novel insight into the regulation of GSL levels in Brassicaceae and will accelerate development of vegetables with modified or enhanced GSL

  3. Detecting single DNA copy number variations in complex genomes using one nanogram of starting DNA and BAC-array CGH.

    Science.gov (United States)

    Guillaud-Bataille, Marine; Valent, Alexander; Soularue, Pascal; Perot, Christine; Inda, Maria Mar; Receveur, Aline; Smaïli, Sadek; Roest Crollius, Hugues; Bénard, Jean; Bernheim, Alain; Gidrol, Xavier; Danglot, Gisèle

    2004-07-29

    Comparative genomic hybridization to bacterial artificial chromosome (BAC)-arrays (array-CGH) is a highly efficient technique, allowing the simultaneous measurement of genomic DNA copy number at hundreds or thousands of loci, and the reliable detection of local one-copy-level variations. We report a genome-wide amplification method allowing the same measurement sensitivity, using 1 ng of starting genomic DNA, instead of the classical 1 microg usually necessary. Using a discrete series of DNA fragments, we defined the parameters adapted to the most faithful ligation-mediated PCR amplification and the limits of the technique. The optimized protocol allows a 3000-fold DNA amplification, retaining the quantitative characteristics of the initial genome. Validation of the amplification procedure, using DNA from 10 tumour cell lines hybridized to BAC-arrays of 1500 spots, showed almost perfectly superimposed ratios for the non-amplified and amplified DNAs. Correlation coefficients of 0.96 and 0.99 were observed for regions of low-copy-level variations and all regions, respectively (including in vivo amplified oncogenes). Finally, labelling DNA using two nucleotides bearing the same fluorophore led to a significant increase in reproducibility and to the correct detection of one-copy gain or loss in >90% of the analysed data, even for pseudotriploid tumour genomes.

  4. Genomic diversity among Danish field strains of Mycoplasma hyosynoviae assessed by amplified fragment length polymorphism analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, Niels F.; Nielsen, Elisabeth O.

    2002-01-01

    Genomic diversity among strains of Mycoplasma hyosynoviae isolated in Denmark was assessed by using amplified fragment length polymorphism (AFLP) analysis. Ninety-six strains, obtained from different specimens and geographical locations during 30 years and the type strain of M. hyosynoviae S16(T......) were concurrently examined for variance in BglII-MfeI and EcoRI-Csp6I-A AFLP markers. A total of 56 different genomic fingerprints having an overall similarity between 77 and 96% were detected. No correlation between AFLP variability and period of isolation or anatomical site of isolation could...

  5. The pattern of polymorphism in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    2005-07-01

    Full Text Available We resequenced 876 short fragments in a sample of 96 individuals of Arabidopsis thaliana that included stock center accessions as well as a hierarchical sample from natural populations. Although A. thaliana is a selfing weed, the pattern of polymorphism in general agrees with what is expected for a widely distributed, sexually reproducing species. Linkage disequilibrium decays rapidly, within 50 kb. Variation is shared worldwide, although population structure and isolation by distance are evident. The data fail to fit standard neutral models in several ways. There is a genome-wide excess of rare alleles, at least partially due to selection. There is too much variation between genomic regions in the level of polymorphism. The local level of polymorphism is negatively correlated with gene density and positively correlated with segmental duplications. Because the data do not fit theoretical null distributions, attempts to infer natural selection from polymorphism data will require genome-wide surveys of polymorphism in order to identify anomalous regions. Despite this, our data support the utility of A. thaliana as a model for evolutionary functional genomics.

  6. Genovar: a detection and visualization tool for genomic variants.

    Science.gov (United States)

    Jung, Kwang Su; Moon, Sanghoon; Kim, Young Jin; Kim, Bong-Jo; Park, Kiejung

    2012-05-08

    Along with single nucleotide polymorphisms (SNPs), copy number variation (CNV) is considered an important source of genetic variation associated with disease susceptibility. Despite the importance of CNV, the tools currently available for its analysis often produce false positive results due to limitations such as low resolution of array platforms, platform specificity, and the type of CNV. To resolve this problem, spurious signals must be separated from true signals by visual inspection. None of the previously reported CNV analysis tools support this function and the simultaneous visualization of comparative genomic hybridization arrays (aCGH) and sequence alignment. The purpose of the present study was to develop a useful program for the efficient detection and visualization of CNV regions that enables the manual exclusion of erroneous signals. A JAVA-based stand-alone program called Genovar was developed. To ascertain whether a detected CNV region is a novel variant, Genovar compares the detected CNV regions with previously reported CNV regions using the Database of Genomic Variants (DGV, http://projects.tcag.ca/variation) and the Single Nucleotide Polymorphism Database (dbSNP). The current version of Genovar is capable of visualizing genomic data from sources such as the aCGH data file and sequence alignment format files. Genovar is freely accessible and provides a user-friendly graphic user interface (GUI) to facilitate the detection of CNV regions. The program also provides comprehensive information to help in the elimination of spurious signals by visual inspection, making Genovar a valuable tool for reducing false positive CNV results. http://genovar.sourceforge.net/.

  7. Analysis of Chinese women with primary ovarian insufficiency by high resolution array-comparative genomic hybridization.

    Science.gov (United States)

    Liao, Can; Fu, Fang; Yang, Xin; Sun, Yi-Min; Li, Dong-Zhi

    2011-06-01

    Primary ovarian insufficiency (POI) is defined as a primary ovarian defect characterized by absent menarche (primary amenorrhea) or premature depletion of ovarian follicles before the age of 40 years. The etiology of primary ovarian insufficiency in human female patients is still unclear. The purpose of this study is to investigate the potential genetic causes in primary amenorrhea patients by high resolution array based comparative genomic hybridization (array-CGH) analysis. Following the standard karyotyping analysis, genomic DNA from whole blood of 15 primary amenorrhea patients and 15 normal control women was hybridized with Affymetrix cytogenetic 2.7M arrays following the standard protocol. Copy number variations identified by array-CGH were confirmed by real time polymerase chain reaction. All the 30 samples were negative by conventional karyotyping analysis. Microdeletions on chromosome 17q21.31-q21.32 with approximately 1.3 Mb were identified in four patients by high resolution array-CGH analysis. This included the female reproductive secretory pathway related factor N-ethylmaleimide-sensitive factor (NSF) gene. The results of the present study suggest that there may be critical regions regulating primary ovarian insufficiency in women with a 17q21.31-q21.32 microdeletion. This effect might be due to the loss of function of the NSF gene/genes within the deleted region or to effects on contiguous genes.

  8. Genome-wide analysis of intraspecific DNA polymorphism in 'Micro-Tom', a model cultivar of tomato (Solanum lycopersicum).

    Science.gov (United States)

    Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh

    2014-02-01

    Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.

  9. Identification and insertion polymorphisms of short interspersed nuclear elements (SINEs) in Brassica genomes

    International Nuclear Information System (INIS)

    Nouroz, F.; Naveed, M.

    2018-01-01

    The non-LTR retrotransposons (retroposons) are abundant in plant genomes including members of Brassicaceae. Of the retroposons, long interspersed nuclear elements (LINEs) are more copious followed by short interspersed nuclear elements (SINEs) in sequenced eukaryotic genomes. The SINEs are short elements and ranged from 100-500 bps flanked by variable sized target site duplications, 5' tRNA region with polymerase III promoter, internal tRNA unrelated region, 3' LINEs derived region and a poly adenosine tail. Different computational approaches were used for the identification and characterization of SINEs, while PCR was used to detect the SINEs insertion polymorphisms in various Brassica genotypes. Ten previously unidentified families of SINEs were identified and characterized from Brassica genomes. The structural features of these SINEs were studied in detail, which showed typical SINE features displaying small sizes, target site duplications, head regions, internal regions (body) of variable sizes and a poly (A) tail at the 3' terminus. The elements from various families ranged from 206-558 bp, where BoSINE2 family displayed smallest SINE element (206 bp), while larger members belonged to BoSINE9 family (524-558 bp). The distribution and abundance of SINEs in various Brassica species and genotypes (40) at a particular site/locus were investigated by SINEs based PCR markers. Various SINE insertion polymorphisms were detected from different genotypes, where higher PCR bands amplified the SINE insertions, while lower bands amplified the pre-insertion sites (flanking regions). The analysis of Brassica SINEs copy numbers from 10 identified families revealed that around 860 and 1712 copies of SINEs were calculated from B. rapa and B. oleracea Whole-genome shotgun contigs (WGS) respectively. Analysis of insertion sites of Brassica SINEs revealed that the members from all 10 SINE families had shown an insertion preference in AT rich regions. The present

  10. Genomic comparison of invasive and rare non-invasive strains reveals Porphyromonas gingivalis genetic polymorphisms

    Directory of Open Access Journals (Sweden)

    Svetlana Dolgilevich

    2011-03-01

    Full Text Available Porphyromonas gingivalis strains are shown to invade human cells in vitro with different invasion efficiencies, varying by up to three orders of magnitude.We tested the hypothesis that invasion-associated interstrain genomic polymorphisms are present in P. gingivalis and that putative invasion-associated genes can contribute to P. gingivalis invasion.Using an invasive (W83 and the only available non-invasive P. gingivalis strain (AJW4 and whole genome microarrays followed by two separate software tools, we carried out comparative genomic hybridization (CGH analysis.We identified 68 annotated and 51 hypothetical open reading frames (ORFs that are polymorphic between these strains. Among these are surface proteins, lipoproteins, capsular polysaccharide biosynthesis enzymes, regulatory and immunoreactive proteins, integrases, and transposases often with abnormal GC content and clustered on the chromosome. Amplification of selected ORFs was used to validate the approach and the selection. Eleven clinical strains were investigated for the presence of selected ORFs. The putative invasion-associated ORFs were present in 10 of the isolates. The invasion ability of three isogenic mutants, carrying deletions in PG0185, PG0186, and PG0982 was tested. The PG0185 (ragA and PG0186 (ragB mutants had 5.1×103-fold and 3.6×103-fold decreased in vitro invasion ability, respectively.The annotation of divergent ORFs suggests deficiency in multiple genes as a basis for P. gingivalis non-invasive phenotype. Access the supplementary material to this article: Supplement, table (see Supplementary files under Reading Tools online.

  11. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    Science.gov (United States)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  12. Identification of chromosome aberrations in sporadic microsatellite stable and unstable colorectal cancers using array comparative genomic hybridization

    DEFF Research Database (Denmark)

    Jensen, Thomas Dyrsø; Li, Jian; Wang, Kai

    2011-01-01

    cancers constitute approximately 85% of sporadic cases, whereas microsatellite unstable (MSI) cases constitute the remaining 15%. In this study, we used array comparative genomic hybridization (aCGH) to identify genomic hotspot regions that harbor recurrent copy number changes. The study material...

  13. Polymorphism at codon 36 of the p53 gene.

    Science.gov (United States)

    Felix, C A; Brown, D L; Mitsudomi, T; Ikagaki, N; Wong, A; Wasserman, R; Womer, R B; Biegel, J A

    1994-01-01

    A polymorphism at codon 36 in exon 4 of the p53 gene was identified by single strand conformation polymorphism (SSCP) analysis and direct sequencing of genomic DNA PCR products. The polymorphic allele, present in the heterozygous state in genomic DNAs of four of 100 individuals (4%), changes the codon 36 CCG to CCA, eliminates a FinI restriction site and creates a BccI site. Including this polymorphism there are four known polymorphisms in the p53 coding sequence.

  14. Genome-wide generation and use of informative intron-spanning and intron-length polymorphism markers for high-throughput genetic analysis in rice

    Science.gov (United States)

    Badoni, Saurabh; Das, Sweta; Sayal, Yogesh K.; Gopalakrishnan, S.; Singh, Ashok K.; Rao, Atmakuri R.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    We developed genome-wide 84634 ISM (intron-spanning marker) and 16510 InDel-fragment length polymorphism-based ILP (intron-length polymorphism) markers from genes physically mapped on 12 rice chromosomes. These genic markers revealed much higher amplification-efficiency (80%) and polymorphic-potential (66%) among rice accessions even by a cost-effective agarose gel-based assay. A wider level of functional molecular diversity (17–79%) and well-defined precise admixed genetic structure was assayed by 3052 genome-wide markers in a structured population of indica, japonica, aromatic and wild rice. Six major grain weight QTLs (11.9–21.6% phenotypic variation explained) were mapped on five rice chromosomes of a high-density (inter-marker distance: 0.98 cM) genetic linkage map (IR 64 x Sonasal) anchored with 2785 known/candidate gene-derived ISM and ILP markers. The designing of multiple ISM and ILP markers (2 to 4 markers/gene) in an individual gene will broaden the user-preference to select suitable primer combination for efficient assaying of functional allelic variation/diversity and realistic estimation of differential gene expression profiles among rice accessions. The genomic information generated in our study is made publicly accessible through a user-friendly web-resource, “Oryza ISM-ILP marker” database. The known/candidate gene-derived ISM and ILP markers can be enormously deployed to identify functionally relevant trait-associated molecular tags by optimal-resource expenses, leading towards genomics-assisted crop improvement in rice. PMID:27032371

  15. A comparison between genotyping-by-sequencing and array-based scoring of SNPs for genomic prediction accuracy in winter wheat.

    Science.gov (United States)

    Elbasyoni, Ibrahim S; Lorenz, A J; Guttieri, M; Frels, K; Baenziger, P S; Poland, J; Akhunov, E

    2018-05-01

    The utilization of DNA molecular markers in plant breeding to maximize selection response via marker-assisted selection (MAS) and genomic selection (GS) has revolutionized plant breeding. A key factor affecting GS applicability is the choice of molecular marker platform. Genotyping-by-sequencing scored SNPs (GBS-scored SNPs) provides a large number of markers, albeit with high rates of missing data. Array scored SNPs are of high quality, but the cost per sample is substantially higher. The objectives of this study were 1) compare GBS-scored SNPs, and array scored SNPs for genomic selection applications, and 2) compare estimates of genomic kinship and population structure calculated using the two marker platforms. SNPs were compared in a diversity panel consisting of 299 hard winter wheat (Triticum aestivum L.) accessions that were part of a multi-year, multi-environments association mapping study. The panel was phenotyped in Ithaca, Nebraska for heading date, plant height, days to physiological maturity and grain yield in 2012 and 2013. The panel was genotyped using GBS-scored SNPs, and array scored SNPs. Results indicate that GBS-scored SNPs is comparable to or better than Array-scored SNPs for genomic prediction application. Both platforms identified the same genetic patterns in the panel where 90% of the lines were classified to common genetic groups. Overall, we concluded that GBS-scored SNPs have the potential to be the marker platform of choice for genetic diversity and genomic selection in winter wheat. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. SNP Discovery and Development of a High-Density Genotyping Array for Sunflower

    Science.gov (United States)

    Bachlava, Eleni; Taylor, Christopher A.; Tang, Shunxue; Bowers, John E.; Mandel, Jennifer R.; Burke, John M.; Knapp, Steven J.

    2012-01-01

    Recent advances in next-generation DNA sequencing technologies have made possible the development of high-throughput SNP genotyping platforms that allow for the simultaneous interrogation of thousands of single-nucleotide polymorphisms (SNPs). Such resources have the potential to facilitate the rapid development of high-density genetic maps, and to enable genome-wide association studies as well as molecular breeding approaches in a variety of taxa. Herein, we describe the development of a SNP genotyping resource for use in sunflower (Helianthus annuus L.). This work involved the development of a reference transcriptome assembly for sunflower, the discovery of thousands of high quality SNPs based on the generation and analysis of ca. 6 Gb of transcriptome re-sequencing data derived from multiple genotypes, the selection of 10,640 SNPs for inclusion in the genotyping array, and the use of the resulting array to screen a diverse panel of sunflower accessions as well as related wild species. The results of this work revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, greater than 95% of successful SNP assays revealed polymorphism, and more than 90% of these assays could be successfully transferred to related wild species. Analysis of the polymorphism data revealed patterns of genetic differentiation that were largely congruent with the evolutionary history of sunflower, though the large number of markers allowed for finer resolution than has previously been possible. PMID:22238659

  17. Development and Molecular Characterization of Novel Polymorphic Genomic DNA SSR Markers in Lentinula edodes.

    Science.gov (United States)

    Moon, Suyun; Lee, Hwa-Yong; Shim, Donghwan; Kim, Myungkil; Ka, Kang-Hyeon; Ryoo, Rhim; Ko, Han-Gyu; Koo, Chang-Duck; Chung, Jong-Wook; Ryu, Hojin

    2017-06-01

    Sixteen genomic DNA simple sequence repeat (SSR) markers of Lentinula edodes were developed from 205 SSR motifs present in 46.1-Mb long L. edodes genome sequences. The number of alleles ranged from 3-14 and the major allele frequency was distributed from 0.17-0.96. The values of observed and expected heterozygosity ranged from 0.00-0.76 and 0.07-0.90, respectively. The polymorphic information content value ranged from 0.07-0.89. A dendrogram, based on 16 SSR markers clustered by the paired hierarchical clustering' method, showed that 33 shiitake cultivars could be divided into three major groups and successfully identified. These SSR markers will contribute to the efficient breeding of this species by providing diversity in shiitake varieties. Furthermore, the genomic information covered by the markers can provide a valuable resource for genetic linkage map construction, molecular mapping, and marker-assisted selection in the shiitake mushroom.

  18. Polymorphisms in AHI1 are not associated with type 2 diabetes or related phenotypes in Danes: non-replication of a genome-wide association result

    DEFF Research Database (Denmark)

    Holmkvist, J; Anthonsen, S; Wegner, L

    2008-01-01

    AIMS/HYPOTHESIS: A genome-wide association study recently identified an association between common variants, rs1535435 and rs9494266, in the AHI1 gene and type 2 diabetes. The aim of the present study was to investigate the putative association between these polymorphisms and type 2 diabetes or t...... the importance of independent and well-powered replication studies of the recent genome-wide association scans before a locus is robustly validated as being associated with type 2 diabetes.......AIMS/HYPOTHESIS: A genome-wide association study recently identified an association between common variants, rs1535435 and rs9494266, in the AHI1 gene and type 2 diabetes. The aim of the present study was to investigate the putative association between these polymorphisms and type 2 diabetes...... or type 2 diabetes-related metabolic traits in Danish individuals. METHODS: The previously associated polymorphisms were genotyped in the population-based Inter99 cohort (n=6162), the Danish ADDITION study (n=8428), a population-based sample of young healthy participants (n=377) and in additional type 2...

  19. SNP array analysis reveals novel genomic abnormalities including copy neutral loss of heterozygosity in anaplastic oligodendrogliomas.

    Directory of Open Access Journals (Sweden)

    Ahmed Idbaih

    Full Text Available Anaplastic oligodendrogliomas (AOD are rare glial tumors in adults with relative homogeneous clinical, radiological and histological features at the time of diagnosis but dramatically various clinical courses. Studies have identified several molecular abnormalities with clinical or biological relevance to AOD (e.g. t(1;19(q10;p10, IDH1, IDH2, CIC and FUBP1 mutations.To better characterize the clinical and biological behavior of this tumor type, the creation of a national multicentric network, named "Prise en charge des OLigodendrogliomes Anaplasiques (POLA," has been supported by the Institut National du Cancer (InCA. Newly diagnosed and centrally validated AOD patients and their related biological material (tumor and blood samples were prospectively included in the POLA clinical database and tissue bank, respectively.At the molecular level, we have conducted a high-resolution single nucleotide polymorphism array analysis, which included 83 patients. Despite a careful central pathological review, AOD have been found to exhibit heterogeneous genomic features. A total of 82% of the tumors exhibited a 1p/19q-co-deletion, while 18% harbor a distinct chromosome pattern. Novel focal abnormalities, including homozygously deleted, amplified and disrupted regions, have been identified. Recurring copy neutral losses of heterozygosity (CNLOH inducing the modulation of gene expression have also been discovered. CNLOH in the CDKN2A locus was associated with protein silencing in 1/3 of the cases. In addition, FUBP1 homozygous deletion was detected in one case suggesting a putative tumor suppressor role of FUBP1 in AOD.Our study showed that the genomic and pathological analyses of AOD are synergistic in detecting relevant clinical and biological subgroups of AOD.

  20. Array comparative genomic hybridisation analysis of boys with X linked hypopituitarism identifies a 3.9 Mb duplicated critical region at Xq27 containing SOX3.

    NARCIS (Netherlands)

    Solomon, N.M.; Ross, S.; Morgan, T.; Belsky, J.L.; Hol, F.A.; Karnes, P.; Hopwood, N.J.; Myers, S.E.; Tan, A.; Warne, G.L.; Forrest, S.M.; Thomas, P.Q.

    2004-01-01

    INTRODUCTION: Array comparative genomic hybridisation (array CGH) is a powerful method that detects alteration of gene copy number with greater resolution and efficiency than traditional methods. However, its ability to detect disease causing duplications in constitutional genomic DNA has not been

  1. Ascertainment bias in studies of human genome-wide polymorphism

    DEFF Research Database (Denmark)

    Clark, Andrew G.; Hubisz, Melissa J.; Bustamente, Carlos D.

    2005-01-01

    of the SNPs that are found are influenced by the discovery sampling effort. The International HapMap project relied on nearly any piece of information available to identify SNPs-including BAC end sequences, shotgun reads, and differences between public and private sequences-and even made use of chimpanzee...... was a resequencing-by-hybridization effort using the 24 people of diverse origin in the Polymorphism Discovery Resource. Here we take these two data sets and contrast two basic summary statistics, heterozygosity and FST, as well as the site frequency spectra, for 500-kb windows spanning the genome. The magnitude...... of disparity between these samples in these measures of variability indicates that population genetic analysis on the raw genotype data is ill advised. Given the knowledge of the discovery samples, we perform an ascertainment correction and show how the post-correction data are more consistent across...

  2. Microsatellite genotyping and genome-wide single nucleotide polymorphism-based indices of Plasmodium falciparum diversity within clinical infections.

    Science.gov (United States)

    Murray, Lee; Mobegi, Victor A; Duffy, Craig W; Assefa, Samuel A; Kwiatkowski, Dominic P; Laman, Eugene; Loua, Kovana M; Conway, David J

    2016-05-12

    In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples. PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs. As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = -0.88, P 10 % had high correlation (r > 0.90) with the index derived using all SNPs. Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).

  3. Genome-wide loss of heterozygosity and copy number alteration in esophageal squamous cell carcinoma using the Affymetrix GeneChip Mapping 10 K array

    Directory of Open Access Journals (Sweden)

    Goldstein Alisa M

    2006-11-01

    Full Text Available Abstract Background Esophageal squamous cell carcinoma (ESCC is a common malignancy worldwide. Comprehensive genomic characterization of ESCC will further our understanding of the carcinogenesis process in this disease. Results Genome-wide detection of chromosomal changes was performed using the Affymetrix GeneChip 10 K single nucleotide polymorphism (SNP array, including loss of heterozygosity (LOH and copy number alterations (CNA, for 26 pairs of matched germ-line and micro-dissected tumor DNA samples. LOH regions were identified by two methods – using Affymetrix's genotype call software and using Affymetrix's copy number alteration tool (CNAT software – and both approaches yielded similar results. Non-random LOH regions were found on 10 chromosomal arms (in decreasing order of frequency: 17p, 9p, 9q, 13q, 17q, 4q, 4p, 3p, 15q, and 5q, including 20 novel LOH regions (10 kb to 4.26 Mb. Fifteen CNA-loss regions (200 kb to 4.3 Mb and 36 CNA-gain regions (200 kb to 9.3 Mb were also identified. Conclusion These studies demonstrate that the Affymetrix 10 K SNP chip is a valid platform to integrate analyses of LOH and CNA. The comprehensive knowledge gained from this analysis will enable improved strategies to prevent, diagnose, and treat ESCC.

  4. Rapid Genome-wide Single Nucleotide Polymorphism Discovery in Soybean and Rice via Deep Resequencing of Reduced Representation Libraries with the Illumina Genome Analyzer

    Directory of Open Access Journals (Sweden)

    Stéphane Deschamps

    2010-07-01

    Full Text Available Massively parallel sequencing platforms have allowed for the rapid discovery of single nucleotide polymorphisms (SNPs among related genotypes within a species. We describe the creation of reduced representation libraries (RRLs using an initial digestion of nuclear genomic DNA with a methylation-sensitive restriction endonuclease followed by a secondary digestion with the 4bp-restriction endonuclease This strategy allows for the enrichment of hypomethylated genomic DNA, which has been shown to be rich in genic sequences, and the digestion with serves to increase the number of common loci resequenced between individuals. Deep resequencing of these RRLs performed with the Illumina Genome Analyzer led to the identification of 2618 SNPs in rice and 1682 SNPs in soybean for two representative genotypes in each of the species. A subset of these SNPs was validated via Sanger sequencing, exhibiting validation rates of 96.4 and 97.0%, in rice ( and soybean (, respectively. Comparative analysis of the read distribution relative to annotated genes in the reference genome assemblies indicated that the RRL strategy was primarily sampling within genic regions for both species. The massively parallel sequencing of methylation-sensitive RRLs for genome-wide SNP discovery can be applied across a wide range of plant species having sufficient reference genomic sequence.

  5. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    Science.gov (United States)

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  6. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  7. SNPchiMp v.3: integrating and standardizing single nucleotide polymorphism data for livestock species.

    Science.gov (United States)

    Nicolazzi, Ezequiel L; Caprera, Andrea; Nazzicari, Nelson; Cozzi, Paolo; Strozzi, Francesco; Lawley, Cindy; Pirani, Ali; Soans, Chandrasen; Brew, Fiona; Jorjani, Hossein; Evans, Gary; Simpson, Barry; Tosser-Klopp, Gwenola; Brauning, Rudiger; Williams, John L; Stella, Alessandra

    2015-04-10

    In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information. Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion. This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.

  8. Development and validation of cross-transferable and polymorphic DNA markers for detecting alien genome introgression in Oryza sativa from Oryza brachyantha.

    Science.gov (United States)

    Ray, Soham; Bose, Lotan K; Ray, Joshitha; Ngangkham, Umakanta; Katara, Jawahar L; Samantaray, Sanghamitra; Behera, Lambodar; Anumalla, Mahender; Singh, Onkar N; Chen, Meingsheng; Wing, Rod A; Mohapatra, Trilochan

    2016-08-01

    African wild rice Oryza brachyantha (FF), a distant relative of cultivated rice Oryza sativa (AA), carries genes for pests and disease resistance. Molecular marker assisted alien gene introgression from this wild species to its domesticated counterpart is largely impeded due to the scarce availability of cross-transferable and polymorphic molecular markers that can clearly distinguish these two species. Availability of the whole genome sequence (WGS) of both the species provides a unique opportunity to develop markers, which are cross-transferable. We observed poor cross-transferability (~0.75 %) of O. sativa specific sequence tagged microsatellite (STMS) markers to O. brachyantha. By utilizing the genome sequence information, we developed a set of 45 low cost PCR based co-dominant polymorphic markers (STS and CAPS). These markers were found cross-transferrable (84.78 %) between the two species and could distinguish them from each other and thus allowed tracing alien genome introgression. Finally, we validated a Monosomic Alien Addition Line (MAAL) carrying chromosome 1 of O. brachyantha in O. sativa background using these markers, as a proof of concept. Hence, in this study, we have identified a set molecular marker (comprising of STMS, STS and CAPS) that are capable of detecting alien genome introgression from O. brachyantha to O. sativa.

  9. Draft genome of the sea cucumber Apostichopus japonicus and genetic polymorphism among color variants.

    Science.gov (United States)

    Jo, Jihoon; Oh, Jooseong; Lee, Hyun-Gwan; Hong, Hyun-Hee; Lee, Sung-Gwon; Cheon, Seongmin; Kern, Elizabeth M A; Jin, Soyeong; Cho, Sung-Jin; Park, Joong-Ki; Park, Chungoo

    2017-01-01

    The Japanese sea cucumber (Apostichopus japonicus Selenka 1867) is an economically important species as a source of seafood and ingredient in traditional medicine. It is mainly found off the coasts of northeast Asia. Recently, substantial exploitation and widespread biotic diseases in A. japonicus have generated increasing conservation concern. However, the genomic knowledge base and resources available for researchers to use in managing this natural resource and to establish genetically based breeding systems for sea cucumber aquaculture are still in a nascent stage. A total of 312 Gb of raw sequences were generated using the Illumina HiSeq 2000 platform and assembled to a final size of 0.66 Gb, which is about 80.5% of the estimated genome size (0.82 Gb). We observed nucleotide-level heterozygosity within the assembled genome to be 0.986%. The resulting draft genome assembly comprising 132 607 scaffolds with an N50 value of 10.5 kb contains a total of 21 771 predicted protein-coding genes. We identified 6.6-14.5 million heterozygous single nucleotide polymorphisms in the assembled genome of the three natural color variants (green, red, and black), resulting in an estimated nucleotide diversity of 0.00146. We report the first draft genome of A. japonicus and provide a general overview of the genetic variation in the three major color variants of A. japonicus. These data will help provide a comprehensive view of the genetic, physiological, and evolutionary relationships among color variants in A. japonicus, and will be invaluable resources for sea cucumber genomic research. © The Author 2017. Published by Oxford University Press.

  10. A response to Yu et al. "A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP) array", BMC Bioinformatics 2007, 8: 145.

    Science.gov (United States)

    Rueda, Oscar M; Diaz-Uriarte, Ramon

    2007-10-16

    Yu et al. (BMC Bioinformatics 2007,8: 145+) have recently compared the performance of several methods for the detection of genomic amplification and deletion breakpoints using data from high-density single nucleotide polymorphism arrays. One of the methods compared is our non-homogenous Hidden Markov Model approach. Our approach uses Markov Chain Monte Carlo for inference, but Yu et al. ran the sampler for a severely insufficient number of iterations for a Markov Chain Monte Carlo-based method. Moreover, they did not use the appropriate reference level for the non-altered state. We rerun the analysis in Yu et al. using appropriate settings for both the Markov Chain Monte Carlo iterations and the reference level. Additionally, to show how easy it is to obtain answers to additional specific questions, we have added a new analysis targeted specifically to the detection of breakpoints. The reanalysis shows that the performance of our method is comparable to that of the other methods analyzed. In addition, we can provide probabilities of a given spot being a breakpoint, something unique among the methods examined. Markov Chain Monte Carlo methods require using a sufficient number of iterations before they can be assumed to yield samples from the distribution of interest. Running our method with too small a number of iterations cannot be representative of its performance. Moreover, our analysis shows how our original approach can be easily adapted to answer specific additional questions (e.g., identify edges).

  11. A genome-wide association analysis of a broad psychosis phenotype identifies three loci for further investigation

    OpenAIRE

    Psychosis Endophenotypes International Consortium; Wellcome Trust Case-Control Consortium; Bramon, E.; Pirinen, M.; Strange, A.; Lin, K.; Freeman, C.; Bellenguez, C.; Su, Z.; Band, G.; Pearson, R.; Vukcevic, D.; Langford, C.; Deloukas, P.; Hunt, S.

    2014-01-01

    BACKGROUND: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. METHODS: 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 69...

  12. A Genome-wide Association Analysis of a Broad Psychosis Phenotype Identifies Three Loci for Further Investigation

    OpenAIRE

    Tosato, Sarah; Myin-germeys, Inez; Barroso, Ines; Bender, Stephan; Giegling, Ina; Arranz, Maria J.; Donnelly, Peter; Bellenguez, Celine; Brown, Matthew A.; Lawrie, Stephen; Kalaydjieva, Luba; Vukcevic, Damjan; Kahn, Rene S.; Dronov, Serge; Walshe, Muriel

    2014-01-01

    Background: Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories.Methods: 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,19...

  13. Whole-genome sequencing of a laboratory-evolved yeast strain

    Directory of Open Access Journals (Sweden)

    Dunham Maitreya J

    2010-02-01

    Full Text Available Abstract Background Experimental evolution of microbial populations provides a unique opportunity to study evolutionary adaptation in response to controlled selective pressures. However, until recently it has been difficult to identify the precise genetic changes underlying adaptation at a genome-wide scale. New DNA sequencing technologies now allow the genome of parental and evolved strains of microorganisms to be rapidly determined. Results We sequenced >93.5% of the genome of a laboratory-evolved strain of the yeast Saccharomyces cerevisiae and its ancestor at >28× depth. Both single nucleotide polymorphisms and copy number amplifications were found, with specific gains over array-based methodologies previously used to analyze these genomes. Applying a segmentation algorithm to quantify structural changes, we determined the approximate genomic boundaries of a 5× gene amplification. These boundaries guided the recovery of breakpoint sequences, which provide insights into the nature of a complex genomic rearrangement. Conclusions This study suggests that whole-genome sequencing can provide a rapid approach to uncover the genetic basis of evolutionary adaptations, with further applications in the study of laboratory selections and mutagenesis screens. In addition, we show how single-end, short read sequencing data can provide detailed information about structural rearrangements, and generate predictions about the genomic features and processes that underlie genome plasticity.

  14. The polymorphic integumentary mucin B.1 from Xenopus laevis contains the short consensus repeat.

    Science.gov (United States)

    Probst, J C; Hauser, F; Joba, W; Hoffmann, W

    1992-03-25

    The frog integumentary mucin B.1 (FIM-B.1), discovered by molecular cloning, contains a cysteine-rich C-terminal domain which is homologous with von Willebrand factor. With the help of the polymerase chain reaction, we now characterize a contiguous region 5' to the von Willebrand factor domain containing the short consensus repeat typical of many proteins from the complement system. Multiple transcripts have been cloned, which originate from a single animal and differ by a variable number of tandem repeats (rep-33 sequences). These different transcripts probably originate solely from two genes and are generated presumably by alternative splicing of an huge array of functional cassettes. This model is supported by analysis of genomic FIM-B.1 sequences from Xenopus laevis. Here, rep-33 sequences are arranged in an interrupted array of individual units. Additionally, results of Southern analysis revealed genetic polymorphism between different animals which is predicted to be within the tandem repeats. A first investigation of the predicted mucins with the help of a specific antibody against a synthetic peptide determined the molecular mass of FIM-B.1 to greater than 200 kDa. Here again, genetic polymorphism between different animals is detected.

  15. Specialization of Generic Array Accesses After Inlining

    Directory of Open Access Journals (Sweden)

    Ryohei Tokuda

    2017-02-01

    Full Text Available We have implemented an optimization that specializes type-generic array accesses after inlining of polymorphic functions in the native-code OCaml compiler. Polymorphic array operations (read and write in OCaml require runtime type dispatch because of ad hoc memory representations of integer and float arrays. It cannot be removed even after being monomorphized by inlining because the intermediate language is mostly untyped. We therefore extended it with explicit type application like System F (while keeping implicit type abstraction by means of unique identifiers for type variables. Our optimization has achieved up to 21% speed-up of numerical programs.

  16. Genome-wide association study for milking speed in French Holstein cows

    DEFF Research Database (Denmark)

    Marete, Andrew Gitahi; Sahana, Goutam; Fritz, Sebastian

    2018-01-01

    Using a combination of data from the BovineSNP50 BeadChip SNP array (Illumina, San Diego, CA) and a EuroGenomics (Amsterdam, the Netherlands) custom single nucleotide polymorphism (SNP) chip with SNP pre-selected from whole genome sequence data, we carried out an association study of milking speed...... associated with milking speed. As clinical mastitis and somatic cell score have an unfavorable genetic correlation with milking speed, we tested whether the most significant SNP on these 22 chromosomes associated with milking speed were also associated with clinical mastitis or somatic cell score. Nine...... hundred seventy-one genome-wide significant SNP were associated with milking speed. Of these, 86 were associated with clinical mastitis and 198 with somatic cell score. The most significant association signals for milking speed were observed on chromosomes 7, 8, 10, 14, and 18. The most significant signal...

  17. Development of a dense SNP-based linkage map of an apple rootstock progeny using the Malus Infinium whole genome genotyping array.

    Science.gov (United States)

    Antanaviciute, Laima; Fernández-Fernández, Felicidad; Jansen, Johannes; Banchi, Elisa; Evans, Katherine M; Viola, Roberto; Velasco, Riccardo; Dunwell, Jim M; Troggio, Michela; Sargent, Daniel J

    2012-05-25

    A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the 'Golden Delicious' genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the 'Golden Delicious' pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been

  18. Oligonucleotide arrays vs. metaphase-comparative genomic hybridisation and BAC arrays for single-cell analysis: first applications to preimplantation genetic diagnosis for Robertsonian translocation carriers.

    Science.gov (United States)

    Ramos, Laia; del Rey, Javier; Daina, Gemma; García-Aragonés, Manel; Armengol, Lluís; Fernandez-Encinas, Alba; Parriego, Mònica; Boada, Montserrat; Martinez-Passarell, Olga; Martorell, Maria Rosa; Casagran, Oriol; Benet, Jordi; Navarro, Joaquima

    2014-01-01

    Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH) and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈ 20 kb). Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14)(q10;q10). Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers.

  19. Oligonucleotide arrays vs. metaphase-comparative genomic hybridisation and BAC arrays for single-cell analysis: first applications to preimplantation genetic diagnosis for Robertsonian translocation carriers.

    Directory of Open Access Journals (Sweden)

    Laia Ramos

    Full Text Available Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈ 20 kb. Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14(q10;q10. Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers.

  20. Oligonucleotide Arrays vs. Metaphase-Comparative Genomic Hybridisation and BAC Arrays for Single-Cell Analysis: First Applications to Preimplantation Genetic Diagnosis for Robertsonian Translocation Carriers

    Science.gov (United States)

    Ramos, Laia; del Rey, Javier; Daina, Gemma; García-Aragonés, Manel; Armengol, Lluís; Fernandez-Encinas, Alba; Parriego, Mònica; Boada, Montserrat; Martinez-Passarell, Olga; Martorell, Maria Rosa; Casagran, Oriol; Benet, Jordi; Navarro, Joaquima

    2014-01-01

    Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH) and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈20 kb). Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14)(q10;q10). Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers. PMID:25415307

  1. Partial digestion with restriction enzymes of ultraviolet-irradiated human genomic DNA: a method for identifying restriction site polymorphisms

    International Nuclear Information System (INIS)

    Nobile, C.; Romeo, G.

    1988-01-01

    A method for partial digestion of total human DNA with restriction enzymes has been developed on the basis of a principle already utilized by P.A. Whittaker and E. Southern for the analysis of phage lambda recombinants. Total human DNA irradiated with uv light of 254 nm is partially digested by restriction enzymes that recognize sequences containing adjacent thymidines because of TT dimer formation. The products resulting from partial digestion of specific genomic regions are detected in Southern blots by genomic-unique DNA probes with high reproducibility. This procedure is rapid and simple to perform because the same conditions of uv irradiation are used for different enzymes and probes. It is shown that restriction site polymorphisms occurring in the genomic regions analyzed are recognized by the allelic partial digest patterns they determine

  2. saSNP Approach for Scalable SNP Analyses of Multiple Bacterial or Viral Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, Shea [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Slezak, Tom [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2010-07-27

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs. The method is fast to compute, finding SNPs and building a SNP phylogeny in seconds to hours. We use it to identify thousands of putative SNPs from all publicly available Filoviridae, Poxviridae, foot-and-mouth disease virus, Bacillus, and Escherichia coli genomes and plasmids. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle as input hundreds of gigabases of sequence in a single run. The algorithm is based on k-mer analysis using a suffix array, so we call it saSNP.

  3. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc

    2015-01-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected...... with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index...... itself. Depending on the trait’s economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage...

  4. Genome-wide association study for ovarian cancer susceptibility using pooled DNA

    DEFF Research Database (Denmark)

    Lu, Yi; Chen, Xiaoqing; Beesley, Jonathan

    2012-01-01

    stage 1 GWAS rather than due to problems with the pooling approach. We conclude that there are unlikely to be any moderate or large effects on ovarian cancer risk untagged by less dense arrays. However, our study lacked power to make clear statements on the existence of hitherto untagged small......Recent Genome-Wide Association Studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate- or low-penetrance variants exist among the subset of single-nucleotide polymorphisms (SNPs) not well tagged by the genotyping arrays used...... in the previous studies, which would account for some of the remaining risk. We therefore conducted a time- and cost-effective stage 1 GWAS on 342 invasive serous cases and 643 controls genotyped on pooled DNA using the high-density Illumina 1M-Duo array. We followed up 20 of the most significantly associated...

  5. Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms.

    Science.gov (United States)

    Taillon-Miller, P; Gu, Z; Li, Q; Hillier, L; Kwok, P Y

    1998-07-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21-7q22, and 13q12-13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations.

  6. Genome-wide development and deployment of informative intron-spanning and intron-length polymorphism markers for genomics-assisted breeding applications in chickpea.

    Science.gov (United States)

    Srivastava, Rishi; Bajaj, Deepak; Sayal, Yogesh K; Meher, Prabina K; Upadhyaya, Hari D; Kumar, Rajendra; Tripathi, Shailesh; Bharadwaj, Chellapilla; Rao, Atmakuri R; Parida, Swarup K

    2016-11-01

    The discovery and large-scale genotyping of informative gene-based markers is essential for rapid delineation of genes/QTLs governing stress tolerance and yield component traits in order to drive genetic enhancement in chickpea. A genome-wide 119169 and 110491 ISM (intron-spanning markers) from 23129 desi and 20386 kabuli protein-coding genes and 7454 in silico InDel (insertion-deletion) (1-45-bp)-based ILP (intron-length polymorphism) markers from 3283 genes were developed that were structurally and functionally annotated on eight chromosomes and unanchored scaffolds of chickpea. A much higher amplification efficiency (83%) and intra-specific polymorphic potential (86%) detected by these markers than that of other sequence-based genetic markers among desi and kabuli chickpea accessions was apparent even by a cost-effective agarose gel-based assay. The genome-wide physically mapped 1718 ILP markers assayed a wider level of functional genetic diversity (19-81%) and well-defined phylogenetics among domesticated chickpea accessions. The gene-derived 1424 ILP markers were anchored on a high-density (inter-marker distance: 0.65cM) desi intra-specific genetic linkage map/functional transcript map (ICC 4958×ICC 2263) of chickpea. This reference genetic map identified six major genomic regions harbouring six robust QTLs mapped on five chromosomes, which explained 11-23% seed weight trait variation (7.6-10.5 LOD) in chickpea. The integration of high-resolution QTL mapping with differential expression profiling detected six including one potential serine carboxypeptidase gene with ILP markers (linked tightly to the major seed weight QTLs) exhibiting seed-specific expression as well as pronounced up-regulation especially in seeds of high (ICC 4958) as compared to low (ICC 2263) seed weight mapping parental accessions. The marker information generated in the present study was made publicly accessible through a user-friendly web-resource, "Chickpea ISM-ILP Marker Database

  7. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity.

    Directory of Open Access Journals (Sweden)

    Nicolas Heslot

    Full Text Available Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L. genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis.

  8. Impact of Marker Ascertainment Bias on Genomic Selection Accuracy and Estimates of Genetic Diversity

    Science.gov (United States)

    Heslot, Nicolas; Rutkoski, Jessica; Poland, Jesse; Jannink, Jean-Luc; Sorrells, Mark E.

    2013-01-01

    Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS) is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology) and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS) accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis. PMID:24040295

  9. Diversity arrays technology (DArT) markers in apple for genetic linkage maps.

    Science.gov (United States)

    Schouten, Henk J; van de Weg, W Eric; Carling, Jason; Khan, Sabaz Ali; McKay, Steven J; van Kaauwen, Martijn P W; Wittenberg, Alexander H J; Koehorst-van Putten, Herma J J; Noordijk, Yolanda; Gao, Zhongshan; Rees, D Jasper G; Van Dyk, Maria M; Jaccoud, Damian; Considine, Michael J; Kilian, Andrzej

    2012-03-01

    Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for apple. This is the first paper on DArT in horticultural trees. Genetic mapping of DArT markers in two mapping populations and their integration with other marker types showed that DArT is a powerful high-throughput method for obtaining accurate and reproducible marker data, despite the low cost per data point. This method appears to be suitable for aligning the genetic maps of different segregating populations. The standard complexity reduction method, based on the methylation-sensitive PstI restriction enzyme, resulted in a high frequency of markers, although there was 52-54% redundancy due to the repeated sampling of highly similar sequences. Sequencing of the marker clones showed that they are significantly enriched for low-copy, genic regions. The genome coverage using the standard method was 55-76%. For improved genome coverage, an alternative complexity reduction method was examined, which resulted in less redundancy and additional segregating markers. The DArT markers proved to be of high quality and were very suitable for genetic mapping at low cost for the apple, providing moderate genome coverage. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11032-011-9579-5) contains supplementary material, which is available to authorized users.

  10. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  11. Prospects for Genomic Research in Forestry

    Directory of Open Access Journals (Sweden)

    K. V. Krutovsky

    2014-08-01

    Full Text Available Conifers are keystone species of boreal forests. Their whole genome sequencing, assembly and annotation will allow us to understand the evolution of the complex ancient giant conifer genomes that are 4 times larger in larch and 7–9 times larger in pines than the human genome. Genomic studies will allow also to obtain important whole genome sequence data and develop highly polymorphic and informative genetic markers, such as microsatellites and single nucleotide polymorphisms (SNPs that can be efficiently used in timber origin identification, for genetic variation monitoring, to study local and climate change adaptation and in tree improvement and conservation programs.

  12. Transposable element activity, genome regulation and human health.

    Science.gov (United States)

    Wang, Lu; Jordan, I King

    2018-03-02

    A convergence of novel genome analysis technologies is enabling population genomic studies of human transposable elements (TEs). Population surveys of human genome sequences have uncovered thousands of individual TE insertions that segregate as common genetic variants, i.e. TE polymorphisms. These recent TE insertions provide an important source of naturally occurring human genetic variation. Investigators are beginning to leverage population genomic data sets to execute genome-scale association studies for assessing the phenotypic impact of human TE polymorphisms. For example, the expression quantitative trait loci (eQTL) analytical paradigm has recently been used to uncover hundreds of associations between human TE insertion variants and gene expression levels. These include population-specific gene regulatory effects as well as coordinated changes to gene regulatory networks. In addition, analyses of linkage disequilibrium patterns with previously characterized genome-wide association study (GWAS) trait variants have uncovered TE insertion polymorphisms that are likely causal variants for a variety of common complex diseases. Gene regulatory mechanisms that underlie specific disease phenotypes have been proposed for a number of these trait associated TE polymorphisms. These new population genomic approaches hold great promise for understanding how ongoing TE activity contributes to functionally relevant genetic variation within and between human populations. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. Genome-wide comparison of paired fresh frozen and formalin-fixed paraffin-embedded gliomas by custom BAC and oligonucleotide array comparative genomic hybridization: facilitating analysis of archival gliomas

    Science.gov (United States)

    Mohapatra, Gayatry; Engler, David A.; Starbuck, Kristen D.; Kim, James C.; Bernay, Derek C.; Scangas, George A.; Rousseau, Audrey; Batchelor, Tracy T.; Betensky, Rebecca A.; Louis, David N.

    2010-01-01

    Molecular genetic analysis of cancer is rapidly evolving as a result of improvement in genomic technologies and the growing applicability of such analyses to clinical oncology. Array based comparative genomic hybridization (aCGH) is a powerful tool for detecting DNA copy number alterations (CNA), particularly in solid tumors, and has been applied to the study of malignant gliomas. In the clinical setting, however, gliomas are often sampled by small biopsies and thus formalin-fixed paraffin-embedded (FFPE) blocks are often the only tissue available for genetic analysis, especially for rare types of gliomas. Moreover, the biological basis for the marked intratumoral heterogeneity in gliomas is most readily addressed in FFPE material. Therefore, for gliomas, the ability to use DNA from FFPE tissue is essential for both clinical and research applications. In this study, we have constructed a custom bacterial artificial chromosome (BAC) array and show excellent sensitivity and specificity for detecting CNAs in a panel of paired frozen and FFPE glioma samples. Our study demonstrates a high concordance rate between CNAs detected in FFPE compared to frozen DNA. We have also developed a method of labeling DNA from FFPE tissue that allows efficient hybridization to oligonucleotide arrays. This labeling technique was applied to a panel of biphasic anaplastic oligoastrocytomas (AOA) to identify genetic changes unique to each component. Together, results from these studies suggest that BAC and oligonucleotide aCGH are sensitive tools for detecting CNAs in FFPE DNA, and can enable genome-wide analysis of rare, small and/or histologically heterogeneous gliomas. PMID:21080181

  14. Identification of a novel FGFRL1 MicroRNA target site polymorphism for bone mineral density in meta-analyses of genome-wide association studies

    NARCIS (Netherlands)

    T. Niu (Tianhua); N. Liu (Ning); M. Zhao (Ming); G. Xie (Guie); L. Zhang (Lei); J. Li (Jian); Y.-F. Pei (Yu-Fang); H. Shen (Hui); X. Fu (Xiaoying); H. He (Hao); S. Lu (Shan); X. Chen (Xiangding); L. Tan (Lijun); T.-L. Yang (Tie-Lin); Y. Guo (Yan); P.J. Leo (Paul); E.L. Duncan (Emma); J. Shen (Jie); Y.-F. Guo (Yan-fang); G.C. Nicholson (Geoffrey); R.L. Prince (Richard L.); J.A. Eisman (John); G. Jones (Graeme); P.N. Sambrook (Philip); X. Hu (Xiang); P.M. Das (Partha M.); Q. Tian (Qing); X.-Z. Zhu (Xue-Zhen); C.J. Papasian (Christopher J.); M.A. Brown (Matthew); A.G. Uitterlinden (André); Y.-P. Wang (Yu-Ping); S. Xiang (Shuanglin); H.-W. Deng

    2015-01-01

    textabstractMicroRNAs (miRNAs) are critical post-transcriptional regulators. Based on a previous genome-wide association (GWA) scan, we conducted a polymorphism in microRNAs' Target Sites (poly-miRTS)-centric multistage meta-analysis for lumbar spine (LS)-, total hip (HIP)-, and femoral neck

  15. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

    Science.gov (United States)

    Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

    2018-04-01

    Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  17. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Science.gov (United States)

    Luo, Huaiyong; Wang, Xiaojie; Zhan, Gangming; Wei, Guorong; Zhou, Xinli; Zhao, Jing; Huang, Lili; Kang, Zhensheng

    2015-01-01

    The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs) are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  18. Genomic profiling of plasmablastic lymphoma using array comparative genomic hybridization (aCGH: revealing significant overlapping genomic lesions with diffuse large B-cell lymphoma

    Directory of Open Access Journals (Sweden)

    Lu Xin-Yan

    2009-11-01

    Full Text Available Abstract Background Plasmablastic lymphoma (PL is a subtype of diffuse large B-cell lymphoma (DLBCL. Studies have suggested that tumors with PL morphology represent a group of neoplasms with clinopathologic characteristics corresponding to different entities including extramedullary plasmablastic tumors associated with plasma cell myeloma (PCM. The goal of the current study was to evaluate the genetic similarities and differences among PL, DLBCL (AIDS-related and non AIDS-related and PCM using array-based comparative genomic hybridization. Results Examination of genomic data in PL revealed that the most frequent segmental gain (> 40% include: 1p36.11-1p36.33, 1p34.1-1p36.13, 1q21.1-1q23.1, 7q11.2-7q11.23, 11q12-11q13.2 and 22q12.2-22q13.3. This correlated with segmental gains occurring in high frequency in DLBCL (AIDS-related and non AIDS-related cases. There were some segmental gains and some segmental loss that occurred in PL but not in the other types of lymphoma suggesting that these foci may contain genes responsible for the differentiation of this lymphoma. Additionally, some segmental gains and some segmental loss occurred only in PL and AIDS associated DLBCL suggesting that these foci may be associated with HIV infection. Furthermore, some segmental gains and some segmental loss occurred only in PL and PCM suggesting that these lesions may be related to plasmacytic differentiation. Conclusion To the best of our knowledge, the current study represents the first genomic exploration of PL. The genomic aberration pattern of PL appears to be more similar to that of DLBCL (AIDS-related or non AIDS-related than to PCM. Our findings suggest that PL may remain best classified as a subtype of DLBCL at least at the genome level.

  19. Gene-based single nucleotide polymorphism markers for genetic and association mapping in common bean.

    Science.gov (United States)

    Galeano, Carlos H; Cortés, Andrés J; Fernández, Andrea C; Soler, Álvaro; Franco-Herrera, Natalia; Makunde, Godwill; Vanderleyden, Jos; Blair, Matthew W

    2012-06-26

    In common bean, expressed sequence tags (ESTs) are an underestimated source of gene-based markers such as insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). However, due to the nature of these conserved sequences, detection of markers is difficult and portrays low levels of polymorphism. Therefore, development of intron-spanning EST-SNP markers can be a valuable resource for genetic experiments such as genetic mapping and association studies. In this study, a total of 313 new gene-based markers were developed at target genes. Intronic variation was deeply explored in order to capture more polymorphism. Introns were putatively identified after comparing the common bean ESTs with the soybean genome, and the primers were designed over intron-flanking regions. The intronic regions were evaluated for parental polymorphisms using the single strand conformational polymorphism (SSCP) technique and Sequenom MassARRAY system. A total of 53 new marker loci were placed on an integrated molecular map in the DOR364 × G19833 recombinant inbred line (RIL) population. The new linkage map was used to build a consensus map, merging the linkage maps of the BAT93 × JALO EEP558 and DOR364 × BAT477 populations. A total of 1,060 markers were mapped, with a total map length of 2,041 cM across 11 linkage groups. As a second application of the generated resource, a diversity panel with 93 genotypes was evaluated with 173 SNP markers using the MassARRAY-platform and KASPar technology. These results were coupled with previous SSR evaluations and drought tolerance assays carried out on the same individuals. This agglomerative dataset was examined, in order to discover marker-trait associations, using general linear model (GLM) and mixed linear model (MLM). Some significant associations with yield components were identified, and were consistent with previous findings. In short, this study illustrates the power of intron-based markers for linkage and association mapping in

  20. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  1. Genome-wide association study of Lp-PLA(2 activity and mass in the Framingham Heart Study.

    Directory of Open Access Journals (Sweden)

    Sunil Suchindran

    2010-04-01

    Full Text Available Lipoprotein-associated phospholipase A(2 (Lp-PLA(2 is an emerging risk factor and therapeutic target for cardiovascular disease. The activity and mass of this enzyme are heritable traits, but major genetic determinants have not been explored in a systematic, genome-wide fashion. We carried out a genome-wide association study of Lp-PLA(2 activity and mass in 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genotypes from the Affymetrix 550K SNP array were obtained from the open-access Framingham SHARe project. Each polymorphism that passed quality control was tested for associations with Lp-PLA(2 activity and mass using linear mixed models implemented in the R statistical package, accounting for familial correlations, and controlling for age, sex, smoking, lipid-lowering-medication use, and cohort. For Lp-PLA(2 activity, polymorphisms at four independent loci reached genome-wide significance, including the APOE/APOC1 region on chromosome 19 (p = 6 x 10(-24; CELSR2/PSRC1 on chromosome 1 (p = 3 x 10(-15; SCARB1 on chromosome 12 (p = 1x10(-8 and ZNF259/BUD13 in the APOA5/APOA1 gene region on chromosome 11 (p = 4 x 10(-8. All of these remained significant after accounting for associations with LDL cholesterol, HDL cholesterol, or triglycerides. For Lp-PLA(2 mass, 12 SNPs achieved genome-wide significance, all clustering in a region on chromosome 6p12.3 near the PLA2G7 gene. Our analyses demonstrate that genetic polymorphisms may contribute to inter-individual variation in Lp-PLA(2 activity and mass.

  2. Novel candidate genes and regions for childhood apraxia of speech identified by array comparative genomic hybridization.

    Science.gov (United States)

    Laffin, Jennifer J S; Raca, Gordana; Jackson, Craig A; Strand, Edythe A; Jakielski, Kathy J; Shriberg, Lawrence D

    2012-11-01

    The goal of this study was to identify new candidate genes and genomic copy-number variations associated with a rare, severe, and persistent speech disorder termed childhood apraxia of speech. Childhood apraxia of speech is the speech disorder segregating with a mutation in FOXP2 in a multigenerational London pedigree widely studied for its role in the development of speech-language in humans. A total of 24 participants who were suspected to have childhood apraxia of speech were assessed using a comprehensive protocol that samples speech in challenging contexts. All participants met clinical-research criteria for childhood apraxia of speech. Array comparative genomic hybridization analyses were completed using a customized 385K Nimblegen array (Roche Nimblegen, Madison, WI) with increased coverage of genes and regions previously associated with childhood apraxia of speech. A total of 16 copy-number variations with potential consequences for speech-language development were detected in 12 or half of the 24 participants. The copy-number variations occurred on 10 chromosomes, 3 of which had two to four candidate regions. Several participants were identified with copy-number variations in two to three regions. In addition, one participant had a heterozygous FOXP2 mutation and a copy-number variation on chromosome 2, and one participant had a 16p11.2 microdeletion and copy-number variations on chromosomes 13 and 14. Findings support the likelihood of heterogeneous genomic pathways associated with childhood apraxia of speech.

  3. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  4. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  5. SNP Arrays

    Directory of Open Access Journals (Sweden)

    Jari Louhelainen

    2016-10-01

    Full Text Available The papers published in this Special Issue “SNP arrays” (Single Nucleotide Polymorphism Arrays focus on several perspectives associated with arrays of this type. The range of papers vary from a case report to reviews, thereby targeting wider audiences working in this field. The research focus of SNP arrays is often human cancers but this Issue expands that focus to include areas such as rare conditions, animal breeding and bioinformatics tools. Given the limited scope, the spectrum of papers is nothing short of remarkable and even from a technical point of view these papers will contribute to the field at a general level. Three of the papers published in this Special Issue focus on the use of various SNP array approaches in the analysis of three different cancer types. Two of the papers concentrate on two very different rare conditions, applying the SNP arrays slightly differently. Finally, two other papers evaluate the use of the SNP arrays in the context of genetic analysis of livestock. The findings reported in these papers help to close gaps in the current literature and also to give guidelines for future applications of SNP arrays.

  6. Polymorphic microsatellites in the human bloodfluke, Schistosoma japonicum, identified using a genomic resource

    Directory of Open Access Journals (Sweden)

    Spear Robert

    2011-02-01

    Full Text Available Abstract Re-emergence of schistosomiasis in regions of China where control programs have ceased requires development of molecular-genetic tools to track gene flow and assess genetic diversity of Schistosoma populations. We identified many microsatellite loci in the draft genome of Schistosoma japonicum using defined search criteria and selected a subset for further analysis. From an initial panel of 50 loci, 20 new microsatellites were selected for eventual optimization and application to a panel of worms from endemic areas. All but one of the selected microsatellites contain simple tri-nucleotide repeats. Moderate to high levels of polymorphism were detected. Numbers of alleles ranged from 6 to 14 and observed heterozygosity was always >0.6. The loci reported here will facilitate high resolution population-genetic studies on schistosomes in re-emergent foci.

  7. [Analysis of clinical outcomes of different embryo stage biopsy in array comparative genomic hybridization based preimplantation genetic diagnosis and screening].

    Science.gov (United States)

    Shen, J D; Wu, W; Shu, L; Cai, L L; Xie, J Z; Ma, L; Sun, X P; Cui, Y G; Liu, J Y

    2017-12-25

    Objective: To evaluate the efficiency of the application of array comparative genomic hybridization (array-CGH) in preimplantation genetic diagnosis or screening (PGD/PGS), and compare the clinical outcomes of different stage embryo biopsy. Methods: The outcomes of 381 PGD/PGS cycles referred in the First Affiliated Hospital of Nanjing Medical University from July 2011 to August 2015 were retrospectively analyzed. There were 320 PGD cycles with 156 cleavage-stage-biopsy cycles and 164 trophectoderm-biopsy cycles, 61 PGS cycles with 23 cleavage-stage-biopsy cycles and 38 trophectoderm-biopsy cycles. Chromosomal analysis was performed by array-CGH technology combined with whole genome amplification. Single embryo transfer was performed in all transfer cycles. Live birth rate was calculated as the main clinical outcomes. Results: The embryo diagnosis rate of PGD/PGS by array-CGH were 96.9%-99.1%. In PGD biopsy cycles, the live birth rate per embryo transfer cycle and live birth rate per embryo biopsy cycle were 50.0%(58/116) and 37.2%(58/156) in cleavage-stage-biopsy group, 67.5%(85/126) and 51.8%(85/164) in trophectoderm-biopsy group (both P 0.05). Conclusions: High diagnosis rate and idea live birth rate are achieved in PGD/PGS cycles based on array-CGH technology. The live birth rate of trophectoderm-biopsy group is significantly higher than that of cleavage-stage-biopsy group in PGD cycles; the efficiency of trophectoderm-biopsy is better.

  8. Using genomic DNA-based probe-selection to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species

    Directory of Open Access Journals (Sweden)

    Townsend Henrik J

    2005-11-01

    Full Text Available Abstract High-density oligonucleotide (oligo arrays are a powerful tool for transcript profiling. Arrays based on GeneChip® technology are amongst the most widely used, although GeneChip® arrays are currently available for only a small number of plant and animal species. Thus, we have developed a method to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species and tested the method by analysing the transcriptome of Brassica oleracea L., a species for which no GeneChip® array is available, using a GeneChip® array designed for Arabidopsis thaliana (L. Heynh. Genomic DNA from B. oleracea was labelled and hybridised to the ATH1-121501 GeneChip® array. Arabidopsis thaliana probe-pairs that hybridised to the B. oleracea genomic DNA on the basis of the perfect-match (PM probe signal were then selected for subsequent B. oleracea transcriptome analysis using a .cel file parser script to generate probe mask files. The transcriptional response of B. oleracea to a mineral nutrient (phosphorus; P stress was quantified using probe mask files generated for a wide range of gDNA hybridisation intensity thresholds. An example probe mask file generated with a gDNA hybridisation intensity threshold of 400 removed > 68 % of the available PM probes from the analysis but retained >96 % of available A. thaliana probe-sets. Ninety-nine of these genes were then identified as significantly regulated under P stress in B. oleracea, including the homologues of P stress responsive genes in A. thaliana. Increasing the gDNA hybridisation intensity thresholds up to 500 for probe-selection increased the sensitivity of the GeneChip® array to detect regulation of gene expression in B. oleracea under P stress by up to 13-fold. Our open-source software to create probe mask files is freely available http://affymetrix.arabidopsis.info/xspecies/ and may be used to facilitate transcriptomic analyses of a wide range of plant and animal

  9. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  10. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Directory of Open Access Journals (Sweden)

    Karolina Chwialkowska

    2017-11-01

    Full Text Available Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq. We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation

  11. Preimplantation genetic diagnosis for chromosomal rearrangements with the use of array comparative genomic hybridization at the blastocyst stage.

    Science.gov (United States)

    Christodoulou, Christodoulos; Dheedene, Annelies; Heindryckx, Björn; van Nieuwerburgh, Filip; Deforce, Dieter; De Sutter, Petra; Menten, Björn; Van den Abbeel, Etienne

    2017-01-01

    To establish the value of array comparative genomic hybridization (CGH) for preimplantation genetic diagnosis (PGD) in embryos of translocation carriers in combination with vitrification and frozen embryo transfer in nonstimulated cycles. Retrospective data analysis study. Academic centers for reproductive medicine and genetics. Thirty-four couples undergoing PGD for chromosomal rearrangements from October 2013 to December 2015. Trophectoderm biopsy at day 5 or day 6 of embryo development and subsequently whole genome amplification and array CGH were performed. This approach revealed a high occurrence of aneuploidies and structural rearrangements unrelated to the parental rearrangement. Nevertheless, we observed a benefit in pregnancy rates of these couples. We detected chromosomal abnormalities in 133/207 embryos (64.2% of successfully amplified), and 74 showed a normal microarray profile (35.7%). In 48 of the 133 abnormal embryos (36.1%), an unbalanced rearrangement originating from the parental translocation was identified. Interestingly, 34.6% of the abnormal embryos (46/133) harbored chromosome rearrangements that were not directly linked to the parental translocation in question. We also detected a combination of unbalanced parental-derived rearrangements and aneuploidies in 27 of the 133 abnormal embryos (20.3%). The use of trophectoderm biopsy at the blastocyst stage is less detrimental to the survival of the embryo and leads to a more reliable estimate of the genomic content of the embryo than cleavage-stage biopsy. In this small cohort PGD study, we describe the successful implementation of array CGH analysis of blastocysts in patients with a chromosomal rearrangement to identify euploid embryos for transfer. Copyright © 2016 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  12. Genome-wide macrosynteny among Fusarium species in the Gibberella fujikuroi complex revealed by amplified fragment length polymorphisms.

    Directory of Open Access Journals (Sweden)

    Lieschen De Vos

    Full Text Available The Gibberella fujikuroi complex includes many Fusarium species that cause significant losses in yield and quality of agricultural and forestry crops. Due to their economic importance, whole-genome sequence information has rapidly become available for species including Fusarium circinatum, Fusarium fujikuroi and Fusarium verticillioides, each of which represent one of the three main clades known in this complex. However, no previous studies have explored the genomic commonalities and differences among these fungi. In this study, a previously completed genetic linkage map for an interspecific cross between Fusarium temperatum and F. circinatum, together with genomic sequence data, was utilized to consider the level of synteny between the three Fusarium genomes. Regions that are homologous amongst the Fusarium genomes examined were identified using in silico and pyrosequenced amplified fragment length polymorphism (AFLP fragment analyses. Homology was determined using BLAST analysis of the sequences, with 777 homologous regions aligned to F. fujikuroi and F. verticillioides. This also made it possible to assign the linkage groups from the interspecific cross to their corresponding chromosomes in F. verticillioides and F. fujikuroi, as well as to assign two previously unmapped supercontigs of F. verticillioides to probable chromosomal locations. We further found evidence of a reciprocal translocation between the distal ends of chromosome 8 and 11, which apparently originated before the divergence of F. circinatum and F. temperatum. Overall, a remarkable level of macrosynteny was observed among the three Fusarium genomes, when comparing AFLP fragments. This study not only demonstrates how in silico AFLPs can aid in the integration of a genetic linkage map to the physical genome, but it also highlights the benefits of using this tool to study genomic synteny and architecture.

  13. Modular assembly of transposable element arrays by microsatellite targeting in the guayule and rice genomes.

    Science.gov (United States)

    Valdes Franco, José A; Wang, Yi; Huo, Naxin; Ponciano, Grisel; Colvin, Howard A; McMahan, Colleen M; Gu, Yong Q; Belknap, William R

    2018-04-19

    Guayule (Parthenium argentatum A. Gray) is a rubber-producing desert shrub native to Mexico and the United States. Guayule represents an alternative to Hevea brasiliensis as a source for commercial natural rubber. The efficient application of modern molecular/genetic tools to guayule improvement requires characterization of its genome. The 1.6 Gb guayule genome was sequenced, assembled and annotated. The final 1.5 Gb assembly, while fragmented (N 50  = 22 kb), maps > 95% of the shotgun reads and is essentially complete. Approximately 40,000 transcribed, protein encoding genes were annotated on the assembly. Further characterization of this genome revealed 15 families of small, microsatellite-associated, transposable elements (TEs) with unexpected chromosomal distribution profiles. These SaTar (Satellite Targeted) elements, which are non-autonomous Mu-like elements (MULEs), were frequently observed in multimeric linear arrays of unrelated individual elements within which no individual element is interrupted by another. This uniformly non-nested TE multimer architecture has not been previously described in either eukaryotic or prokaryotic genomes. Five families of similarly distributed non-autonomous MULEs (microsatellite associated, modularly assembled) were characterized in the rice genome. Families of TEs with similar structures and distribution profiles were identified in sorghum and citrus. The sequencing and assembly of the guayule genome provides a foundation for application of current crop improvement technologies to this plant. In addition, characterization of this genome revealed SaTar elements with distribution profiles unique among TEs. Satar targeting appears based on an alternative MULE recombination mechanism with the potential to impact gene evolution.

  14. Tandem repeat regions within the Burkholderia pseudomallei genome and their application for high resolution genotyping

    Directory of Open Access Journals (Sweden)

    Harvey Steven P

    2007-03-01

    Full Text Available Abstract Background The facultative, intracellular bacterium Burkholderia pseudomallei is the causative agent of melioidosis, a serious infectious disease of humans and animals. We identified and categorized tandem repeat arrays and their distribution throughout the genome of B. pseudomallei strain K96243 in order to develop a genetic typing method for B. pseudomallei. We then screened 104 of the potentially polymorphic loci across a diverse panel of 31 isolates including B. pseudomallei, B. mallei and B. thailandensis in order to identify loci with varying degrees of polymorphism. A subset of these tandem repeat arrays were subsequently developed into a multiple-locus VNTR analysis to examine 66 B. pseudomallei and 21 B. mallei isolates from around the world, as well as 95 lineages from a serial transfer experiment encompassing ~18,000 generations. Results B. pseudomallei contains a preponderance of tandem repeat loci throughout its genome, many of which are duplicated elsewhere in the genome. The majority of these loci are composed of repeat motif lengths of 6 to 9 bp with 4 to 10 repeat units and are predominately located in intergenic regions of the genome. Across geographically diverse B. pseudomallei and B.mallei isolates, the 32 VNTR loci displayed between 7 and 28 alleles, with Nei's diversity values ranging from 0.47 and 0.94. Mutation rates for these loci are comparable (>10-5 per locus per generation to that of the most diverse tandemly repeated regions found in other less diverse bacteria. Conclusion The frequency, location and duplicate nature of tandemly repeated regions within the B. pseudomallei genome indicate that these tandem repeat regions may play a role in generating and maintaining adaptive genomic variation. Multiple-locus VNTR analysis revealed extensive diversity within the global isolate set containing B. pseudomallei and B. mallei, and it detected genotypic differences within clonal lineages of both species that were

  15. Single-nucleotide polymorphism discovery in Leptographium longiclavatum, a mountain pine beetle-associated symbiotic fungus, using whole-genome resequencing.

    Science.gov (United States)

    Ojeda, Dario I; Dhillon, Braham; Tsui, Clement K M; Hamelin, Richard C

    2014-03-01

    Single-nucleotide polymorphisms (SNPs) are rapidly becoming the standard markers in population genomics studies; however, their use in nonmodel organisms is limited due to the lack of cost-effective approaches to uncover genome-wide variation, and the large number of individuals needed in the screening process to reduce ascertainment bias. To discover SNPs for population genomics studies in the fungal symbionts of the mountain pine beetle (MPB), we developed a road map to discover SNPs and to produce a genotyping platform. We undertook a whole-genome sequencing approach of Leptographium longiclavatum in combination with available genomics resources of another MPB symbiont, Grosmannia clavigera. We sequenced 71 individuals pooled into four groups using the Illumina sequencing technology. We generated between 27 and 30 million reads of 75 bp that resulted in a total of 1, 181 contigs longer than 2 kb and an assembled genome size of 28.9 Mb (N50 = 48 kb, average depth = 125x). A total of 9052 proteins were annotated, and between 9531 and 17,266 SNPs were identified in the four pools. A subset of 206 genes (containing 574 SNPs, 11% false positives) was used to develop a genotyping platform for this species. Using this roadmap, we developed a genotyping assay with a total of 147 SNPs located in 121 genes using the Illumina(®) Sequenom iPLEX Gold. Our preliminary genotyping (success rate = 85%) of 304 individuals from 36 populations supports the utility of this approach for population genomics studies in other MPB fungal symbionts and other fungal nonmodel species. © 2013 John Wiley & Sons Ltd.

  16. Prognostic Impact of Array-based Genomic Profiles in Esophageal Squamous Cell Cancer

    International Nuclear Information System (INIS)

    Carneiro, Ana; Isinger, Anna; Karlsson, Anna; Johansson, Jan; Jönsson, Göran; Bendahl, Pär-Ola; Falkenback, Dan; Halvarsson, Britta; Nilbert, Mef

    2008-01-01

    Esophageal squamous cell carcinoma (ESCC) is a genetically complex tumor type and a major cause of cancer related mortality. Although distinct genetic alterations have been linked to ESCC development and prognosis, the genetic alterations have not gained clinical applicability. We applied array-based comparative genomic hybridization (aCGH) to obtain a whole genome copy number profile relevant for identifying deranged pathways and clinically applicable markers. A 32 k aCGH platform was used for high resolution mapping of copy number changes in 30 stage I-IV ESCC. Potential interdependent alterations and deranged pathways were identified and copy number changes were correlated to stage, differentiation and survival. Copy number alterations affected median 19% of the genome and included recurrent gains of chromosome regions 5p, 7p, 7q, 8q, 10q, 11q, 12p, 14q, 16p, 17p, 19p, 19q, and 20q and losses of 3p, 5q, 8p, 9p and 11q. High-level amplifications were observed in 30 regions and recurrently involved 7p11 (EGFR), 11q13 (MYEOV, CCND1, FGF4, FGF3, PPFIA, FAD, TMEM16A, CTTS and SHANK2) and 11q22 (PDFG). Gain of 7p22.3 predicted nodal metastases and gains of 1p36.32 and 19p13.3 independently predicted poor survival in multivariate analysis. aCGH profiling verified genetic complexity in ESCC and herein identified imbalances of multiple central tumorigenic pathways. Distinct gains correlate with clinicopathological variables and independently predict survival, suggesting clinical applicability of genomic profiling in ESCC

  17. Single nucleotide polymorphism array analysis of bone marrow failure patients reveals characteristic patterns of genetic changes.

    Science.gov (United States)

    Babushok, Daria V; Xie, Hongbo M; Roth, Jacquelyn J; Perdigones, Nieves; Olson, Timothy S; Cockroft, Joshua D; Gai, Xiaowu; Perin, Juan C; Li, Yimei; Paessler, Michele E; Hakonarson, Hakon; Podsakoff, Gregory M; Mason, Philip J; Biegel, Jaclyn A; Bessler, Monica

    2014-01-01

    The bone marrow failure syndromes (BMFS) are a heterogeneous group of rare blood disorders characterized by inadequate haematopoiesis, clonal evolution, and increased risk of leukaemia. Single nucleotide polymorphism arrays (SNP-A) have been proposed as a tool for surveillance of clonal evolution in BMFS. To better understand the natural history of BMFS and to assess the clinical utility of SNP-A in these disorders, we analysed 124 SNP-A from a comprehensively characterized cohort of 91 patients at our BMFS centre. SNP-A were correlated with medical histories, haematopathology, cytogenetic and molecular data. To assess clonal evolution, longitudinal analysis of SNP-A was performed in 25 patients. We found that acquired copy number-neutral loss of heterozygosity (CN-LOH) was significantly more frequent in acquired aplastic anaemia (aAA) than in other BMFS (odds ratio 12·2, P < 0·01). Homozygosity by descent was most common in congenital BMFS, frequently unmasking autosomal recessive mutations. Copy number variants (CNVs) were frequently polymorphic, and we identified CNVs enriched in neutropenia and aAA. Our results suggest that acquired CN-LOH is a general phenomenon in aAA that is probably mechanistically and prognostically distinct from typical CN-LOH of myeloid malignancies. Our analysis of clinical utility of SNP-A shows the highest yield of detecting new clonal haematopoiesis at diagnosis and at relapse. © 2013 John Wiley & Sons Ltd.

  18. Comparative mapping of Brassica juncea and Arabidopsis thaliana using Intron Polymorphism (IP markers: homoeologous relationships, diversification and evolution of the A, B and C Brassica genomes

    Directory of Open Access Journals (Sweden)

    Gupta Vibha

    2008-03-01

    Full Text Available Abstract Background Extensive mapping efforts are currently underway for the establishment of comparative genomics between the model plant, Arabidopsis thaliana and various Brassica species. Most of these studies have deployed RFLP markers, the use of which is a laborious and time-consuming process. We therefore tested the efficacy of PCR-based Intron Polymorphism (IP markers to analyze genome-wide synteny between the oilseed crop, Brassica juncea (AABB genome and A. thaliana and analyzed the arrangement of 24 (previously described genomic block segments in the A, B and C Brassica genomes to study the evolutionary events contributing to karyotype variations in the three diploid Brassica genomes. Results IP markers were highly efficient and generated easily discernable polymorphisms on agarose gels. Comparative analysis of the segmental organization of the A and B genomes of B. juncea (present study with the A and B genomes of B. napus and B. nigra respectively (described earlier, revealed a high degree of colinearity suggesting minimal macro-level changes after polyploidization. The ancestral block arrangements that remained unaltered during evolution and the karyotype rearrangements that originated in the Oleracea lineage after its divergence from Rapa lineage were identified. Genomic rearrangements leading to the gain or loss of one chromosome each between the A-B and A-C lineages were deciphered. Complete homoeology in terms of block organization was found between three linkage groups (LG each for the A-B and A-C genomes. Based on the homoeology shared between the A, B and C genomes, a new nomenclature for the B genome LGs was assigned to establish uniformity in the international Brassica LG nomenclature code. Conclusion IP markers were highly effective in generating comparative relationships between Arabidopsis and various Brassica species. Comparative genomics between the three Brassica lineages established the major rearrangements

  19. High-throughput multiplex HLA-typing by ligase detection reaction (LDR) and universal array (UA) approach.

    Science.gov (United States)

    Consolandi, Clarissa

    2009-01-01

    One major goal of genetic research is to understand the role of genetic variation in living systems. In humans, by far the most common type of such variation involves differences in single DNA nucleotides, and is thus termed single nucleotide polymorphism (SNP). The need for improvement in throughput and reliability of traditional techniques makes it necessary to develop new technologies. Thus the past few years have witnessed an extraordinary surge of interest in DNA microarray technology. This new technology offers the first great hope for providing a systematic way to explore the genome. It permits a very rapid analysis of thousands genes for the purpose of gene discovery, sequencing, mapping, expression, and polymorphism detection. We generated a series of analytical tools to address the manufacturing, detection and data analysis components of a microarray experiment. In particular, we set up a universal array approach in combination with a PCR-LDR (polymerase chain reaction-ligation detection reaction) strategy for allele identification in the HLA gene.

  20. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  1. Overlapping Genomic Sequences: A Treasure Trove of Single-Nucleotide Polymorphisms

    Science.gov (United States)

    Taillon-Miller, Patricia; Gu, Zhijie; Li, Qun; Hillier, LaDeana; Kwok, Pui-Yan

    1998-01-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21–7q22, and 13q12–13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AC003015 (for GS113423), AC002380 (GS330J10), AC000066 (RG293F11), AC003086 (RG104F04), AC002525 (257C22A), and U73331 (96A18A).] PMID:9685323

  2. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    Science.gov (United States)

    Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos

    2015-08-01

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  3. CGI: Java software for mapping and visualizing data from array-based comparative genomic hybridization and expression profiling.

    Science.gov (United States)

    Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H; Lau, Ching C; Behl, Sanjiv; Man, Tsz-Kwong

    2007-10-06

    With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  4. CGI: Java Software for Mapping and Visualizing Data from Array-based Comparative Genomic Hybridization and Expression Profiling

    Directory of Open Access Journals (Sweden)

    Joyce Xiuweu-Xu Gu

    2007-01-01

    Full Text Available With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator that matches the BAC clones from array-based comparative genomic hybridization (aCGH to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specifi c BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.

  5. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  6. Microdeletion and microduplication analysis of chinese conotruncal defects patients with targeted array comparative genomic hybridization.

    Directory of Open Access Journals (Sweden)

    Xiaohui Gong

    Full Text Available OBJECTIVE: The current study aimed to develop a reliable targeted array comparative genomic hybridization (aCGH to detect microdeletions and microduplications in congenital conotruncal defects (CTDs, especially on 22q11.2 region, and for some other chromosomal aberrations, such as 5p15-5p, 7q11.23 and 4p16.3. METHODS: Twenty-seven patients with CTDs, including 12 pulmonary atresia (PA, 10 double-outlet right ventricle (DORV, 3 transposition of great arteries (TGA, 1 tetralogy of Fallot (TOF and one ventricular septal defect (VSD, were enrolled in this study and screened for pathogenic copy number variations (CNVs, using Agilent 8 x 15K targeted aCGH. Real-time quantitative polymerase chain reaction (qPCR was performed to test the molecular results of targeted aCGH. RESULTS: Four of 27 patients (14.8% had 22q11.2 CNVs, 1 microdeletion and 3 microduplications. qPCR test confirmed the microdeletion and microduplication detected by the targeted aCGH. CONCLUSION: Chromosomal abnormalities were a well-known cause of multiple congenital anomalies (MCA. This aCGH using arrays with high-density coverage in the targeted regions can detect genomic imbalances including 22q11.2 and other 10 kinds CNVs effectively and quickly. This approach has the potential to be applied to detect aneuploidy and common microdeletion/microduplication syndromes on a single microarray.

  7. Characterisation of genetic markers in Mungbean using direct amplification of length polymorphisms (DALP)

    International Nuclear Information System (INIS)

    Kumar, S.V.; Tan, S.G.; Quah, S.C.

    2000-01-01

    A newly developed technique, Direct Amplification of Length Polymorphisms (DALP), developed by Desmarais and co-workers in 1998 was successfully used to identify and characterise new genetic markers in mungbean (Vigyia radiata). DALP uses an arbitrarily primed PCR (AP-PCR) to produce genomic fingerprints and is specifically designed to enable direct sequencing of polymorphic bands. In this study, an oligonucleotide pair DALP235 and DAPLR were tested on four varieties of mungbean (V3476, P4281, V5973 and V5784) and produced, through PCR, specific multibanded fingerprints which showed polymorphisms. These polymorphic bands are the result of length polymorphisms as well as absence and presence of bands. Some of the polymorphic zones may be codominantly inherited and may be potential microsatellites. The success of DALP in characterising new polymorphic loci and its ability to discover microsatellites without the use of priori knowledge of the mungbean genome is revolutionary. This would greatly facilitate the breeding and improvement of the crop. (author)

  8. Genome-wide association study of multiplex schizophrenia pedigrees

    DEFF Research Database (Denmark)

    Levinson, Douglas F; Shi, Jianxin; Wang, Kai

    2012-01-01

    The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs).......The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs)....

  9. High-resolution genetic maps of Eucalyptus improve Eucalyptus grandis genome assembly.

    Science.gov (United States)

    Bartholomé, Jérôme; Mandrou, Eric; Mabiala, André; Jenkins, Jerry; Nabihoudine, Ibouniyamine; Klopp, Christophe; Schmutz, Jeremy; Plomion, Christophe; Gion, Jean-Marc

    2015-06-01

    Genetic maps are key tools in genetic research as they constitute the framework for many applications, such as quantitative trait locus analysis, and support the assembly of genome sequences. The resequencing of the two parents of a cross between Eucalyptus urophylla and Eucalyptus grandis was used to design a single nucleotide polymorphism (SNP) array of 6000 markers evenly distributed along the E. grandis genome. The genotyping of 1025 offspring enabled the construction of two high-resolution genetic maps containing 1832 and 1773 markers with an average marker interval of 0.45 and 0.5 cM for E. grandis and E. urophylla, respectively. The comparison between genetic maps and the reference genome highlighted 85% of collinear regions. A total of 43 noncollinear regions and 13 nonsynthetic regions were detected and corrected in the new genome assembly. This improved version contains 4943 scaffolds totalling 691.3 Mb of which 88.6% were captured by the 11 chromosomes. The mapping data were also used to investigate the effect of population size and number of markers on linkage mapping accuracy. This study provides the most reliable linkage maps for Eucalyptus and version 2.0 of the E. grandis genome. © 2014 CIRAD. New Phytologist © 2014 New Phytologist Trust.

  10. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi Xuan; Han, Bin; Kurata, Nori

    2015-01-01

    . Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all

  11. Practical Calling Approach for Exome Array-Based Genome-Wide Association Studies in Korean Population

    Directory of Open Access Journals (Sweden)

    Tae-Joon Park

    2015-01-01

    Full Text Available Exome-based genotyping arrays are cost-effective and have recently been used as alternative platforms to whole-exome sequencing. However, the automated clustering algorithm in an exome array has a genotype calling problem in accuracy for identifying rare and low-frequency variants. To address these shortcomings, we present a practical approach for accurate genotype calling using the Illumina Infinium HumanExome BeadChip. We present comparison results and a statistical summary of our genotype data sets. Our data set comprises 14,647 Korean samples. To solve the limitation of automated clustering, we performed manual genotype clustering for the targeted identification of 46,076 variants that were identified using GenomeStudio software. To evaluate the effects of applying custom cluster files, we tested cluster files using 804 independent Korean samples and the same platform. Our study firstly suggests practical guidelines for exome chip quality control in Asian populations and provides valuable insight into an association study using exome chip.

  12. Characterization of canine osteosarcoma by array comparative genomic hybridization and RT-qPCR: signatures of genomic imbalance in canine osteosarcoma parallel the human counterpart.

    Science.gov (United States)

    Angstadt, Andrea Y; Motsinger-Reif, Alison; Thomas, Rachael; Kisseberth, William C; Guillermo Couto, C; Duval, Dawn L; Nielsen, Dahlia M; Modiano, Jaime F; Breen, Matthew

    2011-11-01

    Osteosarcoma (OS) is the most commonly diagnosed malignant bone tumor in humans and dogs, characterized in both species by extremely complex karyotypes exhibiting high frequencies of genomic imbalance. Evaluation of genomic signatures in human OS using array comparative genomic hybridization (aCGH) has assisted in uncovering genetic mechanisms that result in disease phenotype. Previous low-resolution (10-20 Mb) aCGH analysis of canine OS identified a wide range of recurrent DNA copy number aberrations, indicating extensive genomic instability. In this study, we profiled 123 canine OS tumors by 1 Mb-resolution aCGH to generate a dataset for direct comparison with current data for human OS, concluding that several high frequency aberrations in canine and human OS are orthologous. To ensure complete coverage of gene annotation, we identified the human refseq genes that map to these orthologous aberrant dog regions and found several candidate genes warranting evaluation for OS involvement. Specifically, subsequenct FISH and qRT-PCR analysis of RUNX2, TUSC3, and PTEN indicated that expression levels correlated with genomic copy number status, showcasing RUNX2 as an OS associated gene and TUSC3 as a possible tumor suppressor candidate. Together these data demonstrate the ability of genomic comparative oncology to identify genetic abberations which may be important for OS progression. Large scale screening of genomic imbalance in canine OS further validates the use of the dog as a suitable model for human cancers, supporting the idea that dysregulation discovered in canine cancers will provide an avenue for complementary study in human counterparts. Copyright © 2011 Wiley-Liss, Inc.

  13. Development of cleaved amplified polymorphic sequence markers and a CAPS-based genetic linkage map in watermelon (Citrullus lanatus [Thunb.] Matsum. and Nakai) constructed using whole-genome re-sequencing data.

    Science.gov (United States)

    Liu, Shi; Gao, Peng; Zhu, Qianglong; Luan, Feishi; Davis, Angela R; Wang, Xiaolu

    2016-03-01

    Cleaved amplified polymorphic sequence (CAPS) markers are useful tools for detecting single nucleotide polymorphisms (SNPs). This study detected and converted SNP sites into CAPS markers based on high-throughput re-sequencing data in watermelon, for linkage map construction and quantitative trait locus (QTL) analysis. Two inbred lines, Cream of Saskatchewan (COS) and LSW-177 had been re-sequenced and analyzed by Perl self-compiled script for CAPS marker development. 88.7% and 78.5% of the assembled sequences of the two parental materials could map to the reference watermelon genome, respectively. Comparative assembled genome data analysis provided 225,693 and 19,268 SNPs and indels between the two materials. 532 pairs of CAPS markers were designed with 16 restriction enzymes, among which 271 pairs of primers gave distinct bands of the expected length and polymorphic bands, via PCR and enzyme digestion, with a polymorphic rate of 50.94%. Using the new CAPS markers, an initial CAPS-based genetic linkage map was constructed with the F2 population, spanning 1836.51 cM with 11 linkage groups and 301 markers. 12 QTLs were detected related to fruit flesh color, length, width, shape index, and brix content. These newly CAPS markers will be a valuable resource for breeding programs and genetic studies of watermelon.

  14. Array comparative genomic hybridization and cytogenetic analysis in pediatric acute leukemias

    Science.gov (United States)

    Dawson, A.J.; Yanofsky, R.; Vallente, R.; Bal, S.; Schroedter, I.; Liang, L.; Mai, S.

    2011-01-01

    Most patients with acute lymphocytic leukemia (all) are reported to have acquired chromosomal abnormalities in their leukemic bone marrow cells. Many established chromosome rearrangements have been described, and their associations with specific clinical, biologic, and prognostic features are well defined. However, approximately 30% of pediatric and 50% of adult patients with all do not have cytogenetic abnormalities of clinical significance. Despite significant improvements in outcome for pediatric all, therapy fails in approximately 25% of patients, and these failures often occur unpredictably in patients with a favorable prognosis and “good” cytogenetics at diagnosis. It is well known that karyotype analysis in hematologic malignancies, although genome-wide, is limited because of altered cell kinetics (mitotic rate), a propensity of leukemic blasts to undergo apoptosis in culture, overgrowth by normal cells, and chromosomes of poor quality in the abnormal clone. Array comparative genomic hybridization (acgh—“microarray”) has a greatly increased genomic resolution over classical cytogenetics. Cytogenetic microarray, which uses genomic dna, is a powerful tool in the analysis of unbalanced chromosome rearrangements, such as copy number gains and losses, and it is the method of choice when the mitotic index is low and the quality of metaphases is suboptimal. The copy number profile obtained by microarray is often called a “molecular karyotype.” In the present study, microarray was applied to 9 retrospective cases of pediatric all either with initial high-risk features or with at least 1 relapse. The conventional karyotype was compared to the “molecular karyotype” to assess abnormalities as interpreted by classical cytogenetics. Not only were previously undetected chromosome losses and gains identified by microarray, but several karyotypes interpreted by classical cytogenetics were shown to be discordant with the microarray results. The

  15. Human β satellite DNA: Genomic organization and sequence definition of a class of highly repetitive tandem DNA

    International Nuclear Information System (INIS)

    Waye, J.S.; Willard, H.F.

    1989-01-01

    The authors describe a class of human repetitive DNA, called β satellite, that, at a most fundamental level, exists as tandem arrays of diverged ∼68-base-pair monomer repeat units. The monomer units are organized as distinct subsets, each characterized by a multimeric higher-order repeat unit that is tandemly reiterated and represents a recent unit of amplification. They have cloned, characterized, and determined the sequence of two β satellite higher-order repeat units: one located on chromosome 9, the other on the acrocentric chromosomes (13, 14, 15, 21, and 22) and perhaps other sites in the genome. Analysis by pulsed-field gel electrophoresis reveals that these tandem arrays are localized in large domains that are marked by restriction fragment length polymorphisms. In total, β-satellite sequences comprise several million base pairs of DNA in the human genome. Analysis of this DNA family should permit insights into the nature of chromosome-specific and nonspecific modes of satellite DNA evolution and provide useful tools for probing the molecular organization and concerted evolution of the acrocentric chromosomes

  16. Use of allele-specific FAIRE to determine functional regulatory polymorphism using large-scale genotyping arrays.

    Directory of Open Access Journals (Sweden)

    Andrew J P Smith

    Full Text Available Following the widespread use of genome-wide association studies (GWAS, focus is turning towards identification of causal variants rather than simply genetic markers of diseases and traits. As a step towards a high-throughput method to identify genome-wide, non-coding, functional regulatory variants, we describe the technique of allele-specific FAIRE, utilising large-scale genotyping technology (FAIRE-gen to determine allelic effects on chromatin accessibility and regulatory potential. FAIRE-gen was explored using lymphoblastoid cells and the 50,000 SNP Illumina CVD BeadChip. The technique identified an allele-specific regulatory polymorphism within NR1H3 (coding for LXR-α, rs7120118, coinciding with a previously GWAS-identified SNP for HDL-C levels. This finding was confirmed using FAIRE-gen with the 200,000 SNP Illumina Metabochip and verified with the established method of TaqMan allelic discrimination. Examination of this SNP in two prospective Caucasian cohorts comprising 15,000 individuals confirmed the association with HDL-C levels (combined beta = 0.016; p = 0.0006, and analysis of gene expression identified an allelic association with LXR-α expression in heart tissue. Using increasingly comprehensive genotyping chips and distinct tissues for examination, FAIRE-gen has the potential to aid the identification of many causal SNPs associated with disease from GWAS.

  17. Signatures of selection in the Iberian honey bee (Apis mellifera iberiensis) revealed by a genome scan analysis of single nucleotide polymorphisms.

    Science.gov (United States)

    Chávez-Galarza, Julio; Henriques, Dora; Johnston, J Spencer; Azevedo, João C; Patton, John C; Muñoz, Irene; De la Rúa, Pilar; Pinto, M Alice

    2013-12-01

    Understanding the genetic mechanisms of adaptive population divergence is one of the most fundamental endeavours in evolutionary biology and is becoming increasingly important as it will allow predictions about how organisms will respond to global environmental crisis. This is particularly important for the honey bee, a species of unquestionable ecological and economical importance that has been exposed to increasing human-mediated selection pressures. Here, we conducted a single nucleotide polymorphism (SNP)-based genome scan in honey bees collected across an environmental gradient in Iberia and used four FST -based outlier tests to identify genomic regions exhibiting signatures of selection. Additionally, we analysed associations between genetic and environmental data for the identification of factors that might be correlated or act as selective pressures. With these approaches, 4.4% (17 of 383) of outlier loci were cross-validated by four FST -based methods, and 8.9% (34 of 383) were cross-validated by at least three methods. Of the 34 outliers, 15 were found to be strongly associated with one or more environmental variables. Further support for selection, provided by functional genomic information, was particularly compelling for SNP outliers mapped to different genes putatively involved in the same function such as vision, xenobiotic detoxification and innate immune response. This study enabled a more rigorous consideration of selection as the underlying cause of diversity patterns in Iberian honey bees, representing an important first step towards the identification of polymorphisms implicated in local adaptation and possibly in response to recent human-mediated environmental changes. © 2013 John Wiley & Sons Ltd.

  18. Analysis of three polymorphisms in Bidayuh ethnic of Sarawak ...

    African Journals Online (AJOL)

    Insertion/deletion polymorphism of YAP (DYS287), M96 and M120 polymorphisms in Bidayuh ethnic populations of Sarawak, Malaysia were analyzed in this study. Genomic DNA was extracted from 180 buccal samples and amplified by Hot-Start PCR method. The amplified PCR products were separated by using 2% ...

  19. Localizing recent adaptive evolution in the human genome

    DEFF Research Database (Denmark)

    Williamson, Scott H; Hubisz, Melissa J; Clark, Andrew G

    2007-01-01

    , clusters of olfactory receptors, genes involved in nervous system development and function, immune system genes, and heat shock genes. We also observe consistent evidence of selective sweeps in centromeric regions. In general, we find that recent adaptation is strikingly pervasive in the human genome......-nucleotide polymorphism ascertainment, while also providing fine-scale estimates of the position of the selected site, we analyzed a genomic dataset of 1.2 million human single-nucleotide polymorphisms genotyped in African-American, European-American, and Chinese samples. We identify 101 regions of the human genome...

  20. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.).

    Science.gov (United States)

    Stoffel, Kevin; van Leeuwen, Hans; Kozik, Alexander; Caldwell, David; Ashrafi, Hamid; Cui, Xinping; Tan, Xiaoping; Hill, Theresa; Reyes-Chin-Wo, Sebastian; Truco, Maria-Jose; Michelmore, Richard W; Van Deynze, Allen

    2012-05-14

    genomic DNA to a custom oligonucleotide array designed for maximum gene coverage, we were able to identify polymorphisms using two approaches for pair-wise comparisons, as well as a highly parallel method that compared all 52 genotypes simultaneously.

  1. Development and characterization of polymorphic genomic-SSR markers in Asian long-horned beetle (Anoplophora glabripennis).

    Science.gov (United States)

    Liu, Zhaoyang; Tao, Jing; Luo, Youqing

    2017-12-01

    The Asian long-horned beetle (ALB), Anoplophora glabripennis (Motschulsky) (Coleoptera: Cerambycidae: Lamiinae), is a wood-borer and polyphagous xylophage that is native to Asia. It infests and seriously harms healthy trees, and therefore is a cause for considerable environmental concern. The analysis of population genetic structure of ALB and sibling species Anoplophora nobilis (Ganglbauer) will not only help to clarify the relationship between environmental variables and mechanisms of speciation, but also will enhance our understanding of evolutionary processes. However, the known genetic markers, particularly microsatellites, are limited for this species. SSRLocator software was used to analyze the distribution and frequencies of genomic simple sequence repeat (SSR), to infer the basic characteristics of repeat motifs, and to design primers. We developed SSR loci of 2-6 repeated units, including 10,650 perfect SSRs, and found 140 types of repeat motifs. A total of 2621 SSR markers were discovered in ALB whole-genome shotgun sequences. 48 pairs of SSR primers were randomly chosen from 2621 SSR markers, and half of these 48 pairs were polymorphic containing 4 di-, 7 tri-, 2 tetra-, and 11-hexamer SSRs. Four populations test the effectiveness of the primers. These results suggest that our method for whole-genome SSR screening is feasible and efficient, and the SSR markers developed in this study are suitable for further population genetics studies of ALB. Moreover, they may also be useful for the development of SSRs for other Coleoptera.

  2. A Mismatch EndoNuclease Array-Based Methodology (MENA) for Identifying Known SNPs or Novel Point Mutations.

    Science.gov (United States)

    Comeron, Josep M; Reed, Jordan; Christie, Matthew; Jacobs, Julia S; Dierdorff, Jason; Eberl, Daniel F; Manak, J Robert

    2016-04-05

    Accurate and rapid identification or confirmation of single nucleotide polymorphisms (SNPs), point mutations and other human genomic variation facilitates understanding the genetic basis of disease. We have developed a new methodology (called MENA (Mismatch EndoNuclease Array)) pairing DNA mismatch endonuclease enzymology with tiling microarray hybridization in order to genotype both known point mutations (such as SNPs) as well as identify previously undiscovered point mutations and small indels. We show that our assay can rapidly genotype known SNPs in a human genomic DNA sample with 99% accuracy, in addition to identifying novel point mutations and small indels with a false discovery rate as low as 10%. Our technology provides a platform for a variety of applications, including: (1) genotyping known SNPs as well as confirming newly discovered SNPs from whole genome sequencing analyses; (2) identifying novel point mutations and indels in any genomic region from any organism for which genome sequence information is available; and (3) screening panels of genes associated with particular diseases and disorders in patient samples to identify causative mutations. As a proof of principle for using MENA to discover novel mutations, we report identification of a novel allele of the beethoven (btv) gene in Drosophila, which encodes a ciliary cytoplasmic dynein motor protein important for auditory mechanosensation.

  3. Development of cleaved amplified polymorphic sequence (CAPS) and high-resolution melting (HRM) markers from the chloroplast genome of Glycyrrhiza species.

    Science.gov (United States)

    Jo, Ick-Hyun; Sung, Jwakyung; Hong, Chi-Eun; Raveendar, Sebastin; Bang, Kyong-Hwan; Chung, Jong-Wook

    2018-05-01

    Licorice ( Glycyrrhiza glabra ) is an important medicinal crop often used as health foods or medicine worldwide. The molecular genetics of licorice is under scarce owing to lack of molecular markers. Here, we have developed cleaved amplified polymorphic sequence (CAPS) and high-resolution melting (HRM) markers based on single nucleotide polymorphisms (SNP) by comparing the chloroplast genomes of two Glycyrrhiza species ( G. glabra and G. lepidota ). The CAPS and HRM markers were tested for diversity analysis with 24 Glycyrrhiza accessions. The restriction profiles generated with CAPS markers classified the accessions (2-4 genotypes) and melting curves (2-3) were obtained from the HRM markers. The number of alleles and major allele frequency were 2-6 and 0.31-0.92, respectively. The genetic distance and polymorphism information content values were 0.16-0.76 and 0.15-0.72, respectively. The phylogenetic relationships among the 24 accessions were estimated using a dendrogram, which classified them into four clades. Except clade III, the remaining three clades included the same species, confirming interspecies genetic correlation. These 18 CAPS and HRM markers might be helpful for genetic diversity assessment and rapid identification of licorice species.

  4. Large-scale analysis of antisense transcription in wheat using the Affymetrix GeneChip Wheat Genome Array

    Directory of Open Access Journals (Sweden)

    Settles Matthew L

    2009-05-01

    Full Text Available Abstract Background Natural antisense transcripts (NATs are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded or a different locus (trans-encoded. They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation. NATs give rise to sense-antisense transcript pairs and the number of these identified has escalated greatly with the availability of DNA sequencing resources and public databases. Traditionally, NATs were identified by the alignment of full-length cDNAs or expressed sequence tags to genome sequences, but an alternative method for large-scale detection of sense-antisense transcript pairs involves the use of microarrays. In this study we developed a novel protocol to assay sense- and antisense-strand transcription on the 55 K Affymetrix GeneChip Wheat Genome Array, which is a 3' in vitro transcription (3'IVT expression array. We selected five different tissue types for assay to enable maximum discovery, and used the 'Chinese Spring' wheat genotype because most of the wheat GeneChip probe sequences were based on its genomic sequence. This study is the first report of using a 3'IVT expression array to discover the expression of natural sense-antisense transcript pairs, and may be considered as proof-of-concept. Results By using alternative target preparation schemes, both the sense- and antisense-strand derived transcripts were labeled and hybridized to the Wheat GeneChip. Quality assurance verified that successful hybridization did occur in the antisense-strand assay. A stringent threshold for positive hybridization was applied, which resulted in the identification of 110 sense-antisense transcript pairs, as well as 80 potentially antisense-specific transcripts. Strand-specific RT-PCR validated the microarray observations, and showed that antisense transcription is likely to be tissue specific. For the annotated sense

  5. Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

    Science.gov (United States)

    Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

    2009-01-01

    Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494

  6. Sequence- vs. chip-assisted genomic selection: accurate biological information is advised.

    Science.gov (United States)

    Pérez-Enciso, Miguel; Rincón, Juan C; Legarra, Andrés

    2015-05-09

    The development of next-generation sequencing technologies (NGS) has made the use of whole-genome sequence data for routine genetic evaluations possible, which has triggered a considerable interest in animal and plant breeding fields. Here, we investigated whether complete or partial sequence data can improve upon existing SNP (single nucleotide polymorphism) array-based selection strategies by simulation using a mixed coalescence - gene-dropping approach. We simulated 20 or 100 causal mutations (quantitative trait nucleotides, QTN) within 65 predefined 'gene' regions, each 10 kb long, within a genome composed of ten 3-Mb chromosomes. We compared prediction accuracy by cross-validation using a medium-density chip (7.5 k SNPs), a high-density (HD, 17 k) and sequence data (335 k). Genetic evaluation was based on a GBLUP method. The simulations showed: (1) a law of diminishing returns with increasing number of SNPs; (2) a modest effect of SNP ascertainment bias in arrays; (3) a small advantage of using whole-genome sequence data vs. HD arrays i.e. ~4%; (4) a minor effect of NGS errors except when imputation error rates are high (≥20%); and (5) if QTN were known, prediction accuracy approached 1. Since this is obviously unrealistic, we explored milder assumptions. We showed that, if all SNPs within causal genes were included in the prediction model, accuracy could also dramatically increase by ~40%. However, this criterion was highly sensitive to either misspecification (including wrong genes) or to the use of an incomplete gene list; in these cases, accuracy fell rapidly towards that reached when all SNPs from sequence data were blindly included in the model. Our study shows that, unless an accurate prior estimate on the functionality of SNPs can be included in the predictor, there is a law of diminishing returns with increasing SNP density. As a result, use of whole-genome sequence data may not result in a highly increased selection response over high

  7. Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB

    Directory of Open Access Journals (Sweden)

    Joon-Ho Lee

    2014-09-01

    Full Text Available Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB (http://snugenome2.snu.ac.kr/HSDB provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.

  8. ChIP on SNP-chip for genome-wide analysis of human histone H4 hyperacetylation

    Directory of Open Access Journals (Sweden)

    Porter Christopher J

    2007-09-01

    Full Text Available Abstract Background SNP microarrays are designed to genotype Single Nucleotide Polymorphisms (SNPs. These microarrays report hybridization of DNA fragments and therefore can be used for the purpose of detecting genomic fragments. Results Here, we demonstrate that a SNP microarray can be effectively used in this way to perform chromatin immunoprecipitation (ChIP on chip as an alternative to tiling microarrays. We illustrate this novel application by mapping whole genome histone H4 hyperacetylation in human myoblasts and myotubes. We detect clusters of hyperacetylated histone H4, often spanning across up to 300 kilobases of genomic sequence. Using complementary genome-wide analyses of gene expression by DNA microarray we demonstrate that these clusters of hyperacetylated histone H4 tend to be associated with expressed genes. Conclusion The use of a SNP array for a ChIP-on-chip application (ChIP on SNP-chip will be of great value to laboratories whose interest is the determination of general rules regarding the relationship of specific chromatin modifications to transcriptional status throughout the genome and to examine the asymmetric modification of chromatin at heterozygous loci.

  9. Comparative genomics of Bacillus anthracis from the wool industry highlights polymorphisms of lineage A.Br.Vollum.

    Science.gov (United States)

    Derzelle, Sylviane; Aguilar-Bultet, Lisandra; Frey, Joachim

    2016-12-01

    With the advent of affordable next-generation sequencing (NGS) technologies, major progress has been made in the understanding of the population structure and evolution of the B. anthracis species. Here we report the use of whole genome sequencing and computer-based comparative analyses to characterize six strains belonging to the A.Br.Vollum lineage. These strains were isolated in Switzerland, in 1981, during iterative cases of anthrax involving workers in a textile plant processing cashmere wool from the Indian subcontinent. We took advantage of the hundreds of currently available B. anthracis genomes in public databases, to investigate the genetic diversity existing within the A.Br.Vollum lineage and to position the six Swiss isolates into the worldwide B. anthracis phylogeny. Thirty additional genomes related to the A.Br.Vollum group were identified by whole-genome single nucleotide polymorphism (SNP) analysis, including two strains forming a new evolutionary branch at the basis of the A.Br.Vollum lineage. This new phylogenetic lineage (termed A.Br.H9401) splits off the branch leading to the A.Br.Vollum group soon after its divergence to the other lineages of the major A clade (i.e. 6 SNPs). The available dataset of A.Br.Vollum genomes were resolved into 2 distinct groups. Isolates from the Swiss wool processing facility clustered together with two strains from Pakistan and one strain of unknown origin isolated from yarn. They were clearly differentiated (69 SNPs) from the twenty-five other A.Br.Vollum strains located on the branch leading to the terminal reference strain A0488 of the lineage. Novel analytic assays specific to these new subgroups were developed for the purpose of rapid molecular epidemiology. Whole genome SNP surveys greatly expand upon our knowledge on the sub-structure of the A.Br.Vollum lineage. Possible origin and route of spread of this lineage worldwide are discussed. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights

  10. Analysis of genomic alterations in neuroblastoma by multiplex ligation-dependent probe amplification and array comparative genomic hybridization: a comparison of results.

    Science.gov (United States)

    Combaret, Valérie; Iacono, Isabelle; Bréjon, Stéphanie; Schleiermacher, Gudrun; Pierron, Gäelle; Couturier, Jérôme; Bergeron, Christophe; Blay, Jean-Yves

    2012-12-01

    In cases of neuroblastoma, recurring genetic alterations--losses of the 1p, 3p, 4p, and 11q and/or gains of 1q, 2p, and 17q chromosome arms--are currently used to define the therapeutic strategy in therapeutic protocols for low- and intermediate-risk patients. Different genome-wide analysis techniques, such as array comparative genomic hybridization (aCGH) or multiplex ligation-dependent probe amplification (MLPA), have been suggested for detecting chromosome segmental abnormalities. In this study, we compared the results of the two technologies in the analyses of the DNA of tumor samples from 91 neuroblastoma patients. Similar results were obtained with the two techniques for 75 samples (82%). In five cases (5.5%), the MLPA results were not interpretable. Discrepancies between the aCGH and MLPA results were observed in 11 cases (12%). Among the discrepancies, a 18q21.2-qter gain and 16p11.2 and 11q14.1-q14.3 losses were detected only by aCGH. The MLPA results showed that the 7p, 7q, and 14q chromosome arms were affected in six cases, while in two cases, 2p and 17q gains were observed; these results were confirmed by neither aCGH nor fluorescence in situ hybridization (FISH) analysis. Because of the higher sensitivity and specificity of genome-wide information, reasonable cost, and shorter time of aCGH analysis, we recommend the aCGH procedure for the analysis of genomic alterations in neuroblastoma. Copyright © 2012 Elsevier Inc. All rights reserved.

  11. Physical mapping of QTL for tuber yield, starch content and starch yield in tetraploid potato (Solanum tuberosum L.) by means of genome wide genotyping by sequencing and the 8.3 K SolCAP SNP array.

    Science.gov (United States)

    Schönhals, Elske Maria; Ding, Jia; Ritter, Enrique; Paulo, Maria João; Cara, Nicolás; Tacke, Ekhard; Hofferbert, Hans-Reinhard; Lübeck, Jens; Strahwald, Josef; Gebhardt, Christiane

    2017-08-22

    Tuber yield and starch content of the cultivated potato are complex traits of decisive importance for breeding improved varieties. Natural variation of tuber yield and starch content depends on the environment and on multiple, mostly unknown genetic factors. Dissection and molecular identification of the genes and their natural allelic variants controlling these complex traits will lead to the development of diagnostic DNA-based markers, by which precision and efficiency of selection can be increased (precision breeding). Three case-control populations were assembled from tetraploid potato cultivars based on maximizing the differences between high and low tuber yield (TY), starch content (TSC) and starch yield (TSY, arithmetic product of TY and TSC). The case-control populations were genotyped by restriction-site associated DNA sequencing (RADseq) and the 8.3 k SolCAP SNP genotyping array. The allele frequencies of single nucleotide polymorphisms (SNPs) were compared between cases and controls. RADseq identified, depending on data filtering criteria, between 6664 and 450 genes with one or more differential SNPs for one, two or all three traits. Differential SNPs in 275 genes were detected using the SolCAP array. A genome wide association study using the SolCAP array on an independent, unselected population identified SNPs associated with tuber starch content in 117 genes. Physical mapping of the genes containing differential or associated SNPs, and comparisons between the two genome wide genotyping methods and two different populations identified genome segments on all twelve potato chromosomes harboring one or more quantitative trait loci (QTL) for TY, TSC and TSY. Several hundred genes control tuber yield and starch content in potato. They are unequally distributed on all potato chromosomes, forming clusters between 0.5-4 Mbp width. The largest fraction of these genes had unknown function, followed by genes with putative signalling and regulatory functions. The

  12. A comparison of rice chloroplast genomes

    DEFF Research Database (Denmark)

    Tang, Jiabin; Xia, Hong'ai; Cao, Mengliang

    2004-01-01

    Using high quality sequence reads extracted from our whole genome shotgun repository, we assembled two chloroplast genome sequences from two rice (Oryza sativa) varieties, one from 93-11 (a typical indica variety) and the other from PA64S (an indica-like variety with maternal origin of japonica......), which are both parental varieties of the super-hybrid rice, LYP9. Based on the patterns of high sequence coverage, we partitioned chloroplast sequence variations into two classes, intravarietal and intersubspecific polymorphisms. Intravarietal polymorphisms refer to variations within 93-11 or PA64S...

  13. How to interpret Methylation Sensitive Amplified Polymorphism (MSAP) profiles?

    OpenAIRE

    Fulneček, Jaroslav; Kovařík, Aleš

    2014-01-01

    Background DNA methylation plays a key role in development, contributes to genome stability, and may also respond to external factors supporting adaptation and evolution. To connect different types of stimuli with particular biological processes, identifying genome regions with altered 5-methylcytosine distribution at a genome-wide scale is important. Many researchers are using the simple, reliable, and relatively inexpensive Methylation Sensitive Amplified Polymorphism (MSAP) method that is ...

  14. Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.

    Science.gov (United States)

    Johnson, Eric O; Hancock, Dana B; Levy, Joshua L; Gaddis, Nathan C; Saccone, Nancy L; Bierut, Laura J; Page, Grier P

    2013-05-01

    A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.

  15. Molecular Identification of Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) Markers.

    Science.gov (United States)

    Al-Khalifah, Nasser S; Shanavaskhan, A E

    2017-01-01

    Ambiguity in the total number of date palm cultivars across the world is pointing toward the necessity for an enumerative study using standard morphological and molecular markers. Among molecular markers, DNA markers are more suitable and ubiquitous to most applications. They are highly polymorphic in nature, frequently occurring in genomes, easy to access, and highly reproducible. Various molecular markers such as restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), simple sequence repeats (SSR), inter-simple sequence repeats (ISSR), and random amplified polymorphic DNA (RAPD) markers have been successfully used as efficient tools for analysis of genetic variation in date palm. This chapter explains a stepwise protocol for extracting total genomic DNA from date palm leaves. A user-friendly protocol for RAPD analysis and a table showing the primers used in different molecular techniques that produce polymorphisms in date palm are also provided.

  16. Eighteen polymorphic microsatellites for domestic pigeon Columba ...

    Indian Academy of Sciences (India)

    certain parasites which cause health problems in humans and domestic animals ... The genomic DNA was isolated using standard protocol as described by ..... panel of polymorphic microsatellite markers in Himalayan monal. Lophophorus ...

  17. CRISPRDetect: A flexible algorithm to define CRISPR arrays.

    Science.gov (United States)

    Biswas, Ambarish; Staals, Raymond H J; Morales, Sergio E; Fineran, Peter C; Brown, Chris M

    2016-05-17

    CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci. We have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with 'extra-large' repeats in bacteria (repeats 44-50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets. CRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.

  18. Search for methylation-sensitive amplification polymorphisms in mutant figs.

    Science.gov (United States)

    Rodrigues, M G F; Martins, A B G; Bertoni, B W; Figueira, A; Giuliatti, S

    2013-07-08

    Fig (Ficus carica) breeding programs that use conventional approaches to develop new cultivars are rare, owing to limited genetic variability and the difficulty in obtaining plants via gamete fusion. Cytosine methylation in plants leads to gene repression, thereby affecting transcription without changing the DNA sequence. Previous studies using random amplification of polymorphic DNA and amplified fragment length polymorphism markers revealed no polymorphisms among select fig mutants that originated from gamma-irradiated buds. Therefore, we conducted methylation-sensitive amplified polymorphism analysis to verify the existence of variability due to epigenetic DNA methylation among these mutant selections compared to the main cultivar 'Roxo-de-Valinhos'. Samples of genomic DNA were double-digested with either HpaII (methylation sensitive) or MspI (methylation insensitive) and with EcoRI. Fourteen primer combinations were tested, and on an average, non-methylated CCGG, symmetrically methylated CmCGG, and hemimethylated hmCCGG sites accounted for 87.9, 10.1, and 2.0%, respectively. MSAP analysis was effective in detecting differentially methylated sites in the genomic DNA of fig mutants, and methylation may be responsible for the phenotypic variation between treatments. Further analyses such as polymorphic DNA sequencing are necessary to validate these differences, standardize the regions of methylation, and analyze reads using bioinformatic tools.

  19. Detection and correction of false segmental duplications caused by genome mis-assembly

    Science.gov (United States)

    2010-01-01

    Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes. PMID:20219098

  20. Human Retrotransposon Insertion Polymorphisms Are Associated with Health and Disease via Gene Regulatory Phenotypes

    Directory of Open Access Journals (Sweden)

    Lu Wang

    2017-08-01

    Full Text Available The human genome hosts several active families of transposable elements (TEs, including the Alu, LINE-1, and SVA retrotransposons that are mobilized via reverse transcription of RNA intermediates. We evaluated how insertion polymorphisms generated by human retrotransposon activity may be related to common health and disease phenotypes that have been previously interrogated through genome-wide association studies (GWAS. To address this question, we performed a genome-wide screen for retrotransposon polymorphism disease associations that are linked to TE induced gene regulatory changes. Our screen first identified polymorphic retrotransposon insertions found in linkage disequilibrium (LD with single nucleotide polymorphisms that were previously associated with common complex diseases by GWAS. We further narrowed this set of candidate disease associated retrotransposon polymorphisms by identifying insertions that are located within tissue-specific enhancer elements. We then performed expression quantitative trait loci analysis on the remaining set of candidates in order to identify polymorphic retrotransposon insertions that are associated with gene expression changes in B-cells of the human immune system. This progressive and stringent screen yielded a list of six retrotransposon insertions as the strongest candidates for TE polymorphisms that lead to disease via enhancer-mediated changes in gene regulation. For example, we found an SVA insertion within a cell-type specific enhancer located in the second intron of the B4GALT1 gene. B4GALT1 encodes a glycosyltransferase that functions in the glycosylation of the Immunoglobulin G (IgG antibody in such a way as to convert its activity from pro- to anti-inflammatory. The disruption of the B4GALT1 enhancer by the SVA insertion is associated with down-regulation of the gene in B-cells, which would serve to keep the IgG molecule in a pro-inflammatory state. Consistent with this idea, the B4GALT1 enhancer

  1. Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN

    Science.gov (United States)

    Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger

    2016-01-01

    Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831

  2. Analysis of the genetic variation in Mycobacterium tuberculosis strains by multiple genome alignments

    Directory of Open Access Journals (Sweden)

    Morales Juan

    2008-11-01

    Full Text Available Abstract Background The recent determination of the complete nucleotide sequence of several Mycobacterium tuberculosis (MTB genomes allows the use of comparative genomics as a tool for dissecting the nature and consequence of genetic variability within this species. The multiple alignment of the genomes of clinical strains (CDC1551, F11, Haarlem and C, along with the genomes of laboratory strains (H37Rv and H37Ra, provides new insights on the mechanisms of adaptation of this bacterium to the human host. Findings The genetic variation found in six M. tuberculosis strains does not involve significant genomic rearrangements. Most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE family but not with genes implicated in virulence. Using a Perl-based software islandsanalyser, which creates a representation of the genetic variation in the genome, we identified differences in the patterns of distribution and frequency of the polymorphisms across the genome. The identification of genes displaying strain-specific polymorphisms and the extrapolation of the number of strain-specific polymorphisms to an unlimited number of genomes indicates that the different strains contain a limited number of unique polymorphisms. Conclusion The comparison of multiple genomes demonstrates that the M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. This observation opens new perspectives into the evolution and the understanding of the pathogenesis of this bacterium.

  3. Characterizing the cancer genome in lung adenocarcinoma

    Science.gov (United States)

    Weir, Barbara A.; Woo, Michele S.; Getz, Gad; Perner, Sven; Ding, Li; Beroukhim, Rameen; Lin, William M.; Province, Michael A.; Kraja, Aldi; Johnson, Laura A.; Shah, Kinjal; Sato, Mitsuo; Thomas, Roman K.; Barletta, Justine A.; Borecki, Ingrid B.; Broderick, Stephen; Chang, Andrew C.; Chiang, Derek Y.; Chirieac, Lucian R.; Cho, Jeonghee; Fujii, Yoshitaka; Gazdar, Adi F.; Giordano, Thomas; Greulich, Heidi; Hanna, Megan; Johnson, Bruce E.; Kris, Mark G.; Lash, Alex; Lin, Ling; Lindeman, Neal; Mardis, Elaine R.; McPherson, John D.; Minna, John D.; Morgan, Margaret B.; Nadel, Mark; Orringer, Mark B.; Osborne, John R.; Ozenberger, Brad; Ramos, Alex H.; Robinson, James; Roth, Jack A.; Rusch, Valerie; Sasaki, Hidefumi; Shepherd, Frances; Sougnez, Carrie; Spitz, Margaret R.; Tsao, Ming-Sound; Twomey, David; Verhaak, Roel G. W.; Weinstock, George M.; Wheeler, David A.; Winckler, Wendy; Yoshizawa, Akihiko; Yu, Soyoung; Zakowski, Maureen F.; Zhang, Qunyuan; Beer, David G.; Wistuba, Ignacio I.; Watson, Mark A.; Garraway, Levi A.; Ladanyi, Marc; Travis, William D.; Pao, William; Rubin, Mark A.; Gabriel, Stacey B.; Gibbs, Richard A.; Varmus, Harold E.; Wilson, Richard K.; Lander, Eric S.; Meyerson, Matthew

    2008-01-01

    Somatic alterations in cellular DNA underlie almost all human cancers1. The prospect of targeted therapies2 and the development of high-resolution, genome-wide approaches3–8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumors (n = 371) using dense single nucleotide polymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scale copy-number gain or loss, of which only a handful have been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in ~12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 (NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered. PMID:17982442

  4. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains

    KAUST Repository

    Preston, Mark D.

    2014-06-13

    Malaria is a major public health problem that is actively being addressed in a global eradication campaign. Increased population mobility through international air travel has elevated the risk of re-introducing parasites to elimination areas and dispersing drug-resistant parasites to new regions. A simple genetic marker that quickly and accurately identifies the geographic origin of infections would be a valuable public health tool for locating the source of imported outbreaks. Here we analyse the mitochondrion and apicoplast genomes of 711 Plasmodium falciparum isolates from 14 countries, and find evidence that they are non-recombining and co-inherited. The high degree of linkage produces a panel of relatively few single-nucleotide polymorphisms (SNPs) that is geographically informative. We design a 23-SNP barcode that is highly predictive (?92%) and easily adapted to aid case management in the field and survey parasite migration worldwide. 2014 Macmillan Publishers Limited. All rights reserved.

  5. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains

    KAUST Repository

    Preston, Mark D.; Campino, Susana; Assefa, Samuel A.; Echeverry, Diego F.; Ocholla, Harold; Amambua-Ngwa, Alfred; Stewart, Lindsay B.; Conway, David J.; Borrmann, Steffen; Michon, Pascal; Zongo, Issaka; Oué draogo, Jean-Bosco; Djimde, Abdoulaye A.; Doumbo, Ogobara K.; Nosten, Francois; Pain, Arnab; Bousema, Teun; Drakeley, Chris J.; Fairhurst, Rick M.; Sutherland, Colin J.; Roper, Cally; Clark, Taane G.

    2014-01-01

    Malaria is a major public health problem that is actively being addressed in a global eradication campaign. Increased population mobility through international air travel has elevated the risk of re-introducing parasites to elimination areas and dispersing drug-resistant parasites to new regions. A simple genetic marker that quickly and accurately identifies the geographic origin of infections would be a valuable public health tool for locating the source of imported outbreaks. Here we analyse the mitochondrion and apicoplast genomes of 711 Plasmodium falciparum isolates from 14 countries, and find evidence that they are non-recombining and co-inherited. The high degree of linkage produces a panel of relatively few single-nucleotide polymorphisms (SNPs) that is geographically informative. We design a 23-SNP barcode that is highly predictive (?92%) and easily adapted to aid case management in the field and survey parasite migration worldwide. 2014 Macmillan Publishers Limited. All rights reserved.

  6. Estimating additive and non-additive genetic variances and predicting genetic merits using genome-wide dense single nucleotide polymorphism markers.

    Directory of Open Access Journals (Sweden)

    Guosheng Su

    Full Text Available Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP markers. In addition, this study for the first time proposed a method to construct dominance relationship matrix using SNP markers and demonstrated it in detail. The proposed model was implemented to investigate the amounts of additive genetic, dominance and epistatic variations, and assessed the accuracy and unbiasedness of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1 a simple additive genetic model (MA, 2 a model including both additive and additive by additive epistatic genetic effects (MAE, 3 a model including both additive and dominance genetic effects (MAD, and 4 a full model including all three genetic components (MAED. Estimates of narrow-sense heritability were 0.397, 0.373, 0.379 and 0.357 for models MA, MAE, MAD and MAED, respectively. Estimated dominance variance and additive by additive epistatic variance accounted for 5.6% and 9.5% of the total phenotypic variance, respectively. Based on model MAED, the estimate of broad-sense heritability was 0.506. Reliabilities of genomic predicted breeding values for the animals without performance records were 28.5%, 28.8%, 29.2% and 29.5% for models MA, MAE, MAD and MAED, respectively. In addition, models including non-additive genetic effects improved unbiasedness of genomic predictions.

  7. Single Nucleotide Polymorphism Identification, Characterization, and Linkage Mapping in Quinoa

    Directory of Open Access Journals (Sweden)

    P. J. Maughan

    2012-11-01

    Full Text Available Quinoa ( Willd. is an important seed crop throughout the Andean region of South America. It is important as a regional food security crop for millions of impoverished rural inhabitants of the Andean Altiplano (high plains. Efforts to improve the crop have led to an increased focus on genetic research. We report the identification of 14,178 putative single nucleotide polymorphisms (SNPs using a genomic reduction protocol as well as the development of 511 functional SNP assays. The SNP assays are based on KASPar genotyping chemistry and were detected using the Fluidigm dynamic array platform. A diversity screen of 113 quinoa accessions showed that the minor allele frequency (MAF of the SNPs ranged from 0.02 to 0.50, with an average MAF of 0.28. Structure analysis of the quinoa diversity panel uncovered the two major subgroups corresponding to the Andean and coastal quinoa ecotypes. Linkage mapping of the SNPs in two recombinant inbred line populations produced an integrated linkage map consisting of 29 linkage groups with 20 large linkage groups, spanning 1404 cM with a marker density of 3.1 cM per SNP marker. The SNPs identified here represent important genomic tools needed in emerging plant breeding programs for advanced genetic analysis of agronomic traits in quinoa.

  8. Analyses of Genotypes and Phenotypes of Ten Chinese Patients with Wolf-Hirschhorn Syndrome by Multiplex Ligation-dependent Probe Amplification and Array Comparative Genomic Hybridization.

    Science.gov (United States)

    Yang, Wen-Xu; Pan, Hong; Li, Lin; Wu, Hai-Rong; Wang, Song-Tao; Bao, Xin-Hua; Jiang, Yu-Wu; Qi, Yu

    2016-03-20

    Wolf-Hirschhorn syndrome (WHS) is a contiguous gene syndrome that is typically caused by a deletion of the distal portion of the short arm of chromosome 4. However, there are few reports about the features of Chinese WHS patients. This study aimed to characterize the clinical and molecular cytogenetic features of Chinese WHS patients using the combination of multiplex ligation-dependent probe amplification (MLPA) and array comparative genomic hybridization (array CGH). Clinical information was collected from ten patients with WHS. Genomic DNA was extracted from the peripheral blood of the patients. The deletions were analyzed by MLPA and array CGH. All patients exhibited the core clinical symptoms of WHS, including severe growth delay, a Greek warrior helmet facial appearance, differing degrees of intellectual disability, and epilepsy or electroencephalogram anomalies. The 4p deletions ranged from 2.62 Mb to 17.25 Mb in size and included LETM1, WHSC1, and FGFR3. The combined use of MLPA and array CGH is an effective and specific means to diagnose WHS and allows for the precise identification of the breakpoints and sizes of deletions. The deletion of genes in the WHS candidate region is closely correlated with the core WHS phenotype.

  9. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics

    Directory of Open Access Journals (Sweden)

    Zhang Dapeng

    2012-06-01

    Full Text Available Abstract Background Proteinaceous toxins are observed across all levels of inter-organismal and intra-genomic conflicts. These include recently discovered prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. They are characterized by a remarkable diversity of C-terminal toxin domains generated by recombination with standalone toxin-coding cassettes. Prior analysis revealed a striking diversity of nuclease and deaminase domains among the toxin modules. We systematically investigated polymorphic toxin systems using comparative genomics, sequence and structure analysis. Results Polymorphic toxin systems are distributed across all major bacterial lineages and are delivered by at least eight distinct secretory systems. In addition to type-II, these include type-V, VI, VII (ESX, and the poorly characterized “Photorhabdus virulence cassettes (PVC”, PrsW-dependent and MuF phage-capsid-like systems. We present evidence that trafficking of these toxins is often accompanied by autoproteolytic processing catalyzed by HINT, ZU5, PrsW, caspase-like, papain-like, and a novel metallopeptidase associated with the PVC system. We identified over 150 distinct toxin domains in these systems. These span an extraordinary catalytic spectrum to include 23 distinct clades of peptidases, numerous previously unrecognized versions of nucleases and deaminases, ADP-ribosyltransferases, ADP ribosyl cyclases, RelA/SpoT-like nucleotidyltransferases, glycosyltranferases and other enzymes predicted to modify lipids and carbohydrates, and a pore-forming toxin domain. Several of these toxin domains are shared with host-directed effectors of pathogenic bacteria. Over 90 families of immunity proteins might neutralize anywhere between a single to at least 27 distinct types of toxin domains. In some organisms multiple tandem immunity genes or immunity protein domains are organized into polyimmunity loci or polyimmunity proteins. Gene-neighborhood-analysis of

  10. Clinical biochemistry and laboratory medicine in the post-genome era

    International Nuclear Information System (INIS)

    Efremov, Georgi D.

    2001-01-01

    The last decades of the 20th century were a period of outstanding scientific achievements. The most significant discovery was the decoding of the human genome (Venter, J. et al., 2001; Dennis, C. et al., 2001; Baltimore, D., 2001). In this article the present view of the post genomic era is presented. The new analytical methods, such as micro arrays, bio chips, and nano technology, the discovery of SNPs, and the analysis of the proteome will lead to a greater understanding of the pathogenesis of inherited and acquired diseases. Their use in clinical chemistry and laboratory medicine, and the future of technological innovations are discussed. In the post genomic era the greatest interest will be devoted to the application of these scientific achievements in the diagnosis, prevention and therapy of human diseases. The advances in human genetics that have occurred during the past 20 years have revolutionized our knowledge of the role played by inheritance in health and disease. It is clear that our DNA determines not only single gene disorders but also interacts with environments to predispose individuals to cancer, allergy, hypertension, heart disease, diabetes, psychiatric disorders and even to some infectious diseases. The study of longevity and the demonstration of genes favouring a long lifespan suggest that such protective systems exist. The study of genetic polymorphisms has made clear that some alleles have beneficial effects. These discoveries will be of great help in our understanding of the interactions between genetics and environment. Gene array analysis has become the method of choice for identifying genes expressed at different levels in different samples. The mRNA expression profiles of normal and tumor tissues, treated and untreated cell cultures, and developmental stages of an organism can be compared quickly and easily with an appropriate array analysis system. A major task after a genome has been fully sequenced is to understand the functions

  11. Conclusive evidence for hexasomic inheritance in chrysanthemum based on analysis of a 183 k SNP array.

    Science.gov (United States)

    van Geest, Geert; Voorrips, Roeland E; Esselink, Danny; Post, Aike; Visser, Richard Gf; Arens, Paul

    2017-08-07

    Cultivated chrysanthemum is an outcrossing hexaploid (2n = 6× = 54) with a disputed mode of inheritance. In this paper, we present a single nucleotide polymorphism (SNP) selection pipeline that was used to design an Affymetrix Axiom array with 183 k SNPs from RNA sequencing data (1). With this array, we genotyped four bi-parental populations (with sizes of 405, 53, 76 and 37 offspring plants respectively), and a cultivar panel of 63 genotypes. Further, we present a method for dosage scoring in hexaploids from signal intensities of the array based on mixture models (2) and validation of selection steps in the SNP selection pipeline (3). The resulting genotypic data is used to draw conclusions on the mode of inheritance in chrysanthemum (4), and to make an inference on allelic expression bias (5). With use of the mixture model approach, we successfully called the dosage of 73,936 out of 183,130 SNPs (40.4%) that segregated in any of the bi-parental populations. To investigate the mode of inheritance, we analysed markers that segregated in the large bi-parental population (n = 405). Analysis of segregation of duplex x nulliplex SNPs resulted in evidence for genome-wide hexasomic inheritance. This evidence was substantiated by the absence of strong linkage between markers in repulsion, which indicated absence of full disomic inheritance. We present the success rate of SNP discovery out of RNA sequencing data as affected by different selection steps, among which SNP coverage over genotypes and use of different types of sequence read mapping software. Genomic dosage highly correlated with relative allele coverage from the RNA sequencing data, indicating that most alleles are expressed according to their genomic dosage. The large population, genotyped with a very large number of markers, is a unique framework for extensive genetic analyses in hexaploid chrysanthemum. As starting point, we show conclusive evidence for genome-wide hexasomic inheritance.

  12. Genome architecture enables local adaptation of Atlantic cod despite high connectivity

    DEFF Research Database (Denmark)

    Barth, Julia M I; Berg, Paul R; Jonsson, Per R.

    2017-01-01

    Adaptation to local conditions is a fundamental process in evolution; however, mechanisms maintaining local adaptation despite high gene flow are still poorly understood. Marine ecosystems provide a wide array of diverse habitats that frequently promote ecological adaptation even in species...... characterized by strong levels of gene flow. As one example, populations of the marine fish Atlantic cod (Gadus morhua) are highly connected due to immense dispersal capabilities but nevertheless show local adaptation in several key traits. By combining population genomic analyses based on 12K single......-nucleotide polymorphisms with larval dispersal patterns inferred using a biophysical ocean model, we show that Atlantic cod individuals residing in sheltered estuarine habitats of Scandinavian fjords mainly belong to offshore oceanic populations with considerable connectivity between these diverse ecosystems. Nevertheless...

  13. Single Nucleotide Polymorphism

    DEFF Research Database (Denmark)

    Børsting, Claus; Pereira, Vania; Andersen, Jeppe Dyrberg

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the genome. They have been studied extensively in the last decade with various purposes in mind. In this chapter, we will discuss the advantages and disadvantages of using SNPs for human identification...... of SNPs. This will allow acquisition of more information from the sample materials and open up for new possibilities as well as new challenges....

  14. Genome Plasticity and Polymorphisms in Critical Genes Correlate with Increased Virulence of Dutch Outbreak-Related Coxiella burnetii Strains

    Directory of Open Access Journals (Sweden)

    Runa Kuley

    2017-08-01

    Full Text Available Coxiella burnetii is an obligate intracellular bacterium and the etiological agent of Q fever. During 2007–2010 the largest Q fever outbreak ever reported occurred in The Netherlands. It is anticipated that strains from this outbreak demonstrated an increased zoonotic potential as more than 40,000 individuals were assumed to be infected. The acquisition of novel genetic factors by these C. burnetii outbreak strains, such as virulence-related genes, has frequently been proposed and discussed, but is not proved yet. In the present study, the whole genome sequence of several Dutch strains (CbNL01 and CbNL12 genotypes, a few additionally selected strains from different geographical locations and publicly available genome sequences were used for a comparative bioinformatics approach. The study focuses on the identification of specific genetic differences in the outbreak related CbNL01 strains compared to other C. burnetii strains. In this approach we investigated the phylogenetic relationship and genomic aspects of virulence and host-specificity. Phylogenetic clustering of whole genome sequences showed a genotype-specific clustering that correlated with the clustering observed using Multiple Locus Variable-number Tandem Repeat Analysis (MLVA. Ortholog analysis on predicted genes and single nucleotide polymorphism (SNP analysis of complete genome sequences demonstrated the presence of genotype-specific gene contents and SNP variations in C. burnetii strains. It also demonstrated that the currently used MLVA genotyping methods are highly discriminatory for the investigated outbreak strains. In the fully reconstructed genome sequence of the Dutch outbreak NL3262 strain of the CbNL01 genotype, a relatively large number of transposon-linked genes were identified as compared to the other published complete genome sequences of C. burnetii. Additionally, large numbers of SNPs in its membrane proteins and predicted virulence-associated genes were identified

  15. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    OpenAIRE

    Karolina Chwialkowska; Urszula Korotko; Joanna Kosinska; Iwona Szarejko; Miroslaw Kwasniewski

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing ...

  16. Genomic characterization of large heterochromatic gaps in the human genome assembly.

    Directory of Open Access Journals (Sweden)

    Nicolas Altemose

    2014-05-01

    Full Text Available The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3. The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations.

  17. Genomic Variation in Natural Populations of Drosophila melanogaster

    Science.gov (United States)

    Langley, Charles H.; Stevens, Kristian; Cardeno, Charis; Lee, Yuh Chwen G.; Schrider, Daniel R.; Pool, John E.; Langley, Sasha A.; Suarez, Charlyn; Corbett-Detig, Russell B.; Kolaczkowski, Bryan; Fang, Shu; Nista, Phillip M.; Holloway, Alisha K.; Kern, Andrew D.; Dewey, Colin N.; Song, Yun S.; Hahn, Matthew W.; Begun, David J.

    2012-01-01

    This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5′- and 3′-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species. PMID:22673804

  18. GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies.

    Science.gov (United States)

    Sulovari, Arvis; Li, Dawei

    2014-07-19

    Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep

  19. High-Resolution Genome-Wide Linkage Mapping Identifies Susceptibility Loci for BMI in the Chinese Population

    DEFF Research Database (Denmark)

    Zhang, Dong Feng; Pang, Zengchang; Li, Shuxia

    2012-01-01

    The genetic loci affecting the commonly used BMI have been intensively investigated using linkage approaches in multiple populations. This study aims at performing the first genome-wide linkage scan on BMI in the Chinese population in mainland China with hypothesis that heterogeneity in genetic...... linkage could exist in different ethnic populations. BMI was measured from 126 dizygotic twins in Qingdao municipality who were genotyped using high-resolution Affymetrix Genome-Wide Human SNP arrays containing about 1 million single-nucleotide polymorphisms (SNPs). Nonparametric linkage analysis...... in western countries. Multiple loci showing suggestive linkage were found on chromosome 1 (lod score 2.38 at 242 cM), chromosome 8 (2.48 at 95 cM), and chromosome 14 (2.2 at 89.4 cM). The strong linkage identified in the Chinese subjects that is consistent with that found in populations of European origin...

  20. Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias

    Directory of Open Access Journals (Sweden)

    Didion John P

    2012-01-01

    Full Text Available Abstract Background High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. Results We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. Conclusion The problems of ascertainment bias and missing

  1. Genome-Wide Association Study to Identify Single Nucleotide Polymorphisms (SNPs) Associated With the Development of Erectile Dysfunction in African-American Men After Radiotherapy for Prostate Cancer

    International Nuclear Information System (INIS)

    Kerns, Sarah L.; Ostrer, Harry; Stock, Richard; Li, William; Moore, Julian; Pearlman, Alexander; Campbell, Christopher; Shao Yongzhao; Stone, Nelson; Kusnetz, Lynda; Rosenstein, Barry S.

    2010-01-01

    Purpose: To identify single nucleotide polymorphisms (SNPs) associated with erectile dysfunction (ED) among African-American prostate cancer patients treated with external beam radiation therapy. Methods and Materials: A cohort of African-American prostate cancer patients treated with external beam radiation therapy was observed for the development of ED by use of the five-item Sexual Health Inventory for Men (SHIM) questionnaire. Final analysis included 27 cases (post-treatment SHIM score ≤7) and 52 control subjects (post-treatment SHIM score ≥16). A genome-wide association study was performed using approximately 909,000 SNPs genotyped on Affymetrix 6.0 arrays (Affymetrix, Santa Clara, CA). Results: We identified SNP rs2268363, located in the follicle-stimulating hormone receptor (FSHR) gene, as significantly associated with ED after correcting for multiple comparisons (unadjusted p = 5.46 x 10 -8 , Bonferroni p = 0.028). We identified four additional SNPs that tended toward a significant association with an unadjusted p value -6 . Inference of population substructure showed that cases had a higher proportion of African ancestry than control subjects (77% vs. 60%, p = 0.005). A multivariate logistic regression model that incorporated estimated ancestry and four of the top-ranked SNPs was a more accurate classifier of ED than a model that included only clinical variables. Conclusions: To our knowledge, this is the first genome-wide association study to identify SNPs associated with adverse effects resulting from radiotherapy. It is important to note that the SNP that proved to be significantly associated with ED is located within a gene whose encoded product plays a role in male gonad development and function. Another key finding of this project is that the four SNPs most strongly associated with ED were specific to persons of African ancestry and would therefore not have been identified had a cohort of European ancestry been screened. This study demonstrates

  2. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.

    Directory of Open Access Journals (Sweden)

    Stoffel Kevin

    2012-05-01

    distinguished morphological types. Conclusion By hybridizing genomic DNA to a custom oligonucleotide array designed for maximum gene coverage, we were able to identify polymorphisms using two approaches for pair-wise comparisons, as well as a highly parallel method that compared all 52 genotypes simultaneously.

  3. Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L

    Science.gov (United States)

    Clarke, Wayne E.; Parkin, Isobel A.; Gajardo, Humberto A.; Gerhardt, Daniel J.; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G.; Snowdon, Rod J.; Federico, Maria L.; Iniguez-Luy, Federico L.

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci –QTL– analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species. PMID:24312619

  4. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

    Science.gov (United States)

    Thorvaldsdóttir, Helga; Robinson, James T; Mesirov, Jill P

    2013-03-01

    Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.

  5. Genomic Prediction of Barley Hybrid Performance

    Directory of Open Access Journals (Sweden)

    Norman Philipp

    2016-07-01

    Full Text Available Hybrid breeding in barley ( L. offers great opportunities to accelerate the rate of genetic improvement and to boost yield stability. A crucial requirement consists of the efficient selection of superior hybrid combinations. We used comprehensive phenotypic and genomic data from a commercial breeding program with the goal of examining the potential to predict the hybrid performances. The phenotypic data were comprised of replicated grain yield trials for 385 two-way and 408 three-way hybrids evaluated in up to 47 environments. The parental lines were genotyped using a 3k single nucleotide polymorphism (SNP array based on an Illumina Infinium assay. We implemented ridge regression best linear unbiased prediction modeling for additive and dominance effects and evaluated the prediction ability using five-fold cross validations. The prediction ability of hybrid performances based on general combining ability (GCA effects was moderate, amounting to 0.56 and 0.48 for two- and three-way hybrids, respectively. The potential of GCA-based hybrid prediction requires that both parental components have been evaluated in a hybrid background. This is not necessary for genomic prediction for which we also observed moderate cross-validated prediction abilities of 0.51 and 0.58 for two- and three-way hybrids, respectively. This exemplifies the potential of genomic prediction in hybrid barley. Interestingly, prediction ability using the two-way hybrids as training population and the three-way hybrids as test population or vice versa was low, presumably, because of the different genetic makeup of the parental source populations. Consequently, further research is needed to optimize genomic prediction approaches combining different source populations in barley.

  6. The impact of genome triplication on tandem gene evolution in Brassica rapa

    Directory of Open Access Journals (Sweden)

    Lu eFang

    2012-11-01

    Full Text Available Whole genome duplication (WGD and tandem duplication (TD are both important modes of gene expansion. However, how whole genome duplication influences tandemly duplicated genes is not well studied. We used Brassica rapa, which has undergone an additional genome triplication (WGT and shares a common ancestor with Arabidopsis thaliana, Arabidopsis lyrata and Thellungiella parvula, to investigate the impact of genome triplication on tandem gene evolution. We identified 2,137, 1,569, 1,751 and 1,135 tandem gene arrays in B. rapa, A. thaliana, A. lyrata and T. parvula respectively. Among them, 414 conserved tandem arrays are shared by the 3 species without WGT, which were also considered as existing in the diploid ancestor of B. rapa. Thus, after genome triplication, B. rapa should have 1,242 tandem arrays according to the 414 conserved tandems. Here, we found 400 out of the 414 tandems had at least one syntenic ortholog in the genome of B. rapa. Furthermore, 294 out of the 400 shared syntenic orthologs maintain tandem arrays (more than one gene for each syntenic hit in B. rapa. For the 294 tandem arrays, we obtained 426 copies of syntenic paralogous tandems in the triplicated genome of B. rapa. In this study, we demonstrated that tandem arrays in B. rapa were dramatically fractionated after WGT when compared either to non-tandem genes in the B. rapa genome or to the tandem arrays in closely related species that have not experienced a recent whole-genome polyploidization event.

  7. Detailed analysis of inversions predicted between two human genomes: errors, real polymorphisms, and their origin and population distribution.

    Science.gov (United States)

    Vicente-Salvador, David; Puig, Marta; Gayà-Vidal, Magdalena; Pacheco, Sarai; Giner-Delgado, Carla; Noguera, Isaac; Izquierdo, David; Martínez-Fundichely, Alexander; Ruiz-Herrera, Aurora; Estivill, Xavier; Aguado, Cristina; Lucas-Lledó, José Ignacio; Cáceres, Mario

    2017-02-01

    The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF = 0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. Contributions of IKZF1, DDC, CDKN2A, CEBPE, and LMO1 Gene Polymorphisms to Acute Lymphoblastic Leukemia in a Yemeni Population.

    Science.gov (United States)

    Al-Absi, Boshra; Razif, Muhammad F M; Noor, Suzita M; Saif-Ali, Riyadh; Aqlan, Mohammed; Salem, Sameer D; Ahmed, Radwan H; Muniandy, Sekaran

    2017-10-01

    Genome-wide and candidate gene association studies have previously revealed links between a predisposition to acute lymphoblastic leukemia (ALL) and genetic polymorphisms in the following genes: IKZF1 (7p12.2; ID: 10320), DDC (7p12.2; ID: 1644), CDKN2A (9p21.3; ID: 1029), CEBPE (14q11.2; ID: 1053), and LMO1 (11p15; ID: 4004). In this study, we aimed to conduct an investigation into the possible association between polymorphisms in these genes and ALL within a sample of Yemeni children of Arab-Asian descent. Seven single-nucleotide polymorphisms (SNPs) in IKZF1, three SNPs in DDC, two SNPs in CDKN2A, two SNPs in CEBPE, and three SNPs in LMO1 were genotyped in 289 Yemeni children (136 cases and 153 controls), using the nanofluidic Dynamic Array (Fluidigm 192.24 Dynamic Array). Logistic regression analyses were used to estimate ALL risk, and the strength of association was expressed as odds ratios with 95% confidence intervals. We found that the IKZF1 SNP rs10235796 C allele (p = 0.002), the IKZF1 rs6964969 A>G polymorphism (p = 0.048, GG vs. AA), the CDKN2A rs3731246 G>C polymorphism (p = 0.047, GC+CC vs. GG), and the CDKN2A SNP rs3731246 C allele (p = 0.007) were significantly associated with ALL in Yemenis of Arab-Asian descent. In addition, a borderline association was found between IKZF1 rs4132601 T>G variant and ALL risk. No associations were found between the IKZF1 SNPs (rs11978267; rs7789635), DDC SNPs (rs3779084; rs880028; rs7809758), CDKN2A SNP (rs3731217), the CEBPE SNPs (rs2239633; rs12434881) and LMO1 SNPs (rs442264; rs3794012; rs4237770) with ALL in Yemeni children. The IKZF1 SNPs, rs10235796 and rs6964969, and the CDKN2A SNP rs3731246 (previously unreported) could serve as risk markers for ALL susceptibility in Yemeni children.

  9. Medicina genómica: Aplicaciones del polimorfismo de un nucleótido y micromatrices de ADN Genomic Medicine: Polymorphisms and microarray applications

    Directory of Open Access Journals (Sweden)

    Monica P. Spalvieri

    2004-12-01

    Full Text Available Esta actualización tiene por objeto difundir un nuevo enfoque de las variaciones del ADN entre individuos y comentar las nuevas tecnologías para su detección. La secuenciación total del genoma humano es el comienzo para conocer la diversidad genética. La unidad de medida reconocida de esta variabilidad es el polimorfismo de un solo nucleótido (single nucleotide polymorphism o SNP. El estudio de los SNPs está restringido a la investigación pero las numerosas publicaciones sobre el tema hacen vislumbrar su entrada en la práctica clínica. Se presentan ejemplos del uso de SNPs como marcadores moleculares en la genotipificación étnica, la expresión génica de enfermedades y como potenciales blancos farmacológicos. Se comenta la técnica de las matrices (arrays que facilita el estudio de múltiples secuencias de genes mediante chips de diseño específico. Los métodos convencionales analizan hasta un máximo de 20 genes, mientras que una sola micromatriz provee información sobre decenas de miles de genes simultáneamente con una genotipificación rápida y exacta. Los avances de la biotecnología permitirán conocer, además de la secuencia de cada gen, la frecuencia y ubicación exacta de los SNPs y su influencia en los comportamientos celulares. Si bien la validez de los resultados y la eficiencia de las micromatrices son aún controvertidos, el conocimiento y caracterización del perfil genético de un paciente impulsará seguramente un cambio radical en la prevención, diagnóstico, pronóstico y tratamiento de las enfermedades humanas.This update shows new concepts related to the significance of DNA variations among individuals, as well as to their detection by using a new technology. The sequencing of the human genome is only the beginning of what will enable us to understand genetic diversity. The unit of DNA variability is the polymorphism of a single nucleotide (SNP. At present, studies on SNPs are restricted to basic research

  10. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

    NARCIS (Netherlands)

    J. Huang (Jie); B. Howie (Bryan); S. McCarthy (Shane); Y. Memari (Yasin); K. Walter (Klaudia); J.L. Min (Josine L.); P. Danecek (Petr); G. Malerba (Giovanni); E. Trabetti (Elisabetta); H.-F. Zheng (Hou-Feng); G. Gambaro (Giovanni); J.B. Richards (Brent); R. Durbin (Richard); N.J. Timpson (Nicholas); J. Marchini (Jonathan); N. Soranzo (Nicole); S.H. Al Turki (Saeed); A. Amuzu (Antoinette); C. Anderson (Carl); R. Anney (Richard); D. Antony (Dinu); M.S. Artigas; M. Ayub (Muhammad); S. Bala (Senduran); J.C. Barrett (Jeffrey); I.E. Barroso (Inês); P.L. Beales (Philip); M. Benn (Marianne); J. Bentham (Jamie); S. Bhattacharya (Shoumo); E. Birney (Ewan); D.H.R. Blackwood (Douglas); M. Bobrow (Martin); E. Bochukova (Elena); P.F. Bolton (Patrick F.); R. Bounds (Rebecca); C. Boustred (Chris); G. Breen (Gerome); M. Calissano (Mattia); K. Carss (Keren); J.P. Casas (Juan Pablo); J.C. Chambers (John C.); R. Charlton (Ruth); K. Chatterjee (Krishna); L. Chen (Lu); A. Ciampi (Antonio); S. Cirak (Sebahattin); P. Clapham (Peter); G. Clement (Gail); G. Coates (Guy); M. Cocca (Massimiliano); D.A. Collier (David); C. Cosgrove (Catherine); T. Cox (Tony); N.J. Craddock (Nick); L. Crooks (Lucy); S. Curran (Sarah); D. Curtis (David); A. Daly (Allan); I.N.M. Day (Ian N.M.); A.G. Day-Williams (Aaron); G.V. Dedoussis (George); T. Down (Thomas); Y. Du (Yuanping); C.M. van Duijn (Cornelia); I. Dunham (Ian); T. Edkins (Ted); R. Ekong (Rosemary); P. Ellis (Peter); D.M. Evans (David); I.S. Farooqi (I. Sadaf); D.R. Fitzpatrick (David R.); P. Flicek (Paul); J. Floyd (James); A.R. Foley (A. Reghan); C.S. Franklin (Christopher S.); M. Futema (Marta); L. Gallagher (Louise); P. Gasparini (Paolo); T.R. Gaunt (Tom); M. Geihs (Matthias); D. Geschwind (Daniel); C.M.T. Greenwood (Celia); H. Griffin (Heather); D. Grozeva (Detelina); X. Guo (Xiaosen); X. Guo (Xueqin); H. Gurling (Hugh); D. Hart (Deborah); A.E. Hendricks (Audrey E.); P.A. Holmans (Peter A.); L. Huang (Liren); T. Hubbard (Tim); S.E. Humphries (Steve E.); M.E. Hurles (Matthew); P.G. Hysi (Pirro); V. Iotchkova (Valentina); A. Isaacs (Aaron); D.K. Jackson (David K.); Y. Jamshidi (Yalda); J. Johnson (Jon); C. Joyce (Chris); K.J. Karczewski (Konrad); J. Kaye (Jane); T. Keane (Thomas); J.P. Kemp (John); K. Kennedy (Karen); A. Kent (Alastair); J. Keogh (Julia); F. Khawaja (Farrah); M.E. Kleber (Marcus); M. Van Kogelenberg (Margriet); A. Kolb-Kokocinski (Anja); J.S. Kooner (Jaspal S.); G. Lachance (Genevieve); C. Langenberg (Claudia); C. Langford (Cordelia); D. Lawson (Daniel); I. Lee (Irene); E.M. van Leeuwen (Elisa); M. Lek (Monkol); R. Li (Rui); Y. Li (Yingrui); J. Liang (Jieqin); H. Lin (Hong); R. Liu (Ryan); J. Lönnqvist (Jouko); L.R. Lopes (Luis R.); M.C. Lopes (Margarida); J. Luan; D.G. MacArthur (Daniel G.); M. Mangino (Massimo); G. Marenne (Gaëlle); W. März (Winfried); J. Maslen (John); A. Matchan (Angela); I. Mathieson (Iain); P. McGuffin (Peter); A.M. McIntosh (Andrew); A.G. McKechanie (Andrew G.); A. McQuillin (Andrew); S. Metrustry (Sarah); N. Migone (Nicola); H.M. Mitchison (Hannah M.); A. Moayyeri (Alireza); J. Morris (James); R. Morris (Richard); D. Muddyman (Dawn); F. Muntoni; B.G. Nordestgaard (Børge G.); K. Northstone (Kate); M.C. O'donovan (Michael); S. O'Rahilly (Stephen); A. Onoufriadis (Alexandros); K. Oualkacha (Karim); M.J. Owen (Michael J.); A. Palotie (Aarno); K. Panoutsopoulou (Kalliope); V. Parker (Victoria); J.R. Parr (Jeremy R.); L. Paternoster (Lavinia); T. Paunio (Tiina); F. Payne (Felicity); S.J. Payne (Stewart J.); J.R.B. Perry (John); O.P.H. Pietiläinen (Olli); V. Plagnol (Vincent); R.C. Pollitt (Rebecca C.); S. Povey (Sue); M.A. Quail (Michael A.); L. Quaye (Lydia); L. Raymond (Lucy); K. Rehnström (Karola); C.K. Ridout (Cheryl K.); S.M. Ring (Susan); G.R.S. Ritchie (Graham R.S.); N. Roberts (Nicola); R.L. Robinson (Rachel L.); D.B. Savage (David); P.J. Scambler (Peter); S. Schiffels (Stephan); M. Schmidts (Miriam); N. Schoenmakers (Nadia); R.H. Scott (Richard H.); R.A. Scott (Robert); R.K. Semple (Robert K.); E. Serra (Eva); S.I. Sharp (Sally I.); A.C. Shaw (Adam C.); H.A. Shihab (Hashem A.); S.-Y. Shin (So-Youn); D. Skuse (David); K.S. Small (Kerrin); C. Smee (Carol); G.D. Smith; L. Southam (Lorraine); O. Spasic-Boskovic (Olivera); T.D. Spector (Timothy); D. St. Clair (David); B. St Pourcain (Beate); J. Stalker (Jim); E. Stevens (Elizabeth); J. Sun (Jianping); G. Surdulescu (Gabriela); J. Suvisaari (Jaana); P. Syrris (Petros); I. Tachmazidou (Ioanna); R. Taylor (Rohan); J. Tian (Jing); M.D. Tobin (Martin); D. Toniolo (Daniela); M. Traglia (Michela); A. Tybjaerg-Hansen; A.M. Valdes; A.M. Vandersteen (Anthony M.); A. Varbo (Anette); P. Vijayarangakannan (Parthiban); P.M. Visscher (Peter); L.V. Wain (Louise); J.T. Walters (James); G. Wang (Guangbiao); J. Wang (Jun); Y. Wang (Yu); K. Ward (Kirsten); E. Wheeler (Eleanor); P.H. Whincup (Peter); T. Whyte (Tamieka); H.J. Williams (Hywel J.); K.A. Williamson (Kathleen); C. Wilson (Crispian); S.G. Wilson (Scott); K. Wong (Kim); C. Xu (Changjiang); J. Yang (Jian); G. Zaza (Gianluigi); E. Zeggini (Eleftheria); F. Zhang (Feng); P. Zhang (Pingbo); W. Zhang (Weihua)

    2015-01-01

    textabstractImputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced

  11. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  12. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    Science.gov (United States)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  13. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

    Science.gov (United States)

    Ignatieva, Elena V; Levitsky, Victor G; Yudin, Nikolay S; Moshkin, Mikhail P; Kolchanov, Nikolay A

    2014-01-01

    The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  14. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    Science.gov (United States)

    2011-09-01

    SNP Array v2. A ‘proof-of-concept’ advanced data mining algorithm for unsupervised analysis of genome-wide association study (GWAS) dataset was... Opal F AUS Yes U141 Peggs F AUS Yes U142 Taxi F AUS Yes U143 Riso MI MAL Yes U144 Szarik MI GSD Yes U145 Astor MI MAL Yes U146 Roy MC MAL Yes... mining of genetic studies in general, and especially GWAS. As a proof-of-concept, a classification analysis of the WG SNP typing dataset of a

  15. Genome-Wide Association Study of Antiphospholipid Antibodies

    Directory of Open Access Journals (Sweden)

    M. Ilyas Kamboh

    2013-01-01

    Full Text Available Background. The persistent presence of antiphospholipid antibodies (APA may lead to the development of primary or secondary antiphospholipid syndrome. Although the genetic basis of APA has been suggested, the identity of the underlying genes is largely unknown. In this study, we have performed a genome-wide association study (GWAS in an effort to identify susceptibility loci/genes for three main APA: anticardiolipin antibodies (ACL, lupus anticoagulant (LAC, and anti-β2 glycoprotein I antibodies (anti-β2GPI. Methods. DNA samples were genotyped using the Affymetrix 6.0 array containing 906,600 single-nucleotide polymorphisms (SNPs. Association of SNPs with the antibody status (positive/negative was tested using logistic regression under the additive model. Results. We have identified a number of suggestive novel loci with Pgenome-wide significance, many of the suggestive loci are potential candidates for the production of APA. We have replicated the previously reported associations of HLA genes and APOH with APA but these were not the top loci. Conclusions. We have identified a number of suggestive novel loci for APA that will stimulate follow-up studies in independent and larger samples to replicate our findings.

  16. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

    DEFF Research Database (Denmark)

    Huang, Jie; Howie, Bryan; Mccarthy, Shane

    2015-01-01

    Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low de...

  17. Characterization of polymorphic SSRs among Prunus chloroplast genomes

    Science.gov (United States)

    An in silico mining process yielded 80, 75, and 78 microsatellites in the chloroplast genome of Prunus persica, P. kansuensis, and P. mume. A and T repeats were predominant in the three genomes, accounting for 67.8% on average and most of them were successful in primer design. For the 80 P. persica ...

  18. Genome landscape and evolutionary plasticity of chromosomes in malaria mosquitoes.

    Directory of Open Access Journals (Sweden)

    Ai Xia

    2010-05-01

    Full Text Available Nonrandom distribution of rearrangements is a common feature of eukaryotic chromosomes that is not well understood in terms of genome organization and evolution. In the major African malaria vector Anopheles gambiae, polymorphic inversions are highly nonuniformly distributed among five chromosomal arms and are associated with epidemiologically important adaptations. However, it is not clear whether the genomic content of the chromosomal arms is associated with inversion polymorphism and fixation rates.To better understand the evolutionary dynamics of chromosomal inversions, we created a physical map for an Asian malaria mosquito, Anopheles stephensi, and compared it with the genome of An. gambiae. We also developed and deployed novel Bayesian statistical models to analyze genome landscapes in individual chromosomal arms An. gambiae. Here, we demonstrate that, despite the paucity of inversion polymorphisms on the X chromosome, this chromosome has the fastest rate of inversion fixation and the highest density of transposable elements, simple DNA repeats, and GC content. The highly polymorphic and rapidly evolving autosomal 2R arm had overrepresentation of genes involved in cellular response to stress supporting the role of natural selection in maintaining adaptive polymorphic inversions. In addition, the 2R arm had the highest density of regions involved in segmental duplications that clustered in the breakpoint-rich zone of the arm. In contrast, the slower evolving 2L, 3R, and 3L, arms were enriched with matrix-attachment regions that potentially contribute to chromosome stability in the cell nucleus.These results highlight fundamental differences in evolutionary dynamics of the sex chromosome and autosomes and revealed the strong association between characteristics of the genome landscape and rates of chromosomal evolution. We conclude that a unique combination of various classes of genes and repetitive DNA in each arm, rather than a single type

  19. Correction for Measurement Error from Genotyping-by-Sequencing in Genomic Variance and Genomic Prediction Models

    DEFF Research Database (Denmark)

    Ashraf, Bilal; Janss, Luc; Jensen, Just

    sample). The GBSeq data can be used directly in genomic models in the form of individual SNP allele-frequency estimates (e.g., reference reads/total reads per polymorphic site per individual), but is subject to measurement error due to the low sequencing depth per individual. Due to technical reasons....... In the current work we show how the correction for measurement error in GBSeq can also be applied in whole genome genomic variance and genomic prediction models. Bayesian whole-genome random regression models are proposed to allow implementation of large-scale SNP-based models with a per-SNP correction...... for measurement error. We show correct retrieval of genomic explained variance, and improved genomic prediction when accounting for the measurement error in GBSeq data...

  20. [Study of Chloroplast DNA Polymorphism in the Sunflower (Helianthus L.)].

    Science.gov (United States)

    Markina, N V; Usatov, A V; Logacheva, M D; Azarin, K V; Gorbachenko, C F; Kornienko, I V; Gavrilova, V A; Tihobaeva, V E

    2015-08-01

    The polymorphism of microsatellite loci of chloroplast genome in six Helianthus species and 46 lines of cultivated sunflower H. annuus (17 CMS lines and 29 Rf-lines) were studied. The differences between species are confined to four SSR loci. Within cultivated forms of the sunflower H. annuus, the polymorphism is absent. A comparative analysis was performed on sequences of the cpDNA inbred line 3629, line 398941 of the wild sunflower, and the American line HA383 H. annuus. As a result, 52 polymorphic loci represented by 27 SSR and 25 SNP were found; they can be used for genotyping of H. annuus samples, including cultural varieties: twelve polymorphic positions, of which eight are SSR and four are SNP.

  1. Nested Inversion Polymorphisms Predispose Chromosome 22q11.2 to Meiotic Rearrangements

    NARCIS (Netherlands)

    Demaerel, Wolfram; Hestand, Matthew S.; Vergaelen, Elfi; Swillen, Ann; López-Sánchez, Marcos; Pérez-Jurado, Luis A.; McDonald-Mcginn, Donna M.; Zackai, Elaine; Emanuel, Beverly S.; Morrow, Bernice E.; Breckpot, Jeroen; Devriendt, Koenraad; Vermeesch, Joris R.; Antshel, Kevin M.; Arango, Celso; Armando, Marco; Bassett, Anne S.; Bearden, Carrie E.; Boot, Erik; Bravo-Sanchez, Marta; Breetvelt, Elemi; Busa, Tiffany; Butcher, Nancy J.; Campbell, Linda E.; Carmel, Miri; Chow, Eva W C; Crowley, T. Blaine; Cubells, Joseph; Cutler, David; Demaerel, Wolfram; Digilio, Maria Cristina; Duijff, Sasja; Eliez, Stephan; Emanuel, Beverly S.; Epstein, Michael P.; Evers, Rens; Fernandez Garcia-Moya, Luis; Fiksinski, Ania; Fraguas, David; Fremont, Wanda; Fritsch, Rosemarie; Garcia-Minaur, Sixto; Golden, Aaron; Gothelf, Doron; Guo, Tingwei; Gur, Ruben C.; Gur, Raquel E.; Heine-Suner, Damian; Hestand, Matthew; Hooper, Stephen R.; Kates, Wendy R.; Kushan, Leila; Laorden-Nieto, Alejandra; Maeder, Johanna; Marino, Bruno; Marshall, Christian R.; McCabe, Kathryn; McDonald-Mcginn, Donna M.; Michaelovosky, Elena; Morrow, Bernice E.; Moss, Edward; Mulle, Jennifer; Murphy, Declan; Murphy, Kieran C.; Murphy, Clodagh M.; Niarchou, Maria; Ornstein, Claudia; Owen, Michael J; Philip, Nicole; Repetto, Gabriela M.; Schneider, Maude; Shashi, Vandana; Simon, Tony J.; Swillen, Ann; Tassone, Flora; Unolt, Marta; Van Amelsvoort, Therese; van den Bree, Marianne B M; Van Duin, Esther; Vergaelen, Elfi; Vermeesch, Joris R.; Vicari, Stefano; Vingerhoets, Claudia; Vorstman, Jacob; Warren, Steve; Weinberger, Ronnie; Weisman, Omri; Weizman, Abraham; Zackai, Elaine; Zhang, Zhengdong; Zwick, Michael

    2017-01-01

    Inversion polymorphisms between low-copy repeats (LCRs) might predispose chromosomes to meiotic non-allelic homologous recombination (NAHR) events and thus lead to genomic disorders. However, for the 22q11.2 deletion syndrome (22q11.2DS), the most common genomic disorder, no such inversions have

  2. Genome sequence of herpes simplex virus 1 strain KOS.

    Science.gov (United States)

    Macdonald, Stuart J; Mostafa, Heba H; Morrison, Lynda A; Davido, David J

    2012-06-01

    Herpes simplex virus type 1 (HSV-1) strain KOS has been extensively used in many studies to examine HSV-1 replication, gene expression, and pathogenesis. Notably, strain KOS is known to be less pathogenic than the first sequenced genome of HSV-1, strain 17. To understand the genotypic differences between KOS and other phenotypically distinct strains of HSV-1, we sequenced the viral genome of strain KOS. When comparing strain KOS to strain 17, there are at least 1,024 small nucleotide polymorphisms (SNPs) and 172 insertions/deletions (indels). The polymorphisms observed in the KOS genome will likely provide insights into the genes, their protein products, and the cis elements that regulate the biology of this HSV-1 strain.

  3. Genomic Prediction Accuracy for Resistance Against Piscirickettsia salmonis in Farmed Rainbow Trout

    Directory of Open Access Journals (Sweden)

    Grazyella M. Yoshida

    2018-02-01

    Full Text Available Salmonid rickettsial syndrome (SRS, caused by the intracellular bacterium Piscirickettsia salmonis, is one of the main diseases affecting rainbow trout (Oncorhynchus mykiss farming. To accelerate genetic progress, genomic selection methods can be used as an effective approach to control the disease. The aims of this study were: (i to compare the accuracy of estimated breeding values using pedigree-based best linear unbiased prediction (PBLUP with genomic BLUP (GBLUP, single-step GBLUP (ssGBLUP, Bayes C, and Bayesian Lasso (LASSO; and (ii to test the accuracy of genomic prediction and PBLUP using different marker densities (0.5, 3, 10, 20, and 27 K for resistance against P. salmonis in rainbow trout. Phenotypes were recorded as number of days to death (DD and binary survival (BS from 2416 fish challenged with P. salmonis. A total of 1934 fish were genotyped using a 57 K single-nucleotide polymorphism (SNP array. All genomic prediction methods achieved higher accuracies than PBLUP. The relative increase in accuracy for different genomic models ranged from 28 to 41% for both DD and BS at 27 K SNP. Between different genomic models, the highest relative increase in accuracy was obtained with Bayes C (∼40%, where 3 K SNP was enough to achieve a similar accuracy to that of the 27 K SNP for both traits. For resistance against P. salmonis in rainbow trout, we showed that genomic predictions using GBLUP, ssGBLUP, Bayes C, and LASSO can increase accuracy compared with PBLUP. Moreover, it is possible to use relatively low-density SNP panels for genomic prediction without compromising accuracy predictions for resistance against P. salmonis in rainbow trout.

  4. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  5. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper.

    Science.gov (United States)

    Manivannan, Abinaya; Kim, Jin-Hee; Yang, Eun-Young; Ahn, Yul-Kyun; Lee, Eun-Su; Choi, Sena; Kim, Do-Sun

    2018-01-01

    Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS) approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP) indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  6. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper

    Directory of Open Access Journals (Sweden)

    Abinaya Manivannan

    2018-01-01

    Full Text Available Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  7. Spatial normalization of array-CGH data

    Directory of Open Access Journals (Sweden)

    Brennetot Caroline

    2006-05-01

    Full Text Available Abstract Background Array-based comparative genomic hybridization (array-CGH is a recently developed technique for analyzing changes in DNA copy number. As in all microarray analyses, normalization is required to correct for experimental artifacts while preserving the true biological signal. We investigated various sources of systematic variation in array-CGH data and identified two distinct types of spatial effect of no biological relevance as the predominant experimental artifacts: continuous spatial gradients and local spatial bias. Local spatial bias affects a large proportion of arrays, and has not previously been considered in array-CGH experiments. Results We show that existing normalization techniques do not correct these spatial effects properly. We therefore developed an automatic method for the spatial normalization of array-CGH data. This method makes it possible to delineate and to eliminate and/or correct areas affected by spatial bias. It is based on the combination of a spatial segmentation algorithm called NEM (Neighborhood Expectation Maximization and spatial trend estimation. We defined quality criteria for array-CGH data, demonstrating significant improvements in data quality with our method for three data sets coming from two different platforms (198, 175 and 26 BAC-arrays. Conclusion We have designed an automatic algorithm for the spatial normalization of BAC CGH-array data, preventing the misinterpretation of experimental artifacts as biologically relevant outliers in the genomic profile. This algorithm is implemented in the R package MANOR (Micro-Array NORmalization, which is described at http://bioinfo.curie.fr/projects/manor and available from the Bioconductor site http://www.bioconductor.org. It can also be tested on the CAPweb bioinformatics platform at http://bioinfo.curie.fr/CAPweb.

  8. Endothelial nitric oxide synthase gene polymorphisms associated ...

    African Journals Online (AJOL)

    Endothelial nitric oxide synthase (NOS3) is involved in key steps of immune response. Genetic factors predispose individuals to periodontal disease. This study's aim was to explore the association between NOS3 gene polymorphisms and clinical parameters in patients with periodontal disease. Genomic DNA was obtained ...

  9. Detection of human DNA polymorphisms with a simplified denaturing gradient gel electrophoresis technique

    International Nuclear Information System (INIS)

    Noll, W.W.; Collins, M.

    1987-01-01

    Single base pair differences between otherwise identical DNA molecules can result in altered melting behavior detectable by denaturing gradient gel electrophoresis. The authors have developed a simplified procedure for using denaturing gradient gel electrophoresis to detect base pair changes in genomic DNA. Genomic DNA is digested with restriction enzymes and hybridized in solution to labeled single-stranded probe DNA. The excess probe is then hybridized to complementary phage M13 template DNA, and the reaction mixture is electrophoresed on a denaturing gradient gel. Only the genomic DNA probe hybrids migrate into the gel. Differences in hybrid mobility on the gel indicate base pair changes in the genomic DNA. They have used this technique to identify two polymorphic sites within a 1.2-kilobase region of human chromosome 20. This approach should greatly facilitate the identification of DNA polymorphisms useful for gene linkage studies and the diagnosis of genetic diseases

  10. A genome-wide association study by ImmunoChip reveals potential modifiers in myelodysplastic syndromes.

    Science.gov (United States)

    Danjou, Fabrice; Fozza, Claudio; Zoledziewska, Magdalena; Mulas, Antonella; Corda, Giovanna; Contini, Salvatore; Dore, Fausto; Galleu, Antonio; Di Tucci, Anna Angela; Caocci, Giovanni; Gaviano, Eleonora; Latte, Giancarlo; Gabbas, Attilio; Casula, Paolo; Delogu, Lucia Gemma; La Nasa, Giorgio; Angelucci, Emanuele; Cucca, Francesco; Longinotti, Maurizio

    2016-11-01

    Because different findings suggest that an immune dysregulation plays a role in the pathogenesis of myelodysplastic syndrome (MDS), we analyzed a large cohort of patients from a homogeneous Sardinian population using ImmunoChip, a genotyping array exploring 147,954 single-nucleotide polymorphisms (SNPs) localized in genomic regions displaying some degree of association with immune-mediated diseases or pathways. The population studied included 133 cases and 3,894 controls, and a total of 153,978 autosomal markers and 971 non-autosomal markers were genotyped. After association analysis, only one variant passed the genome-wide significance threshold: rs71325459 (p = 1.16 × 10 -12 ), which is situated on chromosome 20. The variant is in high linkage disequilibrium with rs35640778, an untested missense variant situated in the RTEL1 gene, an interesting candidate that encodes for an ATP-dependent DNA helicase implicated in telomere-length regulation, DNA repair, and maintenance of genomic stability. The second most associated signal is composed of five variants that fall slightly below the genome-wide significance threshold but point out another interesting gene candidate. These SNPs, with p values between 2.53 × 10 -6 and 3.34 × 10 -6 , are situated in the methylene tetrahydrofolate reductase (MTHFR) gene. The most associated of these variants, rs1537514, presents an increased frequency of the derived C allele in cases, with 11.4% versus 4.4% in controls. MTHFR is the rate-limiting enzyme in the methyl cycle and genetic variations in this gene have been strongly associated with the risk of neoplastic diseases. The current understanding of the MDS biology, which is based on the hypothesis of the sequential development of multiple subclonal molecular lesions, fits very well with the demonstration of a possible role for RTEL1 and MTHFR gene polymorphisms, both of which are related to a variable risk of genomic instability. Copyright © 2016 ISEH - International

  11. Informative genomic microsatellite markers for efficient genotyping applications in sugarcane.

    Science.gov (United States)

    Parida, Swarup K; Kalia, Sanjay K; Kaul, Sunita; Dalal, Vivek; Hemaprabha, G; Selvi, Athiappan; Pandit, Awadhesh; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; Srivastava, Prem Shankar; Singh, Nagendra K; Mohapatra, Trilochan

    2009-01-01

    Genomic microsatellite markers are capable of revealing high degree of polymorphism. Sugarcane (Saccharum sp.), having a complex polyploid genome requires more number of such informative markers for various applications in genetics and breeding. With the objective of generating a large set of microsatellite markers designated as Sugarcane Enriched Genomic MicroSatellite (SEGMS), 6,318 clones from genomic libraries of two hybrid sugarcane cultivars enriched with 18 different microsatellite repeat-motifs were sequenced to generate 4.16 Mb high-quality sequences. Microsatellites were identified in 1,261 of the 5,742 non-redundant clones that accounted for 22% enrichment of the libraries. Retro-transposon association was observed for 23.1% of the identified microsatellites. The utility of the microsatellite containing genomic sequences were demonstrated by higher primer designing potential (90%) and PCR amplification efficiency (87.4%). A total of 1,315 markers including 567 class I microsatellite markers were designed and placed in the public domain for unrestricted use. The level of polymorphism detected by these markers among sugarcane species, genera, and varieties was 88.6%, while cross-transferability rate was 93.2% within Saccharum complex and 25% to cereals. Cloning and sequencing of size variant amplicons revealed that the variation in the number of repeat-units was the main source of SEGMS fragment length polymorphism. High level of polymorphism and wide range of genetic diversity (0.16-0.82 with an average of 0.44) assayed with the SEGMS markers suggested their usefulness in various genotyping applications in sugarcane.

  12. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  13. sY116, a human Y-linked polymorphic STS

    Indian Academy of Sciences (India)

    3Laboratoire d'ImmunogeÂneÂtique, Faculte de Sciences de Tunis, Tunis, Tunisia ... studying genomic instabilities in some types of cancer is discussed. Materials ..... polymorphisms by denaturing high-performance liquid chroma- tography.

  14. Utilization of complete chloroplast genomes for phylogenetic studies

    NARCIS (Netherlands)

    Ramlee, Shairul Izan Binti

    2016-01-01

    Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from

  15. Comprehensive search for intra- and inter-specific sequence polymorphisms among coding envelope genes of retroviral origin found in the human genome: genes and pseudogenes

    Directory of Open Access Journals (Sweden)

    Vasilescu Alexandre

    2005-09-01

    Full Text Available Abstract Background The human genome carries a high load of proviral-like sequences, called Human Endogenous Retroviruses (HERVs, which are the genomic traces of ancient infections by active retroviruses. These elements are in most cases defective, but open reading frames can still be found for the retroviral envelope gene, with sixteen such genes identified so far. Several of them are conserved during primate evolution, having possibly been co-opted by their host for a physiological role. Results To characterize further their status, we presently sequenced 12 of these genes from a panel of 91 Caucasian individuals. Genomic analyses reveal strong sequence conservation (only two non synonymous Single Nucleotide Polymorphisms [SNPs] for the two HERV-W and HERV-FRD envelope genes, i.e. for the two genes specifically expressed in the placenta and possibly involved in syncytiotrophoblast formation. We further show – using an ex vivo fusion assay for each allelic form – that none of these SNPs impairs the fusogenic function. The other envelope proteins disclose variable polymorphisms, with the occurrence of a stop codon and/or frameshift for most – but not all – of them. Moreover, the sequence conservation analysis of the orthologous genes that can be found in primates shows that three env genes have been maintained in a fully coding state throughout evolution including envW and envFRD. Conclusion Altogether, the present study strongly suggests that some but not all envelope encoding sequences are bona fide genes. It also provides new tools to elucidate the possible role of endogenous envelope proteins as susceptibility factors in a number of pathologies where HERVs have been suspected to be involved.

  16. Zinc finger arrays binding human papillomavirus types 16 and 18 genomic DNA: precursors of gene-therapeutics for in-situ reversal of associated cervical neoplasia

    Directory of Open Access Journals (Sweden)

    Wayengera Misaki

    2012-07-01

    Full Text Available Abstract Background Human papillomavirus (HPV types 16 and 18 are the high-risk, sexually transmitted infectious causes of most cervical intraepithelial neoplasias (CIN or cancers. While efficacious vaccines to reduce the sexual acquisition of these high-risk HPVs have recently been introduced, no virus-targeted therapies exist for those already exposed and infected. Considering the oncogenic role of the transforming (E6 and E7 genes of high-risk HPVs in the slow pathogenesis of cervical cancer, we hypothesize that timely disruption or abolition of HPV genome expression within pre-cancerous lesions identified at screening may reverse neoplasia. We aimed to derive model zinc finger nucleases (ZFNs for mutagenesis of the genomes of two high-risk HPV (types 16 & 18. Methods and results Using ZiFiT software and the complete genomes of HPV types16 and 18, we computationally generated the consensus amino acid sequences of the DNA-binding domains (F1, F2, & F3 of (i 296 & 327 contextually unpaired (or single three zinc-finger arrays (sZFAs and (ii 9 & 13 contextually paired (left and right three- zinc-finger arrays (pZFAs that bind genomic DNA of HPV-types 16 and 18 respectively, inclusive of the E7 gene (s/pZFAHpV/E7. In the absence of contextually paired three-zinc-finger arrays (pZFAs that bind DNA corresponding to the genomic context of the E6 gene of either HPV type, we derived the DNA binding domains of another set of 9 & 14 contextually unpaired E6 gene-binding ZFAs (sZFAE6 to aid the future quest for paired ZFAs to target E6 gene sequences in both HPV types studied (pZFAE6. This paper presents models for (i synthesis of hybrid ZFNs that cleave within the genomic DNA of either HPV type, by linking the gene sequences of the DNA-cleavage domain of the FokI endonuclease FN to the gene sequences of a member of the paired-HPV-binding ZFAs (pZFAHpV/E7 + FN, and (ii delivery of the same into precancerous lesions using HPV-derived viral plasmids or

  17. Single nucleotide polymorphism array karyotyping: a diagnostic and prognostic tool in myelodysplastic syndromes with unsuccessful conventional cytogenetic testing.

    Science.gov (United States)

    Arenillas, Leonor; Mallo, Mar; Ramos, Fernando; Guinta, Kathryn; Barragán, Eva; Lumbreras, Eva; Larráyoz, María-José; De Paz, Raquel; Tormo, Mar; Abáigar, María; Pedro, Carme; Cervera, José; Such, Esperanza; José Calasanz, María; Díez-Campelo, María; Sanz, Guillermo F; Hernández, Jesús María; Luño, Elisa; Saumell, Sílvia; Maciejewski, Jaroslaw; Florensa, Lourdes; Solé, Francesc

    2013-12-01

    Cytogenetic aberrations identified by metaphase cytogenetics (MC) have diagnostic, prognostic, and therapeutic implications in myelodysplastic syndromes (MDS). However, in some MDS patients MC study is unsuccesful. Single nucleotide polymorphism array (SNP-A) based karyotyping could be helpful in these cases. We performed SNP-A in 62 samples from bone marrow or peripheral blood of primary MDS with an unsuccessful MC study. SNP-A analysis enabled the detection of aberrations in 31 (50%) patients. We used the copy number alteration information to apply the International Prognostic Scoring System (IPSS) and we observed differences in survival between the low/intermediate-1 and intermediate-2/high risk patients. We also saw differences in survival between very low/low/intermediate and the high/very high patients when we applied the revised IPSS (IPSS-R). In conclusion, SNP-A can be used successfully in PB samples and the identification of CNA by SNP-A improve the diagnostic and prognostic evaluation of this group of MDS patients. Copyright © 2013 Wiley Periodicals, Inc.

  18. Identification of copy number variants defining genomic differences among major human groups.

    Directory of Open Access Journals (Sweden)

    Lluís Armengol

    Full Text Available BACKGROUND: Understanding the genetic contribution to phenotype variation of human groups is necessary to elucidate differences in disease predisposition and response to pharmaceutical treatments in different human populations. METHODOLOGY/PRINCIPAL FINDINGS: We have investigated the genome-wide profile of structural variation on pooled samples from the three populations studied in the HapMap project by comparative genome hybridization (CGH in different array platforms. We have identified and experimentally validated 33 genomic loci that show significant copy number differences from one population to the other. Interestingly, we found an enrichment of genes related to environment adaptation (immune response, lipid metabolism and extracellular space within these regions and the study of expression data revealed that more than half of the copy number variants (CNVs translate into gene-expression differences among populations, suggesting that they could have functional consequences. In addition, the identification of single nucleotide polymorphisms (SNPs that are in linkage disequilibrium with the copy number alleles allowed us to detect evidences of population differentiation and recent selection at the nucleotide variation level. CONCLUSIONS: Overall, our results provide a comprehensive view of relevant copy number changes that might play a role in phenotypic differences among major human populations, and generate a list of interesting candidates for future studies.

  19. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset

    Directory of Open Access Journals (Sweden)

    Elena V. Ignatieva

    2014-03-01

    Full Text Available The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors, which are activated by olfactory stimuli (ligands. Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter (a region of DNA about 100–1000 base pairs long located upstream of the transcription start site. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.. In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  20. Arrays in Postnatal and Prenatal Diagnosis : An Exploration of the Ethics of Consent

    NARCIS (Netherlands)

    Dondorp, Wybo; Sikkema-Raddatz, Birgit; de Die-Smulders, Christine; de Wert, Guido

    The introduction of genome-wide arrays in postnatal and prenatal diagnosis raises challenging ethical issues. Here, we explore questions with regard to the ethics of consent. One important issue is whether informed consent for genome-wide array-based testing is in fact feasible, given the wide range

  1. Development of Chloroplast Genomic Resources in Chinese Yam (Dioscorea polystachya

    Directory of Open Access Journals (Sweden)

    Junling Cao

    2018-01-01

    Full Text Available Chinese yam has been used both as a food and in traditional herbal medicine. Developing more effective genetic markers in this species is necessary to assess its genetic diversity and perform cultivar identification. In this study, new chloroplast genomic resources were developed using whole chloroplast genomes from six genotypes originating from different geographical locations. The Dioscorea polystachya chloroplast genome is a circular molecule consisting of two single-copy regions separated by a pair of inverted repeats. Comparative analyses of six D. polystachya chloroplast genomes revealed 141 single nucleotide polymorphisms (SNPs. Seventy simple sequence repeats (SSRs were found in the six genotypes, including 24 polymorphic SSRs. Forty-three common indels and five small inversions were detected. Phylogenetic analysis based on the complete chloroplast genome provided the best resolution among the genotypes. Our evaluation of chloroplast genome resources among these genotypes led us to consider the complete chloroplast genome sequence of D. polystachya as a source of reliable and valuable molecular markers for revealing biogeographical structure and the extent of genetic variation in wild populations and for identifying different cultivars.

  2. Random amplified polymorphic DNA (RAPD) markers reveal genetic ...

    African Journals Online (AJOL)

    The present study evaluated genetic variability of superior bael genotypes collected from different parts of Andaman Islands, India using fruit characters and random amplified polymorphic DNA (RAPD) markers. Genomic DNA extracted from leaf material using cetyl trimethyl ammonium bromide (CTAB) method was ...

  3. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

    International Nuclear Information System (INIS)

    Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A.

    2005-01-01

    Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes

  4. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

    Science.gov (United States)

    Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  5. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    Directory of Open Access Journals (Sweden)

    Fagen Li

    Full Text Available Dense genetic maps, along with quantitative trait loci (QTLs detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR, expressed sequence tag (EST derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS, and diversity arrays technology (DArT markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age and wood density (56 months were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.

  6. Association of MTHFR polymorphisms with nsCL/P in Chinese ...

    African Journals Online (AJOL)

    Xianrong Xu

    2016-04-26

    Apr 26, 2016 ... Aim: In this study, we aim to investigate the association between the polymorphism in MTHFR .... DNA extraction, library preparation, and sequencing. Genomic ..... comparative study in Mexican, West African, and European.

  7. [Genome similarity of Baikal omul and sig].

    Science.gov (United States)

    Bychenko, O S; Sukhanova, L V; Ukolova, S S; Skvortsov, T A; Potapov, V K; Azhikina, T L; Sverdlov, E D

    2009-01-01

    Two members of the Baikal sig family, a lake sig (Coregonus lavaretus baicalensis Dybovsky) and omul (C. autumnalis migratorius Georgi), are close relatives that diverged from the same ancestor 10-20 thousand years ago. In this work, we studied genomic polymorphism of these two fish species. The method of subtraction hybridization (SH) did not reveal the presence of extended sequences in the sig genome and their absence in the omul genome. All the fragments found by SH corresponded to polymorphous noncoding genome regions varying in mononucleotide substitutions and short deletions. Many of them are mapped close to genes of the immune system and have regions identical to the Tc-1-like transposons abundant among fish, whose transcription activity may affect the expression of adjacent genes. Thus, we showed for the first time that genetic differences between Baikal sig family members are extremely small and cannot be revealed by the SH method. This is another endorsement of the hypothesis on the close relationship between Baikal sig and omul and their evolutionarily recent divergence from a common ancestor.

  8. Typing and comparative genome analysis of Brucella melitensis isolated from Lebanon.

    Science.gov (United States)

    Abou Zaki, Natalia; Salloum, Tamara; Osman, Marwan; Rafei, Rayane; Hamze, Monzer; Tokajian, Sima

    2017-10-16

    Brucella melitensis is the main causative agent of the zoonotic disease brucellosis. This study aimed at typing and characterizing genetic variation in 33 Brucella isolates recovered from patients in Lebanon. Bruce-ladder multiplex PCR and PCR-RFLP of omp31, omp2a and omp2b were performed. Sixteen representative isolates were chosen for draft-genome sequencing and analyzed to determine variations in virulence, resistance, genomic islands, prophages and insertion sequences. Comparative whole-genome single nucleotide polymorphism analysis was also performed. The isolates were confirmed to be B. melitensis. Genome analysis revealed multiple virulence determinants and efflux pumps. Genome comparisons and single nucleotide polymorphisms divided the isolates based on geographical distribution but revealed high levels of similarity between the strains. Sequence divergence in B. melitensis was mainly due to lateral gene transfer of mobile elements. This is the first report of an in-depth genomic characterization of B. melitensis in Lebanon. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Report for Development of a Census Array and Evaluation of the Array to Detect Biothreat Agents and Environmental Samples for DHS

    Energy Technology Data Exchange (ETDEWEB)

    Jaing, C; Jackson, P

    2011-04-14

    The objective of this project is to provide DHS a comprehensive evaluation of the current genomic technologies including genotyping, Taqman PCR, multiple locus variable tandem repeat analysis (MLVA), microarray and high-throughput DNA sequencing in the analysis of biothreat agents from complex environmental samples. This report focuses on the design, testing and results of samples on the Census Array. We designed a Census/Detection Array to detect all sequenced viruses (including phage), bacteria (eubacteria), and plasmids. Family-specific probes were selected for all sequenced viral and bacterial complete genomes, segments, and plasmids. Probes were designed to tolerate some sequence variation to enable detection of divergent species with homology to sequenced organisms, and to be unique relative to the human genome. A combination of 'detection' probes with high levels of conservation within a family plus 'census' probes targeting strain/isolate specific regions enabled detection and taxonomic classification from the level of family down to the strain. The array has wider coverage of bacterial and viral targets based on more recent sequence data and more probes per target than other microbial detection/discovery arrays in the literature. We tested the array with purified bacterial and viral DNA/RNA samples, artificial mixes of known bacterial/viral samples, spiked DNA against complex background including BW aerosol samples and soil samples, and environmental samples to evaluate the array's sensitivity and forensic capability. The data were analyzed using our novel maximum likelihood software. For most of the organisms tested, we have achieved at least species level discrimination.

  10. High-density rhesus macaque oligonucleotide microarray design using early-stage rhesus genome sequence information and human genome annotations

    Directory of Open Access Journals (Sweden)

    Magness Charles L

    2007-01-01

    Full Text Available Abstract Background Until recently, few genomic reagents specific for non-human primate research have been available. To address this need, we have constructed a macaque-specific high-density oligonucleotide microarray by using highly fragmented low-pass sequence contigs from the rhesus genome project together with the detailed sequence and exon structure of the human genome. Using this method, we designed oligonucleotide probes to over 17,000 distinct rhesus/human gene orthologs and increased by four-fold the number of available genes relative to our first-generation expressed sequence tag (EST-derived array. Results We constructed a database containing 248,000 exon sequences from 23,000 human RefSeq genes and compared each human exon with its best matching sequence in the January 2005 version of the rhesus genome project list of 486,000 DNA contigs. Best matching rhesus exon sequences for each of the 23,000 human genes were then concatenated in the proper order and orientation to produce a rhesus "virtual transcriptome." Microarray probes were designed, one per gene, to the region closest to the 3' untranslated region (UTR of each rhesus virtual transcript. Each probe was compared to a composite rhesus/human transcript database to test for cross-hybridization potential yielding a final probe set representing 18,296 rhesus/human gene orthologs, including transcript variants, and over 17,000 distinct genes. We hybridized mRNA from rhesus brain and spleen to both the EST- and genome-derived microarrays. Besides four-fold greater gene coverage, the genome-derived array also showed greater mean signal intensities for genes present on both arrays. Genome-derived probes showed 99.4% identity when compared to 4,767 rhesus GenBank sequence tag site (STS sequences indicating that early stage low-pass versions of complex genomes are of sufficient quality to yield valuable functional genomic information when combined with finished genome information from

  11. Genome-wide association study identifies polymorphisms associated with the analgesic effect of fentanyl in the preoperative cold pressor-induced pain test

    Directory of Open Access Journals (Sweden)

    Kaori Takahashi

    2018-03-01

    Full Text Available Opioid analgesics are widely used for the treatment of moderate to severe pain. The analgesic effects of opioids are well known to vary among individuals. The present study focused on the genetic factors that are associated with interindividual differences in pain and opioid sensitivity. We conducted a multistage genome-wide association study in subjects who were scheduled to undergo mandibular sagittal split ramus osteotomy and were not medicated until they received fentanyl for the induction of anesthesia. We preoperatively conducted the cold pressor-induced pain test before and after fentanyl administration. The rs13093031 and rs12633508 single-nucleotide polymorphisms (SNPs near the LOC728432 gene region and rs6961071 SNP in the tcag7.1213 gene region were significantly associated with the analgesic effect of fentanyl, based on differences in pain perception latency before and after fentanyl administration. The associations of these three SNPs that were identified in our exploratory study have not been previously reported. The two polymorphic loci (rs13093031 and rs12633508 were shown to be in strong linkage disequilibrium. Subjects with the G/G genotype of the rs13093031 and rs6961071 SNPs presented lower fentanyl-induced analgesia. Our findings provide a basis for investigating genetics-based analgesic sensitivity and personalized pain control. Keywords: Opioid sensitivity, Analgesia, Fentanyl, Polymorphism, GWAS

  12. a potential source of spurious associations in genome-wide ...

    Indian Academy of Sciences (India)

    2010-04-01

    Apr 1, 2010 ... Genome-wide association studies (GWAS) examine the entire human genome with the goal of identifying genetic variants. (usually single nucleotide polymorphisms (SNPs)) that are associated with phenotypic traits such as disease status and drug response. The discordance of significantly associated ...

  13. Identification and Evaluation of Single-Nucleotide Polymorphisms in Allotetraploid Peanut (Arachis hypogaea L.) Based on Amplicon Sequencing Combined with High Resolution Melting (HRM) Analysis.

    Science.gov (United States)

    Hong, Yanbin; Pandey, Manish K; Liu, Ying; Chen, Xiaoping; Liu, Hong; Varshney, Rajeev K; Liang, Xuanqiang; Huang, Shangzhi

    2015-01-01

    The cultivated peanut (Arachis hypogaea L.) is an allotetraploid (AABB) species derived from the A-genome (Arachis duranensis) and B-genome (Arachis ipaensis) progenitors. Presence of two versions of a DNA sequence based on the two progenitor genomes poses a serious technical and analytical problem during single nucleotide polymorphism (SNP) marker identification and analysis. In this context, we have analyzed 200 amplicons derived from expressed sequence tags (ESTs) and genome survey sequences (GSS) to identify SNPs in a panel of genotypes consisting of 12 cultivated peanut varieties and two diploid progenitors representing the ancestral genomes. A total of 18 EST-SNPs and 44 genomic-SNPs were identified in 12 peanut varieties by aligning the sequence of A. hypogaea with diploid progenitors. The average frequency of sequence polymorphism was higher for genomic-SNPs than the EST-SNPs with one genomic-SNP every 1011 bp as compared to one EST-SNP every 2557 bp. In order to estimate the potential and further applicability of these identified SNPs, 96 peanut varieties were genotyped using high resolution melting (HRM) method. Polymorphism information content (PIC) values for EST-SNPs ranged between 0.021 and 0.413 with a mean of 0.172 in the set of peanut varieties, while genomic-SNPs ranged between 0.080 and 0.478 with a mean of 0.249. Total 33 SNPs were used for polymorphism detection among the parents and 10 selected lines from mapping population Y13Zh (Zhenzhuhei × Yueyou13). Of the total 33 SNPs, nine SNPs showed polymorphism in the mapping population Y13Zh, and seven SNPs were successfully mapped into five linkage groups. Our results showed that SNPs can be identified in allotetraploid peanut with high accuracy through amplicon sequencing and HRM assay. The identified SNPs were very informative and can be used for different genetic and breeding applications in peanut.

  14. Validation of the high-throughput marker technology DArT using the model plant Arabidopsis thaliana.

    Science.gov (United States)

    Wittenberg, Alexander H J; van der Lee, Theo; Cayla, Cyril; Kilian, Andrzej; Visser, Richard G F; Schouten, Henk J

    2005-08-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds of restriction site based polymorphisms between genotypes and does not require DNA sequence information or site-specific oligonucleotides. This paper demonstrates the potential of DArT for genetic mapping by validating the quality and molecular basis of the markers, using the model plant Arabidopsis thaliana. Restriction fragments from a genomic representation of the ecotype Landsberg erecta (Ler) were amplified by PCR, individualized by cloning and spotted onto glass slides. The arrays were then hybridized with labeled genomic representations of the ecotypes Columbia (Col) and Ler and of individuals from an F(2) population obtained from a Col x Ler cross. The scoring of markers with specialized software was highly reproducible and 107 markers could unambiguously be ordered on a genetic linkage map. The marker order on the genetic linkage map coincided with the order on the DNA sequence map. Sequencing of the Ler markers and alignment with the available Col genome sequence confirmed that the polymorphism in DArT markers is largely a result of restriction site polymorphisms.

  15. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

    Science.gov (United States)

    Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

    2014-07-01

    Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  16. Clinical Utility of Array Comparative Genomic Hybridization for Detection of Chromosomal Abnormalities in Pediatric Acute Lymphoblastic Leukemia

    Science.gov (United States)

    Rabin, Karen R.; Man, Tsz-Kwong; Yu, Alexander; Folsom, Matthew R.; Zhao, Yi-Jue; Rao, Pulivarthi H.; Plon, Sharon E.; Naeem, Rizwan C.

    2014-01-01

    Background Accurate detection of recurrent chromosomal abnormalities is critical to assign patients to risk-based therapeutic regimens for pediatric acute lymphoblastic leukemia (ALL). Procedure We investigated the utility of array comparative genomic hybridization (aCGH) for detection of chromosomal abnormalities compared to standard clinical evaluation with karyotype and fluorescent in-situ hybridization (FISH). Fifty pediatric ALL diagnostic bone marrows were analyzed by bacterial artificial chromosome (BAC) array, and findings compared to standard clinical evaluation. Results Sensitivity of aCGH was 79% to detect karyotypic findings other than balanced translocations, which cannot be detected by aCGH because they involve no copy number change. aCGH also missed abnormalities occurring in subclones constituting less than 25% of cells. aCGH detected 44 additional abnormalities undetected or misidentified by karyotype, 21 subsequently validated by FISH, including abnormalities in 4 of 10 cases with uninformative cytogenetics. aCGH detected concurrent terminal deletions of both 9p and 20q in three cases, in two of which the 20q deletion was undetected by karyotype. A narrow region of loss at 7p21 was detected in two cases. Conclusions An array with increased BAC density over regions important in ALL, combined with PCR for fusion products of balanced translocations, could minimize labor- and time-intensive cytogenetic assays and provide key prognostic information in the approximately 35% of cases with uninformative cytogenetics. PMID:18253961

  17. Whole-genome analysis of herbicide-tolerant mutant rice generated by Agrobacterium-mediated gene targeting.

    Science.gov (United States)

    Endo, Masaki; Kumagai, Masahiko; Motoyama, Ritsuko; Sasaki-Yamagata, Harumi; Mori-Hosokawa, Satomi; Hamada, Masao; Kanamori, Hiroyuki; Nagamura, Yoshiaki; Katayose, Yuichi; Itoh, Takeshi; Toki, Seiichi

    2015-01-01

    Gene targeting (GT) is a technique used to modify endogenous genes in target genomes precisely via homologous recombination (HR). Although GT plants are produced using genetic transformation techniques, if the difference between the endogenous and the modified gene is limited to point mutations, GT crops can be considered equivalent to non-genetically modified mutant crops generated by conventional mutagenesis techniques. However, it is difficult to guarantee the non-incorporation of DNA fragments from Agrobacterium in GT plants created by Agrobacterium-mediated GT despite screening with conventional Southern blot and/or PCR techniques. Here, we report a comprehensive analysis of herbicide-tolerant rice plants generated by inducing point mutations in the rice ALS gene via Agrobacterium-mediated GT. We performed genome comparative genomic hybridization (CGH) array analysis and whole-genome sequencing to evaluate the molecular composition of GT rice plants. Thus far, no integration of Agrobacterium-derived DNA fragments has been detected in GT rice plants. However, >1,000 single nucleotide polymorphisms (SNPs) and insertion/deletion (InDels) were found in GT plants. Among these mutations, 20-100 variants might have some effect on expression levels and/or protein function. Information about additive mutations should be useful in clearing out unwanted mutations by backcrossing. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  18. Susceptibility to Chronic Mucus Hypersecretion, a Genome Wide Association Study

    DEFF Research Database (Denmark)

    Dijkstra, Akkelies E; Smolonska, Joanna; van den Berge, Maarten

    2014-01-01

    by replication and meta-analysis in 11 additional cohorts. In total 2,704 subjects with, and 7,624 subjects without CMH were included, all current or former heavy smokers (≥20 pack-years). Additional studies were performed to test the functional relevance of the most significant single nucleotide polymorphism...... (SNP). RESULTS: A strong association with CMH, consistent across all cohorts, was observed with rs6577641 (p = 4.25×10(-6), OR = 1.17), located in intron 9 of the special AT-rich sequence-binding protein 1 locus (SATB1) on chromosome 3. The risk allele (G) was associated with higher mRNA expression...... of smokers develops CMH. A plausible explanation for this phenomenon is a predisposing genetic constitution. Therefore, we performed a genome wide association (GWA) study of CMH in Caucasian populations. METHODS: GWA analysis was performed in the NELSON-study using the Illumina 610 array, followed...

  19. ArrayExpress update--trends in database growth and links to data analysis tools.

    Science.gov (United States)

    Rustici, Gabriella; Kolesnikov, Nikolay; Brandizi, Marco; Burdett, Tony; Dylag, Miroslaw; Emam, Ibrahim; Farne, Anna; Hastings, Emma; Ison, Jon; Keays, Maria; Kurbatova, Natalja; Malone, James; Mani, Roby; Mupo, Annalisa; Pedro Pereira, Rui; Pilicheva, Ekaterina; Rung, Johan; Sharma, Anjan; Tang, Y Amy; Ternent, Tobias; Tikhonov, Andrew; Welter, Danielle; Williams, Eleanor; Brazma, Alvis; Parkinson, Helen; Sarkans, Ugis

    2013-01-01

    The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.

  20. [Application of single nucleotide polymorphism-microarray and target gene sequencing in the study of genetic etiology of children with unexplained intellectual disability or developmental delay].

    Science.gov (United States)

    Gao, Z J; Jiang, Q; Cheng, D Z; Yan, X X; Chen, Q; Xu, K M

    2016-10-02

    Objective: To evaluate the application of single nucleotide polymorphism (SNP)-microarray and target gene sequencing technology in the clinical molecular genetic diagnosis of unexplained intellectual disability(ID) or developmental delay (DD). Method: Patients with ID or DD were recruited in the Department of Neurology, Affiliated Children's Hospital of Capital Institute of Pediatrics between September 2015 and February 2016. The intellectual assessment of the patients was performed using 0-6-year-old pediatric examination table of neuropsychological development or Wechsler intelligence scale (>6 years). Patients with a DQ less than 49 or IQ less than 51 were included in this study. The patients were scanned by SNP-array for detection of genomic copy number variations (CNV), and the revealed genomic imbalance was confirmed by quantitative real time-PCR. Candidate gene mutation screening was carried out by target gene sequencing technology.Causal mutations or likely pathogenic variants were verified by polymerase chain reaction and direct sequencing. Result: There were 15 children with ID or DD enrolled, 9 males and 6 females. The age of these patients was 7 months-16 years and 9 months. SNP-array revealed that two of the 15 patients had genomic CNV. Both CNV were de novo micro deletions, one involved 11q24.1q25 and the other micro deletion located on 21q22.2q22.3. Both micro deletions were proved to have a clinical significance due to their association with ID, brain DD, unusual faces etc. by querying Decipher database. Thirteen patients with negative findings in SNP-array were consequently examined with target gene sequencing technology, genotype-phenotype correlation analysis and genetic analysis. Five patients were diagnosed with monogenic disorder, two were diagnosed with suspected genetic disorder and six were still negative. Conclusion: Sequential use of SNP-array and target gene sequencing technology can significantly increase the molecular genetic etiologic

  1. Evidence for single nucleotide polymorphisms and their association with bipolar disorder

    Directory of Open Access Journals (Sweden)

    Szczepankiewicz A

    2013-10-01

    Full Text Available Aleksandra Szczepankiewicz1,21Laboratory of Molecular and Cell Biology, 2Department of Psychiatric Genetics, Poznan University of Medical Sciences, Poznan, PolandAbstract: Bipolar disorder (BD is a complex disorder with a number of susceptibility genes and environmental risk factors involved in its pathogenesis. In recent years, huge progress has been made in molecular techniques for genetic studies, which have enabled identification of numerous genomic regions and genetic variants implicated in BD across populations. Despite the abundance of genetic findings, the results have often been inconsistent and not replicated for many candidate genes/single nucleotide polymorphisms (SNPs. Therefore, the aim of the review presented here is to summarize the most important data reported so far in candidate gene and genome-wide association studies. Taking into account the abundance of association data, this review focuses on the most extensively studied genes and polymorphisms reported so far for BD to present the most promising genomic regions/SNPs involved in BD. The review of association data reveals evidence for several genes (SLC6A4/5-HTT [serotonin transporter gene], BDNF [brain-derived neurotrophic factor], DAOA [D-amino acid oxidase activator], DTNBP1 [dysbindin], NRG1 [neuregulin 1], DISC1 [disrupted in schizophrenia 1] to be crucial candidates in BD, whereas numerous genome-wide association studies conducted in BD indicate polymorphisms in two genes (CACNA1C [calcium channel, voltage-dependent, L type, alpha 1C subunit], ANK3 [ankyrin 3] replicated for association with BD in most of these studies. Nevertheless, further studies focusing on interactions between multiple candidate genes/SNPs, as well as systems biology and pathway analyses are necessary to integrate and improve the way we analyze the currently available association data.Keywords: candidate gene, genome-wide association study, SLC6A4, BDNF, DAOA, DTNBP1, NRG1, DISC1

  2. Characterization and compilation of polymorphic simple sequence repeat (SSR markers of peanut from public database

    Directory of Open Access Journals (Sweden)

    Zhao Yongli

    2012-07-01

    Full Text Available Abstract Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L. genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5% within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5% was the most abundant followed by AAG (12.1%, AAT (10.9%, and AT (10.3%.The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders.

  3. Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.

    Science.gov (United States)

    Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E

    2016-11-18

    Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.

  4. Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies.

    Directory of Open Access Journals (Sweden)

    2005-10-01

    Full Text Available With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb in length, 75% were flanked on one or both sides by (often unrelated segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85% semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13% regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22, 13 kb (at 7q11, and 1 kb (at 16q24 fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.

  5. Determination of the frequency of polymorphisms in genes related to the genome stability maintenance of the population residing at Monte Alegre, PA (Brazil) municipality

    International Nuclear Information System (INIS)

    Hozumi, Cristiny Gomes

    2010-01-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on earth, for man and all living things have always been exposed to these sources. Ionizing radiation is a known genotoxic agent which can affect the genomic stability and genes related to DNA repair may play a role when they have committed certain polymorphism. This study aimed to analyze the frequency of polymorphisms (SNPs) in genes of DNA repair and cell cycle control: hOGG1 (Ser326Cys), XRCC3 (Thr241 Met) and p53 (Arg72Pro) in saliva samples from a population located Monte Alegre, state of Para were collected in August 2008 and 40 samples of men and 46 samples of women, adding a total of 86 samples. By RFLP was determined the frequency of homozygous genotypes and / or heterozygous for polymorphic genes. The I)OGG1 gene was 5% of the allele 326Cys, XRCC3 gene found about 21 % of the allele 241 Met and p53 gene showed 40.8% of the 72Pro allele. And the genotype frequencies of individuals for the three genes were 91.04%, 88.06% and 59.7% for homozygous wild genotype, 5.97%, 11.94% and 22.39% for heterozygote genotype and 2,99%, zero and 17:91% for homozygous polymorphic hOGG1 genes respectively, XRCC3, p53. These values are similar to those found in previous studies. The influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, which is statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology in Monte Alegre, that help to characterization of local population. (author)

  6. Complete chloroplast genome sequence of a major allogamous forage species, perennial ryegrass (Lolium perenne L.).

    Science.gov (United States)

    Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne

    2009-06-01

    Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.

  7. CRISPR Diversity and Microevolution in Clostridium difficile

    Science.gov (United States)

    Andersen, Joakim M.; Shoup, Madelyn; Robinson, Cathy; Britton, Robert; Olsen, Katharina E.P.; Barrangou, Rodolphe

    2016-01-01

    Abstract Virulent strains of Clostridium difficile have become a global health problem associated with morbidity and mortality. Traditional typing methods do not provide ideal resolution to track outbreak strains, ascertain genetic diversity between isolates, or monitor the phylogeny of this species on a global basis. Here, we investigate the occurrence and diversity of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) in C. difficile to assess the potential of CRISPR-based phylogeny and high-resolution genotyping. A single Type-IB CRISPR-Cas system was identified in 217 analyzed genomes with cas gene clusters present at conserved chromosomal locations, suggesting vertical evolution of the system, assessing a total of 1,865 CRISPR arrays. The CRISPR arrays, markedly enriched (8.5 arrays/genome) compared with other species, occur both at conserved and variable locations across strains, and thus provide a basis for typing based on locus occurrence and spacer polymorphism. Clustering of strains by array composition correlated with sequence type (ST) analysis. Spacer content and polymorphism within conserved CRISPR arrays revealed phylogenetic relationship across clades and within ST. Spacer polymorphisms of conserved arrays were instrumental for differentiating closely related strains, e.g., ST1/RT027/B1 strains and pathogenicity locus encoding ST3/RT001 strains. CRISPR spacers showed sequence similarity to phage sequences, which is consistent with the native role of CRISPR-Cas as adaptive immune systems in bacteria. Overall, CRISPR-Cas sequences constitute a valuable basis for genotyping of C. difficile isolates, provide insights into the micro-evolutionary events that occur between closely related strains, and reflect the evolutionary trajectory of these genomes. PMID:27576538

  8. Preimplantation genetic diagnosis and screening by array comparative genomic hybridisation: experience of more than 100 cases in a single centre.

    Science.gov (United States)

    Chow, J Fc; Yeung, W Sb; Lee, V Cy; Lau, E Yl; Ho, P C; Ng, E Hy

    2017-04-01

    Preimplantation genetic screening has been proposed to improve the in-vitro fertilisation outcome by screening for aneuploid embryos or blastocysts. This study aimed to report the outcome of 133 cycles of preimplantation genetic diagnosis and screening by array comparative genomic hybridisation. This study of case series was conducted in a tertiary assisted reproductive centre in Hong Kong. Patients who underwent preimplantation genetic diagnosis for chromosomal abnormalities or preimplantation genetic screening between 1 April 2012 and 30 June 2015 were included. They underwent in-vitro fertilisation and intracytoplasmic sperm injection. An embryo biopsy was performed on day-3 embryos and the blastomere was subject to array comparative genomic hybridisation. Embryos with normal copy numbers were replaced. The ongoing pregnancy rate, implantation rate, and miscarriage rate were studied. During the study period, 133 cycles of preimplantation genetic diagnosis for chromosomal abnormalities or preimplantation genetic screening were initiated in 94 patients. Overall, 112 cycles proceeded to embryo biopsy and 65 cycles had embryo transfer. The ongoing pregnancy rate per transfer cycle after preimplantation genetic screening was 50.0% and that after preimplantation genetic diagnosis was 34.9%. The implantation rates after preimplantation genetic screening and diagnosis were 45.7% and 41.1%, respectively and the miscarriage rates were 8.3% and 28.6%, respectively. There were 26 frozen-thawed embryo transfer cycles, in which vitrified and biopsied genetically transferrable embryos were replaced, resulting in an ongoing pregnancy rate of 36.4% in the screening group and 60.0% in the diagnosis group. The clinical outcomes of preimplantation genetic diagnosis and screening using comparative genomic hybridisation in our unit were comparable to those reported internationally. Genetically transferrable embryos replaced in a natural cycle may improve the ongoing pregnancy rate

  9. Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

    Science.gov (United States)

    Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

    2012-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.

  10. Genome analysis and DNA marker-based characterisation of pathogenic trypanosomes

    NARCIS (Netherlands)

    Agbo, Edwin Chukwura

    2003-01-01

    The advances in genomics technologies and genome analysis methods that offer new leads for accelerating discovery of putative targets for developing overall control tools are reviewed in Chapter 1. In Chapter 2, a PCR typing method based on restriction fragment length polymorphism analysis of the

  11. Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

    DEFF Research Database (Denmark)

    Zhan, Bujie; Fadista, João; Thomsen, Bo

    2011-01-01

    Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome...... of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation...

  12. DNA polymorphisms in the Sahiwal breed of Zebu cattle revealed by synthetic oligonucleotide probes

    International Nuclear Information System (INIS)

    Shashikanth; Yadav, B.R.

    2005-01-01

    Genomic DNA of 15 randomly selected unrelated animals and from two sire families (11 animals) of the Sahiwal breed of Zebu cattle were investigated. Four oligonucleotide probes - (GTG) 5 , (TCC) 5 , (GT) 8 and (GT) 12 - were used on genomic DNA digested with restriction enzymes AluI, HinfI, MboI, EcoRI and HaeIII in different combinations. All four probes produced multiloci fingerprints with differing levels of polymorphisms. Total bands and shared bands in the fingerprints of each individual were in the range of 2.5 to 23.0 KB. Band number ranged from 9 to 17, with 0.48 average band sharing. Probes (GT) 8 , (GT) 12 and (TCC) 5 produced fingerprinting patterns of medium to low polymorphism, whereas probe (GTG) 5 produced highly polymorphic patterns. Probe (GTG) 5 in combination with the HaeIII enzyme was highly polymorphic with a heterozygosity level of 0.85, followed by (GT) 8 , (TCC) 5 and (GT) 12 with heterozygosity levels of 0.70, 0.65 and 0.30, respectively. Probe GTG 5 or its complementary sequence CAC 5 produced highly polymorphic fingerprints, indicating that the probe can be used for analysing population structure, parentage verification and identifying loci controlling quantitative traits and fertility status. (author)

  13. Transcription Factor KLF5 Binds a Cyclin E1 Polymorphic Intronic Enhancer to Confer Increased Bladder Cancer Risk

    Science.gov (United States)

    Pattison, Jillian M.; Posternak, Valeriya; Cole, Michael D.

    2016-01-01

    It is well established that environmental toxins, such as exposure to arsenic, are risk factors in the development of urinary bladder cancer, yet recent genome-wide association studies (GWAS) provide compelling evidence that there is a strong genetic component associated with disease predisposition. A single nucleotide polymorphism (SNP), rs8102137, was identified on chromosome 19q12, residing 6 kb upstream of the important cell cycle regulator and proto-oncogene, Cyclin E1 (CCNE1). However, the functional role of this variant in bladder cancer predisposition has been unclear since it lies within a non-coding region of the genome. Here, it is demonstrated that bladder cancer cells heterozygous for this SNP exhibit biased allelic expression of CCNE1 with 1.5-fold more transcription occurring from the risk allele. Furthermore, using chromatin immunoprecipitation assays, a novel enhancer element was identified within the first intron of CCNE1 that binds Kruppel-like Factor 5 (KLF5), a known transcriptional activator in bladder cancer. Moreover, the data reveal that the presence of rs200996365, a SNP in high linkage disequilibrium with rs8102137 residing in the center of a KLF5 motif, alters KLF5 binding to this genomic region. Through luciferase assays and CRISPR-Cas9 genome editing, a novel polymorphic intronic regulatory element controlling CCNE1 transcription is characterized. These studies uncover how a cancer-associated polymorphism mechanistically contributes to an increased predisposition for bladder cancer development. Implications A polymorphic KLF5 binding site near the CCNE1 gene explains genetic risk identified through genome wide association studies. PMID:27514407

  14. Intragenomic polymorphisms among high-copy loci: a genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae).

    Science.gov (United States)

    Weitemier, Kevin; Straub, Shannon C K; Fishbein, Mark; Liston, Aaron

    2015-01-01

    Despite knowledge that concerted evolution of high-copy loci is often imperfect, studies that investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the "noncoding" ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).

  15. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium.

    Science.gov (United States)

    Pajuelo, Mónica J; Eguiluz, María; Dahlstrom, Eric; Requena, David; Guzmán, Frank; Ramirez, Manuel; Sheen, Patricia; Frace, Michael; Sammons, Scott; Cama, Vitaliano; Anzick, Sarah; Bruno, Dan; Mahanty, Siddhartha; Wilkins, Patricia; Nash, Theodore; Gonzalez, Armando; García, Héctor H; Gilman, Robert H; Porcella, Steve; Zimic, Mirko

    2015-12-01

    Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen. For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS) and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples. The predicted size of the hybrid (proglottid genome combined with cyst genome) T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt) were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites. The availability of draft genomes for T. solium represents a significant step

  16. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium.

    Directory of Open Access Journals (Sweden)

    Mónica J Pajuelo

    2015-12-01

    Full Text Available Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen.For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples.The predicted size of the hybrid (proglottid genome combined with cyst genome T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites.The availability of draft genomes for T. solium represents a

  17. The isolation and localization of arbitrary restriction fragment length polymorphisms in Southern African populations

    International Nuclear Information System (INIS)

    Conn, V.

    1987-01-01

    The main aim of this study was to contribute to the mapping of the human genome by searching for and characterizing a number of RFLPs (restriction fragment length polymorphisms) in the human genome. The more specific aims of this study were: 1. To isolate single-copy human DNA sequences from a human genomic library. 2. To use these single-copy sequences as DNA probes to search for polymorphic variation among Caucasoid individuals. 3. To show by means of family studies that the RFLPs were inherited in a co-dominant Mendelian fashion. 4. To determine the population frequencies of these RFLPs in Southern African Populations, namely the Bantu-speaking Negroids and the San. 5. To assign these RFLP-detecting DNA sequences to human chromosomes using somatic cell hybrid lines. In this study DNA was labelled with Phosphorus 32

  18. Arraying proteins by cell-free synthesis.

    Science.gov (United States)

    He, Mingyue; Wang, Ming-Wei

    2007-10-01

    Recent advances in life science have led to great motivation for the development of protein arrays to study functions of genome-encoded proteins. While traditional cell-based methods have been commonly used for generating protein arrays, they are usually a time-consuming process with a number of technical challenges. Cell-free protein synthesis offers an attractive system for making protein arrays, not only does it rapidly converts the genetic information into functional proteins without the need for DNA cloning, but also presents a flexible environment amenable to production of folded proteins or proteins with defined modifications. Recent advancements have made it possible to rapidly generate protein arrays from PCR DNA templates through parallel on-chip protein synthesis. This article reviews current cell-free protein array technologies and their proteomic applications.

  19. Approximation to the distribution of fitness effects across functional categories in human segregating polymorphisms.

    Directory of Open Access Journals (Sweden)

    Fernando Racimo

    2014-11-01

    Full Text Available Quantifying the proportion of polymorphic mutations that are deleterious or neutral is of fundamental importance to our understanding of evolution, disease genetics and the maintenance of variation genome-wide. Here, we develop an approximation to the distribution of fitness effects (DFE of segregating single-nucleotide mutations in humans. Unlike previous methods, we do not assume that synonymous mutations are neutral or not strongly selected, and we do not rely on fitting the DFE of all new nonsynonymous mutations to a single probability distribution, which is poorly motivated on a biological level. We rely on a previously developed method that utilizes a variety of published annotations (including conservation scores, protein deleteriousness estimates and regulatory data to score all mutations in the human genome based on how likely they are to be affected by negative selection, controlling for mutation rate. We map this and other conservation scores to a scale of fitness coefficients via maximum likelihood using diffusion theory and a Poisson random field model on SNP data. Our method serves to approximate the deleterious DFE of mutations that are segregating, regardless of their genomic consequence. We can then compare the proportion of mutations that are negatively selected or neutral across various categories, including different types of regulatory sites. We observe that the distribution of intergenic polymorphisms is highly peaked at neutrality, while the distribution of nonsynonymous polymorphisms has a second peak at [Formula: see text]. Other types of polymorphisms have shapes that fall roughly in between these two. We find that transcriptional start sites, strong CTCF-enriched elements and enhancers are the regulatory categories with the largest proportion of deleterious polymorphisms.

  20. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    Science.gov (United States)

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first

  1. In-silico single nucleotide polymorphisms (SNP) mining of Sorghum ...

    African Journals Online (AJOL)

    Single nucleotide polymorphisms (SNPs) may be considered the ultimate genetic markers as they represent the finest resolution of a DNA sequence (a single nucleotide), and are generally abundant in populations with a low mutation rate. SNPs are important tools in studying complex genetic traits and genome evolution.

  2. Combined amplification and hybridization techniques for genome scanning in vegetatively propagated crops

    International Nuclear Information System (INIS)

    Kahl, G.; Ramser, J.; Terauchi, R.; Lopez-Peralta, C.; Asemota, H.N.; Weising, K.

    1998-01-01

    A combination of PCR- and hybridization-based genome scanning techniques and sequence comparisons between non-coding chloroplast DNA flanking tRNA genes has been employed to screen Dioscorea species for intra- and interspecific genetic diversity. This methodology detected extensive polymorphisms within Dioscorea bulbifera L., and revealed taxonomic and phylogenetic relationships among cultivated Guinea yams varieties and their potential wild progenitors. Finally, screening of yam germplasm grown in Jamaica permitted reliable discrimination between all major cultivars. Genome scanning by micro satellite-primed PCR (MP-PCR) and random amplified polymorphic DNA (RAPD) analysis in combination with the novel random amplified micro satellite polymorphisms (RAMPO) hybridization technique has shown high potential for the genetic analysis of yams, and holds promise for other vegetatively propagated orphan crops. (author)

  3. Isolation of human simple repeat loci by hybridization selection.

    Science.gov (United States)

    Armour, J A; Neumann, R; Gobert, S; Jeffreys, A J

    1994-04-01

    We have isolated short tandem repeat arrays from the human genome, using a rapid method involving filter hybridization to enrich for tri- or tetranucleotide tandem repeats. About 30% of clones from the enriched library cross-hybridize with probes containing trimeric or tetrameric tandem arrays, facilitating the rapid isolation of large numbers of clones. In an initial analysis of 54 clones, 46 different tandem arrays were identified. Analysis of these tandem repeat loci by PCR showed that 24 were polymorphic in length; substantially higher levels of polymorphism were displayed by the tetrameric repeat loci isolated than by the trimeric repeats. Primary mapping of these loci by linkage analysis showed that they derive from 17 chromosomes, including the X chromosome. We anticipate the use of this strategy for the efficient isolation of tandem repeats from other sources of genomic DNA, including DNA from flow-sorted chromosomes, and from other species.

  4. Array-based assay detects genome-wide 5-mC and 5-hmC in the brains of humans, non-human primates, and mice.

    Science.gov (United States)

    Chopra, Pankaj; Papale, Ligia A; White, Andrew T J; Hatch, Andrea; Brown, Ryan M; Garthwaite, Mark A; Roseboom, Patrick H; Golos, Thaddeus G; Warren, Stephen T; Alisch, Reid S

    2014-02-13

    Methylation on the fifth position of cytosine (5-mC) is an essential epigenetic mark that is linked to both normal neurodevelopment and neurological diseases. The recent identification of another modified form of cytosine, 5-hydroxymethylcytosine (5-hmC), in both stem cells and post-mitotic neurons, raises new questions as to the role of this base in mediating epigenetic effects. Genomic studies of these marks using model systems are limited, particularly with array-based tools, because the standard method of detecting DNA methylation cannot distinguish between 5-mC and 5-hmC and most methods have been developed to only survey the human genome. We show that non-human data generated using the optimization of a widely used human DNA methylation array, designed only to detect 5-mC, reproducibly distinguishes tissue types within and between chimpanzee, rhesus, and mouse, with correlations near the human DNA level (R(2) > 0.99). Genome-wide methylation analysis, using this approach, reveals 6,102 differentially methylated loci between rhesus placental and fetal tissues with pathways analysis significantly overrepresented for developmental processes. Restricting the analysis to oncogenes and tumor suppressor genes finds 76 differentially methylated loci, suggesting that rhesus placental tissue carries a cancer epigenetic signature. Similarly, adapting the assay to detect 5-hmC finds highly reproducible 5-hmC levels within human, rhesus, and mouse brain tissue that is species-specific with a hierarchical abundance among the three species (human > rhesus > mouse). Annotation of 5-hmC with respect to gene structure reveals a significant prevalence in the 3'UTR and an association with chromatin-related ontological terms, suggesting an epigenetic feedback loop mechanism for 5-hmC. Together, these data show that this array-based methylation assay is generalizable to all mammals for the detection of both 5-mC and 5-hmC, greatly improving the utility of mammalian model systems

  5. Genome-Wide Screening of Cytogenetic Abnormalities in Multiple Myeloma Patients Using Array-CGH Technique: A Czech Multicenter Experience

    Directory of Open Access Journals (Sweden)

    Jan Smetana

    2014-01-01

    Full Text Available Characteristic recurrent copy number aberrations (CNAs play a key role in multiple myeloma (MM pathogenesis and have important prognostic significance for MM patients. Array-based comparative genomic hybridization (aCGH provides a powerful tool for genome-wide classification of CNAs and thus should be implemented into MM routine diagnostics. We demonstrate the possibility of effective utilization of oligonucleotide-based aCGH in 91 MM patients. Chromosomal aberrations associated with effect on the prognosis of MM were initially evaluated by I-FISH and were found in 93.4% (85/91. Incidence of hyperdiploidy was 49.5% (45/91; del(13(q14 was detected in 57.1% (52/91; gain(1(q21 occurred in 58.2% (53/91; del(17(p13 was observed in 15.4% (14/91; and t(4;14(p16;q32 was found in 18.6% (16/86. Genome-wide screening using Agilent 44K aCGH microarrays revealed copy number alterations in 100% (91/91. Most common deletions were found at 13q (58.9%, 1p (39.6%, and 8p (31.1%, whereas gain of whole 1q was the most often duplicated region (50.6%. Furthermore, frequent homozygous deletions of genes playing important role in myeloma biology such as TRAF3, BIRC1/BIRC2, RB1, or CDKN2C were observed. Taken together, we demonstrated the utilization of aCGH technique in clinical diagnostics as powerful tool for identification of unbalanced genomic abnormalities with prognostic significance for MM patients.

  6. Evaluation of the frequency of polymorphisms in XRCC1 (Arg399Gln) and XPD (Lys751Gln) genes related to the genome stability maintenance in individuals of the resident population from Monte Alegre, PA/Brazil municipality

    International Nuclear Information System (INIS)

    Duarte, Isabelle Magliano

    2010-01-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on Earth. Ionizing radiation is a known genotoxic agent, which can affect biological molecules, causing DNA damage and genomic instability. The cellular system of DNA repair plays an important role in maintaining genomic stability by repairing DNA damage caused by genotoxic agents. However, genes related to DNA repair may have their role committed when presenting a certain polymorphism. This study intended to analyze the frequency of single nucleotide polymorphisms (SNPs) in genes of DNA repair XRCC1 (Arg39-9Gln) and XPD (Lys751Gln) in a: population of the city of Monte Alegre, that resides in an area of high exposure to natural radioactivity. Samples of saliva were collected from individuals of the population of Monte Alegre, in which 40 samples were of male and 46 female. Through the use of RFLP (length polymorphism restriction fragment) the frequency of homozygous genotypes and / or heterozygous was determined for polymorphic genes. The XRCC1 gene had 65.4% of the presence of the allele 399Gln and XPD gene had 32.9% of the 751Gln allele. These values are similar to those found in previous studies for the XPD gene, whereas XRCC1 showed a frequency much higher than described in the literature. The. influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology for risk assessment of cancer in the population of Monte Alegre. (author)

  7. Preimplantation genetic diagnosis guided by single-cell genomics

    Science.gov (United States)

    2013-01-01

    Preimplantation genetic diagnosis (PGD) aims to help couples with heritable genetic disorders to avoid the birth of diseased offspring or the recurrence of loss of conception. Following in vitro fertilization, one or a few cells are biopsied from each human preimplantation embryo for genetic testing, allowing diagnosis and selection of healthy embryos for uterine transfer. Although classical methods, including single-cell PCR and fluorescent in situ hybridization, enable PGD for many genetic disorders, they have limitations. They often require family-specific designs and can be labor intensive, resulting in long waiting lists. Furthermore, certain types of genetic anomalies are not easy to diagnose using these classical approaches, and healthy offspring carrying the parental mutant allele(s) can result. Recently, state-of-the-art methods for single-cell genomics have flourished, which may overcome the limitations associated with classical PGD, and these underpin the development of generic assays for PGD that enable selection of embryos not only for the familial genetic disorder in question, but also for various other genetic aberrations and traits at once. Here, we discuss the latest single-cell genomics methodologies based on DNA microarrays, single-nucleotide polymorphism arrays or next-generation sequence analysis. We focus on their strengths, their validation status, their weaknesses and the challenges for implementing them in PGD. PMID:23998893

  8. Ensembl Genomes 2013: scaling up access to genome-wide data.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  9. Genomic applications in forensic medicine

    DEFF Research Database (Denmark)

    Børsting, Claus; Morling, Niels

    2016-01-01

    Since the 1980s, advances in DNA technology have revolutionized the scope and practice of forensic medicine. From the days of restriction fragment length polymorphisms (RFLPs) to short tandem repeats (STRs), the current focus is on the next generation genome sequencing. It has been almost a decad...

  10. [Analysis of genomic DNA methylation level in radish under cadmium stress by methylation-sensitive amplified polymorphism technique].

    Science.gov (United States)

    Yang, Jin-Lan; Liu, Li-Wang; Gong, Yi-Qin; Huang, Dan-Qiong; Wang, Feng; He, Ling-Li

    2007-06-01

    The level of cytosine methylation induced by cadmium in radish (Raphanus sativus L.) genome was analysed using the technique of methylation-sensitive amplified polymorphism (MSAP). The MSAP ratios in radish seedling exposed to cadmium chloride at the concentration of 50, 250 and 500 mg/L were 37%, 43% and 51%, respectively, and the control was 34%; the full methylation levels (C(m)CGG in double strands) were at 23%, 25% and 27%, respectively, while the control was 22%. The level of increase in MSAP and full methylation indicated that de novo methylation occurred in some 5'-CCGG sites under Cd stress. There was significant positive correlation between increase of total DNA methylation level and CdCl(2) concentration. Four types of MSAP patterns: de novo methylation, de-methylation, atypical pattern and no changes of methylation pattern were identified among CdCl(2) treatments and the control. DNA methylation alteration in plants treated with CdCl(2) was mainly through de novo methylation.

  11. A Genomics-Based Model for Prediction of Severe Bioprosthetic Mitral Valve Calcification.

    Science.gov (United States)

    Ponasenko, Anastasia V; Khutornaya, Maria V; Kutikhin, Anton G; Rutkovskaya, Natalia V; Tsepokina, Anna V; Kondyukova, Natalia V; Yuzhalin, Arseniy E; Barbarash, Leonid S

    2016-08-31

    Severe bioprosthetic mitral valve calcification is a significant problem in cardiovascular surgery. Unfortunately, clinical markers did not demonstrate efficacy in prediction of severe bioprosthetic mitral valve calcification. Here, we examined whether a genomics-based approach is efficient in predicting the risk of severe bioprosthetic mitral valve calcification. A total of 124 consecutive Russian patients who underwent mitral valve replacement surgery were recruited. We investigated the associations of the inherited variation in innate immunity, lipid metabolism and calcium metabolism genes with severe bioprosthetic mitral valve calcification. Genotyping was conducted utilizing the TaqMan assay. Eight gene polymorphisms were significantly associated with severe bioprosthetic mitral valve calcification and were therefore included into stepwise logistic regression which identified male gender, the T/T genotype of the rs3775073 polymorphism within the TLR6 gene, the C/T genotype of the rs2229238 polymorphism within the IL6R gene, and the A/A genotype of the rs10455872 polymorphism within the LPA gene as independent predictors of severe bioprosthetic mitral valve calcification. The developed genomics-based model had fair predictive value with area under the receiver operating characteristic (ROC) curve of 0.73. In conclusion, our genomics-based approach is efficient for the prediction of severe bioprosthetic mitral valve calcification.

  12. Detection of DNA polymorphisms in Dendrobium Sonia White mutant lines

    International Nuclear Information System (INIS)

    Affrida Abu Hassan; Putri Noor Faizah Megat Mohd Tahir; Zaiton Ahmad; Mohd Nazir Basiran

    2006-01-01

    Dendrobium Sonia white mutant lines were obtained through gamma ray induced mutation of purple flower Dendrobium Sonia at dosage 35 Gy. Amplified Fragment Length Polymorphism (AFLP) technique was used to compare genomic variations in these mutant lines with the control. Our objectives were to detect polymorphic fragments from these mutants to provide useful information on genes involving in flower colour expression. AFLP is a PCR based DNA fingerprinting technique. It involves digestion of DNA with restriction enzymes, ligation of adapter and selective amplification using primer with one (pre-amplification) and three (selective amplification) arbitrary nucleotides. A total number of 20 primer combinations have been tested and 7 produced clear fingerprint patterns. Of these, 13 polymorphic bands have been successfully isolate and cloned. (Author)

  13. New polymorphisms within the variable number tandem repeat (VNTR) 7 locus of Mycobacterium avium subsp. paratuberculosis.

    Science.gov (United States)

    Fawzy, Ahmad; Zschöck, Michael; Ewers, Christa; Eisenberg, Tobias

    2016-06-01

    Variable number tandem repeat (VNTR) is a frequently employed typing method of Mycobacterium avium paratuberculosis (MAP) isolates. Based on whole genome sequencing in a previous study, allelic diversity at some VNTR loci seems to over- or under-estimate the actual phylogenetic variance among isolates. Interestingly, two closely related isolates on one farm showed polymorphism at the VNTR 7 locus, raising concerns about the misleading role that it might play in genotyping. We aimed to investigate the underlying basis of VNTR 7-polymorphism by analyzing sequence data for published genomes and field isolates of MAP and other M. avium complex (MAC) members. In contrast to MAP strains from cattle, strains from sheep displayed an "imperfect" repeat within VNTR 7, which was identical to respective allele types in other MAC genomes. Subspecies- and strain-specific single nucleotide polymorphisms (SNPs) and two novel (16 and 56 bp) repeats were detected. Given the combination of the three existing repeats, there are at least five different patterns for VNTR 7. The present findings highlight a higher polymorphism and probable instability of VNTR 7 locus that needs to be considered and challenged in future studies. Until then, sequencing of this locus in future studies is important to correctly assign the underlying allele types.(1). Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Human lymphocyte polymorphisms detected by quantitative two-dimensional electrophoresis

    International Nuclear Information System (INIS)

    Goldman, D.; Merril, C.R.

    1983-01-01

    A survey of 186 soluble lymphocyte proteins for genetic polymorphism was carried out utilizing two-dimensional electrophoresis of 14 C-labeled phytohemagglutinin (PHA)-stimulated human lymphocyte proteins. Nineteen of these proteins exhibited positional variation consistent with independent genetic polymorphism in a primary sample of 28 individuals. Each of these polymorphisms was characterized by quantitative gene-dosage dependence insofar as the heterozygous phenotype expressed approximately 50% of each allelic gene product as was seen in homozygotes. Patterns observed were also identical in monozygotic twins, replicate samples, and replicate gels. The three expected phenotypes (two homozygotes and a heterozygote) were observed in each of 10 of these polymorphisms while the remaining nine had one of the homozygous classes absent. The presence of the three phenotypes, the demonstration of gene-dosage dependence, and our own and previous pedigree analysis of certain of these polymorphisms supports the genetic basis of these variants. Based on this data, the frequency of polymorphic loci for man is: P . 19/186 . .102, and the average heterozygosity is .024. This estimate is approximately 1/3 to 1/2 the rate of polymorphism previously estimated for man in other studies using one-dimensional electrophoresis of isozyme loci. The newly described polymorphisms and others which should be detectable in larger protein surveys with two-dimensional electrophoresis hold promise as genetic markers of the human genome for use in gene mapping and pedigree analyses

  15. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  16. Intragenomic polymorphisms among high-copy loci: a genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae

    Directory of Open Access Journals (Sweden)

    Kevin Weitemier

    2015-01-01

    Full Text Available Despite knowledge that concerted evolution of high-copy loci is often imperfect, studies that investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual’s consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%. Most nrDNA positions (91% were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming.

  17. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  18. Megabase-Scale Inversion Polymorphism in the Wild Ancestor of Maize

    Science.gov (United States)

    Fang, Zhou; Pyhäjärvi, Tanja; Weber, Allison L.; Dawe, R. Kelly; Glaubitz, Jeffrey C.; González, José de Jesus Sánchez; Ross-Ibarra, Claudia; Doebley, John; Morrell, Peter L.; Ross-Ibarra, Jeffrey

    2012-01-01

    Chromosomal inversions are thought to play a special role in local adaptation, through dramatic suppression of recombination, which favors the maintenance of locally adapted alleles. However, relatively few inversions have been characterized in population genomic data. On the basis of single-nucleotide polymorphism (SNP) genotyping across a large panel of Zea mays, we have identified an ∼50-Mb region on the short arm of chromosome 1 where patterns of polymorphism are highly consistent with a polymorphic paracentric inversion that captures >700 genes. Comparison to other taxa in Zea and Tripsacum suggests that the derived, inverted state is present only in the wild Z. mays subspecies parviglumis and mexicana and is completely absent in domesticated maize. Patterns of polymorphism suggest that the inversion is ancient and geographically widespread in parviglumis. Cytological screens find little evidence for inversion loops, suggesting that inversion heterozygotes may suffer few crossover-induced fitness consequences. The inversion polymorphism shows evidence of adaptive evolution, including a strong altitudinal cline, a statistical association with environmental variables and phenotypic traits, and a skewed haplotype frequency spectrum for inverted alleles. PMID:22542971

  19. Gene number determination and genetic polymorphism of the gamma delta T cell co-receptor WC1 genes

    Directory of Open Access Journals (Sweden)

    Chen Chuang

    2012-10-01

    Full Text Available Abstract Background WC1 co-receptors belong to the scavenger receptor cysteine-rich (SRCR superfamily and are encoded by a multi-gene family. Expression of particular WC1 genes defines functional subpopulations of WC1+ γδ T cells. We have previously identified partial or complete genomic sequences for thirteen different WC1 genes through annotation of the bovine genome Btau_3.1 build. We also identified two WC1 cDNA sequences from other cattle that did not correspond to sequences in the Btau_3.1 build. Their absence in the Btau_3.1 build may have reflected gaps in the genome assembly or polymorphisms among animals. Since the response of γδ T cells to bacterial challenge is determined by WC1 gene expression, it was critical to understand whether individual cattle or breeds differ in the number of WC1 genes or display polymorphisms. Results Real-time quantitative PCR using DNA from the animal whose genome was sequenced (“Dominette” and sixteen other animals representing ten breeds of cattle, showed that the number of genes coding for WC1 co-receptors is thirteen. The complete coding sequences of those thirteen WC1 genes is presented, including the correction of an error in the WC1-2 gene due to mis-assembly in the Btau_3.1 build. All other cDNA sequences were found to agree with the previous annotation of complete or partial WC1 genes. PCR amplification and sequencing of the most variable N-terminal SRCR domain (domain 1 which has the SRCR “a” pattern of each of the thirteen WC1 genes showed that the sequences are highly conserved among individuals and breeds. Of 160 sequences of domain 1 from three breeds of cattle, no additional sequences beyond the thirteen described WC1 genes were found. Analysis of the complete WC1 cDNA sequences indicated that the thirteen WC1 genes code for three distinct WC1 molecular forms. Conclusion The bovine WC1 multi-gene family is composed of thirteen genes coding for three structural forms whose

  20. Association between Single Nucleotide Polymorphisms in Vitamin D Receptor Gene Polymorphisms and Permanent Tooth Caries Susceptibility to Permanent Tooth Caries in Chinese Adolescent

    Directory of Open Access Journals (Sweden)

    Miao Yu

    2017-01-01

    Full Text Available Purpose. Dental caries is a multifactorial infectious disease. In this study, we investigated whether single nucleotide polymorphisms (SNPs in vitamin D receptor (VDR gene were associated with susceptibility to permanent tooth caries in Chinese adolescents. Method. A total of 200 dental caries patients and 200 healthy controls aged 12 years were genotyped for VDR gene polymorphisms using the PCR-restriction fragment length polymorphism (PCR-RFLP assay. All of them were examined for their oral and dental status with the WHO criteria, and clinical information such as the Decayed Missing Filled Teeth Index (DMFT was evaluated. Genomic DNA was extracted from the buccal epithelial cells. The four polymorphic SNPs (Bsm I, Taq I, Apa I, and Fok I in VDR were assessed for both genotypic and phenotypic susceptibilities. Results. Among the four examined VDR gene polymorphisms, the increased frequency of the CT and CC genotype of the Fok I VDR gene polymorphism was associated with dental caries in 12-year-old adolescent, compared with the controls (X2 = 17.813, p≤0.001. Moreover, Fok I polymorphic allele C frequency was significantly increased in the dental caries cases, compared to the controls (X2 = 14.144, p≤0.001, OR = 1.730, 95% CI = 1.299–2.303. However, the other three VDR gene polymorphisms (Bsm I, Taq I, and Apa I showed no statistically significant differences in the caries groups compared with the controls. Conclusion. VDR-Fok I gene polymorphisms may be associated with susceptibility to permanent tooth caries in Chinese adolescent.

  1. Prediction of peripheral neuropathy in multiple myeloma patients receiving bortezomib and thalidomide: a genetic study based on a single nucleotide polymorphism array.

    Science.gov (United States)

    García-Sanz, Ramón; Corchete, Luis Antonio; Alcoceba, Miguel; Chillon, María Carmen; Jiménez, Cristina; Prieto, Isabel; García-Álvarez, María; Puig, Noemi; Rapado, Immaculada; Barrio, Santiago; Oriol, Albert; Blanchard, María Jesús; de la Rubia, Javier; Martínez, Rafael; Lahuerta, Juan José; González Díaz, Marcos; Mateos, María Victoria; San Miguel, Jesús Fernando; Martínez-López, Joaquín; Sarasquete, María Eugenia

    2017-12-01

    Bortezomib- and thalidomide-based therapies have significantly contributed to improved survival of multiple myeloma (MM) patients. However, treatment-induced peripheral neuropathy (TiPN) is a common adverse event associated with them. Risk factors for TiPN in MM patients include advanced age, prior neuropathy, and other drugs, but there are conflicting results about the role of genetics in predicting the risk of TiPN. Thus, we carried out a genome-wide association study based on more than 300 000 exome single nucleotide polymorphisms in 172 MM patients receiving therapy involving bortezomib and thalidomide. We compared patients developing and not developing TiPN under similar treatment conditions (GEM05MAS65, NCT00443235). The highest-ranking single nucleotide polymorphism was rs45443101, located in the PLCG2 gene, but no significant differences were found after multiple comparison correction (adjusted P = .1708). Prediction analyses, cytoband enrichment, and pathway analyses were also performed, but none yielded any significant findings. A copy number approach was also explored, but this gave no significant results either. In summary, our study did not find a consistent genetic component associated with TiPN under bortezomib and thalidomide therapies that could be used for prediction, which makes clinical judgment essential in the practical management of MM treatment. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Combined amplification and hybridization techniques for genome scanning in vegetatively propagated crops

    Energy Technology Data Exchange (ETDEWEB)

    Kahl, G; Ramser, J; Terauchi, R [Biocentre, University of Frankfurt, Frankfurt am Main (Germany); Lopez-Peralta, C [IRGP, Colegio de Postgraduados, Montecillo, Edo. de Mexico, Texcoco (Mexico); Asemota, H N [Biotechnology Centre, University of the West Indies, Mona, Kingston (Jamaica); Weising, K [School of Biological Sciences, University of Auckland, Auckland (New Zealand)

    1998-10-01

    A combination of PCR- and hybridization-based genome scanning techniques and sequence comparisons between non-coding chloroplast DNA flanking tRNA genes has been employed to screen Dioscorea species for intra- and interspecific genetic diversity. This methodology detected extensive polymorphisms within Dioscorea bulbifera L., and revealed taxonomic and phylogenetic relationships among cultivated Guinea yams varieties and their potential wild progenitors. Finally, screening of yam germplasm grown in Jamaica permitted reliable discrimination between all major cultivars. Genome scanning by micro satellite-primed PCR (MP-PCR) and random amplified polymorphic DNA (RAPD) analysis in combination with the novel random amplified micro satellite polymorphisms (RAMPO) hybridization technique has shown high potential for the genetic analysis of yams, and holds promise for other vegetatively propagated orphan crops. (author) 46 refs, 3 figs, 3 tabs

  3. Right-hand-side updating for fast computing of genomic breeding values

    NARCIS (Netherlands)

    Calus, M.P.L.

    2014-01-01

    Since both the number of SNPs (single nucleotide polymorphisms) used in genomic prediction and the number of individuals used in training datasets are rapidly increasing, there is an increasing need to improve the efficiency of genomic prediction models in terms of computing time and memory (RAM)

  4. Direct detection of single-nucleotide polymorphisms in bacterial DNA by SNPtrap

    DEFF Research Database (Denmark)

    Grønlund, Hugo Ahlm; Moen, Birgitte; Hoorfar, Jeffrey

    2011-01-01

    A major challenge with single-nucleotide polymorphism (SNP) fingerprinting of bacteria and higher organisms is the combination of genome-wide screenings with the potential of multiplexing and accurate SNP detection. Single-nucleotide extension by the minisequencing principle represents a technolo...

  5. CRISPR Diversity and Microevolution in Clostridium difficile.

    Science.gov (United States)

    Andersen, Joakim M; Shoup, Madelyn; Robinson, Cathy; Britton, Robert; Olsen, Katharina E P; Barrangou, Rodolphe

    2016-09-19

    Virulent strains of Clostridium difficile have become a global health problem associated with morbidity and mortality. Traditional typing methods do not provide ideal resolution to track outbreak strains, ascertain genetic diversity between isolates, or monitor the phylogeny of this species on a global basis. Here, we investigate the occurrence and diversity of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) in C. difficile to assess the potential of CRISPR-based phylogeny and high-resolution genotyping. A single Type-IB CRISPR-Cas system was identified in 217 analyzed genomes with cas gene clusters present at conserved chromosomal locations, suggesting vertical evolution of the system, assessing a total of 1,865 CRISPR arrays. The CRISPR arrays, markedly enriched (8.5 arrays/genome) compared with other species, occur both at conserved and variable locations across strains, and thus provide a basis for typing based on locus occurrence and spacer polymorphism. Clustering of strains by array composition correlated with sequence type (ST) analysis. Spacer content and polymorphism within conserved CRISPR arrays revealed phylogenetic relationship across clades and within ST. Spacer polymorphisms of conserved arrays were instrumental for differentiating closely related strains, e.g., ST1/RT027/B1 strains and pathogenicity locus encoding ST3/RT001 strains. CRISPR spacers showed sequence similarity to phage sequences, which is consistent with the native role of CRISPR-Cas as adaptive immune systems in bacteria. Overall, CRISPR-Cas sequences constitute a valuable basis for genotyping of C. difficile isolates, provide insights into the micro-evolutionary events that occur between closely related strains, and reflect the evolutionary trajectory of these genomes. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Multi-generational imputation of single nucleotide polymorphism marker genotypes and accuracy of genomic selection.

    Science.gov (United States)

    Toghiani, S; Aggrey, S E; Rekaya, R

    2016-07-01

    Availability of high-density single nucleotide polymorphism (SNP) genotyping platforms provided unprecedented opportunities to enhance breeding programmes in livestock, poultry and plant species, and to better understand the genetic basis of complex traits. Using this genomic information, genomic breeding values (GEBVs), which are more accurate than conventional breeding values. The superiority of genomic selection is possible only when high-density SNP panels are used to track genes and QTLs affecting the trait. Unfortunately, even with the continuous decrease in genotyping costs, only a small fraction of the population has been genotyped with these high-density panels. It is often the case that a larger portion of the population is genotyped with low-density and low-cost SNP panels and then imputed to a higher density. Accuracy of SNP genotype imputation tends to be high when minimum requirements are met. Nevertheless, a certain rate of genotype imputation errors is unavoidable. Thus, it is reasonable to assume that the accuracy of GEBVs will be affected by imputation errors; especially, their cumulative effects over time. To evaluate the impact of multi-generational selection on the accuracy of SNP genotypes imputation and the reliability of resulting GEBVs, a simulation was carried out under varying updating of the reference population, distance between the reference and testing sets, and the approach used for the estimation of GEBVs. Using fixed reference populations, imputation accuracy decayed by about 0.5% per generation. In fact, after 25 generations, the accuracy was only 7% lower than the first generation. When the reference population was updated by either 1% or 5% of the top animals in the previous generations, decay of imputation accuracy was substantially reduced. These results indicate that low-density panels are useful, especially when the generational interval between reference and testing population is small. As the generational interval

  7. Genome-wide detection of CNVs in Chinese indigenous sheep with different types of tails using ovine high-density 600K SNP arrays

    OpenAIRE

    Zhu, Caiye; Fan, Hongying; Yuan, Zehu; Hu, Shijin; Ma, Xiaomeng; Xuan, Junli; Wang, Hongwei; Zhang, Li; Wei, Caihong; Zhang, Qin; Zhao, Fuping; Du, Lixin

    2016-01-01

    Chinese indigenous sheep can be classified into three types based on tail morphology: fat-tailed, fat-rumped, and thin-tailed sheep, of which the typical breeds are large-tailed Han sheep, Altay sheep, and Tibetan sheep, respectively. To unravel the genetic mechanisms underlying the phenotypic differences among Chinese indigenous sheep with tails of three different types, we used ovine high-density 600K SNP arrays to detect genome-wide copy number variation (CNV). In large-tailed Han sheep, A...

  8. Genome-wide analysis of the human Alu Yb-lineage

    Directory of Open Access Journals (Sweden)

    Carter Anthony B

    2004-03-01

    Full Text Available Abstract The Alu Yb-lineage is a 'young' primarily human-specific group of short interspersed element (SINE subfamilies that have integrated throughout the human genome. In this study, we have computationally screened the draft sequence of the human genome for Alu Yb-lineage subfamily members present on autosomal chromosomes. A total of 1,733 Yb Alu subfamily members have integrated into human autosomes. The average ages of Yb-lineage subfamilies, Yb7, Yb8 and Yb9, are estimated as 4.81, 2.39 and 2.32 million years, respectively. In order to determine the contribution of the Alu Yb-lineage to human genomic diversity, 1,202 loci were analysed using polymerase chain reaction (PCR-based assays, which amplify the genomic regions containing individual Yb-lineage subfamily members. Approximately 20 per cent of the Yb-lineage Alu elements are polymorphic for insertion presence/absence in the human genome. Fewer than 0.5 per cent of the Yb loci also demonstrate insertions at orthologous positions in non-human primate genomes. Genomic sequencing of these unusual loci demonstrates that each of the orthologous loci from non-human primate genomes contains older Y, Sg and Sx Alu family members that have been altered, through various mechanisms, into Yb8 sequences. These data suggest that Alu Yb-lineage subfamily members are largely restricted to the human genome. The high copy number, level of insertion polymorphism and estimated age indicate that members of the Alu Yb elements will be useful in a wide range of genetic analyses.

  9. EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

    Science.gov (United States)

    Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun

    2017-08-01

    Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  10. Integrative Genomics Viewer (IGV) | Informatics Technology for Cancer Research (ITCR)

    Science.gov (United States)

    The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

  11. Polymorphism of the simple sequence repeat (AAC)5 in the ...

    Indian Academy of Sciences (India)

    2013-12-04

    Dec 4, 2013 ... SSRs could be present in coding and noncoding regions, contributing to genome dynamics and evolution. Previous studies by our research group detected molecular and cytogenetic riboso- mal DNA (rDNA) polymorphisms in Old Portuguese bread and durum wheat cultivars. Considering the rRNA genes.

  12. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  13. Predicting Hybrid Performances for Quality Traits through Genomic-Assisted Approaches in Central European Wheat

    KAUST Repository

    Liu, Guozheng

    2016-07-06

    Bread-making quality traits are central targets for wheat breeding. The objectives of our study were to (1) examine the presence of major effect QTLs for quality traits in a Central European elite wheat population, (2) explore the optimal strategy for predicting the hybrid performance for wheat quality traits, and (3) investigate the effects of marker density and the composition and size of the training population on the accuracy of prediction of hybrid performance. In total 135 inbred lines of Central European bread wheat (Triticum aestivum L.) and 1,604 hybrids derived from them were evaluated for seven quality traits in up to six environments. The 135 parental lines were genotyped using a 90k single-nucleotide polymorphism array. Genome-wide association mapping initially suggested presence of several quantitative trait loci (QTLs), but cross-validation rather indicated the absence of major effect QTLs for all quality traits except of 1000-kernel weight. Genomic selection substantially outperformed marker-assisted selection in predicting hybrid performance. A resampling study revealed that increasing the effective population size in the estimation set of hybrids is relevant to boost the accuracy of prediction for an unrelated test population.

  14. Predicting Hybrid Performances for Quality Traits through Genomic-Assisted Approaches in Central European Wheat.

    Directory of Open Access Journals (Sweden)

    Guozheng Liu

    Full Text Available Bread-making quality traits are central targets for wheat breeding. The objectives of our study were to (1 examine the presence of major effect QTLs for quality traits in a Central European elite wheat population, (2 explore the optimal strategy for predicting the hybrid performance for wheat quality traits, and (3 investigate the effects of marker density and the composition and size of the training population on the accuracy of prediction of hybrid performance. In total 135 inbred lines of Central European bread wheat (Triticum aestivum L. and 1,604 hybrids derived from them were evaluated for seven quality traits in up to six environments. The 135 parental lines were genotyped using a 90k single-nucleotide polymorphism array. Genome-wide association mapping initially suggested presence of several quantitative trait loci (QTLs, but cross-validation rather indicated the absence of major effect QTLs for all quality traits except of 1000-kernel weight. Genomic selection substantially outperformed marker-assisted selection in predicting hybrid performance. A resampling study revealed that increasing the effective population size in the estimation set of hybrids is relevant to boost the accuracy of prediction for an unrelated test population.

  15. Predicting Hybrid Performances for Quality Traits through Genomic-Assisted Approaches in Central European Wheat

    KAUST Repository

    Liu, Guozheng; Zhao, Yusheng; Gowda, Manje; Longin, C. Friedrich H.; Reif, Jochen C.; Mette, Michael F.

    2016-01-01

    Bread-making quality traits are central targets for wheat breeding. The objectives of our study were to (1) examine the presence of major effect QTLs for quality traits in a Central European elite wheat population, (2) explore the optimal strategy for predicting the hybrid performance for wheat quality traits, and (3) investigate the effects of marker density and the composition and size of the training population on the accuracy of prediction of hybrid performance. In total 135 inbred lines of Central European bread wheat (Triticum aestivum L.) and 1,604 hybrids derived from them were evaluated for seven quality traits in up to six environments. The 135 parental lines were genotyped using a 90k single-nucleotide polymorphism array. Genome-wide association mapping initially suggested presence of several quantitative trait loci (QTLs), but cross-validation rather indicated the absence of major effect QTLs for all quality traits except of 1000-kernel weight. Genomic selection substantially outperformed marker-assisted selection in predicting hybrid performance. A resampling study revealed that increasing the effective population size in the estimation set of hybrids is relevant to boost the accuracy of prediction for an unrelated test population.

  16. Predicting Hybrid Performances for Quality Traits through Genomic-Assisted Approaches in Central European Wheat

    Science.gov (United States)

    Liu, Guozheng; Zhao, Yusheng; Gowda, Manje; Longin, C. Friedrich H.; Reif, Jochen C.; Mette, Michael F.

    2016-01-01

    Bread-making quality traits are central targets for wheat breeding. The objectives of our study were to (1) examine the presence of major effect QTLs for quality traits in a Central European elite wheat population, (2) explore the optimal strategy for predicting the hybrid performance for wheat quality traits, and (3) investigate the effects of marker density and the composition and size of the training population on the accuracy of prediction of hybrid performance. In total 135 inbred lines of Central European bread wheat (Triticum aestivum L.) and 1,604 hybrids derived from them were evaluated for seven quality traits in up to six environments. The 135 parental lines were genotyped using a 90k single-nucleotide polymorphism array. Genome-wide association mapping initially suggested presence of several quantitative trait loci (QTLs), but cross-validation rather indicated the absence of major effect QTLs for all quality traits except of 1000-kernel weight. Genomic selection substantially outperformed marker-assisted selection in predicting hybrid performance. A resampling study revealed that increasing the effective population size in the estimation set of hybrids is relevant to boost the accuracy of prediction for an unrelated test population. PMID:27383841

  17. Polymorphic Microsatellite Markers for the Tetrapolar Anther-Smut Fungus Microbotryum saponariae Based on Genome Sequencing

    Science.gov (United States)

    Fortuna, Taiadjana M.; Snirc, Alodie; Badouin, Hélène; Gouzy, Jérome; Siguenza, Sophie; Esquerre, Diane; Le Prieur, Stéphanie; Shykoff, Jacqui A.; Giraud, Tatiana

    2016-01-01

    Background Anther-smut fungi belonging to the genus Microbotryum sterilize their host plants by aborting ovaries and replacing pollen by fungal spores. Sibling Microbotryum species are highly specialized on their host plants and they have been widely used as models for studies of ecology and evolution of plant pathogenic fungi. However, most studies have focused, so far, on M. lychnidis-dioicae that parasitizes the white campion Silene latifolia. Microbotryum saponariae, parasitizing mainly Saponaria officinalis, is an interesting anther-smut fungus, since it belongs to a tetrapolar lineage (i.e., with two independently segregating mating-type loci), while most of the anther-smut Microbotryum fungi are bipolar (i.e., with a single mating-type locus). Saponaria officinalis is a widespread long-lived perennial plant species with multiple flowering stems, which makes its anther-smut pathogen a good model for studying phylogeography and within-host multiple infections. Principal Findings Here, based on a generated genome sequence of M. saponariae we developed 6 multiplexes with a total of 22 polymorphic microsatellite markers using an inexpensive and efficient method. We scored these markers in fungal individuals collected from 97 populations across Europe, and found that the number of their alleles ranged from 2 to 11, and their expected heterozygosity from 0.01 to 0.58. Cross-species amplification was examined using nine other Microbotryum species parasitizing hosts belonging to Silene, Dianthus and Knautia genera. All loci were successfully amplified in at least two other Microbotryum species. Significance These newly developed markers will provide insights into the population genetic structure and the occurrence of within-host multiple infections of M. saponariae. In addition, the draft genome of M. saponariae, as well as one of the described markers will be useful resources for studying the evolution of the breeding systems in the genus Microbotryum and the

  18. Genomic Epidemiology of Salmonella enterica Serotype Enteritidis based on Population Structure of Prevalent Lineages

    DEFF Research Database (Denmark)

    Deng, Xiangyu; Desai, Prerak T.; den Bakker, Henk C.

    2014-01-01

    serotype Nitra strains. Single-nucleotide polymorphisms were filtered to identify 4,887 reliable loci that distinguished all isolates from each other. Our whole-genome single-nucleotide polymorphism typing approach was robust for S. enterica Enteritidis subtyping with combined data for different strains...

  19. Molecular analysis of point mutations in a barley genome exposed to MNU and gamma rays

    Energy Technology Data Exchange (ETDEWEB)

    Kurowska, Marzena, E-mail: mkurowsk@us.edu.pl [Department of Genetics, Faculty of Biology and Environmental Protection, University of Silesia, Jagiellonska 28, 40-032 Katowice (Poland); Labocha-Pawlowska, Anna; Gnizda, Dominika; Maluszynski, Miroslaw; Szarejko, Iwona [Department of Genetics, Faculty of Biology and Environmental Protection, University of Silesia, Jagiellonska 28, 40-032 Katowice (Poland)

    2012-10-15

    We present studies aimed at determining the types and frequencies of mutations induced in the barley genome after treatment with chemical (N-methyl-N-nitrosourea, MNU) and physical (gamma rays) mutagens. We created M{sub 2} populations of a doubled haploid line and used them for the analysis of mutations in targeted DNA sequences and over an entire barley genome using TILLING (Targeting Induced Local Lesions in Genomes) and AFLP (Amplified Fragment Length Polymorphism) technique, respectively. Based on the TILLING analysis of the total DNA sequence of 4,537,117 bp in the MNU population, the average mutation density was estimated as 1/504 kb. Only one nucleotide change was found after an analysis of 3,207,444 bp derived from the highest dose of gamma rays applied. MNU was clearly a more efficient mutagen than gamma rays in inducing point mutations in barley. The majority (63.6%) of the MNU-induced nucleotide changes were transitions, with a similar number of G > A and C > T substitutions. The similar share of G > A and C > T transitions indicates a lack of bias in the repair of O{sup 6}-methylguanine lesions between DNA strands. There was, however, a strong specificity of the nucleotide surrounding the O{sup 6}-meG at the -1 position. Purines formed 81% of nucleotides observed at the -1 site. Scanning the barley genome with AFLP markers revealed ca. a three times higher level of AFLP polymorphism in MNU-treated as compared to the gamma-irradiated population. In order to check whether AFLP markers can really scan the whole barley genome for mutagen-induced polymorphism, 114 different AFLP products, were cloned and sequenced. 94% of bands were heterogenic, with some bands containing up to 8 different amplicons. The polymorphic AFLP products were characterised in terms of their similarity to the records deposited in a GenBank database. The types of sequences present in the polymorphic bands reflected the organisation of the barley genome.

  20. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Directory of Open Access Journals (Sweden)

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  1. Best Linear Unbiased Prediction of Genomic Breeding Values Using a Trait-Specific Marker-Derived Relationship Matrix

    NARCIS (Netherlands)

    Zhe Zhang, Z.; Liu, J.F.; Ding, Z.; Bijma, P.; Koning, de D.J.

    2010-01-01

    With the availability of high density whole-genome single nucleotide polymorphism chips, genomic selection has become a promising method to estimate genetic merit with potentially high accuracy for animal, plant and aquaculture species of economic importance. With markers covering the entire genome,

  2. Advanced statistical tools for SNP arrays : signal calibration, copy number estimation and single array genotyping

    NARCIS (Netherlands)

    Rippe, Ralph Christian Alexander

    2012-01-01

    Fluorescence bias in in signals from individual SNP arrays can be calibrated using linear models. Given the data, the system of equations is very large, so a specialized symbolic algorithm was developed. These models are also used to illustrate that genomic waves do not exist, but are merely an

  3. Transformation of natural genetic variation into Haemophilus influenzae genomes.

    Directory of Open Access Journals (Sweden)

    Joshua Chang Mell

    2011-07-01

    Full Text Available Many bacteria are able to efficiently bind and take up double-stranded DNA fragments, and the resulting natural transformation shapes bacterial genomes, transmits antibiotic resistance, and allows escape from immune surveillance. The genomes of many competent pathogens show evidence of extensive historical recombination between lineages, but the actual recombination events have not been well characterized. We used DNA from a clinical isolate of Haemophilus influenzae to transform competent cells of a laboratory strain. To identify which of the ~40,000 polymorphic differences had recombined into the genomes of four transformed clones, their genomes and their donor and recipient parents were deep sequenced to high coverage. Each clone was found to contain ~1000 donor polymorphisms in 3-6 contiguous runs (8.1±4.5 kb in length that collectively comprised ~1-3% of each transformed chromosome. Seven donor-specific insertions and deletions were also acquired as parts of larger donor segments, but the presence of other structural variation flanking 12 of 32 recombination breakpoints suggested that these often disrupt the progress of recombination events. This is the first genome-wide analysis of chromosomes directly transformed with DNA from a divergent genotype, connecting experimental studies of transformation with the high levels of natural genetic variation found in isolates of the same species.

  4. Development and characterization of highly polymorphic long TC repeat microsatellite markers for genetic analysis of peanut

    Directory of Open Access Journals (Sweden)

    Macedo Selma E

    2012-02-01

    Full Text Available Abstract Background Peanut (Arachis hypogaea L. is a crop of economic and social importance, mainly in tropical areas, and developing countries. Its molecular breeding has been hindered by a shortage of polymorphic genetic markers due to a very narrow genetic base. Microsatellites (SSRs are markers of choice in peanut because they are co-dominant, highly transferrable between species and easily applicable in the allotetraploid genome. In spite of substantial effort over the last few years by a number of research groups, the number of SSRs that are polymorphic for A. hypogaea is still limiting for routine application, creating the demand for the discovery of more markers polymorphic within cultivated germplasm. Findings A plasmid genomic library enriched for TC/AG repeats was constructed and 1401 clones sequenced. From the sequences obtained 146 primer pairs flanking mostly TC microsatellites were developed. The average number of repeat motifs amplified was 23. These 146 markers were characterized on 22 genotypes of cultivated peanut. In total 78 of the markers were polymorphic within cultivated germplasm. Most of those 78 markers were highly informative with an average of 5.4 alleles per locus being amplified. Average gene diversity index (GD was 0.6, and 66 markers showed a GD of more than 0.5. Genetic relationship analysis was performed and corroborated the current taxonomical classification of A. hypogaea subspecies and varieties. Conclusions The microsatellite markers described here are a useful resource for genetics and genomics in Arachis. In particular, the 66 markers that are highly polymorphic in cultivated peanut are a significant step towards routine genetic mapping and marker-assisted selection for the crop.

  5. Genetic Diversity of Turf-Type Tall Fescue Using Diversity Arrays Technology

    Czech Academy of Sciences Publication Activity Database

    Baird, J. H.; Kopecký, David; Lukaszewski, A.J.; Green, R. J.; Bartoš, Jan; Doležel, Jaroslav

    2012-01-01

    Roč. 52, č. 1 (2012), s. 408-412 ISSN 0011-183X Institutional research plan: CEZ:AV0Z50380511 Keywords : Festuca arundinacea * Diversity Arrays Technology (DArT) * Low genetic polymorphism Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.513, year: 2012

  6. GAPIT: genome association and prediction integrated tool.

    Science.gov (United States)

    Lipka, Alexander E; Tian, Feng; Wang, Qishan; Peiffer, Jason; Li, Meng; Bradbury, Peter J; Gore, Michael A; Buckler, Edward S; Zhang, Zhiwu

    2012-09-15

    Software programs that conduct genome-wide association studies and genomic prediction and selection need to use methodologies that maximize statistical power, provide high prediction accuracy and run in a computationally efficient manner. We developed an R package called Genome Association and Prediction Integrated Tool (GAPIT) that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The GAPIT package can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time, while providing user-friendly access and concise tables and graphs to interpret results. http://www.maizegenetics.net/GAPIT. zhiwu.zhang@cornell.edu Supplementary data are available at Bioinformatics online.

  7. Microbial genome sequencing using optical mapping and Illumina sequencing

    Science.gov (United States)

    Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...

  8. A functional gene array for detection of bacterial virulence elements

    Energy Technology Data Exchange (ETDEWEB)

    Jaing, C

    2007-11-01

    We report our development of the first of a series of microarrays designed to detect pathogens with known mechanisms of virulence and antibiotic resistance. By targeting virulence gene families as well as genes unique to specific biothreat agents, these arrays will provide important data about the pathogenic potential and drug resistance profiles of unknown organisms in environmental samples. To validate our approach, we developed a first generation array targeting genes from Escherichia coli strains K12 and CFT073, Enterococcus faecalis and Staphylococcus aureus. We determined optimal probe design parameters for microorganism detection and discrimination, measured the required target concentration, and assessed tolerance for mismatches between probe and target sequences. Mismatch tolerance is a priority for this application, due to DNA sequence variability among members of gene families. Arrays were created using the NimbleGen Maskless Array Synthesizer at Lawrence Livermore National Laboratory. Purified genomic DNA from combinations of one or more of the four target organisms, pure cultures of four related organisms, and environmental aerosol samples with spiked-in genomic DNA were hybridized to the arrays. Based on the success of this prototype, we plan to design further arrays in this series, with the goal of detecting all known virulence and antibiotic resistance gene families in a greatly expanded set of organisms.

  9. Development of maizeSNP3072, a high-throughput compatible SNP array, for DNA fingerprinting identification of Chinese maize varieties.

    Science.gov (United States)

    Tian, Hong-Li; Wang, Feng-Ge; Zhao, Jiu-Ran; Yi, Hong-Mei; Wang, Lu; Wang, Rui; Yang, Yang; Song, Wei

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are abundant and evenly distributed throughout the maize ( Zea mays L.) genome. SNPs have several advantages over simple sequence repeats, such as ease of data comparison and integration, high-throughput processing of loci, and identification of associated phenotypes. SNPs are thus ideal for DNA fingerprinting, genetic diversity analysis, and marker-assisted breeding. Here, we developed a high-throughput and compatible SNP array, maizeSNP3072, containing 3072 SNPs developed from the maizeSNP50 array. To improve genotyping efficiency, a high-quality cluster file, maizeSNP3072_GT.egt, was constructed. All 3072 SNP loci were localized within different genes, where they were distributed in exons (43 %), promoters (21 %), 3' untranslated regions (UTRs; 22 %), 5' UTRs (9 %), and introns (5 %). The average genotyping failure rate using these SNPs was only 6 %, or 3 % using the cluster file to call genotypes. The genotype consistency of repeat sample analysis on Illumina GoldenGate versus Infinium platforms exceeded 96.4 %. The minor allele frequency (MAF) of the SNPs averaged 0.37 based on data from 309 inbred lines. The 3072 SNPs were highly effective for distinguishing among 276 examined hybrids. Comparative analysis using Chinese varieties revealed that the 3072SNP array showed a better marker success rate and higher average MAF values, evaluation scores, and variety-distinguishing efficiency than the maizeSNP50K array. The maizeSNP3072 array thus can be successfully used in DNA fingerprinting identification of Chinese maize varieties and shows potential as a useful tool for germplasm resource evaluation and molecular marker-assisted breeding.

  10. ALIS-FLP: Amplified ligation selected fragment-length polymorphism method for microbial genotyping

    DEFF Research Database (Denmark)

    Brillowska-Dabrowska, A.; Wianecka, M.; Dabrowski, Slawomir

    2008-01-01

    A DNA fingerprinting method known as ALIS-FLP (amplified ligation selected fragment-length polymorphism) has been developed for selective and specific amplification of restriction fragments from TspRI restriction endonuclease digested genomic DNA. The method is similar to AFLP, but differs...

  11. Single feature polymorphism (SFP-based selective sweep identification and association mapping of growth-related metabolic traits in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Stitt Mark

    2010-03-01

    Full Text Available Abstract Background Natural accessions of Arabidopsis thaliana are characterized by a high level of phenotypic variation that can be used to investigate the extent and mode of selection on the primary metabolic traits. A collection of 54 A. thaliana natural accession-derived lines were subjected to deep genotyping through Single Feature Polymorphism (SFP detection via genomic DNA hybridization to Arabidopsis Tiling 1.0 Arrays for the detection of selective sweeps, and identification of associations between sweep regions and growth-related metabolic traits. Results A total of 1,072,557 high-quality SFPs were detected and indications for 3,943 deletions and 1,007 duplications were obtained. A significantly lower than expected SFP frequency was observed in protein-, rRNA-, and tRNA-coding regions and in non-repetitive intergenic regions, while pseudogenes, transposons, and non-coding RNA genes are enriched with SFPs. Gene families involved in plant defence or in signalling were identified as highly polymorphic, while several other families including transcription factors are depleted of SFPs. 198 significant associations between metabolic genes and 9 metabolic and growth-related phenotypic traits were detected with annotation hinting at the nature of the relationship. Five significant selective sweep regions were also detected of which one associated significantly with a metabolic trait. Conclusions We generated a high density polymorphism map for 54 A. thaliana accessions that highlights the variability of resistance genes across geographic ranges and used it to identify selective sweeps and associations between metabolic genes and metabolic phenotypes. Several associations show a clear biological relationship, while many remain requiring further investigation.

  12. No association between a common single nucleotide polymorphism, rs4141463, in the MACROD2 gene and autism spectrum disorder.

    NARCIS (Netherlands)

    Curran, S.; Bolton, P.; Rozsnyai, K.; Chiocchetti, A.; Klauck, S.M.; Duketis, E.; Poustka, F.; Schlitt, S.; Freitag, C.M.; Lee, I. van der; Muglia, P.; Poot, M.; Staal, W.G.; Jonge, M.V. de; Ophoff, R.A.; Lewis, C.; Skuse, D.; Mandy, W.; Vassos, E.; Fossdal, R.; Magnusson, P.; Hreidarsson, S.; Saemundsen, E.; Stefansson, H.; Stefansson, K.; Collier, D.

    2011-01-01

    The Autism Genome Project (AGP) Consortium recently reported genome-wide significant association between autism and an intronic single nucleotide polymorphism marker, rs4141463, within the MACROD2 gene. In the present study we attempted to replicate this finding using an independent case-control

  13. Detection of DNA methylation changes in micropropagated banana plants using methylation-sensitive amplification polymorphism (MSAP).

    Science.gov (United States)

    Peraza-Echeverria, S; Herrera-Valencia, V A.; Kay, A -J.

    2001-07-01

    The extent of DNA methylation polymorphisms was evaluated in micropropagated banana (Musa AAA cv. 'Grand Naine') derived from either the vegetative apex of the sucker or the floral apex of the male inflorescence using the methylation-sensitive amplification polymorphism (MSAP) technique. In all, 465 fragments, each representing a recognition site cleaved by either or both of the isoschizomers were amplified using eight combinations of primers. A total of 107 sites (23%) were found to be methylated at cytosine in the genome of micropropagated banana plants. In plants micropropagated from the male inflorescence explant 14 (3%) DNA methylation events were polymorphic, while plants micropropagated from the sucker explant produced 8 (1.7%) polymorphisms. No DNA methylation polymorphisms were detected in conventionally propagated banana plants. These results demonstrated the usefulness of MSAP to detect DNA methylation events in micropropagated banana plants and indicate that DNA methylation polymorphisms are associated with micropropagation.

  14. Identification of polymorphic inversions from genotypes

    Directory of Open Access Journals (Sweden)

    Cáceres Alejandro

    2012-02-01

    Full Text Available Abstract Background Polymorphic inversions are a source of genetic variability with a direct impact on recombination frequencies. Given the difficulty of their experimental study, computational methods have been developed to infer their existence in a large number of individuals using genome-wide data of nucleotide variation. Methods based on haplotype tagging of known inversions attempt to classify individuals as having a normal or inverted allele. Other methods that measure differences between linkage disequilibrium attempt to identify regions with inversions but unable to classify subjects accurately, an essential requirement for association studies. Results We present a novel method to both identify polymorphic inversions from genome-wide genotype data and classify individuals as containing a normal or inverted allele. Our method, a generalization of a published method for haplotype data 1, utilizes linkage between groups of SNPs to partition a set of individuals into normal and inverted subpopulations. We employ a sliding window scan to identify regions likely to have an inversion, and accumulation of evidence from neighboring SNPs is used to accurately determine the inversion status of each subject. Further, our approach detects inversions directly from genotype data, thus increasing its usability to current genome-wide association studies (GWAS. Conclusions We demonstrate the accuracy of our method to detect inversions and classify individuals on principled-simulated genotypes, produced by the evolution of an inversion event within a coalescent model 2. We applied our method to real genotype data from HapMap Phase III to characterize the inversion status of two known inversions within the regions 17q21 and 8p23 across 1184 individuals. Finally, we scan the full genomes of the European Origin (CEU and Yoruba (YRI HapMap samples. We find population-based evidence for 9 out of 15 well-established autosomic inversions, and for 52 regions

  15. A high-density transcript linkage map with 1,845 expressed genes positioned by microarray-based Single Feature Polymorphisms (SFP) in Eucalyptus

    Science.gov (United States)

    2011-01-01

    Background Technological advances are progressively increasing the application of genomics to a wider array of economically and ecologically important species. High-density maps enriched for transcribed genes facilitate the discovery of connections between genes and phenotypes. We report the construction of a high-density linkage map of expressed genes for the heterozygous genome of Eucalyptus using Single Feature Polymorphism (SFP) markers. Results SFP discovery and mapping was achieved using pseudo-testcross screening and selective mapping to simultaneously optimize linkage mapping and microarray costs. SFP genotyping was carried out by hybridizing complementary RNA prepared from 4.5 year-old trees xylem to an SFP array containing 103,000 25-mer oligonucleotide probes representing 20,726 unigenes derived from a modest size expressed sequence tags collection. An SFP-mapping microarray with 43,777 selected candidate SFP probes representing 15,698 genes was subsequently designed and used to genotype SFPs in a larger subset of the segregating population drawn by selective mapping. A total of 1,845 genes were mapped, with 884 of them ordered with high likelihood support on a framework map anchored to 180 microsatellites with average density of 1.2 cM. Using more probes per unigene increased by two-fold the likelihood of detecting segregating SFPs eventually resulting in more genes mapped. In silico validation showed that 87% of the SFPs map to the expected location on the 4.5X draft sequence of the Eucalyptus grandis genome. Conclusions The Eucalyptus 1,845 gene map is the most highly enriched map for transcriptional information for any forest tree species to date. It represents a major improvement on the number of genes previously positioned on Eucalyptus maps and provides an initial glimpse at the gene space for this global tree genome. A general protocol is proposed to build high-density transcript linkage maps in less characterized plant species by SFP genotyping

  16. Insertion polymorphisms of SINE200 retrotransposons within speciation islands of Anopheles gambiae molecular forms

    Directory of Open Access Journals (Sweden)

    Tu Zhijian

    2008-08-01

    Full Text Available Abstract Background SINEs (Short INterspersed Elements are homoplasy-free and co-dominant genetic markers which are considered to represent useful tools for population genetic studies, and could help clarifying the speciation processes ongoing within the major malaria vector in Africa, Anopheles gambiae s.s. Here, we report the results of the analysis of the insertion polymorphism of a nearly 200 bp-long SINE (SINE200 within genome areas of high differentiation (i.e. "speciation islands" of M and S A. gambiae molecular forms. Methods A SINE-PCR approach was carried out on thirteen SINE200 insertions in M and S females collected along the whole range of distribution of A. gambiae s.s. in sub-Saharan Africa. Ten specimens each for Anopheles arabiensis, Anopheles melas, Anopheles quadriannulatus A and 15 M/S hybrids from laboratory crosses were also analysed. Results Eight loci were successfully amplified and were found to be specific for A. gambiae s.s.: 5 on 2L chromosome and one on X chromosome resulted monomorphic, while two loci positioned respectively on 2R (i.e. S200 2R12D and X (i.e. S200 X6.1 chromosomes were found to be polymorphic. S200 2R12D was homozygote for the insertion in most S-form samples, while intermediate levels of polymorphism were shown in M-form, resulting in an overall high degree of genetic differentiation between molecular forms (Fst = 0.46 p S200 X6.1 was found to be fixed in all M- and absent in all S-specimens. This led to develop a novel easy-to-use PCR approach to straightforwardly identify A. gambiae molecular forms. This novel approach allows to overcome the constraints associated with markers on the rDNA region commonly used for M and S identification. In fact, it is based on a single copy and irreversible SINE200 insertion and, thus, is not subjected to peculiar evolutionary patterns affecting rDNA markers, e.g. incomplete homogenization of the arrays through concerted evolution and/or mixtures of M and S IGS

  17. Low-coverage MiSeq next generation sequencing reveals the mitochondrial genome of the Eastern Rock Lobster, Sagmariasus verreauxi.

    Science.gov (United States)

    Doyle, Stephen R; Griffith, Ian S; Murphy, Nick P; Strugnell, Jan M

    2015-01-01

    The complete mitochondrial genome of the Eastern Rock lobster, Sagmariasus verreauxi, is reported for the first time. Using low-coverage, long read MiSeq next generation sequencing, we constructed and determined the mtDNA genome organization of the 15,470 bp sequence from two isolates from Eastern Tasmania, Australia and Northern New Zealand, and identified 46 polymorphic nucleotides between the two sequences. This genome sequence and its genetic polymorphisms will likely be useful in understanding the distribution and population connectivity of the Eastern Rock Lobster, and in the fisheries management of this commercially important species.

  18. The Banana Genome Hub

    Science.gov (United States)

    Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

    2013-01-01

    Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967

  19. Polymorphisms within the FANCA gene associate with premature ovarian failure in Korean women.

    Science.gov (United States)

    Pyun, Jung-A; Kim, Sunshin; Cha, Dong Hyun; Kwack, KyuBum

    2014-05-01

    This study investigated whether polymorphisms within the Fanconi anemia complementation group A (FANCA) gene contribute to the increased risk of premature ovarian failure (POF) in Korean women. Ninety-eight women with POF and 218 controls participated in this study. Genomic DNA from peripheral blood was isolated, and GoldenGate genotyping assay was used to identify single nucleotide polymorphisms (SNPs) within the FANCA gene. Two significant SNPs (rs1006547 and rs2239359; P FANCA gene may increase the risk for POF in Korean women.

  20. [Recent advances of amplified fragment length polymorphism and its applications in forensic botany].

    Science.gov (United States)

    Li, Cheng-Tao; Li, Li

    2008-10-01

    Amplified fragment length polymorphism (AFLP) is a new molecular marker to detect genomic polymorphism. This new technology has advantages of high resolution, good stability, and reproducibility. Great achievements have been derived in recent years in AFLP related technologies with several AFLP expanded methodologies available. AFLP technology has been widely used in the fields of plant, animal, and microbes. It has become one of the hotspots in Forensic Botany. This review focuses on the recent advances of AFLP and its applications in forensic biology.

  1. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  2. The human genome project and novel aspects of cytochrome P450 research

    International Nuclear Information System (INIS)

    Ingelman-Sundberg, Magnus

    2005-01-01

    Currently, 57 active cytochrome P450 (CYP) genes and 58 pseudogenes are known to be present in the human genome. Among the genes discovered by initiatives in the human genome project are CYP2R1, CYP2W1, CYP2S1, CYP2U1 and CYP3A43, the latter apparently encoding a pseudoenzyme. The function, polymorphism and regulation of these genes are still to be discovered to a great extent. The polymorphism of drug metabolizing CYPs is extensive and influences the outcome of drug therapy causing lack of response or adverse drug reactions. The basis for the differences in the global distribution of the polymorphic variants is inactivating gene mutations and subsequent genetic drift. However, polymorphic alleles carrying multiple active gene copies also exist and are suggested in case of CYP2D6 to be caused by positive selection due to development of alkaloid resistance in North East Africa about 10,000-5000 BC. The knowledge about the CYP genes and their polymorphisms is of fundamental importance for effective drug therapy and for drug development as well as for understanding metabolic activation of carcinogens and other xenobiotics. Here, a short review of the current knowledge is given

  3. Genetical genomic determinants of alcohol consumption in rats and humans

    Directory of Open Access Journals (Sweden)

    Mangion Jonathan

    2009-10-01

    Full Text Available Abstract Background We have used a genetical genomic approach, in conjunction with phenotypic analysis of alcohol consumption, to identify candidate genes that predispose to varying levels of alcohol intake by HXB/BXH recombinant inbred rat strains. In addition, in two populations of humans, we assessed genetic polymorphisms associated with alcohol consumption using a custom genotyping array for 1,350 single nucleotide polymorphisms (SNPs. Our goal was to ascertain whether our approach, which relies on statistical and informatics techniques, and non-human animal models of alcohol drinking behavior, could inform interpretation of genetic association studies with human populations. Results In the HXB/BXH recombinant inbred (RI rats, correlation analysis of brain gene expression levels with alcohol consumption in a two-bottle choice paradigm, and filtering based on behavioral and gene expression quantitative trait locus (QTL analyses, generated a list of candidate genes. A literature-based, functional analysis of the interactions of the products of these candidate genes defined pathways linked to presynaptic GABA release, activation of dopamine neurons, and postsynaptic GABA receptor trafficking, in brain regions including the hypothalamus, ventral tegmentum and amygdala. The analysis also implicated energy metabolism and caloric intake control as potential influences on alcohol consumption by the recombinant inbred rats. In the human populations, polymorphisms in genes associated with GABA synthesis and GABA receptors, as well as genes related to dopaminergic transmission, were associated with alcohol consumption. Conclusion Our results emphasize the importance of the signaling pathways identified using the non-human animal models, rather than single gene products, in identifying factors responsible for complex traits such as alcohol consumption. The results suggest cross-species similarities in pathways that influence predisposition to consume

  4. [Analysis of genomic copy number variations in two sisters with primary amenorrhea and hyperandrogenism].

    Science.gov (United States)

    Zhang, Yanliang; Xu, Qiuyue; Cai, Xuemei; Li, Yixun; Song, Guibo; Wang, Juan; Zhang, Rongchen; Dai, Yong; Duan, Yong

    2015-12-01

    To analyze genomic copy number variations (CNVs) in two sisters with primary amenorrhea and hyperandrogenism. G-banding was performed for karyotype analysis. The whole genome of the two sisters were scanned and analyzed by array-based comparative genomic hybridization (array-CGH). The results were confirmed with real-time quantitative PCR (RT-qPCR). No abnormality was found by conventional G-banded chromosome analysis. Array-CGH has identified 11 identical CNVs from the sisters which, however, overlapped with CNVs reported by the Database of Genomic Variants (http://projects.tcag.ca/variation/). Therefore, they are likely to be benign. In addition, a -8.44 Mb 9p11.1-p13.1 duplication (38,561,587-47,002,387 bp, hg18) and a -80.9 kb 4q13.2 deletion (70,183,990-70,264,889 bp, hg18) were also detected in the elder and younger sister, respectively. The relationship between such CNVs and primary amenorrhea and hyperandrogenism was however uncertain. RT-qPCR results were in accordance with array-CGH. Two CNVs were detected in two sisters by array-CGH, for which further studies are needed to clarify their correlation with primary amenorrhea and hyperandrogenism.

  5. Effects of bovine prolactin gene polymorphism within exon 4 on milk ...

    African Journals Online (AJOL)

    In this study, polymorphism of prolactin gene was analyzed as a candidate gene responsible for variation and genetic trends in milk yield and composition traits. Genomic DNAs were extracted from 268 semen samples belonged to Iranian Holstein bulls. Genotyping for the prolactin gene using PCRRFLP technique and RsaI ...

  6. Single nucleotide polymorphisms as susceptibility, prognostic, and therapeutic markers of nonsmall cell lung cancer

    Directory of Open Access Journals (Sweden)

    Zienolddiny S

    2011-12-01

    Full Text Available Shanbeh Zienolddiny, Vidar SkaugSection for Toxicology and Biological Work Environment, National Institute of Occupational Health, Oslo, NorwayAbstract: Lung cancer is a major public health problem throughout the world. Among the most frequent cancer types (prostate, breast, colorectal, stomach, lung, lung cancer is the leading cause of cancer-related deaths worldwide. Among the two major subtypes of small cell lung cancer and nonsmall cell lung cancer (NSCLC, 85% of tumors belong to the NSCLC histological types. Small cell lung cancer is associated with the shortest survival time. Although tobacco smoking has been recognized as the major risk factor for lung cancer, there is a great interindividual and interethnic difference in risk of developing lung cancer given exposure to similar environmental and lifestyle factors. This may indicate that in addition to chemical and environmental factors, genetic variations in the genome may contribute to risk modification. A common type of genetic variation in the genome, known as single nucleotide polymorphism, has been found to be associated with susceptibility to lung cancer. Interestingly, many of these polymorphisms are found in the genes that regulate major pathways of carcinogen metabolism (cytochrome P450 genes, detoxification (glutathione S-transferases, adduct removal (DNA repair genes, cell growth/apoptosis (TP53/MDM2, the immune system (cytokines/chemokines, and membrane receptors (nicotinic acetylcholine and dopaminergic receptors. Some of these polymorphisms have been shown to alter the level of mRNA, and protein structure and function. In addition to being susceptibility markers, several of these polymorphisms are emerging to be important for response to chemotherapy/radiotherapy and survival of patients. Therefore, it is hypothesized that single nucleotide polymorphisms will be valuable genetic markers in individual-based prognosis and therapy in future. Here we will review some of the most

  7. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  8. The distribution and impact of common copy-number variation in the genome of the domesticated apple, Malus x domestica Borkh.

    Science.gov (United States)

    Boocock, James; Chagné, David; Merriman, Tony R; Black, Michael A

    2015-10-23

    Copy number variation (CNV) is a common feature of eukaryotic genomes, and a growing body of evidence suggests that genes affected by CNV are enriched in processes that are associated with environmental responses. Here we use next generation sequence (NGS) data to detect copy-number variable regions (CNVRs) within the Malus x domestica genome, as well as to examine their distribution and impact. CNVRs were detected using NGS data derived from 30 accessions of M. x domestica analyzed using the read-depth method, as implemented in the CNVrd2 software. To improve the reliability of our results, we developed a quality control and analysis procedure that involved checking for organelle DNA, not repeat masking, and the determination of CNVR identity using a permutation testing procedure. Overall, we identified 876 CNVRs, which spanned 3.5 % of the apple genome. To verify that detected CNVRs were not artifacts, we analyzed the B- allele-frequencies (BAF) within a single nucleotide polymorphism (SNP) array dataset derived from a screening of 185 individual apple accessions and found the CNVRs were enriched for SNPs having aberrant BAFs (P apple scab. We present the first analysis and catalogue of CNVRs in the M. x domestica genome. The enrichment of the CNVRs with R gene models and their overlap with gene loci of agricultural significance draw attention to a form of unexplored genetic variation in apple. This research will underpin further investigation of the role that CNV plays within the apple genome.

  9. Impact of gamma rays on the Phaffia rhodozyma genome revealed by RAPD-PCR.

    Science.gov (United States)

    Najafi, N; Hosseini, Ramin; Ahmadi, Ar

    2011-12-01

    Phaffia rhodozyma is a red yeast which produces astaxanthin as the major carotenoid pigment. Astaxanthin is thought to reduce the incidence of cancer and degenerative diseases in man. It also enhances the immune response and acts as a free-radical quencher, a precursor of vitamin A, or a pigment involved in the visual attraction of animals as mating partners. The impact of gamma irradiation was studied on the Phaffia rhodozyma genome. Ten mutant strains, designated Gam1-Gam10, were obtained using gamma irradiation. Ten decamer random amplified polymorphic DNA (RAPD) primers were employed to assess genetic changes. Nine primers revealed scorable polymorphisms and a total of 95 band positions were scored; amongst which 38 bands (37.5%) were polymorphic. Primer F with 3 bands and primer J20 with 13 bands produced the lowest and the highest number of bands, respectively. Primer A16 produced the highest number of polymorphic bands (70% polymorphism) and primer F showed the lowest number of polymorphic bands (0% polymorphism). Genetic distances were calculated using Jaccard's coefficient and the UPGMA method. A dendrogram was created using SPSS (version 11.5) and the strains were clustered into four groups. RAPD markers could distinguish between the parental and the mutant strains of P. rhodozyma. RAPD technique showed that some changes had occurred in the genome of the mutated strains. This technique demonstrated the capability to differentiate between the parental and the mutant strains.

  10. Update on the use of random 10-mers in mapping and fingerprinting genomes

    International Nuclear Information System (INIS)

    Sinibaldi, R.M.

    2001-01-01

    The use of Randomly Amplified Polymorphic DNA (RAPDs) has continued to grow for the last several years. A quick assessment of their use can be estimated by searching PubMed at the National Library of Medicine with the acronym RAPD. Since their first report in 1990, the number of citations with RAPD in them has increased from 12 in 1990, to 45 in 1991, to, 112 in 1993, to, 130 in 1994, to 223 in 1995, to 258 in 1996, to 236 in 1997, to 316 in 1998, to 196 to date (August 31) 1999. The utilization of 10-mers for mapping or fingerprinting has many advantages. These include a relatively low cost, no use of radioactivity, easily adapted to automation, requirement for very small amounts of input DNA, rapid results, existing data bases for many organisms, and low cost equipment requirements. In conjunction with a derived technology such as SCARs (sequence characterized amplified regions), it can provide cost effective and thorough methods for mapping and fingerprinting any genome. Newer methods based on microarray technology may offer powerful but expensive alternative approaches in determining genetic diversity. The costs of arrays should come down with time and improved production methods. In the meantime, RAPDs remain a competent and cost effective method for genome characterizations. (author)

  11. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

    Science.gov (United States)

    Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2016-07-01

    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  12. A 34K SNP genotyping array for Populus trichocarpa: design, application to the study of natural populations and transferability to other Populus species.

    Science.gov (United States)

    Geraldes, A; Difazio, S P; Slavov, G T; Ranjan, P; Muchero, W; Hannemann, J; Gunter, L E; Wymore, A M; Grassa, C J; Farzaneh, N; Porth, I; McKown, A D; Skyba, O; Li, E; Fujita, M; Klápště, J; Martin, J; Schackwitz, W; Pennacchio, C; Rokhsar, D; Friedmann, M C; Wasteneys, G O; Guy, R D; El-Kassaby, Y A; Mansfield, S D; Cronk, Q C B; Ehlting, J; Douglas, C J; Tuskan, G A

    2013-03-01

    Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. For such studies, the use of large single nucleotide polymorphism (SNP) genotyping arrays still offers the most cost-effective solution. Herein we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species latitudinal range. We adopted a candidate gene approach to the array design that resulted in the selection of 34 131 SNPs, the majority of which are located in, or within 2 kb of, 3543 candidate genes. A subset of the SNPs on the array (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%. We demonstrate that even among small numbers of samples (n = 10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca. Finally, we provide evidence for the utility of the array to address evolutionary questions such as intraspecific studies of genetic differentiation, species assignment and the detection of natural hybrids. © 2013 Blackwell Publishing Ltd.

  13. Analysis of single nucleotide polymorphisms in case-control studies.

    Science.gov (United States)

    Li, Yonghong; Shiffman, Dov; Oberbauer, Rainer

    2011-01-01

    Single nucleotide polymorphisms (SNPs) are the most common type of genetic variants in the human genome. SNPs are known to modify susceptibility to complex diseases. We describe and discuss methods used to identify SNPs associated with disease in case-control studies. An outline on study population selection, sample collection and genotyping platforms is presented, complemented by SNP selection, data preprocessing and analysis.

  14. Application of Microarray-Based Comparative Genomic Hybridization in Prenatal and Postnatal Settings: Three Case Reports

    Directory of Open Access Journals (Sweden)

    Jing Liu

    2011-01-01

    Full Text Available Microarray-based comparative genomic hybridization (array CGH is a newly emerged molecular cytogenetic technique for rapid evaluation of the entire genome with sub-megabase resolution. It allows for the comprehensive investigation of thousands and millions of genomic loci at once and therefore enables the efficient detection of DNA copy number variations (a.k.a, cryptic genomic imbalances. The development and the clinical application of array CGH have revolutionized the diagnostic process in patients and has provided a clue to many unidentified or unexplained diseases which are suspected to have a genetic cause. In this paper, we present three clinical cases in both prenatal and postnatal settings. Among all, array CGH played a major discovery role to reveal the cryptic and/or complex nature of chromosome arrangements. By identifying the genetic causes responsible for the clinical observation in patients, array CGH has provided accurate diagnosis and appropriate clinical management in a timely and efficient manner.

  15. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  16. Targeting 160 candidate genes for blood pressure regulation with a genome-wide genotyping array.

    Directory of Open Access Journals (Sweden)

    Siim Sõber

    2009-06-01

    Full Text Available The outcome of Genome-Wide Association Studies (GWAS has challenged the field of blood pressure (BP genetics as previous candidate genes have not been among the top loci in these scans. We used Affymetrix 500K genotyping data of KORA S3 cohort (n = 1,644; Southern-Germany to address (i SNP coverage in 160 BP candidate genes; (ii the evidence for associations with BP traits in genome-wide and replication data, and haplotype analysis. In total, 160 gene regions (genic region+/-10 kb covered 2,411 SNPs across 11.4 Mb. Marker densities in genes varied from 0 (n = 11 to 0.6 SNPs/kb. On average 52.5% of the HAPMAP SNPs per gene were captured. No evidence for association with BP was obtained for 1,449 tested SNPs. Considerable associations (P50% of HAPMAP SNPs were tagged. In general, genes with higher marker density (>0.2 SNPs/kb revealed a better chance to reach close to significance associations. Although, none of the detected P-values remained significant after Bonferroni correction (P<0.05/2319, P<2.15 x 10(-5, the strength of some detected associations was close to this level: rs10889553 (LEPR and systolic BP (SBP (P = 4.5 x 10(-5 as well as rs10954174 (LEP and diastolic BP (DBP (P = 5.20 x 10(-5. In total, 12 markers in 7 genes (ADRA2A, LEP, LEPR, PTGER3, SLC2A1, SLC4A2, SLC8A1 revealed considerable association (P<10(-3 either with SBP, DBP, and/or hypertension (HYP. None of these were confirmed in replication samples (KORA S4, HYPEST, BRIGHT. However, supportive evidence for the association of rs10889553 (LEPR and rs11195419 (ADRA2A with BP was obtained in meta-analysis across samples stratified either by body mass index, smoking or alcohol consumption. Haplotype analysis highlighted LEPR and PTGER3. In conclusion, the lack of associations in BP candidate genes may be attributed to inadequate marker coverage on the genome-wide arrays, small phenotypic effects of the loci and/or complex interaction with life-style and metabolic parameters.

  17. Strand bias in complementary single-nucleotide polymorphisms of transcribed human sequences: evidence for functional effects of synonymous polymorphisms

    Directory of Open Access Journals (Sweden)

    Majewski Jacek

    2006-08-01

    Full Text Available Abstract Background Complementary single-nucleotide polymorphisms (SNPs may not be distributed equally between two DNA strands if the strands are functionally distinct, such as in transcribed genes. In introns, an excess of A↔G over the complementary C↔T substitutions had previously been found and attributed to transcription-coupled repair (TCR, demonstrating the valuable functional clues that can be obtained by studying such asymmetry. Here we studied asymmetry of human synonymous SNPs (sSNPs in the fourfold degenerate (FFD sites as compared to intronic SNPs (iSNPs. Results The identities of the ancestral bases and the direction of mutations were inferred from human-chimpanzee genomic alignment. After correction for background nucleotide composition, excess of A→G over the complementary T→C polymorphisms, which was observed previously and can be explained by TCR, was confirmed in FFD SNPs and iSNPs. However, when SNPs were separately examined according to whether they mapped to a CpG dinucleotide or not, an excess of C→T over G→A polymorphisms was found in non-CpG site FFD SNPs but was absent from iSNPs and CpG site FFD SNPs. Conclusion The genome-wide discrepancy of human FFD SNPs provides novel evidence for widespread selective pressure due to functional effects of sSNPs. The similar asymmetry pattern of FFD SNPs and iSNPs that map to a CpG can be explained by transcription-coupled mechanisms, including TCR and transcription-coupled mutation. Because of the hypermutability of CpG sites, more CpG site FFD SNPs are relatively younger and have confronted less selection effect than non-CpG FFD SNPs, which can explain the asymmetric discrepancy of CpG site FFD SNPs vs. non-CpG site FFD SNPs.

  18. Integrated high-resolution array CGH and SKY analysis of homozygous deletions and other genomic alterations present in malignant mesothelioma cell lines.

    Science.gov (United States)

    Klorin, Geula; Rozenblum, Ester; Glebov, Oleg; Walker, Robert L; Park, Yoonsoo; Meltzer, Paul S; Kirsch, Ilan R; Kaye, Frederic J; Roschke, Anna V

    2013-05-01

    High-resolution oligonucleotide array comparative genomic hybridization (aCGH) and spectral karyotyping (SKY) were applied to a panel of malignant mesothelioma (MMt) cell lines. SKY has not been applied to MMt before, and complete karyotypes are reported based on the integration of SKY and aCGH results. A whole genome search for homozygous deletions (HDs) produced the largest set of recurrent and non-recurrent HDs for MMt (52 recurrent HDs in 10 genomic regions; 36 non-recurrent HDs). For the first time, LINGO2, RBFOX1/A2BP1, RPL29, DUSP7, and CCSER1/FAM190A were found to be homozygously deleted in MMt, and some of these genes could be new tumor suppressor genes for MMt. Integration of SKY and aCGH data allowed reconstruction of chromosomal rearrangements that led to the formation of HDs. Our data imply that only with acquisition of structural and/or numerical karyotypic instability can MMt cells attain a complete loss of tumor suppressor genes located in 9p21.3, which is the most frequently homozygously deleted region. Tetraploidization is a late event in the karyotypic progression of MMt cells, after HDs in the 9p21.3 region have already been acquired. Published by Elsevier Inc.

  19. Identification of new polymorphic regions and differentiation of cultivated olives (Olea europaea L.) through plastome sequence comparison

    Science.gov (United States)

    2010-01-01

    Background The cultivated olive (Olea europaea L.) is the most agriculturally important species of the Oleaceae family. Although many studies have been performed on plastid polymorphisms to evaluate taxonomy, phylogeny and phylogeography of Olea subspecies, only few polymorphic regions discriminating among the agronomically and economically important olive cultivars have been identified. The objective of this study was to sequence the entire plastome of olive and analyze many potential polymorphic regions to develop new inter-cultivar genetic markers. Results The complete plastid genome of the olive cultivar Frantoio was determined by direct sequence analysis using universal and novel PCR primers designed to amplify all overlapping regions. The chloroplast genome of the olive has an organisation and gene order that is conserved among numerous Angiosperm species and do not contain any of the inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses that have been found in the chloroplast genomes of the genera Jasminum and Menodora, from the same family as Olea. The annotated sequence was used to evaluate the content of coding genes, the extent, and distribution of repeated and long dispersed sequences and the nucleotide composition pattern. These analyses provided essential information for structural, functional and comparative genomic studies in olive plastids. Furthermore, the alignment of the olive plastome sequence to those of other varieties and species identified 30 new organellar polymorphisms within the cultivated olive. Conclusions In addition to identifying mutations that may play a functional role in modifying the metabolism and adaptation of olive cultivars, the new chloroplast markers represent a valuable tool to assess the level of olive intercultivar plastome variation for use in population genetic analysis, phylogenesis, cultivar characterisation and DNA food tracking. PMID:20868482

  20. Simple sequence repeats in Neurospora crassa: distribution, polymorphism and evolutionary inference

    Directory of Open Access Journals (Sweden)

    Park Jongsun

    2008-01-01

    Full Text Available Abstract Background Simple sequence repeats (SSRs have been successfully used for various genetic and evolutionary studies in eukaryotic systems. The eukaryotic model organism Neurospora crassa is an excellent system to study evolution and biological function of SSRs. Results We identified and characterized 2749 SSRs of 963 SSR types in the genome of N. crassa. The distribution of tri-nucleotide (nt SSRs, the most common SSRs in N. crassa, was significantly biased in exons. We further characterized the distribution of 19 abundant SSR types (AST, which account for 71% of total SSRs in the N. crassa genome, using a Poisson log-linear model. We also characterized the size variation of SSRs among natural accessions using Polymorphic Index Content (PIC and ANOVA analyses and found that there are genome-wide, chromosome-dependent and local-specific variations. Using polymorphic SSRs, we have built linkage maps from three line-cross populations. Conclusion Taking our computational, statistical and experimental data together, we conclude that 1 the distributions of the SSRs in the sequenced N. crassa genome differ systematically between chromosomes as well as between SSR types, 2 the size variation of tri-nt SSRs in exons might be an important mechanism in generating functional variation of proteins in N. crassa, 3 there are different levels of evolutionary forces in variation of amino acid repeats, and 4 SSRs are stable molecular markers for genetic studies in N. crassa.

  1. Application of Genomic Tools in Plant Breeding

    OpenAIRE

    Pérez-de-Castro, A.M.; Vilanova, S.; Cañizares, J.; Pascual, L.; Blanca, J.M.; Díez, M.J.; Prohens, J.; Picó, B.

    2012-01-01

    Plant breeding has been very successful in developing improved varieties using conventional tools and methodologies. Nowadays, the availability of genomic tools and resources is leading to a new revolution of plant breeding, as they facilitate the study of the genotype and its relationship with the phenotype, in particular for complex traits. Next Generation Sequencing (NGS) technologies are allowing the mass sequencing of genomes and transcriptomes, which is producing a vast array of genomic...

  2. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  3. Genome health nutrigenomics and nutrigenetics--diagnosis and nutritional treatment of genome damage on an individual basis.

    Science.gov (United States)

    Fenech, Michael

    2008-04-01

    The term nutrigenomics refers to the effect of diet on gene expression. The term nutrigenetics refers to the impact of inherited traits on the response to a specific dietary pattern, functional food or supplement on a specific health outcome. The specific fields of genome health nutrigenomics and genome health nutrigenetics are emerging as important new research areas because it is becoming increasingly evident that (a) risk for developmental and degenerative disease increases with DNA damage which in turn is dependent on nutritional status and (b) optimal concentration of micronutrients for prevention of genome damage is also dependent on genetic polymorphisms that alter function of genes involved directly or indirectly in uptake and metabolism of micronutrients required for DNA repair and DNA replication. Development of dietary patterns, functional foods and supplements that are designed to improve genome health maintenance in humans with specific genetic backgrounds may provide an important contribution to a new optimum health strategy based on the diagnosis and individualised nutritional treatment of genome instability i.e. Genome Health Clinics.

  4. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  5. Light whole genome sequence for SNP discovery across domestic cat breeds

    Directory of Open Access Journals (Sweden)

    Driscoll Carlos

    2010-06-01

    Full Text Available Abstract Background The domestic cat has offered enormous genomic potential in the veterinary description of over 250 hereditary disease models as well as the occurrence of several deadly feline viruses (feline leukemia virus -- FeLV, feline coronavirus -- FECV, feline immunodeficiency virus - FIV that are homologues to human scourges (cancer, SARS, and AIDS respectively. However, to realize this bio-medical potential, a high density single nucleotide polymorphism (SNP map is required in order to accomplish disease and phenotype association discovery. Description To remedy this, we generated 3,178,297 paired fosmid-end Sanger sequence reads from seven cats, and combined these data with the publicly available 2X cat whole genome sequence. All sequence reads were assembled together to form a 3X whole genome assembly allowing the discovery of over three million SNPs. To reduce potential false positive SNPs due to the low coverage assembly, a low upper-limit was placed on sequence coverage and a high lower-limit on the quality of the discrepant bases at a potential variant site. In all domestic cats of different breeds: female Abyssinian, female American shorthair, male Cornish Rex, female European Burmese, female Persian, female Siamese, a male Ragdoll and a female African wildcat were sequenced lightly. We report a total of 964 k common SNPs suitable for a domestic cat SNP genotyping array and an additional 900 k SNPs detected between African wildcat and domestic cats breeds. An empirical sampling of 94 discovered SNPs were tested in the sequenced cats resulting in a SNP validation rate of 99%. Conclusions These data provide a large collection of mapped feline SNPs across the cat genome that will allow for the development of SNP genotyping platforms for mapping feline diseases.

  6. Single nucleotide polymorphism analysis of ubiquitin extension protein genes (ubq) of gossypium arboreum and gossypium herbaceum in comparison with arabidopsis thaliana

    International Nuclear Information System (INIS)

    Shaheen, T.; Zafar, Y.; Rahman, M.

    2014-01-01

    Single nucleotide polymorphism analysis is an expedient way to study polymorphisms at genomic level. In the present study we have explored Ubiquitin extension protein gene of G. arboreum (A2) and G. herbaceum (A1) of cotton which is a multiple copy gene. We have found SNPs at 16 positions in 200 bp region within A genome of cotton indicating frequency of SNPs 1/13 bp. Both sequences from cotton have shown maximum similarity with UBQ5 and UBQ6 of Arabidopsis thaliana. Sequence obtained from G. arboreum has shown SNPs at 28 positions in comparison with each UBQ5 and UBQ6 of Arabidopsis thaliana while sequence obtained from G. herbaceum has shown SNPs at 31 positions in comparison with each UBQ5 and UBQ6 of Arabidopsis thaliana. In conclusion although during pace of evolution ubiquitin extension protein genes of both A genome species have got some mutations from nature but still most of their sequence is similar. Single nucleotide polymorphism study can prove a vital tool to identify gene type in case of Multicopy genes. (author)

  7. Hybridization Capture Reveals Evolution and Conservation across the Entire Koala Retrovirus Genome

    Science.gov (United States)

    Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M.; Roca, Alfred L.; Greenwood, Alex D.

    2014-01-01

    The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin. PMID:24752422

  8. Hybridization capture reveals evolution and conservation across the entire Koala retrovirus genome.

    Directory of Open Access Journals (Sweden)

    Kyriakos Tsangaras

    Full Text Available The koala retrovirus (KoRV is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin.

  9. Polymorphism in ABC transporter genes of Dirofilaria immitis

    Directory of Open Access Journals (Sweden)

    Thangadurai Mani

    2017-08-01

    Full Text Available Dirofilaria immitis, a filarial nematode, causes dirofilariasis in dogs, cats and occasionally in humans. Prevention of the disease has been mainly by monthly use of the macrocyclic lactone (ML endectocides during the mosquito transmission season. Recently, ML resistance has been confirmed in D. immitis and therefore, there is a need to find new classes of anthelmintics. One of the mechanisms associated with ML resistance in nematodes has been the possible role of ATP binding cassette (ABC transporters in reducing drug concentrations at receptor sites. ABC transporters, mainly from sub-families B, C and G, may contribute to multidrug resistance (MDR by active efflux of drugs out of the cell. Gene products of ABC transporters may thus serve as the targets for agents that may modulate susceptibility to drugs, by inhibiting drug transport. ABC transporters are believed to be involved in a variety of physiological functions critical to the parasite, such as sterol transport, and therefore may also serve as the target for drugs that can act as anthelmintics on their own. Knowledge of polymorphism in these ABC transporter genes in nematode parasites could provide useful information for the process of drug design. We have identified 15 ABC transporter genes from sub-families A, B, C and G, in D. immitis, by comparative genomic approaches and analyzed them for polymorphism. Whole genome sequencing data from four ML susceptible (SUS and four loss of efficacy (LOE pooled populations were used for single nucleotide polymorphism (SNP genotyping. Out of 231 SNPs identified in those 15 ABC transporter genes, 89 and 75 of them were specific to the SUS or LOE populations, respectively. A few of the SNPs identified may affect gene expression, protein function, substrate specificity or resistance development and may be useful for transporter inhibitor/anthelmintic drug design, or in order to anticipate resistance development. Keywords: Dirofilaria immitis

  10. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    OpenAIRE

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Deng, Zixin; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’....

  11. Lupus-related single nucleotide polymorphisms and risk of diffuse large B-cell lymphoma

    NARCIS (Netherlands)

    Bernatsky, Sasha; Velásquez García, Héctor A; Spinelli, John; Gaffney, Patrick; Smedby, Karin E; Ramsey-Goldman, Rosalind; Wang, Sophia S.; Adami, Hans-Olov; Albanes, Demetrius; Angelucci, Emanuele; Ansell, Stephen M.; Asmann, Yan W.; Becker, Nikolaus; Benavente, Yolanda; Berndt, Sonja I.; Bertrand, Kimberly A.; Birmann, Brenda M.; Boeing, Heiner; Boffetta, Paolo; Bracci, Paige M.; Brennan, Paul; Brooks-Wilson, Angela R.; Cerhan, James R.; Chanock, Stephen J.; Clavel, Jacqueline; Conde, Lucia; Cotenbader, Karen H; Cox, David G; Cozen, Wendy; Crouch, Simon; De Roos, Anneclaire J.; De Sanjose, Silvia; Di Lollo, Simonetta; Diver, W. Ryan; Dogan, Ahmet; Foretova, Lenka; Ghesquières, Hervé; Giles, Graham G.; Glimelius, Bengt; Habermann, Thomas M.; Haioun, Corinne; Hartge, Patricia; Hjalgrim, Henrik; Holford, Theodore R.; Holly, Elizabeth A.; Jackson, Rebecca D.; Kaaks, Rudolph; Kane, Eleanor; Kelly, Rachel S.; Klein, Robert J.; Kraft, Peter; Kricker, Anne; Lan, Qing; Lawrence, Charles; Liebow, Mark; Lightfoot, Tracy; Link, Brian K.; Maynadie, Marc; McKay, James; Melbye, Mads; Molina, Thierry Jo; Monnereau, Alain; Morton, Lindsay M.; Nieters, Alexandra; North, Kari E.; Novak, Anne J.; Offit, Kenneth; Purdue, Mark P.; Rais, Marco; Riby, Jacques; Roman, Eve; Rothman, Nathaniel; Salles, Gilles; Severi, Gianluca; Severson, Richard K.; Skibola, Christine F.; Slager, Susan L.; Smith, Alex; Smith, Martyn T.; Southey, Melissa C.; Staines, Anthony; Teras, Lauren R.; Thompson, Carrie A.; Tilly, Hervé; Tinker, Lesley F.; Tjonneland, Anne; Turner, Jenny; Vajdic, Claire M.; Vermeulen, Roel C H; Vijai, Joseph; Vineis, Paolo; Virtamo, Jarmo; Wang, Zhaoming; Weinstein, Stephanie; Witzig, Thomas E.; Zelenetz, Andrew; Zeleniuch-Jacquotte, Anne; Zhang, Yawei; Zheng, Tongzhang; Zucca, Mariagrazia; Clarke, Ann E

    2017-01-01

    Objective: Determinants of the increased risk of diffuse large B-cell lymphoma (DLBCL) in SLE are unclear. Using data from a recent lymphoma genome-wide association study (GWAS), we assessed whether certain lupus-related single nucleotide polymorphisms (SNPs) were also associated with DLBCL.

  12. Estimated allele substitution effects underlying genomic evaluation models depend on the scaling of allele counts

    NARCIS (Netherlands)

    Bouwman, Aniek C.; Hayes, Ben J.; Calus, Mario P.L.

    2017-01-01

    Background: Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of

  13. WebaCGH: an interactive online tool for the analysis and display of array comparative genomic hybridisation data.

    Science.gov (United States)

    Frankenberger, Casey; Wu, Xiaolin; Harmon, Jerry; Church, Deanna; Gangi, Lisa M; Munroe, David J; Urzúa, Ulises

    2006-01-01

    Gene copy number variations occur both in normal cells and in numerous pathologies including cancer and developmental diseases. Array comparative genomic hybridisation (aCGH) is an emerging technology that allows detection of chromosomal gains and losses in a high-resolution format. When aCGH is performed on cDNA and oligonucleotide microarrays, the impact of DNA copy number on gene transcription profiles may be directly compared. We have created an online software tool, WebaCGH, that functions to (i) upload aCGH and gene transcription results from multiple experiments; (ii) identify significant aberrant regions using a local Z-score threshold in user-selected chromosomal segments subjected to smoothing with moving averages; and (iii) display results in a graphical format with full genome and individual chromosome views. In the individual chromosome display, data can be zoomed in/out in both dimensions (i.e. ratio and physical location) and plotted features can have 'mouse over' linking to outside databases to identify loci of interest. Uploaded data can be stored indefinitely for subsequent retrieval and analysis. WebaCGH was created as a Java-based web application using the open-source database MySQL. WebaCGH is freely accessible at http://129.43.22.27/WebaCGH/welcome.htm Xiaolin Wu (forestwu@mail.nih.gov) or Ulises Urzúa (uurzua@med.uchile.cl).

  14. Development and application of a 20K SNP array in potato

    NARCIS (Netherlands)

    Vos, Peter

    2016-01-01

    In this thesis the results are described of investigations of various application of genome wide SNP (single nucleotide polymorphism) markers. The set of SNP markers was identified by GBS (genotyping by sequencing) strategy. The resulting dataset of 129,156 SNPs across 83 tetraploid varieties was

  15. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  16. Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors

    Directory of Open Access Journals (Sweden)

    Antoine ePersoons

    2014-09-01

    Full Text Available Melampsora larici-populina is a fungal pathogen responsible for foliar rust disease on poplar trees, which causes damage to forest plantations worldwide, particularly in Northern Europe. The reference genome of the isolate 98AG31 was previously sequenced using a whole genome shotgun strategy, revealing a large genome of 101 megabases containing 16,399 predicted genes, which included secreted protein genes representing poplar rust candidate effectors. In the present study, the genomes of 15 isolates collected over the past 20 years throughout the French territory, representing distinct virulence profiles, were characterized by massively parallel sequencing to assess genetic variation in the poplar rust fungus. Comparison to the reference genome revealed striking structural variations. Analysis of coverage and sequencing depth identified large missing regions between isolates related to the mating type loci. More than 611,824 single-nucleotide polymorphism (SNP positions were uncovered overall, indicating a remarkable level of polymorphism. Based on the accumulation of non-synonymous substitutions in coding sequences and the relative frequencies of synonymous and non-synonymous polymorphisms (i.e. PN/PS, we identify candidate genes that may be involved in fungal pathogenesis. Correlation between non-synonymous SNPs in genes encoding secreted proteins and pathotypes of the studied isolates revealed candidate genes potentially related to virulences 1, 6 and 8 of the poplar rust fungus.

  17. Nearly Neutral Evolution Across the Drosophila melanogaster Genome

    DEFF Research Database (Denmark)

    Esteve, David Castellano; James, Jennifer; Eyre-Walker, Adam

    2017-01-01

    Under the nearly neutral theory of molecular evolution the proportion of effectively neutral mutations is expected to depend upon the effective population size (Ne). Here we investigate whether this is the case across the genome of Drosophila melanogaster using polymorphism data from 128 North...

  18. New Tools for Embryo Selection: Comprehensive Chromosome Screening by Array Comparative Genomic Hybridization

    Directory of Open Access Journals (Sweden)

    Lorena Rodrigo

    2014-01-01

    Full Text Available The objective of this study was to evaluate the usefulness of comprehensive chromosome screening (CCS using array comparative genomic hybridization (aCGH. The study included 1420 CCS cycles for recurrent miscarriage (n=203; repetitive implantation failure (n=188; severe male factor (n=116; previous trisomic pregnancy (n=33; and advanced maternal age (n=880. CCS was performed in cycles with fresh oocytes and embryos (n=774; mixed cycles with fresh and vitrified oocytes (n=320; mixed cycles with fresh and vitrified day-2 embryos (n=235; and mixed cycles with fresh and vitrified day-3 embryos (n=91. Day-3 embryo biopsy was performed and analyzed by aCGH followed by day-5 embryo transfer. Consistent implantation (range: 40.5–54.2% and pregnancy rates per transfer (range: 46.0–62.9% were obtained for all the indications and independently of the origin of the oocytes or embryos. However, a lower delivery rate per cycle was achieved in women aged over 40 years (18.1% due to the higher percentage of aneuploid embryos (85.3% and lower number of cycles with at least one euploid embryo available per transfer (40.3%. We concluded that aneuploidy is one of the major factors which affect embryo implantation.

  19. Clinical utility of the X-chromosome array.

    Science.gov (United States)

    Zarate, Yuri A; Dwivedi, Alka; Bartel, Frank O; Bellomo, M Allison; Cathey, Sara S; Champaigne, Neena L; Clarkson, L Kate; Dupont, Barbara R; Everman, David B; Geer, Joseph S; Gordon, Barbara C; Lichty, Angie W; Lyons, Michael J; Rogers, R Curtis; Saul, Robert A; Schroer, Richard J; Skinner, Steven A; Stevenson, Roger E

    2013-01-01

    Previous studies have limited the use of specific X-chromosome array designed platforms to the evaluation of patients with intellectual disability. In this retrospective analysis, we reviewed the clinical utility of an X-chromosome array in a variety of scenarios. We divided patients according to the indication for the test into four defined categories: (1) autism spectrum disorders and/or developmental delay and/or intellectual disability (ASDs/DD/ID) with known family history of neurocognitive disorders; (2) ASDs/DD/ID without known family history of neurocognitive disorders; (3) breakpoint definition of an abnormality detected by a different cytogenetic test; and (4) evaluation of suspected or known X-linked conditions. A total of 59 studies were ordered with 27 copy number variants detected in 25 patients (25/59 = 42%). The findings were deemed pathogenic/likely pathogenic (16/59 = 27%), benign (4/59 = 7%) or uncertain (7/59 = 12%). We place particular emphasis on the utility of this test for the diagnostic evaluation of families affected with X-linked conditions and how it compares to whole genome arrays in this setting. In conclusion, the X-chromosome array frequently detects genomic alterations of the X chromosome and it has advantages when evaluating some specific X-linked conditions. However, careful interpretation and correlation with clinical findings is needed to determine the significance of such changes. When the X-chromosome array was used to confirm a suspected X-linked condition, it had a yield of 63% (12/19) and was useful in the evaluation and risk assessment of patients and families. Copyright © 2012 Wiley Periodicals, Inc.

  20. Associations of activated coagulation factor VII and factor VIIa-antithrombin levels with genome-wide polymorphisms and cardiovascular disease risk.

    Science.gov (United States)

    Olson, N C; Raffield, L M; Lange, L A; Lange, E M; Longstreth, W T; Chauhan, G; Debette, S; Seshadri, S; Reiner, A P; Tracy, R P

    2018-01-01

    Essentials A fraction of coagulation factor VII circulates in blood as an activated protease (FVIIa). We evaluated FVIIa and FVIIa-antithrombin (FVIIa-AT) levels in the Cardiovascular Health Study. Polymorphisms in the F7 and PROCR loci were associated with FVIIa and FVIIa-AT levels. FVIIa may be an ischemic stroke risk factor in older adults and FVIIa-AT may assess mortality risk. Background A fraction of coagulation factor (F) VII circulates as an active protease (FVIIa). FVIIa also circulates as an inactivated complex with antithrombin (FVIIa-AT). Objective Evaluate associations of FVIIa and FVIIa-AT with genome-wide single nucleotide polymorphisms (SNPs) and incident coronary heart disease, ischemic stroke and mortality. Patients/Methods We measured FVIIa and FVIIa-AT in 3486 Cardiovascular Health Study (CHS) participants. We performed a genome-wide association scan for FVIIa and FVIIa-AT in European-Americans (n = 2410) and examined associations of FVII phenotypes with incident cardiovascular disease. Results In European-Americans, the most significant SNP for FVIIa and FVIIa-AT was rs1755685 in the F7 promoter region on chromosome 13 (FVIIa, β = -25.9 mU mL -1 per minor allele; FVIIa-AT, β = -26.6 pm per minor allele). Phenotypes were also associated with rs867186 located in PROCR on chromosome 20 (FVIIa, β = 7.8 mU mL -1 per minor allele; FVIIa-AT, β = 9.9 per minor allele). Adjusted for risk factors, a one standard deviation higher FVIIa was associated with increased risk of ischemic stroke (hazard ratio [HR], 1.12; 95% confidence interval [CI], 1.01, 1.23). Higher FVIIa-AT was associated with mortality from all causes (HR, 1.08; 95% CI, 1.03, 1.12). Among European-American CHS participants the rs1755685 minor allele was associated with lower ischemic stroke (HR, 0.69; 95% CI, 0.54, 0.88), but this association was not replicated in a larger multi-cohort analysis. Conclusions The results support the importance of the F7 and PROCR loci in

  1. Genome-wide association study of pre-harvest sprouting resistance in Chinese wheat founder parents

    Directory of Open Access Journals (Sweden)

    Yu Lin

    2017-07-01

    Full Text Available Abstract Pre-harvest sprouting (PHS is a major abiotic factor affecting grain weight and quality, and is caused by an early break in seed dormancy. Association mapping (AM is used to detect correlations between phenotypes and genotypes based on linkage disequilibrium (LD in wheat breeding programs. We evaluated seed dormancy in 80 Chinese wheat founder parents in five environments and performed a genome-wide association study using 6,057 markers, including 93 simple sequence repeat (SSR, 1,472 diversity array technology (DArT, and 4,492 single nucleotide polymorphism (SNP markers. The general linear model (GLM and the mixed linear model (MLM were used in this study, and two significant markers (tPt-7980 and wPt-6457 were identified. Both markers were located on Chromosome 1B, with wPt-6457 having been identified in a previously reported chromosomal position. The significantly associated loci contain essential information for cloning genes related to resistance to PHS and can be used in wheat breeding programs.

  2. Genome shotgun sequencing and development of microsatellite ...

    African Journals Online (AJOL)

    Analysis of the gerbera genome DNA ('Raon') general library showed that sequences of (AT), (AG), (AAG) and (AAT) repeats appeared most often, whereas (AC), (AAC) and (ACC) were the least frequent. Primer pairs were designed for 80 loci. Only eight primer pairs produced reproducible polymorphic bands in the 28 ...

  3. The Chlamydia psittaci genome: a comparative analysis of intracellular pathogens.

    Directory of Open Access Journals (Sweden)

    Anja Voigt

    Full Text Available Chlamydiaceae are a family of obligate intracellular pathogens causing a wide range of diseases in animals and humans, and facing unique evolutionary constraints not encountered by free-living prokaryotes. To investigate genomic aspects of infection, virulence and host preference we have sequenced Chlamydia psittaci, the pathogenic agent of ornithosis.A comparison of the genome of the avian Chlamydia psittaci isolate 6BC with the genomes of other chlamydial species, C. trachomatis, C. muridarum, C. pneumoniae, C. abortus, C. felis and C. caviae, revealed a high level of sequence conservation and synteny across taxa, with the major exception of the human pathogen C. trachomatis. Important differences manifest in the polymorphic membrane protein family specific for the Chlamydiae and in the highly variable chlamydial plasticity zone. We identified a number of psittaci-specific polymorphic membrane proteins of the G family that may be related to differences in host-range and/or virulence as compared to closely related Chlamydiaceae. We calculated non-synonymous to synonymous substitution rate ratios for pairs of orthologous genes to identify putative targets of adaptive evolution and predicted type III secreted effector proteins.This study is the first detailed analysis of the Chlamydia psittaci genome sequence. It provides insights in the genome architecture of C. psittaci and proposes a number of novel candidate genes mostly of yet unknown function that may be important for pathogen-host interactions.

  4. The Chlamydia psittaci genome: a comparative analysis of intracellular pathogens.

    Science.gov (United States)

    Voigt, Anja; Schöfl, Gerhard; Saluz, Hans Peter

    2012-01-01

    Chlamydiaceae are a family of obligate intracellular pathogens causing a wide range of diseases in animals and humans, and facing unique evolutionary constraints not encountered by free-living prokaryotes. To investigate genomic aspects of infection, virulence and host preference we have sequenced Chlamydia psittaci, the pathogenic agent of ornithosis. A comparison of the genome of the avian Chlamydia psittaci isolate 6BC with the genomes of other chlamydial species, C. trachomatis, C. muridarum, C. pneumoniae, C. abortus, C. felis and C. caviae, revealed a high level of sequence conservation and synteny across taxa, with the major exception of the human pathogen C. trachomatis. Important differences manifest in the polymorphic membrane protein family specific for the Chlamydiae and in the highly variable chlamydial plasticity zone. We identified a number of psittaci-specific polymorphic membrane proteins of the G family that may be related to differences in host-range and/or virulence as compared to closely related Chlamydiaceae. We calculated non-synonymous to synonymous substitution rate ratios for pairs of orthologous genes to identify putative targets of adaptive evolution and predicted type III secreted effector proteins. This study is the first detailed analysis of the Chlamydia psittaci genome sequence. It provides insights in the genome architecture of C. psittaci and proposes a number of novel candidate genes mostly of yet unknown function that may be important for pathogen-host interactions.

  5. Using RNA-Seq to assemble a rose transcriptome with more than 13,000 full-length expressed genes and to develop the WagRhSNP 68k Axiom SNP array for rose (Rosa L.).

    Science.gov (United States)

    Koning-Boucoiran, Carole F S; Esselink, G Danny; Vukosavljev, Mirjana; van 't Westende, Wendy P C; Gitonga, Virginia W; Krens, Frans A; Voorrips, Roeland E; van de Weg, W Eric; Schulz, Dietmar; Debener, Thomas; Maliepaard, Chris; Arens, Paul; Smulders, Marinus J M

    2015-01-01

    In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs) within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array. Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L.) genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.

  6. Retroelement insertional polymorphisms, diversity and phylogeography within diploid, D-genome Aegilops tauschii (Triticeae, Poaceae) sub-taxa in Iran.

    Science.gov (United States)

    Saeidi, Hojjatollah; Rahiminejad, Mohammad Reza; Heslop-Harrison, J S

    2008-04-01

    The diploid goat grass Aegilops tauschii (2n = 2x = 14) is native to the Middle East and is the D-genome donor to hexaploid bread wheat. The aim of this study was to measure the diversity of different subspecies and varieties of wild Ae. tauschii collected across the major areas where it grows in Iran and to examine patterns of diversity related to the taxa and geography. Inter-retroelement amplified polymorphism (IRAP) markers were used to analyse the biodiversity of DNA from 57 accessions of Ae. tauschii from northern and central Iran, and two hexaploid wheats. Key Results Eight IRAP primer combinations amplified a total of 171 distinct DNA fragments between 180 and 3200 bp long from the accessions, of which 169 were polymorphic. On average, about eight fragments were amplified with each primer combination, with more bands being amplified from accessions from the north-west of the country than from other accessions. The IRAP markers showed high levels of genetic diversity. Analysis of all accessions together did not allow the allocation of individuals to taxa based on morphology, but showed a tendency to put accessions from the north-west apart from others regions. It is speculated that this could be due to different activity of retroelements in the different regions. Within the two taxa with most accessions, there was a range of IRAP genotypes that could be correlated closely with geographical origin. This supports suggestions that the centre of origin of the species is towards the south-east of the Caspian Sea. IRAP is an appropriate marker system to evaluate genetic diversity and evolutionary relationships within the taxa, but it is too variable to define the taxa themselves, where more slowly evolving morphological, DNA sequence or chromosomal makers may be more appropriate.

  7. PTH Gene Polymorphism and Breast Cancer Risk in Kazakhstan

    Directory of Open Access Journals (Sweden)

    Nurgul Sikhayeva

    2014-12-01

    Full Text Available Introduction. Breast cancer is the most common type of cancer among women. In Kazakhstan, breast cancer holds first place among causes of women death caused by cancer in the 45-55 year age group . Many studies have shown that the risk of acquiring breast cancer may be related to the level of calcium in the blood serum. One of the important regulators of calcium metabolism in the body is the parathyroid hormone. Single nucleotide polymorphisms in the gene encoding the parathyroid hormone (PTH are associated with breast cancer development risk, and may modify the associative interaction between the levels of calcium intake and breast cancer. Experimental studies have shown that PTH gene has a carcinogenic effect. At least three studies showed a weak positive correlation between the risk of acquiring breast cancer and primary hyperparathyroidism, a state with high levels of PTH and often high levels of calcium. The aim of this investigation was to evaluate potential association between PTH gene polymorphism and breast cancer risk among Kazakhstani women.Methods. Female breast cancer patients (n = 429 and matched control women (n = 373 were recruited into a case – control study,. Genomic DNA was extracted from peripheral venous blood of study participants using Wizard® Genomic DNA Purification Kit (Promega, USA. Detection of PTH gene polymorphism (rs1459015 was done by means of the TaqMan® SNP Genotyping Assay of real-time PCR. Statistical analysis was conducted using SPSS 19.0.Results. PTH gene alleles were in Hardy–Weinberg equilibrium (p > 0.05. Distribution was 59% CC, 35% CT, 6% TT in the group with breast cancer and 50% CC, 43% CT, 6% TT in the control group. Total difference (between the group with breast cancer and the control group in allele frequencies for PTH polymorphism was not significant (p > 0.05. No association was found between rs1459015 TT and breast cancer risk (OR = 1.039; 95%, CI 0.740 - 1.297; p = 0.893.Conclusion. We

  8. Personalized Medicine in a New Genomic Era: Ethical and Legal Aspects.

    Science.gov (United States)

    Shoaib, Maria; Rameez, Mansoor Ali Merchant; Hussain, Syed Ather; Madadin, Mohammed; Menezes, Ritesh G

    2017-08-01

    The genome of two completely unrelated individuals is quite similar apart from minor variations called single nucleotide polymorphisms which contribute to the uniqueness of each and every person. These single nucleotide polymorphisms are of great interest clinically as they are useful in figuring out the susceptibility of certain individuals to particular diseases and for recognizing varied responses to pharmacological interventions. This gives rise to the idea of 'personalized medicine' as an exciting new therapeutic science in this genomic era. Personalized medicine suggests a unique treatment strategy based on an individual's genetic make-up. Its key principles revolve around applied pharmaco-genomics, pharmaco-kinetics and pharmaco-proteomics. Herein, the ethical and legal aspects of personalized medicine in a new genomic era are briefly addressed. The ultimate goal is to comprehensively recognize all relevant forms of genetic variation in each individual and be able to interpret this information in a clinically meaningful manner within the ambit of ethical and legal considerations. The authors of this article firmly believe that personalized medicine has the potential to revolutionize the current landscape of medicine as it makes its way into clinical practice.

  9. Validation of the high-throughput marker technology DArT using the model plant Arabidopsis thaliana

    NARCIS (Netherlands)

    Wittenberg, A.H.J.; Lee, van der T.A.J.; Cayla, C.; Kilian, A.; Visser, R.G.F.; Schouten, H.J.

    2005-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds of restriction site based polymorphisms between genotypes and does not require DNA sequence information or

  10. Fourteen polymorphic microsatellite markers for the fungal banana pathogen Mycosphaerella fijiensis.

    Science.gov (United States)

    Yang, Bao Jun; Zhong, Shao Bin

    2008-07-01

    Fourteen polymorphic microsatellite markers were developed for Mycosphaerella fijiensis, a fungus causing the black sigatoka disease in banana. The sequenced genome of M. fijiensis was screened for sequences with single sequence repeats (SSRs) using a Perl script. Fourteen SSR loci, evaluated on 48 M. fijiensis isolates from Hawaii, were identified to be highly polymorphic. These markers revealed two to 19 alleles, with an average of 6.43 alleles per locus. The estimated gene diversity ranged from 0.091 to 0.930 across the 14 microsatellite loci. The SSR markers developed would be useful for population genetics studies of M. fijiensis. © 2008 The Authors. Journal compilation © 2008 Blackwell Publishing Ltd.

  11. Polymorphism in hybrid male sterility in wild-derived Mus musculus musculus strains on proximal chromosome 17.

    Science.gov (United States)

    Vyskocilová, Martina; Prazanová, Gabriela; Piálek, Jaroslav

    2009-02-01

    The hybrid sterility-1 (Hst1) locus at Chr 17 causes male sterility in crosses between the house mouse subspecies Mus musculus domesticus (Mmd) and M. m. musculus (Mmm). This locus has been defined by its polymorphic variants in two laboratory strains (Mmd genome) when mated to PWD/Ph mice (Mmm genome): C57BL/10 (carrying the sterile allele) and C3H (fertile allele). The occurrence of sterile and/or fertile (wild Mmm x C57BL)F1 males is evidence that polymorphism for this trait also exists in natural populations of Mmm; however, the nature of this polymorphism remains unclear. Therefore, we derived two wild-origin Mmm strains, STUS and STUF, that produce sterile and fertile males, respectively, in crosses with C57BL mice. To determine the genetic basis underlying male fertility, the (STUS x STUF)F1 females were mated to C57BL/10 J males. About one-third of resulting hybrid males (33.8%) had a significantly smaller epididymis and testes than parental animals and lacked spermatozoa due to meiotic arrest. A further one-fifth of males (20.3%) also had anomalous reproductive traits but produced some spermatozoa. The remaining fertile males (45.9%) displayed no deviation from values found in parental individuals. QTL analysis of the progeny revealed strong associations of male fitness components with the proximal end of Chr 17, and a significant effect of the central section of Chr X on testes mass. The data suggest that genetic incompatibilities associated with male sterility have evolved independently at the proximal end of Chr 17 and are polymorphic within both Mmd and Mmm genomes.

  12. A new normalizing algorithm for BAC CGH arrays with quality control metrics.

    Science.gov (United States)

    Miecznikowski, Jeffrey C; Gaile, Daniel P; Liu, Song; Shepherd, Lori; Nowak, Norma

    2011-01-01

    The main focus in pin-tip (or print-tip) microarray analysis is determining which probes, genes, or oligonucleotides are differentially expressed. Specifically in array comparative genomic hybridization (aCGH) experiments, researchers search for chromosomal imbalances in the genome. To model this data, scientists apply statistical methods to the structure of the experiment and assume that the data consist of the signal plus random noise. In this paper we propose "SmoothArray", a new method to preprocess comparative genomic hybridization (CGH) bacterial artificial chromosome (BAC) arrays and we show the effects on a cancer dataset. As part of our R software package "aCGHplus," this freely available algorithm removes the variation due to the intensity effects, pin/print-tip, the spatial location on the microarray chip, and the relative location from the well plate. removal of this variation improves the downstream analysis and subsequent inferences made on the data. Further, we present measures to evaluate the quality of the dataset according to the arrayer pins, 384-well plates, plate rows, and plate columns. We compare our method against competing methods using several metrics to measure the biological signal. With this novel normalization algorithm and quality control measures, the user can improve their inferences on datasets and pinpoint problems that may arise in their BAC aCGH technology.

  13. A genome-wide association study in American Indians implicates DNER as a susceptibility locus for type 2 diabetes.

    Science.gov (United States)

    Hanson, Robert L; Muller, Yunhua L; Kobes, Sayuko; Guo, Tingwei; Bian, Li; Ossowski, Victoria; Wiedrich, Kim; Sutherland, Jeffrey; Wiedrich, Christopher; Mahkee, Darin; Huang, Ke; Abdussamad, Maryam; Traurig, Michael; Weil, E Jennifer; Nelson, Robert G; Bennett, Peter H; Knowler, William C; Bogardus, Clifton; Baier, Leslie J

    2014-01-01

    Most genetic variants associated with type 2 diabetes mellitus (T2DM) have been identified through genome-wide association studies (GWASs) in Europeans. The current study reports a GWAS for young-onset T2DM in American Indians. Participants were selected from a longitudinal study conducted in Pima Indians and included 278 cases with diabetes with onset before 25 years of age, 295 nondiabetic controls ≥45 years of age, and 267 siblings of cases or controls. Individuals were genotyped on a ∼1M single nucleotide polymorphism (SNP) array, resulting in 453,654 SNPs with minor allele frequency >0.05. SNPs were analyzed for association in cases and controls, and a family-based association test was conducted. Tag SNPs (n = 311) were selected for 499 SNPs associated with diabetes (P associated with T2DM (odds ratio = 1.29 per copy of the T allele; P = 6.6 × 10(-8), which represents genome-wide significance accounting for the number of effectively independent SNPs analyzed). Transfection studies in murine pancreatic β-cells suggested that DNER regulates expression of notch signaling pathway genes. These studies implicate DNER as a susceptibility gene for T2DM in American Indians.

  14. Flexible and efficient genome tiling design with penalized uniqueness score

    Directory of Open Access Journals (Sweden)

    Du Yang

    2012-12-01

    Full Text Available Abstract Background As a powerful tool in whole genome analysis, tiling array has been widely used in the answering of many genomic questions. Now it could also serve as a capture device for the library preparation in the popular high throughput sequencing experiments. Thus, a flexible and efficient tiling array design approach is still needed and could assist in various types and scales of transcriptomic experiment. Results In this paper, we address issues and challenges in designing probes suitable for tiling array applications and targeted sequencing. In particular, we define the penalized uniqueness score, which serves as a controlling criterion to eliminate potential cross-hybridization, and a flexible tiling array design pipeline. Unlike BLAST or simple suffix array based methods, computing and using our uniqueness measurement can be more efficient for large scale design and require less memory. The parameters provided could assist in various types of genomic tiling task. In addition, using both commercial array data and experiment data we show, unlike previously claimed, that palindromic sequence exhibiting relatively lower uniqueness. Conclusions Our proposed penalized uniqueness score could serve as a better indicator for cross hybridization with higher sensitivity and specificity, giving more control of expected array quality. The flexible tiling design algorithm incorporating the penalized uniqueness score was shown to give higher coverage and resolution. The package to calculate the penalized uniqueness score and the described probe selection algorithm are implemented as a Perl program, which is freely available at http://www1.fbn-dummerstorf.de/en/forschung/fbs/fb3/paper/2012-yang-1/OTAD.v1.1.tar.gz.

  15. Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome.

    Science.gov (United States)

    Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing; O'Connor, Timothy D; Abecasis, Gonçalo R; Wojcik, Genevieve L; Gignoux, Christopher R; Gourraud, Pierre-Antoine; Lizee, Antoine; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Kenny, Eimear E; Bustamante, Carlos; Beaty, Terri H; Mathias, Rasika A; Barnes, Kathleen C; Qin, Zhaohui S

    2017-04-21

    A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an 'African Diaspora Power Chip' (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry.

  16. Novel fluorescent sequence-related amplified polymorphism(FSRAP markers for the construction of a genetic linkage map of wheat(Triticum aestivum L.

    Directory of Open Access Journals (Sweden)

    Zhao Lingbo

    2017-01-01

    Full Text Available Novel fluorescent sequence-related amplified polymorphism (FSRAP markers were developed based on the SRAP molecular marker. Then, the FSRAP markers were used to construct the genetic map of a wheat (Triticum aestivumL. recombinant inbred line population derived from a Chuanmai 42×Chuannong 16 cross. Reproducibility and polymorphism tests indicated that the FSRAP markers have repeatability and better reflect the polymorphism of wheat varieties compared with SRAP markers. A total of 430 polymorphic loci between Chuanmai 42 and Chuannong 16 were detected with 189 FSRAP primer combinations. A total of 281 FSARP markers and 39 SSR markers re classified into 20 linkage groups. The maps spanned a total length of 2499.3cM with an average distance of 7.81cM between markers. A total of 201 markers were mapped on the B genome and covered a distance of 1013cM. On the A genome, 84 markers were mapped and covered a distance of 849.6cM. On the D genome, however, only 35 markers were mapped and covered a distance of 636.7cM. No FSRAP markers were distributed on the 7D chromosome. The results of the present study revealed that the novel FSRAP markers can be used to generate dense, uniform genetic maps of wheat.

  17. Comparison of relative efficiency of genomic SSR and EST-SSR markers in estimating genetic diversity in sugarcane.

    Science.gov (United States)

    Parthiban, S; Govindaraj, P; Senthilkumar, S

    2018-03-01

    Twenty-five primer pairs developed from genomic simple sequence repeats (SSR) were compared with 25 expressed sequence tags (EST) SSRs to evaluate the efficiency of these two sets of primers using 59 sugarcane genetic stocks. The mean polymorphism information content (PIC) of genomic SSR was higher (0.72) compared to the PIC value recorded by EST-SSR marker (0.62). The relatively low level of polymorphism in EST-SSR markers may be due to the location of these markers in more conserved and expressed sequences compared to genomic sequences which are spread throughout the genome. Dendrogram based on the genomic SSR and EST-SSR marker data showed differences in grouping of genotypes. A total of 59 sugarcane accessions were grouped into 6 and 4 clusters using genomic SSR and EST-SSR, respectively. The highly efficient genomic SSR could subcluster the genotypes of some of the clusters formed by EST-SSR markers. The difference in dendrogram observed was probably due to the variation in number of markers produced by genomic SSR and EST-SSR and different portion of genome amplified by both the markers. The combined dendrogram (genomic SSR and EST-SSR) more clearly showed the genetic relationship among the sugarcane genotypes by forming four clusters. The mean genetic similarity (GS) value obtained using EST-SSR among 59 sugarcane accessions was 0.70, whereas the mean GS obtained using genomic SSR was 0.63. Although relatively lower level of polymorphism was displayed by the EST-SSR markers, genetic diversity shown by the EST-SSR was found to be promising as they were functional marker. High level of PIC and low genetic similarity values of genomic SSR may be more useful in DNA fingerprinting, selection of true hybrids, identification of variety specific markers and genetic diversity analysis. Identification of diverse parents based on cluster analysis can be effectively done with EST-SSR as the genetic similarity estimates are based on functional attributes related to

  18. In Vitro vs In Silico Detected SNPs for the Development of a Genotyping Array: What Can We Learn from a Non-Model Species?

    Science.gov (United States)

    Lepoittevin, Camille; Frigerio, Jean-Marc; Garnier-Géré, Pauline; Salin, Franck; Cervera, María-Teresa; Vornam, Barbara; Harvengt, Luc; Plomion, Christophe

    2010-01-01

    Background There is considerable interest in the high-throughput discovery and genotyping of single nucleotide polymorphisms (SNPs) to accelerate genetic mapping and enable association studies. This study provides an assessment of EST-derived and resequencing-derived SNP quality in maritime pine (Pinus pinaster Ait.), a conifer characterized by a huge genome size (∼23.8 Gb/C). Methodology/Principal Findings A 384-SNPs GoldenGate genotyping array was built from i/ 184 SNPs originally detected in a set of 40 re-sequenced candidate genes (in vitro SNPs), chosen on the basis of functionality scores, presence of neighboring polymorphisms, minor allele frequencies and linkage disequilibrium and ii/ 200 SNPs screened from ESTs (in silico SNPs) selected based on the number of ESTs used for SNP detection, the SNP minor allele frequency and the quality of SNP flanking sequences. The global success rate of the assay was 66.9%, and a conversion rate (considering only polymorphic SNPs) of 51% was achieved. In vitro SNPs showed significantly higher genotyping-success and conversion rates than in silico SNPs (+11.5% and +18.5%, respectively). The reproducibility was 100%, and the genotyping error rate very low (0.54%, dropping down to 0.06% when removing four SNPs showing elevated error rates). Conclusions/Significance This study demonstrates that ESTs provide a resource for SNP identification in non-model species, which do not require any additional bench work and little bio-informatics analysis. However, the time and cost benefits of in silico SNPs are counterbalanced by a lower conversion rate than in vitro SNPs. This drawback is acceptable for population-based experiments, but could be dramatic in experiments involving samples from narrow genetic backgrounds. In addition, we showed that both the visual inspection of genotyping clusters and the estimation of a per SNP error rate should help identify markers that are not suitable to the GoldenGate technology in species

  19. In vitro vs in silico detected SNPs for the development of a genotyping array: what can we learn from a non-model species?

    Directory of Open Access Journals (Sweden)

    Camille Lepoittevin

    2010-06-01

    Full Text Available There is considerable interest in the high-throughput discovery and genotyping of single nucleotide polymorphisms (SNPs to accelerate genetic mapping and enable association studies. This study provides an assessment of EST-derived and resequencing-derived SNP quality in maritime pine (Pinus pinaster Ait., a conifer characterized by a huge genome size ( approximately 23.8 Gb/C.A 384-SNPs GoldenGate genotyping array was built from i/ 184 SNPs originally detected in a set of 40 re-sequenced candidate genes (in vitro SNPs, chosen on the basis of functionality scores, presence of neighboring polymorphisms, minor allele frequencies and linkage disequilibrium and ii/ 200 SNPs screened from ESTs (in silico SNPs selected based on the number of ESTs used for SNP detection, the SNP minor allele frequency and the quality of SNP flanking sequences. The global success rate of the assay was 66.9%, and a conversion rate (considering only polymorphic SNPs of 51% was achieved. In vitro SNPs showed significantly higher genotyping-success and conversion rates than in silico SNPs (+11.5% and +18.5%, respectively. The reproducibility was 100%, and the genotyping error rate very low (0.54%, dropping down to 0.06% when removing four SNPs showing elevated error rates.This study demonstrates that ESTs provide a resource for SNP identification in non-model species, which do not require any additional bench work and little bio-informatics analysis. However, the time and cost benefits of in silico SNPs are counterbalanced by a lower conversion rate than in vitro SNPs. This drawback is acceptable for population-based experiments, but could be dramatic in experiments involving samples from narrow genetic backgrounds. In addition, we showed that both the visual inspection of genotyping clusters and the estimation of a per SNP error rate should help identify markers that are not suitable to the GoldenGate technology in species characterized by a large and complex genome.

  20. First High-Density Linkage Map and Single Nucleotide Polymorphisms Significantly Associated With Traits of Economic Importance in Yellowtail Kingfish Seriola lalandi

    Directory of Open Access Journals (Sweden)

    Nguyen H. Nguyen

    2018-04-01

    Full Text Available The genetic resources available for the commercially important fish species Yellowtail kingfish (YTK (Seriola lalandi are relative sparse. To overcome this, we aimed (1 to develop a linkage map for this species, and (2 to identify markers/variants associated with economically important traits in kingfish (with an emphasis on body weight. Genetic and genomic analyses were conducted using 13,898 single nucleotide polymorphisms (SNPs generated from a new high-throughput genotyping by sequencing platform, Diversity Arrays Technology (DArTseqTM in a pedigreed population comprising 752 animals. The linkage analysis enabled to map about 4,000 markers to 24 linkage groups (LGs, with an average density of 3.4 SNPs per cM. The linkage map was integrated into a genome-wide association study (GWAS and identified six variants/SNPs associated with body weight (P < 5e-8 when a multi-locus mixed model was used. Two out of the six significant markers were mapped to LGs 17 and 23, and collectively they explained 5.8% of the total genetic variance. It is concluded that the newly developed linkage map and the significantly associated markers with body weight provide fundamental information to characterize genetic architecture of growth-related traits in this population of YTK S. lalandi.

  1. First High-Density Linkage Map and Single Nucleotide Polymorphisms Significantly Associated With Traits of Economic Importance in Yellowtail Kingfish Seriola lalandi.

    Science.gov (United States)

    Nguyen, Nguyen H; Rastas, Pasi M A; Premachandra, H K A; Knibb, Wayne

    2018-01-01

    The genetic resources available for the commercially important fish species Yellowtail kingfish (YTK) ( Seriola lalandi) are relative sparse. To overcome this, we aimed (1) to develop a linkage map for this species, and (2) to identify markers/variants associated with economically important traits in kingfish (with an emphasis on body weight). Genetic and genomic analyses were conducted using 13,898 single nucleotide polymorphisms (SNPs) generated from a new high-throughput genotyping by sequencing platform, Diversity Arrays Technology (DArTseq TM ) in a pedigreed population comprising 752 animals. The linkage analysis enabled to map about 4,000 markers to 24 linkage groups (LGs), with an average density of 3.4 SNPs per cM. The linkage map was integrated into a genome-wide association study (GWAS) and identified six variants/SNPs associated with body weight ( P 5e -8 ) when a multi-locus mixed model was used. Two out of the six significant markers were mapped to LGs 17 and 23, and collectively they explained 5.8% of the total genetic variance. It is concluded that the newly developed linkage map and the significantly associated markers with body weight provide fundamental information to characterize genetic architecture of growth-related traits in this population of YTK S. lalandi .

  2. The amphioxus genome and the evolution of the chordate karyotype

    Energy Technology Data Exchange (ETDEWEB)

    Putnam, Nicholas H.; Butts, Thomas; Ferrier, David E.K.; Furlong, Rebecca F.; Hellsten, Uffe; Kawashima, Takeshi; Robinson-Rechavi, Marc; Shoguchi, Eiichi; Terry, Astrid; Yu, Jr-Kai; Benito-Gutierrez, Elia; Dubchak, Inna; Garcia-Fernandez, Jordi; Gibson-Brown, Jeremy J.; Grigoriev, Igor V.; Horton, Amy C.; de Jong, Pieter J.; Jurka, Jerzy; Kapitonov, Vladimir; Kohara, Yuji; Kuroki, Yoko; Lindquist, Erika; Lucas, Susan; Osoegawa, Kazutoyo; Pennacchio, Len A.; Salamov, Asaf A.; Satou, Yutaka; Sauka-Spengler, Tatjana; Schmutz[, Jeremy; Shin-I, Tadasu; Toyoda, Atsushi; Bronner-Fraser, Marianne; Fujiyama, Asao; Holland, Linda Z.; Holland, Peter W. H.; Satoh, Nori; Rokhsar, Daniel S.

    2008-04-01

    Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage with a fossil record dating back to the Cambrian. We describe the structure and gene content of the highly polymorphic {approx}520 million base pair genome of the Florida lancelet Branchiostoma floridae, and analyze it in the context of chordate evolution. Whole genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets, and vertebrates), and allow reconstruction of not only the gene complement of the last common chordate ancestor, but also a partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.

  3. Data analysis in the post-genome-wide association study era

    Directory of Open Access Journals (Sweden)

    Qiao-Ling Wang

    2016-12-01

    Full Text Available Since the first report of a genome-wide association study (GWAS on human age-related macular degeneration, GWAS has successfully been used to discover genetic variants for a variety of complex human diseases and/or traits, and thousands of associated loci have been identified. However, the underlying mechanisms for these loci remain largely unknown. To make these GWAS findings more useful, it is necessary to perform in-depth data mining. The data analysis in the post-GWAS era will include the following aspects: fine-mapping of susceptibility regions to identify susceptibility genes for elucidating the biological mechanism of action; joint analysis of susceptibility genes in different diseases; integration of GWAS, transcriptome, and epigenetic data to analyze expression and methylation quantitative trait loci at the whole-genome level, and find single-nucleotide polymorphisms that influence gene expression and DNA methylation; genome-wide association analysis of disease-related DNA copy number variations. Applying these strategies and methods will serve to strengthen GWAS data to enhance the utility and significance of GWAS in improving understanding of the genetics of complex diseases or traits and translate these findings for clinical applications. Keywords: Genome-wide association study, Data mining, Integrative data analysis, Polymorphism, Copy number variation

  4. Novel mouse model recapitulates genome and transcriptome alterations in human colorectal carcinomas.

    Science.gov (United States)

    McNeil, Nicole E; Padilla-Nash, Hesed M; Buishand, Floryne O; Hue, Yue; Ried, Thomas

    2017-03-01

    Human colorectal carcinomas are defined by a nonrandom distribution of genomic imbalances that are characteristic for this disease. Often, these imbalances affect entire chromosomes. Understanding the role of these aneuploidies for carcinogenesis is of utmost importance. Currently, established transgenic mice do not recapitulate the pathognonomic genome aberration profile of human colorectal carcinomas. We have developed a novel model based on the spontaneous transformation of murine colon epithelial cells. During this process, cells progress through stages of pre-immortalization, immortalization and, finally, transformation, and result in tumors when injected into immunocompromised mice. We analyzed our model for genome and transcriptome alterations using ArrayCGH, spectral karyotyping (SKY), and array based gene expression profiling. ArrayCGH revealed a recurrent pattern of genomic imbalances. These results were confirmed by SKY. Comparing these imbalances with orthologous maps of human chromosomes revealed a remarkable overlap. We observed focal deletions of the tumor suppressor genes Trp53 and Cdkn2a/p16. High-level focal genomic amplification included the locus harboring the oncogene Mdm2, which was confirmed by FISH in the form of double minute chromosomes. Array-based global gene expression revealed distinct differences between the sequential steps of spontaneous transformation. Gene expression changes showed significant similarities with human colorectal carcinomas. Pathways most prominently affected included genes involved in chromosomal instability and in epithelial to mesenchymal transition. Our novel mouse model therefore recapitulates the most prominent genome and transcriptome alterations in human colorectal cancer, and might serve as a valuable tool for understanding the dynamic process of tumorigenesis, and for preclinical drug testing. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  5. Identification and analysis of genome-wide SNPs provide insight into signatures of selection and domestication in channel catfish (Ictalurus punctatus.

    Directory of Open Access Journals (Sweden)

    Luyang Sun

    Full Text Available Domestication and selection for important performance traits can impact the genome, which is most often reflected by reduced heterozygosity in and surrounding genes related to traits affected by selection. In this study, analysis of the genomic impact caused by domestication and artificial selection was conducted by investigating the signatures of selection using single nucleotide polymorphisms (SNPs in channel catfish (Ictalurus punctatus. A total of 8.4 million candidate SNPs were identified by using next generation sequencing. On average, the channel catfish genome harbors one SNP per 116 bp. Approximately 6.6 million, 5.3 million, 4.9 million, 7.1 million and 6.7 million SNPs were detected in the Marion, Thompson, USDA103, Hatchery strain, and wild population, respectively. The allele frequencies of 407,861 SNPs differed significantly between the domestic and wild populations. With these SNPs, 23 genomic regions with putative selective sweeps were identified that included 11 genes. Although the function for the majority of the genes remain unknown in catfish, several genes with known function related to aquaculture performance traits were included in the regions with selective sweeps. These included hypoxia-inducible factor 1β. HIFιβ.. and the transporter gene ATP-binding cassette sub-family B member 5 (ABCB5. HIF1β. is important for response to hypoxia and tolerance to low oxygen levels is a critical aquaculture trait. The large numbers of SNPs identified from this study are valuable for the development of high-density SNP arrays for genetic and genomic studies of performance traits in catfish.

  6. Analysis of Vitamin D Receptor (VDR Gene Polymorphisms in Alopecia Areata

    Directory of Open Access Journals (Sweden)

    Omer Ates

    2016-09-01

    Full Text Available Aim: Alopecia areata (AA is a disease characterized with hair loss on the hair skin any region of the body. This disease affects approximately 1%u20132% of the general population. The etiopathogenesis of this disease is unclear but infections, genetic, psychological and autoimmune factors is known play to role. Vitamin D is thought to be a regulator of the immune system and the action of it is dependent on the vitamin D receptor (VDR. Given the autoimmune component shared by this autoimmune diseases. In this study investigated the role of VDR gene polymorphisms in the development of AA. Material and Method: The study group included 198 patients with AA and 167 control. Genomic DNA was extracted from blood samples using DNA isolation kit. The frequency of VDR gene polymorphisms genotypes and allelic variants were analyzed by using Polymerase Chain Reaction (PCR and Restriction Fragment Length Polymorphisms (RFLP method. Results: Statistical evaluation of data results showed a not significant association for genotypic frequency distribution between the VDR gene BsmI (rs1544410 and ApaI (rs7975232, TaqI (rs731236 polymorphisms and AA (p=0.8891, 0.7309, 0.6761, respectively. Discussion: Our study reflects that VDR gene polymorphisms could not play a role in determining genetic susceptibility to AA.

  7. Incidental finding of alpha-methylacyl-CoA racemase deficiency in a patient with oculocutaneous albinism type 4

    NARCIS (Netherlands)

    Verhagen, Judith M. A.; Huijmans, Jan G.; Williams, Monique; van Ruyven, Rutger L. J.; Bergen, Arthur A. B.; Wouters, Cokkie H.; Brooks, Alice S.

    2012-01-01

    Genome-wide studies may lead to the discovery of genetic variants of potential clinical importance beyond the aims of the study. We performed single nucleotide polymorphism array analysis in a boy with oculocutaneous albinism to identify copy-neutral regions of homozygosity harboring genes involved

  8. CGH and SNP array using DNA extracted from fixed cytogenetic preparations and long-term refrigerated bone marrow specimens

    Directory of Open Access Journals (Sweden)

    MacKinnon Ruth N

    2012-02-01

    Full Text Available Abstract Background The analysis of nucleic acids is limited by the availability of archival specimens and the quality and amount of the extracted material. Archived cytogenetic preparations are stored in many laboratories and are a potential source of total genomic DNA for array karyotyping and other applications. Array CGH using DNA from fixed cytogenetic preparations has been described, but it is not known whether it can be used for SNP arrays. Diagnostic bone marrow specimens taken during the assessment of hematological malignancies are also a potential source of DNA, but it is generally assumed that DNA must be extracted, or the specimen frozen, within a day or two of collection, to obtain DNA suitable for further analysis. We have assessed DNA extracted from these materials for both SNP array and array CGH. Results We show that both SNP array and array CGH can be performed on genomic DNA extracted from cytogenetic specimens stored in Carnoy's fixative, and from bone marrow which has been stored unfrozen, at 4°C, for at least 36 days. We describe a procedure for extracting a usable concentration of total genomic DNA from cytogenetic suspensions of low cellularity. Conclusions The ability to use these archival specimens for DNA-based analysis increases the potential for retrospective genetic analysis of clinical specimens. Fixed cytogenetic preparations and long-term refrigerated bone marrow both provide DNA suitable for array karyotyping, and may be suitable for a wider range of analytical procedures.

  9. Improving accuracy of rare variant imputation with a two-step imputation approach

    DEFF Research Database (Denmark)

    Kreiner-Møller, Eskil; Medina-Gomez, Carolina; Uitterlinden, André G

    2015-01-01

    not being comprehensively scrutinized. Next-generation arrays ensuring sufficient coverage together with new reference panels, as the 1000 Genomes panel, are emerging to facilitate imputation of low frequent single-nucleotide polymorphisms (minor allele frequency (MAF) ... reference sample genotyped on a dense array and hereafter to the 1000 Genomes reference panel. We show that mean imputation quality, measured by the r(2) using this approach, increases by 28% for variants with a MAF between 1 and 5% as compared with direct imputation to 1000 Genomes reference. Similarly......Genotype imputation has been the pillar of the success of genome-wide association studies (GWAS) for identifying common variants associated with common diseases. However, most GWAS have been run using only 60 HapMap samples as reference for imputation, meaning less frequent and rare variants...

  10. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology

    Directory of Open Access Journals (Sweden)

    Chao Shiaoman

    2011-01-01

    Full Text Available Abstract Background Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. Results Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM analysis. Of these, 52 (54% were polymorphic between parents of the Ogle1040 × TAM O-301 (OT mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. Conclusions The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide

  11. Genome position and gene amplification

    Czech Academy of Sciences Publication Activity Database

    Jirsová, Pavla; Snijders, A.M.; Kwek, S.; Roydasgupta, R.; Fridlyand, J.; Tokuyasu, T.; Pinkel, D.; Albertson, D. G.

    2007-01-01

    Roč. 8, č. 6 (2007), r120 ISSN 1474-760X Institutional research plan: CEZ:AV0Z50040507; CEZ:AV0Z50040702 Keywords : gene amplification * array comparative genomic hybridization * oncogene Subject RIV: BO - Biophysics Impact factor: 6.589, year: 2007

  12. Epidemiology of transmissible diseases: Array hybridization and next generation sequencing as universal nucleic acid-mediated typing tools.

    Science.gov (United States)

    Michael Dunne, W; Pouseele, Hannes; Monecke, Stefan; Ehricht, Ralf; van Belkum, Alex

    2017-09-21

    The magnitude of interest in the epidemiology of transmissible human diseases is reflected in the vast number of tools and methods developed recently with the expressed purpose to characterize and track evolutionary changes that occur in agents of these diseases over time. Within the past decade a new suite of such tools has become available with the emergence of the so-called "omics" technologies. Among these, two are exponents of the ongoing genomic revolution. Firstly, high-density nucleic acid probe arrays have been proposed and developed using various chemical and physical approaches. Via hybridization-mediated detection of entire genes or genetic polymorphisms in such genes and intergenic regions these so called "DNA chips" have been successfully applied for distinguishing very closely related microbial species and strains. Second and even more phenomenal, next generation sequencing (NGS) has facilitated the assessment of the complete nucleotide sequence of entire microbial genomes. This technology currently provides the most detailed level of bacterial genotyping and hence allows for the resolution of microbial spread and short-term evolution in minute detail. We will here review the very recent history of these two technologies, sketch their usefulness in the elucidation of the spread and epidemiology of mostly hospital-acquired infections and discuss future developments. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Copy Number Variations in Tilapia Genomes.

    Science.gov (United States)

    Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

    2017-02-01

    Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2  > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.

  14. Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

    Science.gov (United States)

    Lee, Hayan; Schatz, Michael C

    2012-08-15

    Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net

  15. Convergent functional genomics of psychiatric disorders.

    Science.gov (United States)

    Niculescu, Alexander B

    2013-10-01

    Genetic and gene expression studies, in humans and animal models of psychiatric and other medical disorders, are becoming increasingly integrated. Particularly for genomics, the convergence and integration of data across species, experimental modalities and technical platforms is providing a fit-to-disease way of extracting reproducible and biologically important signal, in contrast to the fit-to-cohort effect and limited reproducibility of human genetic analyses alone. With the advent of whole-genome sequencing and the realization that a major portion of the non-coding genome may contain regulatory variants, Convergent Functional Genomics (CFG) approaches are going to be essential to identify disease-relevant signal from the tremendous polymorphic variation present in the general population. Such work in psychiatry can provide an example of how to address other genetically complex disorders, and in turn will benefit by incorporating concepts from other areas, such as cancer, cardiovascular diseases, and diabetes. © 2013 Wiley Periodicals, Inc.

  16. Reference genome sequence of the model plant Setaria.

    Science.gov (United States)

    Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

    2012-05-13

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  17. Reference genome sequence of the model plant Setaria

    Energy Technology Data Exchange (ETDEWEB)

    Bennetzen, Jeffrey L [ORNL; Schmutz, Jeremy [Hudson Alpha Institute of Biotechnology; Wang, Hao [University of Georgia, Athens, GA; Percifield, Ryan [University of Georgia, Athens, GA; Hawkins, Jennifer [University of Georgia, Athens, GA; Pontaroli, Ana C. [University of Georgia, Athens, GA; Estep, Matt [University of Georgia, Athens, GA; Feng, Liang [University of Georgia, Athens, GA; Vaughn, Justin N [ORNL; Grimwood, Jane [Hudson Alpha Institute of Biotechnology; Jenkins, Jerry [Hudson Alpha Institute of Biotechnology; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Lindquist, Erika [U.S. Department of Energy, Joint Genome Institute; Hellsten, Uffe [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Wang, Xuewen [University of Georgia, Athens, GA; Wu, Xiaomei [University of Georgia, Athens, GA; Mitros, Therese [University of California, Berkeley; Triplett, Jimmy [University of Missouri, St. Louis; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Mauro-Herrera, Margarita [Oklahoma State University; Wang, Lin [Cornell University; Li, Pinghua [Cornell University; Sharma, Manoj [University of California, Davis; Sharma, Rita [University of California, Davis; Ronald, Pamela [University of California, Davis; Panaud, Olivier [Universite de Perpignan, Perpignan, France; Kellogg, Elizabeth A. [University of Missouri, St. Louis; Brutnell, Thomas P. [Cornell University; Doust, Andrew N. [Oklahoma State University; Tuskan, Gerald A [ORNL; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Devos, Katrien M [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  18. Reference genome sequence of the model plant Setaria

    Energy Technology Data Exchange (ETDEWEB)

    Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  19. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

    2012-01-01

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  20. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  1. Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

    Science.gov (United States)

    Lane, William J; Westhoff, Connie M; Gleadall, Nicholas S; Aguad, Maria; Smeland-Wagman, Robin; Vege, Sunitha; Simmons, Daimon P; Mah, Helen H; Lebo, Matthew S; Walter, Klaudia; Soranzo, Nicole; Di Angelantonio, Emanuele; Danesh, John; Roberts, David J; Watkins, Nick A; Ouwehand, Willem H; Butterworth, Adam S; Kaufman, Richard M; Rehm, Heidi L; Silberstein, Leslie E; Green, Robert C

    2018-06-01

    There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens. This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons. We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 Med

  2. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  3. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2012-01-01

    Full Text Available Abstract Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952 of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612 were intronic and 9% (n = 464 were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS. Significant (P ® MassARRAY. No significant differences (P > 0.1 were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total. Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving

  4. VERSE: a novel approach to detect virus integration in host genomes through reference genome customization.

    Science.gov (United States)

    Wang, Qingguo; Jia, Peilin; Zhao, Zhongming

    2015-01-01

    Fueled by widespread applications of high-throughput next generation sequencing (NGS) technologies and urgent need to counter threats of pathogenic viruses, large-scale studies were conducted recently to investigate virus integration in host genomes (for example, human tumor genomes) that may cause carcinogenesis or other diseases. A limiting factor in these studies, however, is rapid virus evolution and resulting polymorphisms, which prevent reads from aligning readily to commonly used virus reference genomes, and, accordingly, make virus integration sites difficult to detect. Another confounding factor is host genomic instability as a result of virus insertions. To tackle these challenges and improve our capability to identify cryptic virus-host fusions, we present a new approach that detects Virus intEgration sites through iterative Reference SEquence customization (VERSE). To the best of our knowledge, VERSE is the first approach to improve detection through customizing reference genomes. Using 19 human tumors and cancer cell lines as test data, we demonstrated that VERSE substantially enhanced the sensitivity of virus integration site detection. VERSE is implemented in the open source package VirusFinder 2 that is available at http://bioinfo.mc.vanderbilt.edu/VirusFinder/.

  5. Multiplexed precision genome editing with trackable genomic barcodes in yeast.

    Science.gov (United States)

    Roy, Kevin R; Smith, Justin D; Vonesch, Sibylle C; Lin, Gen; Tu, Chelsea Szu; Lederer, Alex R; Chu, Angela; Suresh, Sundari; Nguyen, Michelle; Horecka, Joe; Tripathi, Ashutosh; Burnett, Wallace T; Morgan, Maddison A; Schulz, Julia; Orsley, Kevin M; Wei, Wu; Aiyar, Raeka S; Davis, Ronald W; Bankaitis, Vytas A; Haber, James E; Salit, Marc L; St Onge, Robert P; Steinmetz, Lars M

    2018-07-01

    Our understanding of how genotype controls phenotype is limited by the scale at which we can precisely alter the genome and assess the phenotypic consequences of each perturbation. Here we describe a CRISPR-Cas9-based method for multiplexed accurate genome editing with short, trackable, integrated cellular barcodes (MAGESTIC) in Saccharomyces cerevisiae. MAGESTIC uses array-synthesized guide-donor oligos for plasmid-based high-throughput editing and features genomic barcode integration to prevent plasmid barcode loss and to enable robust phenotyping. We demonstrate that editing efficiency can be increased more than fivefold by recruiting donor DNA to the site of breaks using the LexA-Fkh1p fusion protein. We performed saturation editing of the essential gene SEC14 and identified amino acids critical for chemical inhibition of lipid signaling. We also constructed thousands of natural genetic variants, characterized guide mismatch tolerance at the genome scale, and ascertained that cryptic Pol III termination elements substantially reduce guide efficacy. MAGESTIC will be broadly useful to uncover the genetic basis of phenotypes in yeast.

  6. Serotonin transporter (SERT gene polymorphism in Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Mahmut Özkaya

    2004-06-01

    Full Text Available Background: Parkinson disease (PD is the second most common neurodegenerative disorder with a prevalence of about 2% in persons older than 65 years of age. Neurodegenerative process in PD is not restricted to the dopaminergic neurons of the substantia nigra but also affects serotoninergic neurons. It has been shown that PD brains with Lewy bodies in the substantia nigra also had Lewy bodies in the raphe nuclei. The re-uptake of 5HT released into the synaptic cleft is mediated by the 5HT transporter (SERT. The SERT gene has been mapped to the chromosome of 17q11.1-q12 and has two main polymorphisms: intron two VNTR polymorphism and promoter region 44 bp insertion/deletion polymorphism. Objective: In this study we investigated whether two polymorphic regions in the serotonin transporter gene are associated with PD. Material and Method: After obtaining informed consent, blood samples were collected from 76 patients and 54 healthy volunteers. Genomic DNA was extracted from peripheral leucocytes using standard methods. The SERT gene genotypes were determined using polymerase chain reaction (PCR method. Results: Based on the intron 2 VNTR polymorphism of SERT gene, the distribution of 12/12, 12/10 and 10/10 genotypes were found as, 56.6 %, 35.5 %, 7.9 % in patients whereas this genotype distribution in control group was 40.7 %, 46.3 % and 13 %, respectively. According to 5-HTTLPR polymorphism, the distribution of L/L, L/S and S/S genotypes were found as 27.6 % 51.3 % and 21.1 % in patients whereas this genotype distribution in control group was 33.4 %, 50.0 % and 16.6 %, respectively. Despite the fact that the genotype distribution of SERT gene polymorphism in patients and control group seemed to be different from each other, this difference was not found to be statistically significant. Conclusion: This finding suggests that polymorphisms within the SERT gene do not play a major role in PD susceptibility in the Turkish population.

  7. High-Resolution Amplified Fragment Length Polymorphism Typing of Lactococcus lactis Strains Enables Identification of Genetic Markers for Subspecies-Related Phenotypes▿

    Science.gov (United States)

    Kütahya, Oylum Erkus; Starrenburg, Marjo J. C.; Rademaker, Jan L. W.; Klaassen, Corné H. W.; van Hylckama Vlieg, Johan E. T.; Smid, Eddy J.; Kleerebezem, Michiel

    2011-01-01

    A high-resolution amplified fragment length polymorphism (AFLP) methodology was developed to achieve the delineation of closely related Lactococcus lactis strains. The differentiation depth of 24 enzyme-primer-nucleotide combinations was experimentally evaluated to maximize the number of polymorphisms. The resolution depth was confirmed by performing diversity analysis on 82 L. lactis strains, including both closely and distantly related strains with dairy and nondairy origins. Strains clustered into two main genomic lineages of L. lactis subsp. lactis and L. lactis subsp. cremoris type-strain-like genotypes and a third novel genomic lineage rooted from the L. lactis subsp. lactis genomic lineage. Cluster differentiation was highly correlated with small-subunit rRNA homology and multilocus sequence analysis (MLSA) studies. Additionally, the selected enzyme-primer combination generated L. lactis subsp. cremoris phenotype-specific fragments irrespective of the genotype. These phenotype-specific markers allowed the differentiation of L. lactis subsp. lactis phenotype from L. lactis subsp. cremoris phenotype strains within the same L. lactis subsp. cremoris type-strain-like genomic lineage, illustrating the potential of AFLP for the generation of phenotype-linked genetic markers. PMID:21666014

  8. Accurate confidence aware clustering of array CGH tumor profiles.

    NARCIS (Netherlands)

    van Houte, B.P.P.; Heringa, J.

    2010-01-01

    Motivation: Chromosomal aberrations tend to be characteristic for given (sub)types of cancer. Such aberrations can be detected with array comparative genomic hybridization (aCGH). Clustering aCGH tumor profiles aids in identifying chromosomal regions of interest and provides useful diagnostic

  9. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  10. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  11. A comprehensive crop genome research project: the Superhybrid Rice Genome Project in China.

    Science.gov (United States)

    Yu, Jun; Wong, Gane Ka-Shu; Liu, Siqi; Wang, Jian; Yang, Huanming

    2007-06-29

    In May 2000, the Beijing Institute of Genomics formally announced the launch of a comprehensive crop genome research project on rice genomics, the Chinese Superhybrid Rice Genome Project. SRGP is not simply a sequencing project targeted to a single rice (Oryza sativa L.) genome, but a full-swing research effort with an ultimate goal of providing inclusive basic genomic information and molecular tools not only to understand biology of the rice, both as an important crop species and a model organism of cereals, but also to focus on a popular superhybrid rice landrace, LYP9. We have completed the first phase of SRGP and provide the rice research community with a finished genome sequence of an indica variety, 93-11 (the paternal cultivar of LYP9), together with ample data on subspecific (between subspecies) polymorphisms, transcriptomes and proteomes, useful for within-species comparative studies. In the second phase, we have acquired the genome sequence of the maternal cultivar, PA64S, together with the detailed catalogues of genes uniquely expressed in the parental cultivars and the hybrid as well as allele-specific markers that distinguish parental alleles. Although SRGP in China is not an open-ended research programme, it has been designed to pave a way for future plant genomics research and application, such as to interrogate fundamentals of plant biology, including genome duplication, polyploidy and hybrid vigour, as well as to provide genetic tools for crop breeding and to carry along a social burden-leading a fight against the world's hunger. It began with genomics, the newly developed and industry-scale research field, and from the world's most populous country. In this review, we summarize our scientific goals and noteworthy discoveries that exploit new territories of systematic investigations on basic and applied biology of rice and other major cereal crops.

  12. Identification of genome-specific transcripts in wheat–rye translocation lines

    Directory of Open Access Journals (Sweden)

    Tong Geon Lee

    2015-09-01

    Full Text Available Studying gene expression in wheat–rye translocation lines is complicated due to the presence of homeologs in hexaploid wheat and high levels of synteny between wheat and rye genomes (Naranjo and Fernandez-Rueda, 1991 [1]; Devos et al., 1995 [2]; Lee et al., 2010 [3]; Lee et al., 2013 [4]. To overcome limitations of current gene expression studies on wheat–rye translocation lines and identify genome-specific transcripts, we developed a custom Roche NimbleGen Gene Expression microarray that contains probes derived from the sequence of hexaploid wheat, diploid rye and diploid progenitors of hexaploid wheat genome (Lee et al., 2014. Using the array developed, we identified genome-specific transcripts in a wheat–rye translocation line (Lee et al., 2014. Expression data are deposited in the NCBI Gene Expression Omnibus (GEO under accession number GSE58678. Here we report the details of the methods used in the array workflow and data analysis.

  13. Amplified-fragment length polymorphism fingerprinting of Mycoplasma species

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, N.F.; Jensen, J.S.

    1999-01-01

    Amplified-fragment length polymorphism (AFLP) is a whole-genome fingerprinting method based on selective amplification of restriction fragments. The potential of the method for the characterization of mycoplasmas was investigated in a total of 50 strains of human and animal origin, including...... Mycoplasma genitalium (n = 11), Mycoplasma pneumoniae (n = 5), Mycoplasma hominis (n = 5), Mycoplasma hyopneunmoniae (n = 9), Myco plasma flocculare (n = 5), Mycoplasma hyosynoviae (n = 10), and Mycoplasma dispar (n = 5), AFLP templates were prepared by the digestion of mycoplasmal DNA with BglII and Mfe...... to discriminate the analyzed strains at species and intraspecies levels as well, Each of the tested Mycoplasma species developed a banding pattern entirely different from those obtained from other species under analysis, Subtle intraspecies genomic differences were detected among strains of all of the Mycoplasma...

  14. Genome-wide association study in discordant sibships identifies multiple inherited susceptibility alleles linked to lung cancer.

    Science.gov (United States)

    Galvan, Antonella; Falvella, Felicia S; Frullanti, Elisa; Spinola, Monica; Incarbone, Matteo; Nosotti, Mario; Santambrogio, Luigi; Conti, Barbara; Pastorino, Ugo; Gonzalez-Neira, Anna; Dragani, Tommaso A

    2010-03-01

    We analyzed a series of young (median age = 52 years) non-smoker lung cancer patients and their unaffected siblings as controls, using a genome-wide 620 901 single-nucleotide polymorphism (SNP) array analysis and a case-control DNA pooling approach. We identified 82 putatively associated SNPs that were retested by individual genotyping followed by use of the sib transmission disequilibrium test, pointing to 36 SNPs associated with lung cancer risk in the discordant sibs series. Analysis of these 36 SNPs in a polygenic model characterized by additive and interchangeable effects of rare alleles revealed a highly statistically significant dosage-dependent association between risk allele carrier status and proportion of cancer cases. Replication of the same 36 SNPs in a population-based series confirmed the association with lung cancer for three SNPs, suggesting that phenocopies and genetic heterogeneity can play a major role in the complex genetics of lung cancer risk in the general population.

  15. The Switchgrass Genome: Tools and Strategies

    Directory of Open Access Journals (Sweden)

    Michael D. Casler

    2011-11-01

    Full Text Available Switchgrass ( L. is a perennial grass species receiving significant focus as a potential bioenergy crop. In the last 5 yr the switchgrass research community has produced a genetic linkage map, an expressed sequence tag (EST database, a set of single nucleotide polymorphism (SNP markers that are distributed across the 18 linkage groups, 4x sampling of the AP13 genome in 400-bp reads, and bacterial artificial chromosome (BAC libraries containing over 200,000 clones. These studies have revealed close collinearity of the switchgrass genome with those of sorghum [ (L. Moench], rice ( L., and (L. P. Beauv. Switchgrass researchers have also developed several microarray technologies for gene expression studies. Switchgrass genomic resources will accelerate the ability of plant breeders to enhance productivity, pest resistance, and nutritional quality. Because switchgrass is a relative newcomer to the genomics world, many secrets of the switchgrass genome have yet to be revealed. To continue to efficiently explore basic and applied topics in switchgrass, it will be critical to capture and exploit the knowledge of plant geneticists and breeders on the next logical steps in the development and utilization of genomic resources for this species. To this end, the community has established a switchgrass genomics executive committee and work group ( [verified 28 Oct. 2011].

  16. Defining the Core Genome of Salmonella enterica Serovar Typhimurium for Genomic Surveillance and Epidemiological Typing

    Science.gov (United States)

    Fu, Songzhe; Octavia, Sophie; Tanaka, Mark M.; Sintchenko, Vitali

    2015-01-01

    Salmonella enterica serovar Typhimurium is the most common Salmonella serovar causing foodborne infections in Australia and many other countries. Twenty-one S. Typhimurium strains from Salmonella reference collection A (SARA) were analyzed using Illumina high-throughput genome sequencing. Single nucleotide polymorphisms (SNPs) in 21 SARA strains ranged from 46 to 11,916 SNPs, with an average of 1,577 SNPs per strain. Together with 47 strains selected from publicly available S. Typhimurium genomes, the S. Typhimurium core genes (STCG) were determined. The STCG consist of 3,846 genes, a set that is much larger than that of the 2,882 Salmonella core genes (SCG) found previously. The STCG together with 1,576 core intergenic regions (IGRs) were defined as the S. Typhimurium core genome. Using 93 S. Typhimurium genomes from 13 epidemiologically confirmed community outbreaks, we demonstrated that typing based on the S. Typhimurium core genome (STCG plus core IGRs) provides superior resolution and higher discriminatory power than that based on SCG for outbreak investigation and molecular epidemiology of S. Typhimurium. STCG and STCG plus core IGR typing achieved 100% separation of all outbreaks compared to that of SCG typing, which failed to separate isolates from two outbreaks from background isolates. Defining the S. Typhimurium core genome allows standardization of genes/regions to be used for high-resolution epidemiological typing and genomic surveillance of S. Typhimurium. PMID:26019201

  17. Comprehensive high-resolution genomic profiling and cytogenetics of human chondrocyte cultures by GTG-banding, locus-specific FISH, SKY and SNP array.

    Science.gov (United States)

    Wallenborn, M; Petters, O; Rudolf, D; Hantmann, H; Richter, M; Ahnert, P; Rohani, L; Smink, J J; Bulwin, G C; Krupp, W; Schulz, R M; Holland, H

    2018-04-23

    In the development of cell-based medicinal products, it is crucial to guarantee that the application of such an advanced therapy medicinal product (ATMP) is safe for the patients. The consensus of the European regulatory authorities is: "In conclusion, on the basis of the state of art, conventional karyotyping can be considered a valuable and useful technique to analyse chromosomal stability during preclinical studies". 408 chondrocyte samples (84 monolayers and 324 spheroids) from six patients were analysed using trypsin-Giemsa staining, spectral karyotyping and fluorescence in situ hybridisation, to evaluate the genetic stability of an ATMP named Spherox®. Single nucleotide polymorphism (SNP) array analysis was performed on chondrocyte spheroids from five of the six donors. Applying this combination of techniques, the genetic analyses performed revealed no significant genetic instability until passage 3 in monolayer cells and interphase cells from spheroid cultures at different time points. Clonal occurrence of polyploid metaphases and endoreduplications were identified associated with prolonged cultivation time. Also, gonosomal losses were observed in chondrocyte spheroids, with increasing passage and duration of the differentiation phase. Interestingly, in one of the donors, chromosomal aberrations that are also described in extraskeletal myxoid chondrosarcoma were identified. The SNP array analysis exhibited chromosomal aberrations in two donors and copy neutral losses of heterozygosity regions in four donors. This study showed the necessity of combined genetic analyses at defined cultivation time points in quality studies within the field of cell therapy.

  18. Transcriptionally active LTR retrotransposons in Eucalyptus genus are differentially expressed and insertionally polymorphic.

    Science.gov (United States)

    Marcon, Helena Sanches; Domingues, Douglas Silva; Silva, Juliana Costa; Borges, Rafael Junqueira; Matioli, Fábio Filippi; Fontes, Marcos Roberto de Mattos; Marino, Celso Luis

    2015-08-14

    In Eucalyptus genus, studies on genome composition and transposable elements (TEs) are particularly scarce. Nearly half of the recently released Eucalyptus grandis genome is composed by retrotransposons and this data provides an important opportunity to understand TE dynamics in Eucalyptus genome and transcriptome. We characterized nine families of transcriptionally active LTR retrotransposons from Copia and Gypsy superfamilies in Eucalyptus grandis genome and we depicted genomic distribution and copy number in two Eucalyptus species. We also evaluated genomic polymorphism and transcriptional profile in three organs of five Eucalyptus species. We observed contrasting genomic and transcriptional behavior in the same family among different species. RLC_egMax_1 was the most prevalent family and RLC_egAngela_1 was the family with the lowest copy number. Most families of both superfamilies have their insertions occurring Eucalyptus species. Using EST analysis and qRT-PCRs, we observed transcriptional activity in several tissues and in all evaluated species. In some families, osmotic stress increases transcript values. Our strategy was successful in isolating transcriptionally active retrotransposons in Eucalyptus, and each family has a particular genomic and transcriptional pattern. Overall, our results show that retrotransposon activity have differentially affected genome and transcriptome among Eucalyptus species.

  19. Undermethylated DNA as a source of microsatellites from a conifer genome.

    Science.gov (United States)

    Zhou, Y; Bui, T; Auckland, L D; Williams, C G

    2002-02-01

    Developing microsatellites from the large, highly duplicated conifer genome requires special tools. To improve the efficiency of developing Pinus taeda L. microsatellites, undermethylated (UM) DNA fragments were used to construct a microsatellite-enriched copy library. A methylation-sensitive restriction enzyme, McrBC, was used to enrich for UM DNA before library construction. Digested DNA fragments larger than 9 kb were then excised and digested with RsaI and used to construct nine dinucleotide and trinucleotide libraries. A total of 1016 microsatellite-positive clones were detected among 11 904 clones and 620 of these were unique. Of 245 primer sets that produced a PCR product, 113 could be developed as UM microsatellite markers and 70 were polymorphic. Inheritance and marker informativeness were tested for a random sample of 36 polymorphic markers using a three-generation outbred pedigree. Thirty-one microsatellites (86%) had single-locus inheritance despite the highly duplicated nature of the P. taeda genome. Nineteen UM microsatellites had highly informative intercross mating type configurations. Allele number and frequency were estimated for eleven UM microsatellites using a population survey. Allele numbers for these UM microsatellites ranged from 3 to 12 with an average of 5.7 alleles/locus. Frequencies for the 63 alleles were mostly in the low-common range; only 14 of the 63 were in the rare allele (q < 0.05) class. Enriching for UM DNA was an efficient method for developing polymorphic microsatellites from a large plant genome.

  20. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    Science.gov (United States)

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813