WorldWideScience

Sample records for genomic indel polymorphisms

  1. The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection

    Science.gov (United States)

    Jiang, Yue; Turinsky, Andrei L.; Brudno, Michael

    2015-01-01

    With the development of High-Throughput Sequencing (HTS) thousands of human genomes have now been sequenced. Whenever different studies analyze the same genome they usually agree on the amount of single-nucleotide polymorphisms, but differ dramatically on the number of insertion and deletion variants (indels). Furthermore, there is evidence that indels are often severely under-reported. In this manuscript we derive the total number of indel variants in a human genome by combining data from different sequencing technologies, while assessing the indel detection accuracy. Our estimate of approximately 1 million indels in a Yoruban genome is much higher than the results reported in several recent HTS studies. We identify two key sources of difficulties in indel detection: the insufficient coverage, read length or alignment quality; and the presence of repeats, including short interspersed elements and homopolymers/dimers. We quantify the effect of these factors on indel detection. The quality of sequencing data plays a major role in improving indel detection by HTS methods. However, many indels exist in long homopolymers and repeats, where their detection is severely impeded. The true number of indel events is likely even higher than our current estimates, and new techniques and technologies will be required to detect them. PMID:26130710

  2. Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN

    Science.gov (United States)

    Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger

    2016-01-01

    Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831

  3. Restricted DCJ-indel model: sorting linear genomes with DCJ and indels

    Science.gov (United States)

    2012-01-01

    Background The double-cut-and-join (DCJ) is a model that is able to efficiently sort a genome into another, generalizing the typical mutations (inversions, fusions, fissions, translocations) to which genomes are subject, but allowing the existence of circular chromosomes at the intermediate steps. In the general model many circular chromosomes can coexist in some intermediate step. However, when the compared genomes are linear, it is more plausible to use the so-called restricted DCJ model, in which we proceed the reincorporation of a circular chromosome immediately after its creation. These two consecutive DCJ operations, which create and reincorporate a circular chromosome, mimic a transposition or a block-interchange. When the compared genomes have the same content, it is known that the genomic distance for the restricted DCJ model is the same as the distance for the general model. If the genomes have unequal contents, in addition to DCJ it is necessary to consider indels, which are insertions and deletions of DNA segments. Linear time algorithms were proposed to compute the distance and to find a sorting scenario in a general, unrestricted DCJ-indel model that considers DCJ and indels. Results In the present work we consider the restricted DCJ-indel model for sorting linear genomes with unequal contents. We allow DCJ operations and indels with the following constraint: if a circular chromosome is created by a DCJ, it has to be reincorporated in the next step (no other DCJ or indel can be applied between the creation and the reincorporation of a circular chromosome). We then develop a sorting algorithm and give a tight upper bound for the restricted DCJ-indel distance. Conclusions We have given a tight upper bound for the restricted DCJ-indel distance. The question whether this bound can be reduced so that both the general and the restricted DCJ-indel distances are equal remains open. PMID:23281630

  4. Genome-wide indel markers shared by diverse Asian rice cultivars compared to Japanese rice cultivar ?Koshihikari?

    OpenAIRE

    Yonemaru, Jun-ichi; Choi, Sun Hee; Sakai, Hiroaki; Ando, Tsuyu; Shomura, Ayahiko; Yano, Masahiro; Wu, Jianzhong; Fukuoka, Shuichi

    2015-01-01

    Insertion-deletion (indel) polymorphisms, such as simple sequence repeats, have been widely used as DNA markers to identify QTLs and genes and to facilitate rice breeding. Recently, next-generation sequencing has produced deep sequences that allow genome-wide detection of indels. These polymorphisms can potentially be used to develop high-accuracy polymerase chain reaction (PCR)-based markers. Here, re-sequencing of 5 indica, 2 aus, and 3 tropical japonica cultivars and Japanese elite cultiva...

  5. On Sorting Genomes with DCJ and Indels

    Science.gov (United States)

    Braga, Marília D. V.

    A previous work of Braga, Willing and Stoye compared two genomes with unequal content, but without duplications, and presented a new linear time algorithm to compute the genomic distance, considering double cut and join (DCJ) operations, insertions and deletions. Here we derive from this approach an algorithm to sort one genome into another one also using DCJ, insertions and deletions. The optimal sorting scenarios can have different compositions and we compare two types of sorting scenarios: one that maximizes and one that minimizes the number of DCJ operations with respect to the number of insertions and deletions.

  6. InDel polymorphisms in quantitative posttransplant chi merism evaluation

    Directory of Open Access Journals (Sweden)

    I. M. Barkhatov

    2016-01-01

    Full Text Available Reduction of minimal residual disease to undetectable levels is the key criterion for efficiency of allogeneic hematopoietic stem cell transplantation (alloHSCT, along with engraftment of transplanted cells with complete replacement of recipient hematopoiesis, i. e., full posttransplant chimerism. Among different approaches, molecular genetic techniques are preferable, being based on the analysis of highly polymorphic DNA sequences (short tandem repeats, STRs. However, this approach, despite its high specificity, has a limited sensitivity. In this regard, it seems appropriate to introduce more sensitive diagnostic solutions, in particular, analysis of insertion/deletion (InDel polymorphisms, followed by real-time detection of PCR products. The data obtained upon analysis of several genetic markers have shown higher sensitivity of this method. However, the deviations in the range of 10 to 90 % in evaluation of the cell ratios indicates the feasibility of using this approach just to evaluate the residual populations of recipient cells.

  7. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

    Science.gov (United States)

    Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

    2017-10-06

    Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

  8. Development of novel InDel markers and genetic diversity in Chenopodium quinoa through whole-genome re-sequencing.

    Science.gov (United States)

    Zhang, Tifu; Gu, Minfeng; Liu, Yuhe; Lv, Yuanda; Zhou, Ling; Lu, Haiyan; Liang, Shuaiqiang; Bao, Huabin; Zhao, Han

    2017-09-05

    Quinoa (Chenopodium quinoa Willd.) is a balanced nutritional crop, but its breeding improvement has been limited by the lack of information on its genetics and genomics. Therefore, it is necessary to obtain knowledge on genomic variation, population structure, and genetic diversity and to develop novel Insertion/Deletion (InDel) markers for quinoa by whole-genome re-sequencing. We re-sequenced 11 quinoa accessions and obtained a coverage depth between approximately 7× to 23× the quinoa genome. Based on the 1453-megabase (Mb) assembly from the reference accession Riobamba, 8,441,022 filtered bi-allelic single nucleotide polymorphisms (SNPs) and 842,783 filtered InDels were identified, with an estimated SNP and InDel density of 5.81 and 0.58 per kilobase (kb). From the genomic InDel variations, 85 dimorphic InDel markers were newly developed and validated. Together with the 62 simple sequence repeat (SSR) markers reported, a total of 147 markers were used for genotyping the 129 quinoa accessions. Molecular grouping analysis showed classification into two major groups, the Andean highland (composed of the northern and southern highland subgroups) and Chilean coastal, based on combined STRUCTURE, phylogenetic tree and PCA (Principle Component Analysis) analyses. Further analysis of the genetic diversity exhibited a decreasing tendency from the Chilean coast group to the Andean highland group, and the gene flow between subgroups was more frequent than that between the two subgroups and the Chilean coastal group. The majority of the variations (approximately 70%) were found through an analysis of molecular variation (AMOVA) due to the diversity between the groups. This was congruent with the observation of a highly significant F ST value (0.705) between the groups, demonstrating significant genetic differentiation between the Andean highland type of quinoa and the Chilean coastal type. Moreover, a core set of 16 quinoa germplasms that capture all 362 alleles was

  9. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  10. Indel-II region deletion sizes in the white spot syndrome virus genome correlate with shrimp disease outbreaks in southern Vietnam

    NARCIS (Netherlands)

    Tran Thi Tuyet, H.; Zwart, M.P.; Phuong, N.T.; Oanh, D.T.H.; Jong, de M.C.M.; Vlak, J.M.

    2012-01-01

    Sequence comparisons of the genomes of white spot syndrome virus (WSSV) strains have identified regions containing variable-length insertions/deletions (i.e. indels). Indel-I and Indel-II, positioned between open reading frames (ORFs) 14/15 and 23/24, respectively, are the largest and the most

  11. Genetic Diversity of Myanmar and Indonesia Native Chickens Together with Two Jungle Fowl Species by Using 102 Indels Polymorphisms

    Directory of Open Access Journals (Sweden)

    Aye Aye Maw

    2012-07-01

    Full Text Available The efficiency of insertion and/or deletion (indels polymorphisms as genetic markers was evaluated by genotyping 102 indels loci in native chicken populations from Myanmar and Indonesia as well as Red jungle fowls and Green jungle fowls from Java Island. Out of the 102 indel markers, 97 were polymorphic. The average observed and expected heterozygosities were 0.206 to 0.268 and 0.229 to 0.284 in native chicken populations and 0.003 to 0.101 and 0.012 to 0.078 in jungle fowl populations. The coefficients of genetic differentiation (Gst of the native chicken populations from Myanmar and Indonesia were 0.041 and 0.098 respectively. The genetic variability is higher among native chicken populations than jungle fowl populations. The high Gst value was found between native chicken populations and jungle fowl populations. Neighbor-joining tree using genetic distance revealed that the native chickens from two countries were genetically close to each other and remote from Red and Green jungle fowls of Java Island.

  12. Single strand conformation polymorphism based SNP and Indel markers for genetic mapping and synteny analysis of common bean (Phaseolus vulgaris L.

    Directory of Open Access Journals (Sweden)

    Gómez Marcela

    2009-12-01

    Full Text Available Abstract Background Expressed sequence tags (ESTs are an important source of gene-based markers such as those based on insertion-deletions (Indels or single-nucleotide polymorphisms (SNPs. Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs, to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes. Results A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 × G19833 recombinant inbred line (RIL population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 × 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes. Conclusion The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction

  13. Single strand conformation polymorphism based SNP and Indel markers for genetic mapping and synteny analysis of common bean (Phaseolus vulgaris L.).

    Science.gov (United States)

    Galeano, Carlos H; Fernández, Andrea C; Gómez, Marcela; Blair, Matthew W

    2009-12-23

    Expressed sequence tags (ESTs) are an important source of gene-based markers such as those based on insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs), to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes. A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 x G19833 recombinant inbred line (RIL) population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 x 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes. The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction of a transcript map and given their high conservation

  14. KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses.

    Science.gov (United States)

    Kim, Jungeun; Weber, Jessica A; Jho, Sungwoong; Jang, Jinho; Jun, JeHoon; Cho, Yun Sung; Kim, Hak-Min; Kim, Hyunho; Kim, Yumi; Chung, OkSung; Kim, Chang Geun; Lee, HyeJin; Kim, Byung Chul; Han, Kyudong; Koh, InSong; Chae, Kyun Shik; Lee, Semin; Edwards, Jeremy S; Bhak, Jong

    2018-04-04

    High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variations.

  15. Genome-wide DNA polymorphism in the indica rice varieties RGD-7S and Taifeng B as revealed by whole genome re-sequencing.

    Science.gov (United States)

    Fu, Chong-Yun; Liu, Wu-Ge; Liu, Di-Lin; Li, Ji-Hua; Zhu, Man-Shan; Liao, Yi-Long; Liu, Zhen-Rong; Zeng, Xue-Qin; Wang, Feng

    2016-03-01

    Next-generation sequencing technologies provide opportunities to further understand genetic variation, even within closely related cultivars. We performed whole genome resequencing of two elite indica rice varieties, RGD-7S and Taifeng B, whose F1 progeny showed hybrid weakness and hybrid vigor when grown in the early- and late-cropping seasons, respectively. Approximately 150 million 100-bp pair-end reads were generated, which covered ∼86% of the rice (Oryza sativa L. japonica 'Nipponbare') reference genome. A total of 2,758,740 polymorphic sites including 2,408,845 SNPs and 349,895 InDels were detected in RGD-7S and Taifeng B, respectively. Applying stringent parameters, we identified 961,791 SNPs and 46,640 InDels between RGD-7S and Taifeng B (RGD-7S/Taifeng B). The density of DNA polymorphisms was 256.8 SNPs and 12.5 InDels per 100 kb for RGD-7S/Taifeng B. Copy number variations (CNVs) were also investigated. In RGD-7S, 1989 of 2727 CNVs were overlapped in 218 genes, and 1231 of 2010 CNVs were annotated in 175 genes in Taifeng B. In addition, we verified a subset of InDels in the interval of hybrid weakness genes, Hw3 and Hw4, and obtained some polymorphic InDel markers, which will provide a sound foundation for cloning hybrid weakness genes. Analysis of genomic variations will also contribute to understanding the genetic basis of hybrid weakness and heterosis.

  16. Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

    Science.gov (United States)

    Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

    2012-01-01

    A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037

  17. Hapsembler: An Assembler for Highly Polymorphic Genomes

    Science.gov (United States)

    Donmez, Nilgun; Brudno, Michael

    As whole genome sequencing has become a routine biological experiment, algorithms for assembly of whole genome shotgun data has become a topic of extensive research, with a plethora of off-the-shelf methods that can reconstruct the genomes of many organisms. Simultaneously, several recently sequenced genomes exhibit very high polymorphism rates. For these organisms genome assembly remains a challenge as most assemblers are unable to handle highly divergent haplotypes in a single individual. In this paper we describe Hapsembler, an assembler for highly polymorphic genomes, which makes use of paired reads. Our experiments show that Hapsembler produces accurate and contiguous assemblies of highly polymorphic genomes, while performing on par with the leading tools on haploid genomes. Hapsembler is available for download at http://compbio.cs.toronto.edu/hapsembler.

  18. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds.

    Directory of Open Access Journals (Sweden)

    Nedenia Bonvino Stafuzza

    Full Text Available Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose, Gyr, Girolando and Holstein (dairy production. A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs and 3,828,041 insertions/deletions (InDels were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

  19. Genome-wide analysis of intraspecific DNA polymorphism in 'Micro-Tom', a model cultivar of tomato (Solanum lycopersicum).

    Science.gov (United States)

    Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh

    2014-02-01

    Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.

  20. Autosomal InDel polymorphisms for population genetic structure and differentiation analysis of Chinese Kazak ethnic group

    Science.gov (United States)

    Kong, Tingting; Chen, Yahao; Guo, Yuxin; Wei, Yuanyuan; Jin, Xiaoye; Xie, Tong; Mu, Yuling; Dong, Qian; Wen, Shaoqing; Zhou, Boyan; Zhang, Li; Shen, Chunmei; Zhu, Bofeng

    2017-01-01

    In the present study, we assessed the genetic diversities of the Chinese Kazak ethnic group on the basis of 30 well-chosen autosomal insertion and deletion loci and explored the genetic relationships between Kazak and 23 reference groups. We detected the level of the expected heterozygosity ranging from 0.3605 at HLD39 locus to 0.5000 at HLD136 locus and the observed heterozygosity ranging from 0.3548 at HLD39 locus to 0.5283 at HLD136 locus. The combined power of discrimination and the combined power of exclusion for all 30 loci in the studied Kazak group were 0.999999999999128 and 0.9945, respectively. The dataset generated in this study indicated the panel of 30 InDels was highly efficient in forensic individual identifcation but may not have enough power in paternity cases. The results of the interpopulation differentiations, PCA plots, phylogenetic trees and STRUCTURE analyses showed a close genetic affiliation between the Kazak and Uigur group. PMID:28915619

  1. Simple Detection of Large InDeLS by DHPLC: The ACE Gene as a Model

    Directory of Open Access Journals (Sweden)

    Renata Guedes Koyama

    2008-01-01

    Full Text Available Insertion-deletion polymorphism (InDeL is the second most frequent type of genetic variation in the human genome. For the detection of large InDeLs, researchers usually resort to either PCR gel analysis or RFLP, but these are time consuming and dependent on human interpretation. Therefore, a more efficient method for genotyping this kind of genetic variation is needed. In this report, we describe a method that can detect large InDeLs by DHPLC (denaturating high-performance liquid chromatography using the angiotensin-converting enzyme (ACE gene I/D polymorphism as a model. The InDeL targeted in this study is characterized by a 288 bp Alu element insertion (I. We used DHPLC at nondenaturating conditions to analyze the PCR product with a flow through the chromatographic column under two different gradients based on the differences between D and I sequences. The analysis described is quick and easy, making this technique a suitable and efficient means for DHPLC users to screen InDeLs in genetic epidemiological studies.

  2. ScanIndel: a hybrid framework for indel detection via gapped alignment, split reads and de novo assembly.

    Science.gov (United States)

    Yang, Rendong; Nelson, Andrew C; Henzler, Christine; Thyagarajan, Bharat; Silverstein, Kevin A T

    2015-12-07

    Comprehensive identification of insertions/deletions (indels) across the full size spectrum from second generation sequencing is challenging due to the relatively short read length inherent in the technology. Different indel calling methods exist but are limited in detection to specific sizes with varying accuracy and resolution. We present ScanIndel, an integrated framework for detecting indels with multiple heuristics including gapped alignment, split reads and de novo assembly. Using simulation data, we demonstrate ScanIndel's superior sensitivity and specificity relative to several state-of-the-art indel callers across various coverage levels and indel sizes. ScanIndel yields higher predictive accuracy with lower computational cost compared with existing tools for both targeted resequencing data from tumor specimens and high coverage whole-genome sequencing data from the human NIST standard NA12878. Thus, we anticipate ScanIndel will improve indel analysis in both clinical and research settings. ScanIndel is implemented in Python, and is freely available for academic use at https://github.com/cauyrd/ScanIndel.

  3. Bioinformatics analysis of SARS coronavirus genome polymorphism

    Directory of Open Access Journals (Sweden)

    Pavlović-Lažetić Gordana M

    2004-05-01

    Full Text Available Abstract Background We have compared 38 isolates of the SARS-CoV complete genome. The main goal was twofold: first, to analyze and compare nucleotide sequences and to identify positions of single nucleotide polymorphism (SNP, insertions and deletions, and second, to group them according to sequence similarity, eventually pointing to phylogeny of SARS-CoV isolates. The comparison is based on genome polymorphism such as insertions or deletions and the number and positions of SNPs. Results The nucleotide structure of all 38 isolates is presented. Based on insertions and deletions and dissimilarity due to SNPs, the dataset of all the isolates has been qualitatively classified into three groups each having their own subgroups. These are the A-group with "regular" isolates (no insertions / deletions except for 5' and 3' ends, the B-group of isolates with "long insertions", and the C-group of isolates with "many individual" insertions and deletions. The isolate with the smallest average number of SNPs, compared to other isolates, has been identified (TWH. The density distribution of SNPs, insertions and deletions for each group or subgroup, as well as cumulatively for all the isolates is also presented, along with the gene map for TWH. Since individual SNPs may have occurred at random, positions corresponding to multiple SNPs (occurring in two or more isolates are identified and presented. This result revises some previous results of a similar type. Amino acid changes caused by multiple SNPs are also identified (for the annotated sequences, as well as presupposed amino acid changes for non-annotated ones. Exact SNP positions for the isolates in each group or subgroup are presented. Finally, a phylogenetic tree for the SARS-CoV isolates has been produced using the CLUSTALW program, showing high compatibility with former qualitative classification. Conclusions The comparative study of SARS-CoV isolates provides essential information for genome

  4. Genome-wide detection of chromosomal rearrangements, indels, and mutations in circular chromosomes by short read sequencing

    DEFF Research Database (Denmark)

    Skovgaard, Ole; Bak, Mads; Løbner-Olesen, Anders

    2011-01-01

    a combination of WGS and genome copy number analysis, for the identification of mutations that suppress the growth deficiency imposed by excessive initiations from the Escherichia coli origin of replication, oriC. The E. coli chromosome, like the majority of bacterial chromosomes, is circular, and DNA...... replication is initiated by assembling two replication complexes at the origin, oriC. These complexes then replicate the chromosome bidirectionally toward the terminus, ter. In a population of growing cells, this results in a copy number gradient, so that origin-proximal sequences are more frequent than...... origin-distal sequences. Major rearrangements in the chromosome are, therefore, readily identified by changes in copy number, i.e., certain sequences become over- or under-represented. Of the eight mutations analyzed in detail here, six were found to affect a single gene only, one was a large chromosomal...

  5. Genome-wide DNA polymorphism analyses using VariScan

    Directory of Open Access Journals (Sweden)

    Vilella Albert J

    2006-09-01

    Full Text Available Abstract Background DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. Results We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i exhaustive population-genetic analyses including those based on the coalescent theory; ii analysis adapted to the shallow data generated by the high-throughput genome projects; iii use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v visualization of the results integrated with current genome annotations in commonly available genome browsers. Conclusion VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

  6. Fitness consequences of polymorphic inversions in the zebra finch genome.

    Science.gov (United States)

    Knief, Ulrich; Hemmrich-Stanisak, Georg; Wittig, Michael; Franke, Andre; Griffith, Simon C; Kempenaers, Bart; Forstmeier, Wolfgang

    2016-09-29

    Inversion polymorphisms constitute an evolutionary puzzle: they should increase embryo mortality in heterokaryotypic individuals but still they are widespread in some taxa. Some insect species have evolved mechanisms to reduce the cost of embryo mortality but humans have not. In birds, a detailed analysis is missing although intraspecific inversion polymorphisms are regarded as common. In Australian zebra finches (Taeniopygia guttata), two polymorphic inversions are known cytogenetically and we set out to detect these two and potentially additional inversions using genomic tools and study their effects on embryo mortality and other fitness-related and morphological traits. Using whole-genome SNP data, we screened 948 wild zebra finches for polymorphic inversions and describe four large (12-63 Mb) intraspecific inversion polymorphisms with allele frequencies close to 50 %. Using additional data from 5229 birds and 9764 eggs from wild and three captive zebra finch populations, we show that only the largest inversions increase embryo mortality in heterokaryotypic males, with surprisingly small effect sizes. We test for a heterozygote advantage on other fitness components but find no evidence for heterosis for any of the inversions. Yet, we find strong additive effects on several morphological traits. The mechanism that has carried the derived inversion haplotypes to such high allele frequencies remains elusive. It appears that selection has effectively minimized the costs associated with inversions in zebra finches. The highly skewed distribution of recombination events towards the chromosome ends in zebra finches and other estrildid species may function to minimize crossovers in the inverted regions.

  7. Uninformative polymorphisms bias genome scans for signatures of selection

    Directory of Open Access Journals (Sweden)

    Roesti Marius

    2012-06-01

    Full Text Available Abstract Background With the establishment of high-throughput sequencing technologies and new methods for rapid and extensive single nucleotide (SNP discovery, marker-based genome scans in search of signatures of divergent selection between populations occupying ecologically distinct environments are becoming increasingly popular. Methods and Results On the basis of genome-wide SNP marker data generated by RAD sequencing of lake and stream stickleback populations, we show that the outcome of such studies can be systematically biased if markers with a low minor allele frequency are included in the analysis. The reason is that these ‘uninformative’ polymorphisms lack the adequate potential to capture signatures of drift and hitchhiking, the focal processes in ecological genome scans. Bias associated with uninformative polymorphisms is not eliminated by just avoiding technical artifacts in the data (PCR and sequencing errors, as a high proportion of SNPs with a low minor allele frequency is a general biological feature of natural populations. Conclusions We suggest that uninformative markers should be excluded from genome scans based on empirical criteria derived from careful inspection of the data, and that these criteria should be reported explicitly. Together, this should increase the quality and comparability of genome scans, and hence promote our understanding of the processes driving genomic differentiation.

  8. Templated sequence insertion polymorphisms in the human genome

    Science.gov (United States)

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  9. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  10. Genome-wide patterns of nucleotide polymorphism in domesticated rice

    DEFF Research Database (Denmark)

    Caicedo, Ana L; Williamson, Scott H; Hernandez, Ryan D

    2007-01-01

    Domesticated Asian rice (Oryza sativa) is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments......, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs) in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models...... to explain contemporary patterns of polymorphisms in rice, including a (i) selectively neutral population bottleneck model, (ii) bottleneck plus migration model, (iii) multiple selective sweeps model, and (iv) bottleneck plus selective sweeps model. We find that a simple bottleneck model, which has been...

  11. The polydeoxyadenylate tract of Alu repetitive elements is polymorphic in the human genome

    International Nuclear Information System (INIS)

    Economou, E.P.; Bergen, A.W.; Warren, A.C.; Antonarakis, S.E.

    1990-01-01

    To identify DNA polymorphisms that are abundant in the human genome and are detectable by polymerase chain reaction amplification of genomic DNA, the authors hypothesize that the polydeoxyadenylate tract of the Alu family of repetitive elements is polymorphic among human chromosomes. Analysis of the 3' ends of three specific Alu sequences showed two occurrences, one in the adenosine deaminase gene and other in the β-globin pseudogene, were polymorphic. This novel class of polymorphism, termed AluVpA [Alu variable poly(A)] may represent one of the most useful and informative group of DNA markers in the human genome

  12. Genome-wide development and deployment of informative intron-spanning and intron-length polymorphism markers for genomics-assisted breeding applications in chickpea.

    Science.gov (United States)

    Srivastava, Rishi; Bajaj, Deepak; Sayal, Yogesh K; Meher, Prabina K; Upadhyaya, Hari D; Kumar, Rajendra; Tripathi, Shailesh; Bharadwaj, Chellapilla; Rao, Atmakuri R; Parida, Swarup K

    2016-11-01

    The discovery and large-scale genotyping of informative gene-based markers is essential for rapid delineation of genes/QTLs governing stress tolerance and yield component traits in order to drive genetic enhancement in chickpea. A genome-wide 119169 and 110491 ISM (intron-spanning markers) from 23129 desi and 20386 kabuli protein-coding genes and 7454 in silico InDel (insertion-deletion) (1-45-bp)-based ILP (intron-length polymorphism) markers from 3283 genes were developed that were structurally and functionally annotated on eight chromosomes and unanchored scaffolds of chickpea. A much higher amplification efficiency (83%) and intra-specific polymorphic potential (86%) detected by these markers than that of other sequence-based genetic markers among desi and kabuli chickpea accessions was apparent even by a cost-effective agarose gel-based assay. The genome-wide physically mapped 1718 ILP markers assayed a wider level of functional genetic diversity (19-81%) and well-defined phylogenetics among domesticated chickpea accessions. The gene-derived 1424 ILP markers were anchored on a high-density (inter-marker distance: 0.65cM) desi intra-specific genetic linkage map/functional transcript map (ICC 4958×ICC 2263) of chickpea. This reference genetic map identified six major genomic regions harbouring six robust QTLs mapped on five chromosomes, which explained 11-23% seed weight trait variation (7.6-10.5 LOD) in chickpea. The integration of high-resolution QTL mapping with differential expression profiling detected six including one potential serine carboxypeptidase gene with ILP markers (linked tightly to the major seed weight QTLs) exhibiting seed-specific expression as well as pronounced up-regulation especially in seeds of high (ICC 4958) as compared to low (ICC 2263) seed weight mapping parental accessions. The marker information generated in the present study was made publicly accessible through a user-friendly web-resource, "Chickpea ISM-ILP Marker Database

  13. Ascertainment bias in studies of human genome-wide polymorphism

    DEFF Research Database (Denmark)

    Clark, Andrew G.; Hubisz, Melissa J.; Bustamente, Carlos D.

    2005-01-01

    of the SNPs that are found are influenced by the discovery sampling effort. The International HapMap project relied on nearly any piece of information available to identify SNPs-including BAC end sequences, shotgun reads, and differences between public and private sequences-and even made use of chimpanzee...... was a resequencing-by-hybridization effort using the 24 people of diverse origin in the Polymorphism Discovery Resource. Here we take these two data sets and contrast two basic summary statistics, heterozygosity and FST, as well as the site frequency spectra, for 500-kb windows spanning the genome. The magnitude...... of disparity between these samples in these measures of variability indicates that population genetic analysis on the raw genotype data is ill advised. Given the knowledge of the discovery samples, we perform an ascertainment correction and show how the post-correction data are more consistent across...

  14. Population Genomics of Inversion Polymorphisms in Drosophila melanogaster

    Science.gov (United States)

    Corbett-Detig, Russell B.; Hartl, Daniel L.

    2012-01-01

    Chromosomal inversions have been an enduring interest of population geneticists since their discovery in Drosophila melanogaster. Numerous lines of evidence suggest powerful selective pressures govern the distributions of polymorphic inversions, and these observations have spurred the development of many explanatory models. However, due to a paucity of nucleotide data, little progress has been made towards investigating selective hypotheses or towards inferring the genealogical histories of inversions, which can inform models of inversion evolution and suggest selective mechanisms. Here, we utilize population genomic data to address persisting gaps in our knowledge of D. melanogaster's inversions. We develop a method, termed Reference-Assisted Reassembly, to assemble unbiased, highly accurate sequences near inversion breakpoints, which we use to estimate the age and the geographic origins of polymorphic inversions. We find that inversions are young, and most are African in origin, which is consistent with the demography of the species. The data suggest that inversions interact with polymorphism not only in breakpoint regions but also chromosome-wide. Inversions remain differentiated at low levels from standard haplotypes even in regions that are distant from breakpoints. Although genetic exchange appears fairly extensive, we identify numerous regions that are qualitatively consistent with selective hypotheses. Finally, we show that In(1)Be, which we estimate to be ∼60 years old (95% CI 5.9 to 372.8 years), has likely achieved high frequency via sex-ratio segregation distortion in males. With deeper sampling, it will be possible to build on our inferences of inversion histories to rigorously test selective models—particularly those that postulate that inversions achieve a selective advantage through the maintenance of co-adapted allele complexes. PMID:23284285

  15. Genome-wide divergence and linkage disequilibrium analyses for Capsicum baccatum revealed by genome-anchored single nucleotide polymorphisms

    Science.gov (United States)

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...

  16. Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx)

    DEFF Research Database (Denmark)

    Xia, Qingyou; Guo, Yiran; Zhang, Ze

    2009-01-01

    A single-base pair resolution silkworm genetic variation map was constructed from 40 domesticated and wild silkworms, each sequenced to approximately threefold coverage, representing 99.88% of the genome. We identified ~16 million single-nucleotide polymorphisms, many indels, and structural varia...

  17. Development of cleaved amplified polymorphic sequence markers and a CAPS-based genetic linkage map in watermelon (Citrullus lanatus [Thunb.] Matsum. and Nakai) constructed using whole-genome re-sequencing data.

    Science.gov (United States)

    Liu, Shi; Gao, Peng; Zhu, Qianglong; Luan, Feishi; Davis, Angela R; Wang, Xiaolu

    2016-03-01

    Cleaved amplified polymorphic sequence (CAPS) markers are useful tools for detecting single nucleotide polymorphisms (SNPs). This study detected and converted SNP sites into CAPS markers based on high-throughput re-sequencing data in watermelon, for linkage map construction and quantitative trait locus (QTL) analysis. Two inbred lines, Cream of Saskatchewan (COS) and LSW-177 had been re-sequenced and analyzed by Perl self-compiled script for CAPS marker development. 88.7% and 78.5% of the assembled sequences of the two parental materials could map to the reference watermelon genome, respectively. Comparative assembled genome data analysis provided 225,693 and 19,268 SNPs and indels between the two materials. 532 pairs of CAPS markers were designed with 16 restriction enzymes, among which 271 pairs of primers gave distinct bands of the expected length and polymorphic bands, via PCR and enzyme digestion, with a polymorphic rate of 50.94%. Using the new CAPS markers, an initial CAPS-based genetic linkage map was constructed with the F2 population, spanning 1836.51 cM with 11 linkage groups and 301 markers. 12 QTLs were detected related to fruit flesh color, length, width, shape index, and brix content. These newly CAPS markers will be a valuable resource for breeding programs and genetic studies of watermelon.

  18. Analysis of the indel at the ARMS2 3′UTR in age-related macular degeneration

    Science.gov (United States)

    Wang, Gaofeng; Spencer, Kylee L.; Scott, William K.; Whitehead, Patrice; Court, Brenda L.; Ayala-Haedo, Juan; Mayo, Ping; Schwartz, Stephen G.; Kovach, Jaclyn L.; Gallins, Paul; Polk, Monica; Agarwal, Anita; Postel, Eric A.; Haines, Jonathan L.; Pericak-Vance, Margaret A.

    2010-01-01

    Controversy remains as to which gene at the chromosome 10q26 locus confers risk for age-related macular degeneration (AMD) and statistical genetic analysis is confounded by the strong linkage disequilibrium (LD) across the region. Functional analysis of related genetic variations could solve this puzzle. Recently Fritsche et al. reported that AMD is associated with unstable ARMS2 transcripts possibly caused by a complex insertion/deletion (indel; consisting of a 443 bp deletion and an adjacent 54 bp insertion) in its 3′UTR (untranslated region). To validate this indel, we sequenced our samples. We found that this indel is even more complex and is composed of two side-by-side indels separated by 17 bp: (1) 9 bp deletion with 10bp insertion; (2) 417 bp deletion with 27 bp insertion. The indel is significantly associated with the risk of AMD, but is also in strong LD with the non-synonymous single nucleotide polymorphism (SNP) rs10490924 (A69S). We also found that ARMS2 is expressed not only in placenta and retina but also in multiple human tissues. Using quantitative PCR, we found no correlation between the indel and ARMS2 mRNA level in human retina and blood samples. The lack of functional effects of the 3′UTR indel, the amino acid substitution of rs10490924 (A69S) and strong LD between them suggest that A69S, not the indel is the variant that confers risk of AMD. To our knowledge, it is the first time it's been shown that ARMS2 is widely expressed in human tissues. Conclusively, the indel at 3′UTR of ARMS2 actually contains two side-by-side indels. The indels are associated with risk of AMD, but not correlated with ARMS2 mRNA level. PMID:20182747

  19. Characterization and potential functional significance of human-chimpanzee large INDEL variation

    Directory of Open Access Journals (Sweden)

    Polavarapu Nalini

    2011-10-01

    Full Text Available Abstract Background Although humans and chimpanzees have accumulated significant differences in a number of phenotypic traits since diverging from a common ancestor about six million years ago, their genomes are more than 98.5% identical at protein-coding loci. This modest degree of nucleotide divergence is not sufficient to explain the extensive phenotypic differences between the two species. It has been hypothesized that the genetic basis of the phenotypic differences lies at the level of gene regulation and is associated with the extensive insertion and deletion (INDEL variation between the two species. To test the hypothesis that large INDELs (80 to 12,000 bp may have contributed significantly to differences in gene regulation between the two species, we categorized human-chimpanzee INDEL variation mapping in or around genes and determined whether this variation is significantly correlated with previously determined differences in gene expression. Results Extensive, large INDEL variation exists between the human and chimpanzee genomes. This variation is primarily attributable to retrotransposon insertions within the human lineage. There is a significant correlation between differences in gene expression and large human-chimpanzee INDEL variation mapping in genes or in proximity to them. Conclusions The results presented herein are consistent with the hypothesis that large INDELs, particularly those associated with retrotransposons, have played a significant role in human-chimpanzee regulatory evolution.

  20. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  1. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Science.gov (United States)

    Luo, Huaiyong; Wang, Xiaojie; Zhan, Gangming; Wei, Guorong; Zhou, Xinli; Zhao, Jing; Huang, Lili; Kang, Zhensheng

    2015-01-01

    The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs) are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  2. Insertion and deletion polymorphisms of the ancient AluS family in the human genome.

    Science.gov (United States)

    Kryatova, Maria S; Steranka, Jared P; Burns, Kathleen H; Payer, Lindsay M

    2017-01-01

    Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3' intact with 3' poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion

  3. Intra-strain polymorphisms are detected but no genomic alteration is found in cloned mice

    International Nuclear Information System (INIS)

    Gotoh, Koshichi; Inoue, Kimiko; Ogura, Atsuo; Oishi, Michio

    2006-01-01

    In-gel competitive reassociation (IGCR) is a method for differential subtraction of polymorphic (RFLP) DNA fragments between two DNA samples of interest without probes or specific sequence information. Here, we applied the IGCR procedure to two cloned mice derived from an F1 hybrid of the C57BL/6Cr and DBA/2 strains, in order to investigate the possibility of genomic alteration in the cloned mouse genomes. Each of the five of the genomic alterations we detected between the two cloned mice corresponded to the 'intra-strain' polymorphisms in the C57BL/6Cr and DBA/2 mouse strains. Our result suggests that no severe aberration of genome sequences occurs due to somatic cell nuclear transfer

  4. Genomic Relatedness of Chlamydia Isolates Determined by Amplified Fragment Length Polymorphism Analysis

    OpenAIRE

    Meijer, Adam; Morré, Servaas A.; Van Den Brule, Adriaan J. C.; Savelkoul, Paul H. M.; Ossewaarde, Jacobus M.

    1999-01-01

    The genomic relatedness of 19 Chlamydia pneumoniae isolates (17 from respiratory origin and 2 from atherosclerotic origin), 21 Chlamydia trachomatis isolates (all serovars from the human biovar, an isolate from the mouse biovar, and a porcine isolate), 6 Chlamydia psittaci isolates (5 avian isolates and 1 feline isolate), and 1 Chlamydia pecorum isolate was studied by analyzing genomic amplified fragment length polymorphism (AFLP) fingerprints. The AFLP procedure was adapted from a previously...

  5. Characterization of polymorphic SSRs among Prunus chloroplast genomes

    Science.gov (United States)

    An in silico mining process yielded 80, 75, and 78 microsatellites in the chloroplast genome of Prunus persica, P. kansuensis, and P. mume. A and T repeats were predominant in the three genomes, accounting for 67.8% on average and most of them were successful in primer design. For the 80 P. persica ...

  6. Fast and sensitive detection of indels induced by precise gene targeting

    DEFF Research Database (Denmark)

    Yang, Zhang; Steentoft, Catharina; Hauge, Camilla

    2015-01-01

    The nuclease-based gene editing tools are rapidly transforming capabilities for altering the genome of cells and organisms with great precision and in high throughput studies. A major limitation in application of precise gene editing lies in lack of sensitive and fast methods to detect...... and characterize the induced DNA changes. Precise gene editing induces double-stranded DNA breaks that are repaired by error-prone non-homologous end joining leading to introduction of insertions and deletions (indels) at the target site. These indels are often small and difficult and laborious to detect...

  7. Extensive variation in the density and distribution of DNA polymorphism in sorghum genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Evans

    Full Text Available Sorghum genotypes currently used for grain production in the United States were developed from African landraces that were imported starting in the mid-to-late 19(th century. Farmers and plant breeders selected genotypes for grain production with reduced plant height, early flowering, increased grain yield, adaptation to drought, and improved resistance to lodging, diseases and pests. DNA polymorphisms that distinguish three historically important grain sorghum genotypes, BTx623, BTx642 and Tx7000, were characterized by genome sequencing, genotyping by sequencing, genetic mapping, and pedigree-based haplotype analysis. The distribution and density of DNA polymorphisms in the sequenced genomes varied widely, in part because the lines were derived through breeding and selection from diverse Kafir, Durra, and Caudatum race accessions. Genomic DNA spanning dw1 (SBI-09 and dw3 (SBI-07 had identical haplotypes due to selection for reduced height. Lower SNP density in genes located in pericentromeric regions compared with genes located in euchromatic regions is consistent with background selection in these regions of low recombination. SNP density was higher in euchromatic DNA and varied >100-fold in contiguous intervals that spanned up to 300 Kbp. The localized variation in DNA polymorphism density occurred throughout euchromatic regions where recombination is elevated, however, polymorphism density was not correlated with gene density or DNA methylation. Overall, sorghum chromosomes contain distal euchromatic regions characterized by extensive, localized variation in DNA polymorphism density, and large pericentromeric regions of low gene density, diversity, and recombination.

  8. Human Xq28 Inversion Polymorphism: From Sex Linkage to Genomics--A Genetic Mother Lode

    Science.gov (United States)

    Kirby, Cait S.; Kolber, Natalie; Salih Almohaidi, Asmaa M.; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to…

  9. On peculiar Šindel sequences

    Czech Academy of Sciences Publication Activity Database

    Křížek, Michal; Somer, L.

    2010-01-01

    Roč. 17, č. 2 (2010), s. 129-140 ISSN 0972-5555 R&D Projects: GA AV ČR(CZ) IAA100190803 Institutional research plan: CEZ:AV0Z10190503 Keywords : quadratic residue * Chinese remainder theorem * primitive Šindel sequences * Prague clock sequence Subject RIV: BA - General Mathematics http://www.pphmj.com/abstract/5095.htm

  10. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Directory of Open Access Journals (Sweden)

    Gopala Krishnan S

    Full Text Available BACKGROUND: Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. RESULTS: We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts. Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. CONCLUSIONS: Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  11. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Science.gov (United States)

    Krishnan S, Gopala; Waters, Daniel L E; Henry, Robert J

    2014-01-01

    Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts). Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  12. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    Science.gov (United States)

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  13. Association between the polymorphisms of angiotensin converting enzyme (Peptidyl-Dipeptidase A INDEL mutation (I/D and Angiotensin II type I receptor (A1166C and breast cancer among post menopausal Egyptian females

    Directory of Open Access Journals (Sweden)

    Rania Mohamed El Sharkawy

    2014-09-01

    Results: A statistically significant difference in AT1R A1166C SNP genotype frequencies was found among the studied groups. The patients group showed higher frequency of “CC” (2.9% vs 0% and “AC” (44.3% vs 24% and lower frequency of “AA” genotype (52.9% vs 76% than controls. The patients also showed significant higher frequency of allele “C” (25% vs 12% which was associated with increased breast cancer risk with an Odds ratio of 2.4444 (95% CI: 1.1967–4.9931. Testing the dominant model of inheritance revealed a statistically higher frequency of exposed genotypes “AC and CC” among the patients group (47.1% vs 24%, respectively; p = 0.013 with substantial increase in breast cancer risk among the exposed genotypes with an Odds ratio of 2.8243 (95% CI: 1.2679–6.2913. The present study demonstrated that (AC and CC genotypes of AT1R A1166C SNP and increased BMI can be considered as predictors for breast cancer risk among post menopausal Egyptian females. Results also revealed that A1166C SNP of AT1R gene and ACE/ID polymorphism could not be considered as predictors for breast cancer prognosis.

  14. Evaluation of multiple approaches to identify genome-wide polymorphisms in closely related genotypes of sweet cherry (Prunus avium L.

    Directory of Open Access Journals (Sweden)

    Seanna Hewitt

    Full Text Available Identification of genetic polymorphisms and subsequent development of molecular markers is important for marker assisted breeding of superior cultivars of economically important species. Sweet cherry (Prunus avium L. is an economically important non-climacteric tree fruit crop in the Rosaceae family and has undergone a genetic bottleneck due to breeding, resulting in limited genetic diversity in the germplasm that is utilized for breeding new cultivars. Therefore, it is critical to recognize the best platforms for identifying genome-wide polymorphisms that can help identify, and consequently preserve, the diversity in a genetically constrained species. For the identification of polymorphisms in five closely related genotypes of sweet cherry, a gel-based approach (TRAP, reduced representation sequencing (TRAPseq, a 6k cherry SNParray, and whole genome sequencing (WGS approaches were evaluated in the identification of genome-wide polymorphisms in sweet cherry cultivars. All platforms facilitated detection of polymorphisms among the genotypes with variable efficiency. In assessing multiple SNP detection platforms, this study has demonstrated that a combination of appropriate approaches is necessary for efficient polymorphism identification, especially between closely related cultivars of a species. The information generated in this study provides a valuable resource for future genetic and genomic studies in sweet cherry, and the insights gained from the evaluation of multiple approaches can be utilized for other closely related species with limited genetic diversity in the breeding germplasm. Keywords: Polymorphisms, Prunus avium, Next-generation sequencing, Target region amplification polymorphism (TRAP, Genetic diversity, SNParray, Reduced representation sequencing, Whole genome sequencing (WGS

  15. Polymorphism and mutation analysis of genomic DNA on cancer

    International Nuclear Information System (INIS)

    Ohta, Tsutomu

    2003-01-01

    DNA repair is a universal process in living cells that maintains the structural integrity of chromosomal DNA molecules in face of damage. A deficiency in DNA damage repair is associated with an increased cancer risk by increasing a mutation frequency of cancer-related genes. Variation in DNA repair capacity may be genetically determined. Therefore, we searched single-nucleotide polymorphisms (SNPs) in major DNA repair genes. This led to the finding of 600 SNPs and mutations including many novel SNPs in Japanese population. Case-control studies to explore the contribution of the SNPs in DNA repair genes to the risk of lung cancer revealed that five SNPs are associated with lung carcinogenesis. One of these SNPs is found in RAD54L gene, which is involved in double-strand DNA repair. We analyzed and reported activities of Rad54L protein with SNP and mutations. (authors)

  16. DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

    Directory of Open Access Journals (Sweden)

    Inês Soares

    Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.

  17. Genomic relations among 31 species of Mammillaria haworth (Cactaceae) using random amplified polymorphic DNA.

    Science.gov (United States)

    Mattagajasingh, Ilwola; Mukherjee, Arup Kumar; Das, Premananda

    2006-01-01

    Thirty-one species of Mammillaria were selected to study the molecular phylogeny using random amplified polymorphic DNA (RAPD) markers. High amount of mucilage (gelling polysaccharides) present in Mammillaria was a major obstacle in isolating good quality genomic DNA. The CTAB (cetyl trimethyl ammonium bromide) method was modified to obtain good quality genomic DNA. Twenty-two random decamer primers resulted in 621 bands, all of which were polymorphic. The similarity matrix value varied from 0.109 to 0.622 indicating wide variability among the studied species. The dendrogram obtained from the unweighted pair group method using arithmetic averages (UPGMA) analysis revealed that some of the species did not follow the conventional classification. The present work shows the usefulness of RAPD markers for genetic characterization to establish phylogenetic relations among Mammillaria species.

  18. Genomic diversity among Danish field strains of Mycoplasma hyosynoviae assessed by amplified fragment length polymorphism analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, Niels F.; Nielsen, Elisabeth O.

    2002-01-01

    Genomic diversity among strains of Mycoplasma hyosynoviae isolated in Denmark was assessed by using amplified fragment length polymorphism (AFLP) analysis. Ninety-six strains, obtained from different specimens and geographical locations during 30 years and the type strain of M. hyosynoviae S16(T......) were concurrently examined for variance in BglII-MfeI and EcoRI-Csp6I-A AFLP markers. A total of 56 different genomic fingerprints having an overall similarity between 77 and 96% were detected. No correlation between AFLP variability and period of isolation or anatomical site of isolation could...

  19. Genome-wide DNA polymorphisms in Kavuni, a traditional rice cultivar with nutritional and therapeutic properties.

    Science.gov (United States)

    Rathinasabapathi, Pasupathi; Purushothaman, Natarajan; Parani, Madasamy

    2016-05-01

    Although rice genome was sequenced in the year 2002, efforts in resequencing the large number of available accessions, landraces, traditional cultivars, and improved varieties of this important food crop are limited. We have initiated resequencing of the traditional cultivars from India. Kavuni is an important traditional rice cultivar from South India that attracts premium price for its nutritional and therapeutic properties. Whole-genome sequencing of Kavuni using Illumina platform and SNPs analysis using Nipponbare reference genome identified 1 150 711 SNPs of which 377 381 SNPs were located in the genic regions. Non-synonymous SNPs (62 708) were distributed in 19 251 genes, and their number varied between 1 and 115 per gene. Large-effect DNA polymorphisms (7769) were present in 3475 genes. Pathway mapping of these polymorphisms revealed the involvement of genes related to carbohydrate metabolism, translation, protein-folding, and cell death. Analysis of the starch biosynthesis related genes revealed that the granule-bound starch synthase I gene had T/G SNPs at the first intron/exon junction and a two-nucleotide combination, which were reported to favour high amylose content and low glycemic index. The present study provided a valuable genomics resource to study the rice varieties with nutritional and medicinal properties.

  20. An 8bp indel in exon 1 of Ghrelin gene associated with chicken growth.

    Science.gov (United States)

    Fang, Meixia; Nie, Qinghua; Luo, Chenglong; Zhang, Dexiang; Zhang, Xiquan

    2007-04-01

    Ghrelin, acts as the endogenous ligand for growth hormone secretagogues receptor (GHS-R), is a novel growth hormone (GH) releasing peptide with reported effects on food intake in chickens. In this study, an 8 bp indel polymorphism in exon 1 of the chicken Ghrelin (cGHRL) gene was genotyped in a F(2) designed full-sib population to analyze its associations with chicken growth and carcass traits. Later, mRNA level in the proventriculus was determined by real-time PCR to reveal the expression feature of cGHRL gene. Result showed that this 8 bp indel was significantly associated with body weight at the age of 28 days (BW28) and 56 days (BW56), eviscerated weight (EW) and leg muscle weight (LMW) (PGhrelin on chicken growth were indicated by this study.

  1. Effects of As2O3 on DNA methylation, genomic instability, and LTR retrotransposon polymorphism in Zea mays.

    Science.gov (United States)

    Erturk, Filiz Aygun; Aydin, Murat; Sigmaz, Burcu; Taspinar, M Sinan; Arslan, Esra; Agar, Guleray; Yagci, Semra

    2015-12-01

    Arsenic is a well-known toxic substance on the living organisms. However, limited efforts have been made to study its DNA methylation, genomic instability, and long terminal repeat (LTR) retrotransposon polymorphism causing properties in different crops. In the present study, effects of As2O3 (arsenic trioxide) on LTR retrotransposon polymorphism and DNA methylation as well as DNA damage in Zea mays seedlings were investigated. The results showed that all of arsenic doses caused a decreasing genomic template stability (GTS) and an increasing Random Amplified Polymorphic DNAs (RAPDs) profile changes (DNA damage). In addition, increasing DNA methylation and LTR retrotransposon polymorphism characterized a model to explain the epigenetically changes in the gene expression were also found. The results of this experiment have clearly shown that arsenic has epigenetic effect as well as its genotoxic effect. Especially, the increasing of polymorphism of some LTR retrotransposon under arsenic stress may be a part of the defense system against the stress.

  2. A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing

    Directory of Open Access Journals (Sweden)

    Guangtu Gao

    2018-04-01

    Full Text Available Single-nucleotide polymorphisms (SNPs are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout (Oncorhynchus mykiss, SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD libraries, reduced representation libraries (RRL and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1 which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup, followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs and multi-sequence variants (MSVs. Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25. The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and

  3. Development and Molecular Characterization of Novel Polymorphic Genomic DNA SSR Markers in Lentinula edodes.

    Science.gov (United States)

    Moon, Suyun; Lee, Hwa-Yong; Shim, Donghwan; Kim, Myungkil; Ka, Kang-Hyeon; Ryoo, Rhim; Ko, Han-Gyu; Koo, Chang-Duck; Chung, Jong-Wook; Ryu, Hojin

    2017-06-01

    Sixteen genomic DNA simple sequence repeat (SSR) markers of Lentinula edodes were developed from 205 SSR motifs present in 46.1-Mb long L. edodes genome sequences. The number of alleles ranged from 3-14 and the major allele frequency was distributed from 0.17-0.96. The values of observed and expected heterozygosity ranged from 0.00-0.76 and 0.07-0.90, respectively. The polymorphic information content value ranged from 0.07-0.89. A dendrogram, based on 16 SSR markers clustered by the paired hierarchical clustering' method, showed that 33 shiitake cultivars could be divided into three major groups and successfully identified. These SSR markers will contribute to the efficient breeding of this species by providing diversity in shiitake varieties. Furthermore, the genomic information covered by the markers can provide a valuable resource for genetic linkage map construction, molecular mapping, and marker-assisted selection in the shiitake mushroom.

  4. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  5. Polymorphic integrations of an endogenous gammaretrovirus in the mule deer genome.

    Science.gov (United States)

    Elleder, Daniel; Kim, Oekyung; Padhi, Abinash; Bankert, Jason G; Simeonov, Ivan; Schuster, Stephan C; Wittekindt, Nicola E; Motameny, Susanne; Poss, Mary

    2012-03-01

    Endogenous retroviruses constitute a significant genomic fraction in all mammalian species. Typically they are evolutionarily old and fixed in the host species population. Here we report on a novel endogenous gammaretrovirus (CrERVγ; for cervid endogenous gammaretrovirus) in the mule deer (Odocoileus hemionus) that is insertionally polymorphic among individuals from the same geographical location, suggesting that it has a more recent evolutionary origin. Using PCR-based methods, we identified seven CrERVγ proviruses and demonstrated that they show various levels of insertional polymorphism in mule deer individuals. One CrERVγ provirus was detected in all mule deer sampled but was absent from white-tailed deer, indicating that this virus originally integrated after the split of the two species, which occurred approximately one million years ago. There are, on average, 100 CrERVγ copies in the mule deer genome based on quantitative PCR analysis. A CrERVγ provirus was sequenced and contained intact open reading frames (ORFs) for three virus genes. Transcripts were identified covering the entire provirus. CrERVγ forms a distinct branch of the gammaretrovirus phylogeny, with the closest relatives of CrERVγ being endogenous gammaretroviruses from sheep and pig. We demonstrated that white-tailed deer (Odocoileus virginianus) and elk (Cervus canadensis) DNA contain proviruses that are closely related to mule deer CrERVγ in a conserved region of pol; more distantly related sequences can be identified in the genome of another member of the Cervidae, the muntjac (Muntiacus muntjak). The discovery of a novel transcriptionally active and insertionally polymorphic retrovirus in mammals could provide a useful model system to study the dynamic interaction between the host genome and an invading retrovirus.

  6. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  7. Draft genome sequence of Coxiella burnetii Dog Utad, a strain isolated from a dog-related outbreak of Q fever

    Directory of Open Access Journals (Sweden)

    F. D’amato

    2014-07-01

    Full Text Available Coxiella burnetii Dog Utad, with a 2 008 938 bp genome is a strain isolated from a parturient dog responsible for a human familial outbreak of acute Q fever in Nova Scotia, Canada. Its genotype, determined by multispacer typing, is 21; the only one found in Canada that includes Q212, which causes endocarditis. Only 107 single nucleotide polymorphisms and 16 INDELs differed from Q212, suggesting a recent clonal radiation.

  8. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    Science.gov (United States)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  9. Investigation of inversion polymorphisms in the human genome using principal components analysis.

    Science.gov (United States)

    Ma, Jianzhong; Amos, Christopher I

    2012-01-01

    Despite the significant advances made over the last few years in mapping inversions with the advent of paired-end sequencing approaches, our understanding of the prevalence and spectrum of inversions in the human genome has lagged behind other types of structural variants, mainly due to the lack of a cost-efficient method applicable to large-scale samples. We propose a novel method based on principal components analysis (PCA) to characterize inversion polymorphisms using high-density SNP genotype data. Our method applies to non-recurrent inversions for which recombination between the inverted and non-inverted segments in inversion heterozygotes is suppressed due to the loss of unbalanced gametes. Inside such an inversion region, an effect similar to population substructure is thus created: two distinct "populations" of inversion homozygotes of different orientations and their 1:1 admixture, namely the inversion heterozygotes. This kind of substructure can be readily detected by performing PCA locally in the inversion regions. Using simulations, we demonstrated that the proposed method can be used to detect and genotype inversion polymorphisms using unphased genotype data. We applied our method to the phase III HapMap data and inferred the inversion genotypes of known inversion polymorphisms at 8p23.1 and 17q21.31. These inversion genotypes were validated by comparing with literature results and by checking Mendelian consistency using the family data whenever available. Based on the PCA-approach, we also performed a preliminary genome-wide scan for inversions using the HapMap data, which resulted in 2040 candidate inversions, 169 of which overlapped with previously reported inversions. Our method can be readily applied to the abundant SNP data, and is expected to play an important role in developing human genome maps of inversions and exploring associations between inversions and susceptibility of diseases.

  10. Draft genome of the sea cucumber Apostichopus japonicus and genetic polymorphism among color variants.

    Science.gov (United States)

    Jo, Jihoon; Oh, Jooseong; Lee, Hyun-Gwan; Hong, Hyun-Hee; Lee, Sung-Gwon; Cheon, Seongmin; Kern, Elizabeth M A; Jin, Soyeong; Cho, Sung-Jin; Park, Joong-Ki; Park, Chungoo

    2017-01-01

    The Japanese sea cucumber (Apostichopus japonicus Selenka 1867) is an economically important species as a source of seafood and ingredient in traditional medicine. It is mainly found off the coasts of northeast Asia. Recently, substantial exploitation and widespread biotic diseases in A. japonicus have generated increasing conservation concern. However, the genomic knowledge base and resources available for researchers to use in managing this natural resource and to establish genetically based breeding systems for sea cucumber aquaculture are still in a nascent stage. A total of 312 Gb of raw sequences were generated using the Illumina HiSeq 2000 platform and assembled to a final size of 0.66 Gb, which is about 80.5% of the estimated genome size (0.82 Gb). We observed nucleotide-level heterozygosity within the assembled genome to be 0.986%. The resulting draft genome assembly comprising 132 607 scaffolds with an N50 value of 10.5 kb contains a total of 21 771 predicted protein-coding genes. We identified 6.6-14.5 million heterozygous single nucleotide polymorphisms in the assembled genome of the three natural color variants (green, red, and black), resulting in an estimated nucleotide diversity of 0.00146. We report the first draft genome of A. japonicus and provide a general overview of the genetic variation in the three major color variants of A. japonicus. These data will help provide a comprehensive view of the genetic, physiological, and evolutionary relationships among color variants in A. japonicus, and will be invaluable resources for sea cucumber genomic research. © The Author 2017. Published by Oxford University Press.

  11. Genome-Wide Association of Copy Number Polymorphisms and Kidney Function.

    Directory of Open Access Journals (Sweden)

    Man Li

    Full Text Available Genome-wide association studies (GWAS using single nucleotide polymorphisms (SNPs have identified more than 50 loci associated with estimated glomerular filtration rate (eGFR, a measure of kidney function. However, significant SNPs account for a small proportion of eGFR variability. Other forms of genetic variation have not been comprehensively evaluated for association with eGFR. In this study, we assess whether changes in germline DNA copy number are associated with GFR estimated from serum creatinine, eGFRcrea. We used hidden Markov models (HMMs to identify copy number polymorphic regions (CNPs from high-throughput SNP arrays for 2,514 African (AA and 8,645 European ancestry (EA participants in the Atherosclerosis Risk in Communities (ARIC study. Separately for the EA and AA cohorts, we used Bayesian Gaussian mixture models to estimate copy number at regions identified by the HMM or previously reported in the HapMap Project. We identified 312 and 464 autosomal CNPs among individuals of EA and AA, respectively. Multivariate models adjusted for SNP-derived covariates of population structure identified one CNP in the EA cohort near genome-wide statistical significance (Bonferroni-adjusted p = 0.067 located on chromosome 5 (876-880kb. Overall, our findings suggest a limited role of CNPs in explaining eGFR variability.

  12. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  13. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  14. Genomic comparison of invasive and rare non-invasive strains reveals Porphyromonas gingivalis genetic polymorphisms

    Directory of Open Access Journals (Sweden)

    Svetlana Dolgilevich

    2011-03-01

    Full Text Available Porphyromonas gingivalis strains are shown to invade human cells in vitro with different invasion efficiencies, varying by up to three orders of magnitude.We tested the hypothesis that invasion-associated interstrain genomic polymorphisms are present in P. gingivalis and that putative invasion-associated genes can contribute to P. gingivalis invasion.Using an invasive (W83 and the only available non-invasive P. gingivalis strain (AJW4 and whole genome microarrays followed by two separate software tools, we carried out comparative genomic hybridization (CGH analysis.We identified 68 annotated and 51 hypothetical open reading frames (ORFs that are polymorphic between these strains. Among these are surface proteins, lipoproteins, capsular polysaccharide biosynthesis enzymes, regulatory and immunoreactive proteins, integrases, and transposases often with abnormal GC content and clustered on the chromosome. Amplification of selected ORFs was used to validate the approach and the selection. Eleven clinical strains were investigated for the presence of selected ORFs. The putative invasion-associated ORFs were present in 10 of the isolates. The invasion ability of three isogenic mutants, carrying deletions in PG0185, PG0186, and PG0982 was tested. The PG0185 (ragA and PG0186 (ragB mutants had 5.1×103-fold and 3.6×103-fold decreased in vitro invasion ability, respectively.The annotation of divergent ORFs suggests deficiency in multiple genes as a basis for P. gingivalis non-invasive phenotype. Access the supplementary material to this article: Supplement, table (see Supplementary files under Reading Tools online.

  15. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

    2012-01-01

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  16. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  17. Genome-based polymorphic microsatellite development and validation in the mosquito Aedes aegypti and application to population genetics in Haiti

    Directory of Open Access Journals (Sweden)

    Streit Thomas G

    2009-12-01

    Full Text Available Abstract Background Microsatellite markers have proven useful in genetic studies in many organisms, yet microsatellite-based studies of the dengue and yellow fever vector mosquito Aedes aegypti have been limited by the number of assayable and polymorphic loci available, despite multiple independent efforts to identify them. Here we present strategies for efficient identification and development of useful microsatellites with broad coverage across the Aedes aegypti genome, development of multiplex-ready PCR groups of microsatellite loci, and validation of their utility for population analysis with field collections from Haiti. Results From 79 putative microsatellite loci representing 31 motifs identified in 42 whole genome sequence supercontig assemblies in the Aedes aegypti genome, 33 microsatellites providing genome-wide coverage amplified as single copy sequences in four lab strains, with a range of 2-6 alleles per locus. The tri-nucleotide motifs represented the majority (51% of the polymorphic single copy loci, and none of these was located within a putative open reading frame. Seven groups of 4-5 microsatellite loci each were developed for multiplex-ready PCR. Four multiplex-ready groups were used to investigate population genetics of Aedes aegypti populations sampled in Haiti. Of the 23 loci represented in these groups, 20 were polymorphic with a range of 3-24 alleles per locus (mean = 8.75. Allelic polymorphic information content varied from 0.171 to 0.867 (mean = 0.545. Most loci met Hardy-Weinberg expectations across populations and pairwise FST comparisons identified significant genetic differentiation between some populations. No evidence for genetic isolation by distance was observed. Conclusion Despite limited success in previous reports, we demonstrate that the Aedes aegypti genome is well-populated with single copy, polymorphic microsatellite loci that can be uncovered using the strategy developed here for rapid and efficient

  18. Methylation-Sensitive Amplification Length Polymorphism (MS-AFLP) Microarrays for Epigenetic Analysis of Human Genomes.

    Science.gov (United States)

    Alonso, Sergio; Suzuki, Koichi; Yamamoto, Fumiichiro; Perucho, Manuel

    2018-01-01

    Somatic, and in a minor scale also germ line, epigenetic aberrations are fundamental to carcinogenesis, cancer progression, and tumor phenotype. DNA methylation is the most extensively studied and arguably the best understood epigenetic mechanisms that become altered in cancer. Both somatic loss of methylation (hypomethylation) and gain of methylation (hypermethylation) are found in the genome of malignant cells. In general, the cancer cell epigenome is globally hypomethylated, while some regions-typically gene-associated CpG islands-become hypermethylated. Given the profound impact that DNA methylation exerts on the transcriptional profile and genomic stability of cancer cells, its characterization is essential to fully understand the complexity of cancer biology, improve tumor classification, and ultimately advance cancer patient management and treatment. A plethora of methods have been devised to analyze and quantify DNA methylation alterations. Several of the early-developed methods relied on the use of methylation-sensitive restriction enzymes, whose activity depends on the methylation status of their recognition sequences. Among these techniques, methylation-sensitive amplification length polymorphism (MS-AFLP) was developed in the early 2000s, and successfully adapted from its original gel electrophoresis fingerprinting format to a microarray format that notably increased its throughput and allowed the quantification of the methylation changes. This array-based platform interrogates over 9500 independent loci putatively amplified by the MS-AFLP technique, corresponding to the NotI sites mapped throughout the human genome.

  19. Genome sequence of herpes simplex virus 1 strain KOS.

    Science.gov (United States)

    Macdonald, Stuart J; Mostafa, Heba H; Morrison, Lynda A; Davido, David J

    2012-06-01

    Herpes simplex virus type 1 (HSV-1) strain KOS has been extensively used in many studies to examine HSV-1 replication, gene expression, and pathogenesis. Notably, strain KOS is known to be less pathogenic than the first sequenced genome of HSV-1, strain 17. To understand the genotypic differences between KOS and other phenotypically distinct strains of HSV-1, we sequenced the viral genome of strain KOS. When comparing strain KOS to strain 17, there are at least 1,024 small nucleotide polymorphisms (SNPs) and 172 insertions/deletions (indels). The polymorphisms observed in the KOS genome will likely provide insights into the genes, their protein products, and the cis elements that regulate the biology of this HSV-1 strain.

  20. Sequence length variation, indel costs, and congruence in sensitivity analysis

    DEFF Research Database (Denmark)

    Aagesen, Lone; Petersen, Gitte; Seberg, Ole

    2005-01-01

    The behavior of two topological and four character-based congruence measures was explored using different indel treatments in three empirical data sets, each with different alignment difficulties. The analyses were done using direct optimization within a sensitivity analysis framework in which...... the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously...... preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation...

  1. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains

    KAUST Repository

    Preston, Mark D.; Campino, Susana; Assefa, Samuel A.; Echeverry, Diego F.; Ocholla, Harold; Amambua-Ngwa, Alfred; Stewart, Lindsay B.; Conway, David J.; Borrmann, Steffen; Michon, Pascal; Zongo, Issaka; Oué draogo, Jean-Bosco; Djimde, Abdoulaye A.; Doumbo, Ogobara K.; Nosten, Francois; Pain, Arnab; Bousema, Teun; Drakeley, Chris J.; Fairhurst, Rick M.; Sutherland, Colin J.; Roper, Cally; Clark, Taane G.

    2014-01-01

    Malaria is a major public health problem that is actively being addressed in a global eradication campaign. Increased population mobility through international air travel has elevated the risk of re-introducing parasites to elimination areas and dispersing drug-resistant parasites to new regions. A simple genetic marker that quickly and accurately identifies the geographic origin of infections would be a valuable public health tool for locating the source of imported outbreaks. Here we analyse the mitochondrion and apicoplast genomes of 711 Plasmodium falciparum isolates from 14 countries, and find evidence that they are non-recombining and co-inherited. The high degree of linkage produces a panel of relatively few single-nucleotide polymorphisms (SNPs) that is geographically informative. We design a 23-SNP barcode that is highly predictive (?92%) and easily adapted to aid case management in the field and survey parasite migration worldwide. 2014 Macmillan Publishers Limited. All rights reserved.

  2. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains

    KAUST Repository

    Preston, Mark D.

    2014-06-13

    Malaria is a major public health problem that is actively being addressed in a global eradication campaign. Increased population mobility through international air travel has elevated the risk of re-introducing parasites to elimination areas and dispersing drug-resistant parasites to new regions. A simple genetic marker that quickly and accurately identifies the geographic origin of infections would be a valuable public health tool for locating the source of imported outbreaks. Here we analyse the mitochondrion and apicoplast genomes of 711 Plasmodium falciparum isolates from 14 countries, and find evidence that they are non-recombining and co-inherited. The high degree of linkage produces a panel of relatively few single-nucleotide polymorphisms (SNPs) that is geographically informative. We design a 23-SNP barcode that is highly predictive (?92%) and easily adapted to aid case management in the field and survey parasite migration worldwide. 2014 Macmillan Publishers Limited. All rights reserved.

  3. Polymorphic microsatellites in the human bloodfluke, Schistosoma japonicum, identified using a genomic resource

    Directory of Open Access Journals (Sweden)

    Spear Robert

    2011-02-01

    Full Text Available Abstract Re-emergence of schistosomiasis in regions of China where control programs have ceased requires development of molecular-genetic tools to track gene flow and assess genetic diversity of Schistosoma populations. We identified many microsatellite loci in the draft genome of Schistosoma japonicum using defined search criteria and selected a subset for further analysis. From an initial panel of 50 loci, 20 new microsatellites were selected for eventual optimization and application to a panel of worms from endemic areas. All but one of the selected microsatellites contain simple tri-nucleotide repeats. Moderate to high levels of polymorphism were detected. Numbers of alleles ranged from 6 to 14 and observed heterozygosity was always >0.6. The loci reported here will facilitate high resolution population-genetic studies on schistosomes in re-emergent foci.

  4. Identification and insertion polymorphisms of short interspersed nuclear elements (SINEs) in Brassica genomes

    International Nuclear Information System (INIS)

    Nouroz, F.; Naveed, M.

    2018-01-01

    The non-LTR retrotransposons (retroposons) are abundant in plant genomes including members of Brassicaceae. Of the retroposons, long interspersed nuclear elements (LINEs) are more copious followed by short interspersed nuclear elements (SINEs) in sequenced eukaryotic genomes. The SINEs are short elements and ranged from 100-500 bps flanked by variable sized target site duplications, 5' tRNA region with polymerase III promoter, internal tRNA unrelated region, 3' LINEs derived region and a poly adenosine tail. Different computational approaches were used for the identification and characterization of SINEs, while PCR was used to detect the SINEs insertion polymorphisms in various Brassica genotypes. Ten previously unidentified families of SINEs were identified and characterized from Brassica genomes. The structural features of these SINEs were studied in detail, which showed typical SINE features displaying small sizes, target site duplications, head regions, internal regions (body) of variable sizes and a poly (A) tail at the 3' terminus. The elements from various families ranged from 206-558 bp, where BoSINE2 family displayed smallest SINE element (206 bp), while larger members belonged to BoSINE9 family (524-558 bp). The distribution and abundance of SINEs in various Brassica species and genotypes (40) at a particular site/locus were investigated by SINEs based PCR markers. Various SINE insertion polymorphisms were detected from different genotypes, where higher PCR bands amplified the SINE insertions, while lower bands amplified the pre-insertion sites (flanking regions). The analysis of Brassica SINEs copy numbers from 10 identified families revealed that around 860 and 1712 copies of SINEs were calculated from B. rapa and B. oleracea Whole-genome shotgun contigs (WGS) respectively. Analysis of insertion sites of Brassica SINEs revealed that the members from all 10 SINE families had shown an insertion preference in AT rich regions. The present

  5. Incorporating indel information into phylogeny estimation for rapidly emerging pathogens

    Directory of Open Access Journals (Sweden)

    Suchard Marc A

    2007-03-01

    Full Text Available Abstract Background Phylogenies of rapidly evolving pathogens can be difficult to resolve because of the small number of substitutions that accumulate in the short times since divergence. To improve resolution of such phylogenies we propose using insertion and deletion (indel information in addition to substitution information. We accomplish this through joint estimation of alignment and phylogeny in a Bayesian framework, drawing inference using Markov chain Monte Carlo. Joint estimation of alignment and phylogeny sidesteps biases that stem from conditioning on a single alignment by taking into account the ensemble of near-optimal alignments. Results We introduce a novel Markov chain transition kernel that improves computational efficiency by proposing non-local topology rearrangements and by block sampling alignment and topology parameters. In addition, we extend our previous indel model to increase biological realism by placing indels preferentially on longer branches. We demonstrate the ability of indel information to increase phylogenetic resolution in examples drawn from within-host viral sequence samples. We also demonstrate the importance of taking alignment uncertainty into account when using such information. Finally, we show that codon-based substitution models can significantly affect alignment quality and phylogenetic inference by unrealistically forcing indels to begin and end between codons. Conclusion These results indicate that indel information can improve phylogenetic resolution of recently diverged pathogens and that alignment uncertainty should be considered in such analyses.

  6. Application of real-time PCR of sex-independent insertion-deletion polymorphisms to determine fetal sex using cell-free fetal DNA from maternal plasma.

    Science.gov (United States)

    Ho, Sherry Sze Yee; Barrett, Angela; Thadani, Henna; Asibal, Cecille Laureano; Koay, Evelyn Siew-Chuan; Choolani, Mahesh

    2015-07-01

    Prenatal diagnosis of sex-linked disorders requires invasive procedures, carrying a risk of miscarriage of up to 1%. Cell-free fetal DNA (cffDNA) present in cell-free DNA (cfDNA) from maternal plasma offers a non-invasive source of fetal genetic material for analysis. Detection of Y-chromosome sequences in cfDNA indicates presence of a male fetus; in the absence of a Y-chromosome signal a female fetus is inferred. We aimed to validate the clinical utility of insertion-deletion polymorphisms (INDELs) to confirm presence of a female fetus using cffDNA. Quantitative real-time PCR (qPCR) for the Y-chromosome-specific sequence, SRY, was performed on cfDNA from 82 samples at 6-39 gestational weeks. In samples without detectable SRY, qPCRs for eight INDELs were performed on maternal genomic DNA and cfDNA. Detection of paternally inherited fetal alleles in cfDNA negative for SRY confirmed a female fetus. Fetal sex was correctly determined in 77/82 (93.9%) cfDNA samples. SRY was detected in all 39 samples from male-bearing pregnancies, and none of the 43 female-bearing pregnancies (sensitivity and specificity of SRY qPCR is therefore 100%; 95% CI 91%-100%). Paternally inherited fetal alleles were detected in 38/43 samples with no SRY signal, confirming the presence of a female fetus (INDEL assay sensitivity is therefore 88.4%; 95% CI 74.1%-95.6%). Since paternally inherited fetal INDELs were not used in women bearing male fetuses, the specificity of INDELs cannot be calculated. Five cfDNA samples were negative for both SRY and INDELS. We have validated a non-invasive prenatal test to confirm fetal sex as early as 6 gestational weeks using cffDNA from maternal plasma.

  7. Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms.

    Science.gov (United States)

    Taillon-Miller, P; Gu, Z; Li, Q; Hillier, L; Kwok, P Y

    1998-07-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21-7q22, and 13q12-13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations.

  8. DELISHUS: an efficient and exact algorithm for genome-wide detection of deletion polymorphism in autism

    Science.gov (United States)

    Aguiar, Derek; Halldórsson, Bjarni V.; Morrow, Eric M.; Istrail, Sorin

    2012-01-01

    Motivation: The understanding of the genetic determinants of complex disease is undergoing a paradigm shift. Genetic heterogeneity of rare mutations with deleterious effects is more commonly being viewed as a major component of disease. Autism is an excellent example where research is active in identifying matches between the phenotypic and genomic heterogeneities. A considerable portion of autism appears to be correlated with copy number variation, which is not directly probed by single nucleotide polymorphism (SNP) array or sequencing technologies. Identifying the genetic heterogeneity of small deletions remains a major unresolved computational problem partly due to the inability of algorithms to detect them. Results: In this article, we present an algorithmic framework, which we term DELISHUS, that implements three exact algorithms for inferring regions of hemizygosity containing genomic deletions of all sizes and frequencies in SNP genotype data. We implement an efficient backtracking algorithm—that processes a 1 billion entry genome-wide association study SNP matrix in a few minutes—to compute all inherited deletions in a dataset. We further extend our model to give an efficient algorithm for detecting de novo deletions. Finally, given a set of called deletions, we also give a polynomial time algorithm for computing the critical regions of recurrent deletions. DELISHUS achieves significantly lower false-positive rates and higher power than previously published algorithms partly because it considers all individuals in the sample simultaneously. DELISHUS may be applied to SNP array or sequencing data to identify the deletion spectrum for family-based association studies. Availability: DELISHUS is available at http://www.brown.edu/Research/Istrail_Lab/. Contact: Eric_Morrow@brown.edu and Sorin_Istrail@brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22689755

  9. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    Science.gov (United States)

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  10. High-resolution genomic fingerprinting of Campylobacter jejuni and Campylobacter coli by analysis of amplified fragment length polymorphisms

    DEFF Research Database (Denmark)

    Kokotovic, Branko; On, Stephen L.W.

    1999-01-01

    A method for high-resolution genomic fingerprinting of the enteric pathogens Campylobacter jejuni and Campylobacter coli, based on the determination of amplified fragment length polymorphism, is described. The potential of this method for molecular epidemiological studies of these species...... is evaluated with 50 type, reference, and well-characterised field strains. Amplified fragment length polymorphism fingerprints comprised over 60 bands detected in the size range 35-500 bp. Groups of outbreak strains, replicate subcultures, and 'genetically identical' strains from humans, poultry and cattle......, proved indistinguishable by amplified fragment length polymorphism fingerprinting, but were differentiated fi-om unrelated isolates. Previously unknown relationships between three hippurate-negative C. jejuni strains, and two C. coil var, hyoilei strains, were identified. These relationships corresponded...

  11. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  12. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  13. Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome evolution between two wheat cultivars

    KAUST Repository

    Thind, Anupriya Kaur

    2018-02-08

    Background: Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate high-quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the evolutionary dynamics of wheat genomes on a megabase-scale. Results: Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes, the old landrace Chinese Spring and the elite Swiss spring wheat line CH Campala Lr22a. There was a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations revealed four large insertions/deletions (InDels) of >100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the evolutionary mechanisms that caused these InDels. Three of the large InDels affected copy number of NLRs, a gene family involved in plant immunity. Analysis of single nucleotide polymorphism (SNP) density revealed three haploblocks of 8 Mb, 9 Mb and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Conclusions: This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations. The insight obtained from this analysis will form the basis of future wheat pan-genome studies.

  14. Polymorphic Microsatellite Markers for the Tetrapolar Anther-Smut Fungus Microbotryum saponariae Based on Genome Sequencing

    Science.gov (United States)

    Fortuna, Taiadjana M.; Snirc, Alodie; Badouin, Hélène; Gouzy, Jérome; Siguenza, Sophie; Esquerre, Diane; Le Prieur, Stéphanie; Shykoff, Jacqui A.; Giraud, Tatiana

    2016-01-01

    Background Anther-smut fungi belonging to the genus Microbotryum sterilize their host plants by aborting ovaries and replacing pollen by fungal spores. Sibling Microbotryum species are highly specialized on their host plants and they have been widely used as models for studies of ecology and evolution of plant pathogenic fungi. However, most studies have focused, so far, on M. lychnidis-dioicae that parasitizes the white campion Silene latifolia. Microbotryum saponariae, parasitizing mainly Saponaria officinalis, is an interesting anther-smut fungus, since it belongs to a tetrapolar lineage (i.e., with two independently segregating mating-type loci), while most of the anther-smut Microbotryum fungi are bipolar (i.e., with a single mating-type locus). Saponaria officinalis is a widespread long-lived perennial plant species with multiple flowering stems, which makes its anther-smut pathogen a good model for studying phylogeography and within-host multiple infections. Principal Findings Here, based on a generated genome sequence of M. saponariae we developed 6 multiplexes with a total of 22 polymorphic microsatellite markers using an inexpensive and efficient method. We scored these markers in fungal individuals collected from 97 populations across Europe, and found that the number of their alleles ranged from 2 to 11, and their expected heterozygosity from 0.01 to 0.58. Cross-species amplification was examined using nine other Microbotryum species parasitizing hosts belonging to Silene, Dianthus and Knautia genera. All loci were successfully amplified in at least two other Microbotryum species. Significance These newly developed markers will provide insights into the population genetic structure and the occurrence of within-host multiple infections of M. saponariae. In addition, the draft genome of M. saponariae, as well as one of the described markers will be useful resources for studying the evolution of the breeding systems in the genus Microbotryum and the

  15. Genetic analysis of glucosinolate variability in broccoli florets using genome-anchored single nucleotide polymorphisms.

    Science.gov (United States)

    Brown, Allan F; Yousef, Gad G; Reid, Robert W; Chebrolu, Kranthi K; Thomas, Aswathy; Krueger, Christopher; Jeffery, Elizabeth; Jackson, Eric; Juvik, John A

    2015-07-01

    The identification of genetic factors influencing the accumulation of individual glucosinolates in broccoli florets provides novel insight into the regulation of glucosinolate levels in Brassica vegetables and will accelerate the development of vegetables with glucosinolate profiles tailored to promote human health. Quantitative trait loci analysis of glucosinolate (GSL) variability was conducted with a B. oleracea (broccoli) mapping population, saturated with single nucleotide polymorphism markers from a high-density array designed for rapeseed (Brassica napus). In 4 years of analysis, 14 QTLs were associated with the accumulation of aliphatic, indolic, or aromatic GSLs in floret tissue. The accumulation of 3-carbon aliphatic GSLs (2-propenyl and 3-methylsulfinylpropyl) was primarily associated with a single QTL on C05, but common regulation of 4-carbon aliphatic GSLs was not observed. A single locus on C09, associated with up to 40 % of the phenotypic variability of 2-hydroxy-3-butenyl GSL over multiple years, was not associated with the variability of precursor compounds. Similarly, QTLs on C02, C04, and C09 were associated with 4-methylsulfinylbutyl GSL concentration over multiple years but were not significantly associated with downstream compounds. Genome-specific SNP markers were used to identify candidate genes that co-localized to marker intervals and previously sequenced Brassica oleracea BAC clones containing known GSL genes (GSL-ALK, GSL-PRO, and GSL-ELONG) were aligned to the genomic sequence, providing support that at least three of our 14 QTLs likely correspond to previously identified GSL loci. The results demonstrate that previously identified loci do not fully explain GSL variation in broccoli. The identification of additional genetic factors influencing the accumulation of GSL in broccoli florets provides novel insight into the regulation of GSL levels in Brassicaceae and will accelerate development of vegetables with modified or enhanced GSL

  16. Overlapping Genomic Sequences: A Treasure Trove of Single-Nucleotide Polymorphisms

    Science.gov (United States)

    Taillon-Miller, Patricia; Gu, Zhijie; Li, Qun; Hillier, LaDeana; Kwok, Pui-Yan

    1998-01-01

    An efficient strategy to develop a dense set of single-nucleotide polymorphism (SNP) markers is to take advantage of the human genome sequencing effort currently under way. Our approach is based on the fact that bacterial artificial chromosomes (BACs) and P1-based artificial chromosomes (PACs) used in long-range sequencing projects come from diploid libraries. If the overlapping clones sequenced are from different lineages, one is comparing the sequences from 2 homologous chromosomes in the overlapping region. We have analyzed in detail every SNP identified while sequencing three sets of overlapping clones found on chromosome 5p15.2, 7q21–7q22, and 13q12–13q13. In the 200.6 kb of DNA sequence analyzed in these overlaps, 153 SNPs were identified. Computer analysis for repetitive elements and suitability for STS development yielded 44 STSs containing 68 SNPs for further study. All 68 SNPs were confirmed to be present in at least one of the three (Caucasian, African-American, Hispanic) populations studied. Furthermore, 42 of the SNPs tested (62%) were informative in at least one population, 32 (47%) were informative in two or more populations, and 23 (34%) were informative in all three populations. These results clearly indicate that developing SNP markers from overlapping genomic sequence is highly efficient and cost effective, requiring only the two simple steps of developing STSs around the known SNPs and characterizing them in the appropriate populations. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AC003015 (for GS113423), AC002380 (GS330J10), AC000066 (RG293F11), AC003086 (RG104F04), AC002525 (257C22A), and U73331 (96A18A).] PMID:9685323

  17. Genomic polymorphism, recombination, and linkage disequilibrium in human major histocompatibility complex-encoded antigen-processing genes.

    Science.gov (United States)

    van Endert, P M; Lopez, M T; Patel, S D; Monaco, J J; McDevitt, H O

    1992-01-01

    Recently, two subunits of a large cytosolic protease and two putative peptide transporter proteins were found to be encoded by genes within the class II region of the major histocompatibility complex (MHC). These genes have been suggested to be involved in the processing of antigenic proteins for presentation by MHC class I molecules. Because of the high degree of polymorphism in MHC genes, and previous evidence for both functional and polypeptide sequence polymorphism in the proteins encoded by the antigen-processing genes, we tested DNA from 27 consanguineous human cell lines for genomic polymorphism by restriction fragment length polymorphism (RFLP) analysis. These studies demonstrate a strong linkage disequilibrium between TAP1 and LMP2 RFLPs. Moreover, RFLPs, as well as a polymorphic stop codon in the telomeric TAP2 gene, appear to be in linkage disequilibrium with HLA-DR alleles and RFLPs in the HLA-DO gene. A high rate of recombination, however, seems to occur in the center of the complex, between the TAP1 and TAP2 genes. Images PMID:1360671

  18. Multi-generational imputation of single nucleotide polymorphism marker genotypes and accuracy of genomic selection.

    Science.gov (United States)

    Toghiani, S; Aggrey, S E; Rekaya, R

    2016-07-01

    Availability of high-density single nucleotide polymorphism (SNP) genotyping platforms provided unprecedented opportunities to enhance breeding programmes in livestock, poultry and plant species, and to better understand the genetic basis of complex traits. Using this genomic information, genomic breeding values (GEBVs), which are more accurate than conventional breeding values. The superiority of genomic selection is possible only when high-density SNP panels are used to track genes and QTLs affecting the trait. Unfortunately, even with the continuous decrease in genotyping costs, only a small fraction of the population has been genotyped with these high-density panels. It is often the case that a larger portion of the population is genotyped with low-density and low-cost SNP panels and then imputed to a higher density. Accuracy of SNP genotype imputation tends to be high when minimum requirements are met. Nevertheless, a certain rate of genotype imputation errors is unavoidable. Thus, it is reasonable to assume that the accuracy of GEBVs will be affected by imputation errors; especially, their cumulative effects over time. To evaluate the impact of multi-generational selection on the accuracy of SNP genotypes imputation and the reliability of resulting GEBVs, a simulation was carried out under varying updating of the reference population, distance between the reference and testing sets, and the approach used for the estimation of GEBVs. Using fixed reference populations, imputation accuracy decayed by about 0.5% per generation. In fact, after 25 generations, the accuracy was only 7% lower than the first generation. When the reference population was updated by either 1% or 5% of the top animals in the previous generations, decay of imputation accuracy was substantially reduced. These results indicate that low-density panels are useful, especially when the generational interval between reference and testing population is small. As the generational interval

  19. Rapid Genome-wide Single Nucleotide Polymorphism Discovery in Soybean and Rice via Deep Resequencing of Reduced Representation Libraries with the Illumina Genome Analyzer

    Directory of Open Access Journals (Sweden)

    Stéphane Deschamps

    2010-07-01

    Full Text Available Massively parallel sequencing platforms have allowed for the rapid discovery of single nucleotide polymorphisms (SNPs among related genotypes within a species. We describe the creation of reduced representation libraries (RRLs using an initial digestion of nuclear genomic DNA with a methylation-sensitive restriction endonuclease followed by a secondary digestion with the 4bp-restriction endonuclease This strategy allows for the enrichment of hypomethylated genomic DNA, which has been shown to be rich in genic sequences, and the digestion with serves to increase the number of common loci resequenced between individuals. Deep resequencing of these RRLs performed with the Illumina Genome Analyzer led to the identification of 2618 SNPs in rice and 1682 SNPs in soybean for two representative genotypes in each of the species. A subset of these SNPs was validated via Sanger sequencing, exhibiting validation rates of 96.4 and 97.0%, in rice ( and soybean (, respectively. Comparative analysis of the read distribution relative to annotated genes in the reference genome assemblies indicated that the RRL strategy was primarily sampling within genic regions for both species. The massively parallel sequencing of methylation-sensitive RRLs for genome-wide SNP discovery can be applied across a wide range of plant species having sufficient reference genomic sequence.

  20. Whole-Genome Analyses of Korean Native and Holstein Cattle Breeds by Massively Parallel Sequencing

    Science.gov (United States)

    Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P.; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2014-01-01

    A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea—Hanwoo, Jeju Heugu, and Korean Holstein—using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions–deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding. PMID:24992012

  1. High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

    Science.gov (United States)

    2011-01-01

    Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across

  2. Single Nucleotide Polymorphisms in Common Bean: Their Discovery and Genotyping Using a Multiplex Detection System

    Directory of Open Access Journals (Sweden)

    E. Gaitán-Solís

    2008-11-01

    Full Text Available Single nucleotide polymorphism (SNP markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean ( L. by comparing sequences from coding and noncoding regions obtained from the GenBank and genomic DNA and to compare sequencing results with those obtained using single base extension (SBE assays on the Luminex-100 system for use in high-throughput germplasm evaluation. We assessed the frequency of SNPs in 47 fragments of common bean DNA, using SBE as the evaluation methodology. We conducted a sequence analysis of 10 genotypes of cultivated and wild beans belonging to the Mesoamerican and Andean genetic pools of . For the 10 genotypes evaluated, a total of 20,964 bp of sequence were analyzed in each genotype and compared, resulting in the discovery of 239 SNPs and 133 InDels, giving an average SNP frequency of one per 88 bp and an InDel frequency of one per 157 bp. This is the equivalent of a nucleotide diversity (θ of 6.27 × 10. Comparisons with the SNP genotypes previously obtained by direct sequencing showed that the SBE assays on the Luminex-100 were accurate, with 2.5% being miscalled and 1% showing no signal. These results indicate that the Luminex-100 provides a high-throughput system that can be used to analyze SNPs in large samples of genotypes both for purposes of assessing diversity and also for mapping studies.

  3. Partial digestion with restriction enzymes of ultraviolet-irradiated human genomic DNA: a method for identifying restriction site polymorphisms

    International Nuclear Information System (INIS)

    Nobile, C.; Romeo, G.

    1988-01-01

    A method for partial digestion of total human DNA with restriction enzymes has been developed on the basis of a principle already utilized by P.A. Whittaker and E. Southern for the analysis of phage lambda recombinants. Total human DNA irradiated with uv light of 254 nm is partially digested by restriction enzymes that recognize sequences containing adjacent thymidines because of TT dimer formation. The products resulting from partial digestion of specific genomic regions are detected in Southern blots by genomic-unique DNA probes with high reproducibility. This procedure is rapid and simple to perform because the same conditions of uv irradiation are used for different enzymes and probes. It is shown that restriction site polymorphisms occurring in the genomic regions analyzed are recognized by the allelic partial digest patterns they determine

  4. Whole Genome Association Study to Detect Single Nucleotide Polymorphisms for Behavior in Sapsaree Dog (

    Directory of Open Access Journals (Sweden)

    J. H. Ha

    2015-07-01

    Full Text Available The purpose of this study was to characterize genetic architecture of behavior patterns in Sapsaree dogs. The breed population (n = 8,256 has been constructed since 1990 over 12 generations and managed at the Sapsaree Breeding Research Institute, Gyeongsan, Korea. Seven behavioral traits were investigated for 882 individuals. The traits were classified as a quantitative or a categorical group, and heritabilities (h2 and variance components were estimated under the Animal model using ASREML 2.0 software program. In general, the h2 estimates of the traits ranged between 0.00 and 0.16. Strong genetic (rG and phenotypic (rP correlations were observed between nerve stability, affability and adaptability, i.e. 0.9 to 0.94 and 0.46 to 0.68, respectively. To detect significant single nucleotide polymorphism (SNP for the behavioral traits, a total of 134 and 60 samples were genotyped using the Illumina 22K CanineSNP20 and 170K CanineHD bead chips, respectively. Two datasets comprising 60 (Sap60 and 183 (Sap183 samples were analyzed, respectively, of which the latter was based on the SNPs that were embedded on both the 22K and 170K chips. To perform genome-wide association analysis, each SNP was considered with the residuals of each phenotype that were adjusted for sex and year of birth as fixed effects. A least squares based single marker regression analysis was followed by a stepwise regression procedure for the significant SNPs (p<0.01, to determine a best set of SNPs for each trait. A total of 41 SNPs were detected with the Sap183 samples for the behavior traits. The significant SNPs need to be verified using other samples, so as to be utilized to improve behavior traits via marker-assisted selection in the Sapsaree population.

  5. A resource of genome-wide single-nucleotide polymorphisms generated by RAD tag sequencing in the critically endangered European eel

    DEFF Research Database (Denmark)

    Pujolar, J.M.; Jacobsen, M.W.; Frydenberg, J.

    2013-01-01

    Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the Eu...... 425 loci and 376 918 associated SNPs provides a valuable tool for future population genetics and genomics studies and allows for targeting specific genes and particularly interesting regions of the eel genome...

  6. Methylation-sensitive amplified polymorphism-based genome-wide analysis of cytosine methylation profiles in Nicotiana tabacum cultivars.

    Science.gov (United States)

    Jiao, J; Wu, J; Lv, Z; Sun, C; Gao, L; Yan, X; Cui, L; Tang, Z; Yan, B; Jia, Y

    2015-11-26

    This study aimed to investigate cytosine methylation profiles in different tobacco (Nicotiana tabacum) cultivars grown in China. Methylation-sensitive amplified polymorphism was used to analyze genome-wide global methylation profiles in four tobacco cultivars (Yunyan 85, NC89, K326, and Yunyan 87). Amplicons with methylated C motifs were cloned by reamplified polymerase chain reaction, sequenced, and analyzed. The results show that geographical location had a greater effect on methylation patterns in the tobacco genome than did sampling time. Analysis of the CG dinucleotide distribution in methylation-sensitive polymorphic restriction fragments suggested that a CpG dinucleotide cluster-enriched area is a possible site of cytosine methylation in the tobacco genome. The sequence alignments of the Nia1 gene (that encodes nitrate reductase) in Yunyan 87 in different regions indicate that a C-T transition might be responsible for the tobacco phenotype. T-C nucleotide replacement might also be responsible for the tobacco phenotype and may be influenced by geographical location.

  7. Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp using a soybean genome array

    Directory of Open Access Journals (Sweden)

    Wanamaker Steve

    2008-02-01

    Full Text Available Abstract Background Cowpea (Vigna unguiculata L. Walp is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP is a microarray-based marker which can be used for high throughput genotyping and high density mapping. Results Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max genome array. Robustified projection pursuit (RPP was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. Conclusion We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.

  8. Genome-wide macrosynteny among Fusarium species in the Gibberella fujikuroi complex revealed by amplified fragment length polymorphisms.

    Directory of Open Access Journals (Sweden)

    Lieschen De Vos

    Full Text Available The Gibberella fujikuroi complex includes many Fusarium species that cause significant losses in yield and quality of agricultural and forestry crops. Due to their economic importance, whole-genome sequence information has rapidly become available for species including Fusarium circinatum, Fusarium fujikuroi and Fusarium verticillioides, each of which represent one of the three main clades known in this complex. However, no previous studies have explored the genomic commonalities and differences among these fungi. In this study, a previously completed genetic linkage map for an interspecific cross between Fusarium temperatum and F. circinatum, together with genomic sequence data, was utilized to consider the level of synteny between the three Fusarium genomes. Regions that are homologous amongst the Fusarium genomes examined were identified using in silico and pyrosequenced amplified fragment length polymorphism (AFLP fragment analyses. Homology was determined using BLAST analysis of the sequences, with 777 homologous regions aligned to F. fujikuroi and F. verticillioides. This also made it possible to assign the linkage groups from the interspecific cross to their corresponding chromosomes in F. verticillioides and F. fujikuroi, as well as to assign two previously unmapped supercontigs of F. verticillioides to probable chromosomal locations. We further found evidence of a reciprocal translocation between the distal ends of chromosome 8 and 11, which apparently originated before the divergence of F. circinatum and F. temperatum. Overall, a remarkable level of macrosynteny was observed among the three Fusarium genomes, when comparing AFLP fragments. This study not only demonstrates how in silico AFLPs can aid in the integration of a genetic linkage map to the physical genome, but it also highlights the benefits of using this tool to study genomic synteny and architecture.

  9. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  10. Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs and an online database.

    Directory of Open Access Journals (Sweden)

    Christopher A Raistrick

    2010-10-01

    Full Text Available Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS discovered a polymorphism in a 5' non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5' untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants--splice translational efficiency polymorphisms (STEPs--may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL using publicly available data.Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs. 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5' non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison.Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.

  11. dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms.

    Science.gov (United States)

    Puritz, Jonathan B; Hollenbeck, Christopher M; Gold, John R

    2014-01-01

    Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for non-model organisms with large effective population sizes and high levels of genetic polymorphism. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is due to the fact that dDocent quality trims instead of filtering, incorporates both forward and reverse reads (including reads with INDEL polymorphisms) in assembly, mapping, and SNP calling. The pipeline and a comprehensive user guide can be found at http://dDocent.wordpress.com.

  12. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  13. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  14. Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L

    Science.gov (United States)

    Clarke, Wayne E.; Parkin, Isobel A.; Gajardo, Humberto A.; Gerhardt, Daniel J.; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G.; Snowdon, Rod J.; Federico, Maria L.; Iniguez-Luy, Federico L.

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci –QTL– analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species. PMID:24312619

  15. Development of Chloroplast Genomic Resources in Chinese Yam (Dioscorea polystachya

    Directory of Open Access Journals (Sweden)

    Junling Cao

    2018-01-01

    Full Text Available Chinese yam has been used both as a food and in traditional herbal medicine. Developing more effective genetic markers in this species is necessary to assess its genetic diversity and perform cultivar identification. In this study, new chloroplast genomic resources were developed using whole chloroplast genomes from six genotypes originating from different geographical locations. The Dioscorea polystachya chloroplast genome is a circular molecule consisting of two single-copy regions separated by a pair of inverted repeats. Comparative analyses of six D. polystachya chloroplast genomes revealed 141 single nucleotide polymorphisms (SNPs. Seventy simple sequence repeats (SSRs were found in the six genotypes, including 24 polymorphic SSRs. Forty-three common indels and five small inversions were detected. Phylogenetic analysis based on the complete chloroplast genome provided the best resolution among the genotypes. Our evaluation of chloroplast genome resources among these genotypes led us to consider the complete chloroplast genome sequence of D. polystachya as a source of reliable and valuable molecular markers for revealing biogeographical structure and the extent of genetic variation in wild populations and for identifying different cultivars.

  16. Pathogenesis comparison between the United States porcine epidemic diarrhoea virus prototype and S-INDEL-variant strains in conventional neonatal piglets.

    Science.gov (United States)

    Chen, Qi; Gauger, Phillip C; Stafne, Molly R; Thomas, Joseph T; Madson, Darin M; Huang, Haiyan; Zheng, Ying; Li, Ganwu; Zhang, Jianqiang

    2016-05-01

    At least two genetically different porcine epidemic diarrhoea virus (PEDV) strains have been identified in the USA: US PEDV prototype and S-INDEL-variant strains. The objective of this study was to compare the pathogenicity differences of the US PEDV prototype and S-INDEL-variant strains in conventional neonatal piglets under experimental infections. Fifty PEDV-negative 5-day-old pigs were divided into five groups of ten pigs each and were inoculated orogastrically with three US PEDV prototype isolates (IN19338/2013, NC35140/2013 and NC49469/2013), an S-INDEL-variant isolate (IL20697/2014), and virus-negative culture medium, respectively, with virus titres of 104 TCID50 ml- 1, 10 ml per pig. All three PEDV prototype isolates tested in this study, regardless of their phylogenetic clades, had similar pathogenicity and caused severe enteric disease in 5-day-old pigs as evidenced by clinical signs, faecal virus shedding, and gross and histopathological lesions. Compared with pigs inoculated with the three US PEDV prototype isolates, pigs inoculated with the S-INDEL-variant isolate had significantly diminished clinical signs, virus shedding in faeces, gross lesions in small intestines, caeca and colons, histopathological lesions in small intestines, and immunohistochemistry staining in ileum. However, the US PEDV prototype and the S-INDEL-variant strains induced similar viraemia levels in inoculated pigs. Whole genome sequences of the PEDV prototype and S-INDEL-variant strains were determined, but the molecular basis of virulence differences between these PEDV strains remains to be elucidated using a reverse genetics approach.

  17. Microsatellite genotyping and genome-wide single nucleotide polymorphism-based indices of Plasmodium falciparum diversity within clinical infections.

    Science.gov (United States)

    Murray, Lee; Mobegi, Victor A; Duffy, Craig W; Assefa, Samuel A; Kwiatkowski, Dominic P; Laman, Eugene; Loua, Kovana M; Conway, David J

    2016-05-12

    In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples. PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs. As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = -0.88, P 10 % had high correlation (r > 0.90) with the index derived using all SNPs. Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).

  18. GapCoder automates the use of indel characters in phylogenetic analysis.

    Science.gov (United States)

    Young, Nelson D; Healy, John

    2003-02-19

    Several ways of incorporating indels into phylogenetic analysis have been suggested. Simple indel coding has two strengths: (1) biological realism and (2) efficiency of analysis. In the method, each indel with different start and/or end positions is considered to be a separate character. The presence/absence of these indel characters is then added to the data set. We have written a program, GapCoder to automate this procedure. The program can input PIR format aligned datasets, find the indels and add the indel-based characters. The output is a NEXUS format file, which includes a table showing what region each indel characters is based on. If regions are excluded from analysis, this table makes it easy to identify the corresponding indel characters for exclusion. Manual implementation of the simple indel coding method can be very time-consuming, especially in data sets where indels are numerous and/or overlapping. GapCoder automates this method and is therefore particularly useful during procedures where phylogenetic analyses need to be repeated many times, such as when different alignments are being explored or when various taxon or character sets are being explored. GapCoder is currently available for Windows from http://www.home.duq.edu/~youngnd/GapCoder.

  19. The Organelle Genomes of Hassawi Rice (Oryza sativa L.) and Its Hybrid in Saudi Arabia: Genome Variation, Rearrangement, and Origins

    Science.gov (United States)

    Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun

    2012-01-01

    Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184

  20. Genome Plasticity and Polymorphisms in Critical Genes Correlate with Increased Virulence of Dutch Outbreak-Related Coxiella burnetii Strains

    Directory of Open Access Journals (Sweden)

    Runa Kuley

    2017-08-01

    Full Text Available Coxiella burnetii is an obligate intracellular bacterium and the etiological agent of Q fever. During 2007–2010 the largest Q fever outbreak ever reported occurred in The Netherlands. It is anticipated that strains from this outbreak demonstrated an increased zoonotic potential as more than 40,000 individuals were assumed to be infected. The acquisition of novel genetic factors by these C. burnetii outbreak strains, such as virulence-related genes, has frequently been proposed and discussed, but is not proved yet. In the present study, the whole genome sequence of several Dutch strains (CbNL01 and CbNL12 genotypes, a few additionally selected strains from different geographical locations and publicly available genome sequences were used for a comparative bioinformatics approach. The study focuses on the identification of specific genetic differences in the outbreak related CbNL01 strains compared to other C. burnetii strains. In this approach we investigated the phylogenetic relationship and genomic aspects of virulence and host-specificity. Phylogenetic clustering of whole genome sequences showed a genotype-specific clustering that correlated with the clustering observed using Multiple Locus Variable-number Tandem Repeat Analysis (MLVA. Ortholog analysis on predicted genes and single nucleotide polymorphism (SNP analysis of complete genome sequences demonstrated the presence of genotype-specific gene contents and SNP variations in C. burnetii strains. It also demonstrated that the currently used MLVA genotyping methods are highly discriminatory for the investigated outbreak strains. In the fully reconstructed genome sequence of the Dutch outbreak NL3262 strain of the CbNL01 genotype, a relatively large number of transposon-linked genes were identified as compared to the other published complete genome sequences of C. burnetii. Additionally, large numbers of SNPs in its membrane proteins and predicted virulence-associated genes were identified

  1. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  2. Detailed analysis of inversions predicted between two human genomes: errors, real polymorphisms, and their origin and population distribution.

    Science.gov (United States)

    Vicente-Salvador, David; Puig, Marta; Gayà-Vidal, Magdalena; Pacheco, Sarai; Giner-Delgado, Carla; Noguera, Isaac; Izquierdo, David; Martínez-Fundichely, Alexander; Ruiz-Herrera, Aurora; Estivill, Xavier; Aguado, Cristina; Lucas-Lledó, José Ignacio; Cáceres, Mario

    2017-02-01

    The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF = 0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Development and characterization of polymorphic genomic-SSR markers in Asian long-horned beetle (Anoplophora glabripennis).

    Science.gov (United States)

    Liu, Zhaoyang; Tao, Jing; Luo, Youqing

    2017-12-01

    The Asian long-horned beetle (ALB), Anoplophora glabripennis (Motschulsky) (Coleoptera: Cerambycidae: Lamiinae), is a wood-borer and polyphagous xylophage that is native to Asia. It infests and seriously harms healthy trees, and therefore is a cause for considerable environmental concern. The analysis of population genetic structure of ALB and sibling species Anoplophora nobilis (Ganglbauer) will not only help to clarify the relationship between environmental variables and mechanisms of speciation, but also will enhance our understanding of evolutionary processes. However, the known genetic markers, particularly microsatellites, are limited for this species. SSRLocator software was used to analyze the distribution and frequencies of genomic simple sequence repeat (SSR), to infer the basic characteristics of repeat motifs, and to design primers. We developed SSR loci of 2-6 repeated units, including 10,650 perfect SSRs, and found 140 types of repeat motifs. A total of 2621 SSR markers were discovered in ALB whole-genome shotgun sequences. 48 pairs of SSR primers were randomly chosen from 2621 SSR markers, and half of these 48 pairs were polymorphic containing 4 di-, 7 tri-, 2 tetra-, and 11-hexamer SSRs. Four populations test the effectiveness of the primers. These results suggest that our method for whole-genome SSR screening is feasible and efficient, and the SSR markers developed in this study are suitable for further population genetics studies of ALB. Moreover, they may also be useful for the development of SSRs for other Coleoptera.

  4. Genomic variations of Mycoplasma capricolum subsp capripneumoniae detected by amplified fragment length polymorphism (AFLP) analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Bolske, G.; Ahrens, Peter

    2000-01-01

    The genetic diversity of Mycoplasma capricolum subsp. capripneumoniae strains based on determination of amplified fragment length polymorphisms (AFLP) is described. AFLP fingerprints of 38 strains derived from different countries in Africa and the Middle East consisted of over 100 bands in the size...

  5. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle.

    Science.gov (United States)

    Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S

    2014-11-01

    Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  6. Identification of a novel FGFRL1 MicroRNA target site polymorphism for bone mineral density in meta-analyses of genome-wide association studies

    NARCIS (Netherlands)

    T. Niu (Tianhua); N. Liu (Ning); M. Zhao (Ming); G. Xie (Guie); L. Zhang (Lei); J. Li (Jian); Y.-F. Pei (Yu-Fang); H. Shen (Hui); X. Fu (Xiaoying); H. He (Hao); S. Lu (Shan); X. Chen (Xiangding); L. Tan (Lijun); T.-L. Yang (Tie-Lin); Y. Guo (Yan); P.J. Leo (Paul); E.L. Duncan (Emma); J. Shen (Jie); Y.-F. Guo (Yan-fang); G.C. Nicholson (Geoffrey); R.L. Prince (Richard L.); J.A. Eisman (John); G. Jones (Graeme); P.N. Sambrook (Philip); X. Hu (Xiang); P.M. Das (Partha M.); Q. Tian (Qing); X.-Z. Zhu (Xue-Zhen); C.J. Papasian (Christopher J.); M.A. Brown (Matthew); A.G. Uitterlinden (André); Y.-P. Wang (Yu-Ping); S. Xiang (Shuanglin); H.-W. Deng

    2015-01-01

    textabstractMicroRNAs (miRNAs) are critical post-transcriptional regulators. Based on a previous genome-wide association (GWA) scan, we conducted a polymorphism in microRNAs' Target Sites (poly-miRTS)-centric multistage meta-analysis for lumbar spine (LS)-, total hip (HIP)-, and femoral neck

  7. Development of highly polymorphic simple sequence repeat markers using genome-wide microsatellite variant analysis in Foxtail millet [Setaria italica (L.) P. Beauv].

    Science.gov (United States)

    Zhang, Shuo; Tang, Chanjuan; Zhao, Qiang; Li, Jing; Yang, Lifang; Qie, Lufeng; Fan, Xingke; Li, Lin; Zhang, Ning; Zhao, Meicheng; Liu, Xiaotong; Chai, Yang; Zhang, Xue; Wang, Hailong; Li, Yingtao; Li, Wen; Zhi, Hui; Jia, Guanqing; Diao, Xianmin

    2014-01-28

    Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1' by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei's genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species.

  8. Sequencing and annotation of the chloroplast DNAs and identification of polymorphisms distinguishing normal male-fertile and male-sterile cytoplasms of onion.

    Science.gov (United States)

    von Kohn, Christopher; Kiełkowska, Agnieszka; Havey, Michael J

    2013-12-01

    Male-sterile (S) cytoplasm of onion is an alien cytoplasm introgressed into onion in antiquity and is widely used for hybrid seed production. Owing to the biennial generation time of onion, classical crossing takes at least 4 years to classify cytoplasms as S or normal (N) male-fertile. Molecular markers in the organellar DNAs that distinguish N and S cytoplasms are useful to reduce the time required to classify onion cytoplasms. In this research, we completed next-generation sequencing of the chloroplast DNAs of N- and S-cytoplasmic onions; we assembled and annotated the genomes in addition to identifying polymorphisms that distinguish these cytoplasms. The sizes (153 538 and 153 355 base pairs) and GC contents (36.8%) were very similar for the chloroplast DNAs of N and S cytoplasms, respectively, as expected given their close phylogenetic relationship. The size difference was primarily due to small indels in intergenic regions and a deletion in the accD gene of N-cytoplasmic onion. The structures of the onion chloroplast DNAs were similar to those of most land plants with large and small single copy regions separated by inverted repeats. Twenty-eight single nucleotide polymorphisms, two polymorphic restriction-enzyme sites, and one indel distributed across 20 chloroplast genes in the large and small single copy regions were selected and validated using diverse onion populations previously classified as N or S cytoplasmic using restriction fragment length polymorphisms. Although cytoplasmic male sterility is likely associated with the mitochondrial DNA, maternal transmission of the mitochondrial and chloroplast DNAs allows for polymorphisms in either genome to be useful for classifying onion cytoplasms to aid the development of hybrid onion cultivars.

  9. Comparative genomics of Bacillus anthracis from the wool industry highlights polymorphisms of lineage A.Br.Vollum.

    Science.gov (United States)

    Derzelle, Sylviane; Aguilar-Bultet, Lisandra; Frey, Joachim

    2016-12-01

    With the advent of affordable next-generation sequencing (NGS) technologies, major progress has been made in the understanding of the population structure and evolution of the B. anthracis species. Here we report the use of whole genome sequencing and computer-based comparative analyses to characterize six strains belonging to the A.Br.Vollum lineage. These strains were isolated in Switzerland, in 1981, during iterative cases of anthrax involving workers in a textile plant processing cashmere wool from the Indian subcontinent. We took advantage of the hundreds of currently available B. anthracis genomes in public databases, to investigate the genetic diversity existing within the A.Br.Vollum lineage and to position the six Swiss isolates into the worldwide B. anthracis phylogeny. Thirty additional genomes related to the A.Br.Vollum group were identified by whole-genome single nucleotide polymorphism (SNP) analysis, including two strains forming a new evolutionary branch at the basis of the A.Br.Vollum lineage. This new phylogenetic lineage (termed A.Br.H9401) splits off the branch leading to the A.Br.Vollum group soon after its divergence to the other lineages of the major A clade (i.e. 6 SNPs). The available dataset of A.Br.Vollum genomes were resolved into 2 distinct groups. Isolates from the Swiss wool processing facility clustered together with two strains from Pakistan and one strain of unknown origin isolated from yarn. They were clearly differentiated (69 SNPs) from the twenty-five other A.Br.Vollum strains located on the branch leading to the terminal reference strain A0488 of the lineage. Novel analytic assays specific to these new subgroups were developed for the purpose of rapid molecular epidemiology. Whole genome SNP surveys greatly expand upon our knowledge on the sub-structure of the A.Br.Vollum lineage. Possible origin and route of spread of this lineage worldwide are discussed. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights

  10. Systematic analysis of short internal indels and their impact on protein folding

    Directory of Open Access Journals (Sweden)

    Guo Jun-tao

    2010-08-01

    Full Text Available Abstract Background Protein sequence insertions/deletions (indels can be introduced during evolution or through alternative splicing (AS. Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledge of the structural changes due to indels is critical to our understanding of the evolution of protein structure and function. In addition, it can help us probe the evolution of alternative splicing and the diversity of functional isoforms. However, little is known about the effects of indels, in particular the ones involving core secondary structures, on the folding of protein structures. The long term goal of our study is to accurately predict the protein AS isoform structures. As a first step towards this goal, we performed a systematic analysis on the structural changes caused by short internal indels through mining highly homologous proteins in Protein Data Bank (PDB. Results We compiled a non-redundant dataset of short internal indels (2-40 amino acids from highly homologous protein pairs and analyzed the sequence and structural features of the indels. We found that about one third of indel residues are in disordered state and majority of the residues are exposed to solvent, suggesting that these indels are generally located on the surface of proteins. Though naturally occurring indels are fewer than engineered ones in the dataset, there are no statistically significant differences in terms of amino acid frequencies and secondary structure types between the "Natural" indels and "All" indels in the dataset. Structural comparisons show that all the protein pairs with short internal indels in the dataset preserve the structural folds and about 85% of protein pairs have global RMSDs (root mean square deviations of 2Å or less, suggesting that protein structures tend to be conserved and can tolerate short insertions and deletions. A few pairs

  11. Genetic Diversity and Population Structure in Native Chicken Populations from Myanmar, Thailand and Laos by Using 102 Indels Markers

    Directory of Open Access Journals (Sweden)

    A. A. Maw

    2015-01-01

    Full Text Available The genetic diversity of native chicken populations from Myanmar, Thailand, and Laos was examined by using 102 insertion and/or deletion (indels markers. Most of the indels loci were polymorphic (71% to 96%, and the genetic variability was similar in all populations. The average observed heterozygosities (HO and expected heterozygosities (HE ranged from 0.205 to 0.263 and 0.239 to 0.381, respectively. The coefficients of genetic differentiation (Gst for all cumulated populations was 0.125, and the Thai native chickens showed higher Gst (0.088 than Myanmar (0.041 and Laotian (0.024 populations. The pairwise Fst distances ranged from 0.144 to 0.308 among populations. A neighbor-joining (NJ tree, using Nei’s genetic distance, revealed that Thai and Laotian native chicken populations were genetically close, while Myanmar native chickens were distant from the others. The native chickens from these three countries were thought to be descended from three different origins (K = 3 from STRUCTURE analysis. Genetic admixture was observed in Thai and Laotian native chickens, while admixture was absent in Myanmar native chickens.

  12. Genome-wide single nucleotide polymorphisms (SNPs) for a model invasive ascidian Botryllus schlosseri.

    Science.gov (United States)

    Gao, Yangchun; Li, Shiguo; Zhan, Aibin

    2018-04-01

    Invasive species cause huge damages to ecology, environment and economy globally. The comprehensive understanding of invasion mechanisms, particularly genetic bases of micro-evolutionary processes responsible for invasion success, is essential for reducing potential damages caused by invasive species. The golden star tunicate, Botryllus schlosseri, has become a model species in invasion biology, mainly owing to its high invasiveness nature and small well-sequenced genome. However, the genome-wide genetic markers have not been well developed in this highly invasive species, thus limiting the comprehensive understanding of genetic mechanisms of invasion success. Using restriction site-associated DNA (RAD) tag sequencing, here we developed a high-quality resource of 14,119 out of 158,821 SNPs for B. schlosseri. These SNPs were relatively evenly distributed at each chromosome. SNP annotations showed that the majority of SNPs (63.20%) were located at intergenic regions, and 21.51% and 14.58% were located at introns and exons, respectively. In addition, the potential use of the developed SNPs for population genomics studies was primarily assessed, such as the estimate of observed heterozygosity (H O ), expected heterozygosity (H E ), nucleotide diversity (π), Wright's inbreeding coefficient (F IS ) and effective population size (Ne). Our developed SNP resource would provide future studies the genome-wide genetic markers for genetic and genomic investigations, such as genetic bases of micro-evolutionary processes responsible for invasion success.

  13. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics

    Directory of Open Access Journals (Sweden)

    Zhang Dapeng

    2012-06-01

    Full Text Available Abstract Background Proteinaceous toxins are observed across all levels of inter-organismal and intra-genomic conflicts. These include recently discovered prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. They are characterized by a remarkable diversity of C-terminal toxin domains generated by recombination with standalone toxin-coding cassettes. Prior analysis revealed a striking diversity of nuclease and deaminase domains among the toxin modules. We systematically investigated polymorphic toxin systems using comparative genomics, sequence and structure analysis. Results Polymorphic toxin systems are distributed across all major bacterial lineages and are delivered by at least eight distinct secretory systems. In addition to type-II, these include type-V, VI, VII (ESX, and the poorly characterized “Photorhabdus virulence cassettes (PVC”, PrsW-dependent and MuF phage-capsid-like systems. We present evidence that trafficking of these toxins is often accompanied by autoproteolytic processing catalyzed by HINT, ZU5, PrsW, caspase-like, papain-like, and a novel metallopeptidase associated with the PVC system. We identified over 150 distinct toxin domains in these systems. These span an extraordinary catalytic spectrum to include 23 distinct clades of peptidases, numerous previously unrecognized versions of nucleases and deaminases, ADP-ribosyltransferases, ADP ribosyl cyclases, RelA/SpoT-like nucleotidyltransferases, glycosyltranferases and other enzymes predicted to modify lipids and carbohydrates, and a pore-forming toxin domain. Several of these toxin domains are shared with host-directed effectors of pathogenic bacteria. Over 90 families of immunity proteins might neutralize anywhere between a single to at least 27 distinct types of toxin domains. In some organisms multiple tandem immunity genes or immunity protein domains are organized into polyimmunity loci or polyimmunity proteins. Gene-neighborhood-analysis of

  14. Comparative Genomics of Rhodococcus equi Virulence Plasmids Indicates Host-Driven Evolution of the vap Pathogenicity Island.

    Science.gov (United States)

    MacArthur, Iain; Anastasi, Elisa; Alvarez, Sonsiray; Scortti, Mariela; Vázquez-Boland, José A

    2017-05-01

    The conjugative virulence plasmid is a key component of the Rhodococcus equi accessory genome essential for pathogenesis. Three host-associated virulence plasmid types have been identified the equine pVAPA and porcine pVAPB circular variants, and the linear pVAPN found in bovine (ruminant) isolates. We recently characterized the R. equi pangenome (Anastasi E, et al. 2016. Pangenome and phylogenomic analysis of the pathogenic actinobacterium Rhodococcus equi. Genome Biol Evol. 8:3140-3148.) and we report here the comparative analysis of the virulence plasmid genomes. Plasmids within each host-associated type were highly similar despite their diverse origins. Variation was accounted for by scattered single nucleotide polymorphisms and short nucleotide indels, while larger indels-mostly in the plasticity region near the vap pathogencity island (PAI)-defined plasmid genomic subtypes. Only one of the plasmids analyzed, of pVAPN type, was exceptionally divergent due to accumulation of indels in the housekeeping backbone. Each host-associated plasmid type carried a unique PAI differing in vap gene complement, suggesting animal host-specific evolution of the vap multigene family. Complete conservation of the vap PAI was observed within each host-associated plasmid type. Both diversity of host-associated plasmid types and clonality of specific chromosomal-plasmid genomic type combinations were observed within the same R. equi phylogenomic subclade. Our data indicate that the overall strong conservation of the R. equi host-associated virulence plasmids is the combined result of host-driven selection, lateral transfer between strains, and geographical spread due to international livestock exchanges. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. Use of microsatellite markers derived from whole genome sequence data for identifying polymorphism in Phytophthora ramorum

    Science.gov (United States)

    Kelly Ivors; Matteo Garbelotto; Ineke De Vries; Peter Bonants

    2006-01-01

    Investigating the population genetics of Phytophthora ramorum, the causal agent of sudden oak death (SOD), is critical to understanding the biology and epidemiology of this important phytopathogen. Raw sequence data (445,000 reads) of P. ramorum was provided by the Joint Genome Institute. Our objective was to develop and utilize...

  16. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  17. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Directory of Open Access Journals (Sweden)

    Yash Paul Khajuria

    Full Text Available The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777 of an inter-specific reference mapping population. High amplification efficiency (87%, experimental validation success rate (81% and polymorphic potential (55% of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48% detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%. An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777 having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped

  18. Comparative mapping of Brassica juncea and Arabidopsis thaliana using Intron Polymorphism (IP markers: homoeologous relationships, diversification and evolution of the A, B and C Brassica genomes

    Directory of Open Access Journals (Sweden)

    Gupta Vibha

    2008-03-01

    Full Text Available Abstract Background Extensive mapping efforts are currently underway for the establishment of comparative genomics between the model plant, Arabidopsis thaliana and various Brassica species. Most of these studies have deployed RFLP markers, the use of which is a laborious and time-consuming process. We therefore tested the efficacy of PCR-based Intron Polymorphism (IP markers to analyze genome-wide synteny between the oilseed crop, Brassica juncea (AABB genome and A. thaliana and analyzed the arrangement of 24 (previously described genomic block segments in the A, B and C Brassica genomes to study the evolutionary events contributing to karyotype variations in the three diploid Brassica genomes. Results IP markers were highly efficient and generated easily discernable polymorphisms on agarose gels. Comparative analysis of the segmental organization of the A and B genomes of B. juncea (present study with the A and B genomes of B. napus and B. nigra respectively (described earlier, revealed a high degree of colinearity suggesting minimal macro-level changes after polyploidization. The ancestral block arrangements that remained unaltered during evolution and the karyotype rearrangements that originated in the Oleracea lineage after its divergence from Rapa lineage were identified. Genomic rearrangements leading to the gain or loss of one chromosome each between the A-B and A-C lineages were deciphered. Complete homoeology in terms of block organization was found between three linkage groups (LG each for the A-B and A-C genomes. Based on the homoeology shared between the A, B and C genomes, a new nomenclature for the B genome LGs was assigned to establish uniformity in the international Brassica LG nomenclature code. Conclusion IP markers were highly effective in generating comparative relationships between Arabidopsis and various Brassica species. Comparative genomics between the three Brassica lineages established the major rearrangements

  19. Microarray-based genomic surveying of gene polymorphisms in Chlamydia trachomatis

    OpenAIRE

    Brunelle, Brian W; Nicholson, Tracy L; Stephens, Richard S

    2004-01-01

    By comparing two fully sequenced genomes of Chlamydia trachomatis using competitive hybridization on DNA microarrays, a logarithmic correlation was demonstrated between the signal ratio of the arrays and the 75-99% range of nucleotide identities of the genes. Variable genes within 14 uncharacterized strains of C. trachomatis were identified by array analysis and verified by DNA sequencing. These genes may be crucial for understanding chlamydial virulence and pathogenesis.

  20. Retroelement insertional polymorphisms, diversity and phylogeography within diploid, D-genome Aegilops tauschii (Triticeae, Poaceae) sub-taxa in Iran.

    Science.gov (United States)

    Saeidi, Hojjatollah; Rahiminejad, Mohammad Reza; Heslop-Harrison, J S

    2008-04-01

    The diploid goat grass Aegilops tauschii (2n = 2x = 14) is native to the Middle East and is the D-genome donor to hexaploid bread wheat. The aim of this study was to measure the diversity of different subspecies and varieties of wild Ae. tauschii collected across the major areas where it grows in Iran and to examine patterns of diversity related to the taxa and geography. Inter-retroelement amplified polymorphism (IRAP) markers were used to analyse the biodiversity of DNA from 57 accessions of Ae. tauschii from northern and central Iran, and two hexaploid wheats. Key Results Eight IRAP primer combinations amplified a total of 171 distinct DNA fragments between 180 and 3200 bp long from the accessions, of which 169 were polymorphic. On average, about eight fragments were amplified with each primer combination, with more bands being amplified from accessions from the north-west of the country than from other accessions. The IRAP markers showed high levels of genetic diversity. Analysis of all accessions together did not allow the allocation of individuals to taxa based on morphology, but showed a tendency to put accessions from the north-west apart from others regions. It is speculated that this could be due to different activity of retroelements in the different regions. Within the two taxa with most accessions, there was a range of IRAP genotypes that could be correlated closely with geographical origin. This supports suggestions that the centre of origin of the species is towards the south-east of the Caspian Sea. IRAP is an appropriate marker system to evaluate genetic diversity and evolutionary relationships within the taxa, but it is too variable to define the taxa themselves, where more slowly evolving morphological, DNA sequence or chromosomal makers may be more appropriate.

  1. Neutral genomic microevolution of a recently emerged pathogen, Salmonella enterica serovar Agona.

    Directory of Open Access Journals (Sweden)

    Zhemin Zhou

    2013-04-01

    Full Text Available Salmonella enterica serovar Agona has caused multiple food-borne outbreaks of gastroenteritis since it was first isolated in 1952. We analyzed the genomes of 73 isolates from global sources, comparing five distinct outbreaks with sporadic infections as well as food contamination and the environment. Agona consists of three lineages with minimal mutational diversity: only 846 single nucleotide polymorphisms (SNPs have accumulated in the non-repetitive, core genome since Agona evolved in 1932 and subsequently underwent a major population expansion in the 1960s. Homologous recombination with other serovars of S. enterica imported 42 recombinational tracts (360 kb in 5/143 nodes within the genealogy, which resulted in 3,164 additional SNPs. In contrast to this paucity of genetic diversity, Agona is highly diverse according to pulsed-field gel electrophoresis (PFGE, which is used to assign isolates to outbreaks. PFGE diversity reflects a highly dynamic accessory genome associated with the gain or loss (indels of 51 bacteriophages, 10 plasmids, and 6 integrative conjugational elements (ICE/IMEs, but did not correlate uniquely with outbreaks. Unlike the core genome, indels occurred repeatedly in independent nodes (homoplasies, resulting in inaccurate PFGE genealogies. The accessory genome contained only few cargo genes relevant to infection, other than antibiotic resistance. Thus, most of the genetic diversity within this recently emerged pathogen reflects changes in the accessory genome, or is due to recombination, but these changes seemed to reflect neutral processes rather than Darwinian selection. Each outbreak was caused by an independent clade, without universal, outbreak-associated genomic features, and none of the variable genes in the pan-genome seemed to be associated with an ability to cause outbreaks.

  2. Genome wide re-sequencing of newly developed Rice Lines from common wild rice (Oryza rufipogon Griff.) for the identification of NBS-LRR genes.

    Science.gov (United States)

    Liu, Wen; Ghouri, Fozia; Yu, Hang; Li, Xiang; Yu, Shuhong; Shahid, Muhammad Qasim; Liu, Xiangdong

    2017-01-01

    Common wild rice (Oryza rufipogon Griff.) is an important germplasm for rice breeding, which contains many resistance genes. Re-sequencing provides an unprecedented opportunity to explore the abundant useful genes at whole genome level. Here, we identified the nucleotide-binding site leucine-rich repeat (NBS-LRR) encoding genes by re-sequencing of two wild rice lines (i.e. Huaye 1 and Huaye 2) that were developed from common wild rice. We obtained 128 to 147 million reads with approximately 32.5-fold coverage depth, and uniquely covered more than 89.6% (> = 1 fold) of reference genomes. Two wild rice lines showed high SNP (single-nucleotide polymorphisms) variation rate in 12 chromosomes against the reference genomes of Nipponbare (japonica cultivar) and 93-11 (indica cultivar). InDels (insertion/deletion polymorphisms) count-length distribution exhibited normal distribution in the two lines, and most of the InDels were ranged from -5 to 5 bp. With reference to the Nipponbare genome sequence, we detected a total of 1,209,308 SNPs, 161,117 InDels and 4,192 SVs (structural variations) in Huaye 1, and 1,387,959 SNPs, 180,226 InDels and 5,305 SVs in Huaye 2. A total of 44.9% and 46.9% genes exhibited sequence variations in two wild rice lines compared to the Nipponbare and 93-11 reference genomes, respectively. Analysis of NBS-LRR mutant candidate genes showed that they were mainly distributed on chromosome 11, and NBS domain was more conserved than LRR domain in both wild rice lines. NBS genes depicted higher levels of genetic diversity in Huaye 1 than that found in Huaye 2. Furthermore, protein-protein interaction analysis showed that NBS genes mostly interacted with the cytochrome C protein (Os05g0420600, Os01g0885000 and BGIOSGA038922), while some NBS genes interacted with heat shock protein, DNA-binding activity, Phosphoinositide 3-kinase and a coiled coil region. We explored abundant NBS-LRR encoding genes in two common wild rice lines through genome wide re

  3. Developing market class specific InDel markers from next generation sequence data in Phaseolus vulgaris L.

    Directory of Open Access Journals (Sweden)

    Samira eMafi Moghaddam

    2014-05-01

    Full Text Available Next generation sequence data provides valuable information and tools for genetic and genomic research and offers new insights useful for marker development. This data is useful for the design of accurate and user-friendly molecular tools. Common bean (Phaseolus vulgaris L. is a diverse crop in which separate domestication events happened in each gene pool followed by race and market class diversification that has resulted in different morphological characteristics in each commercial market class. This has led to essentially independent breeding programs within each market class which in turn has resulted in limited within market class sequence variation. Sequence data from selected genotypes of five bean market classes (pinto, black, navy, and light and dark red kidney were used to develop InDel-based markers specific to each market class. Design of the InDel markers was conducted through a combination of assembly, alignment and primer design software using 1.6x to 5.1x coverage of Illumina GAII sequence data for each of the selected genotypes. The procedure we developed for primer design is fast, accurate, less error prone, and higher throughput than when they are designed manually. All InDel markers are easy to run and score with no need for PCR optimization. A total of 2,687 InDel markers distributed across the genome were developed. To highlight their usefulness, they were employed to construct a phylogenetic tree and a genetic map, showing that InDel markers are reliable, simple, and accurate.

  4. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    OpenAIRE

    Karolina Chwialkowska; Urszula Korotko; Joanna Kosinska; Iwona Szarejko; Miroslaw Kwasniewski

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing ...

  5. Bayesian phylogeny analysis of vertebrate serpins illustrates evolutionary conservation of the intron and indels based six groups classification system from lampreys for ∼500 MY

    Directory of Open Access Journals (Sweden)

    Abhishek Kumar

    2015-06-01

    Full Text Available The serpin superfamily is characterized by proteins that fold into a conserved tertiary structure and exploits a sophisticated and irreversible suicide-mechanism of inhibition. Vertebrate serpins are classified into six groups (V1–V6, based on three independent biological features—genomic organization, diagnostic amino acid sites and rare indels. However, this classification system was based on the limited number of mammalian genomes available. In this study, several non-mammalian genomes are used to validate this classification system using the powerful Bayesian phylogenetic method. This method supports the intron and indel based vertebrate classification and proves that serpins have been maintained from lampreys to humans for about 500 MY. Lampreys have fewer than 10 serpins, which expand into 36 serpins in humans. The two expanding groups V1 and V2 have SERPINB1/SERPINB6 and SERPINA8/SERPIND1 as the ancestral serpins, respectively. Large clusters of serpins are formed by local duplications of these serpins in tetrapod genomes. Interestingly, the ancestral HCII/SERPIND1 locus (nested within PIK4CA possesses group V4 serpin (A2APL1, homolog of α2-AP/SERPINF2 of lampreys; hence, pointing to the fact that group V4 might have originated from group V2. Additionally in this study, details of the phylogenetic history and genomic characteristics of vertebrate serpins are revisited.

  6. Genome analysis of yellow fever virus of the ongoing outbreak in Brazil reveals polymorphisms

    Directory of Open Access Journals (Sweden)

    Myrna C Bonaldo

    Full Text Available The current yellow fever outbreak in Brazil is the most severe one in the country in recent times. It has rapidly spread to areas where YF virus (YFV activity has not been observed for more than 70 years and vaccine coverage is almost null. Here, we sequenced the whole YFV genome of two naturally infected howler-monkeys (Alouatta clamitans obtained from the Municipality of Domingos Martins, state of Espírito Santo, Brazil. These two ongoing-outbreak genome sequences are identical. They clustered in the 1E sub-clade (South America genotype I along with the Brazilian and Venezuelan strains recently characterised from infections in humans and non-human primates that have been described in the last 20 years. However, we detected eight unique amino acid changes in the viral proteins, including the structural capsid protein (one change, and the components of the viral replicase complex, the NS3 (two changes and NS5 (five changes proteins, that could impact the capacity of viral infection in vertebrate and/or invertebrate hosts and spreading of the ongoing outbreak.

  7. [Analysis of methylation-sensitive amplified polymorphism in wheat genome under the wheat leaf rust stress].

    Science.gov (United States)

    Fu, Sheng-Jie; Wang, Hui; Feng, Li-Na; Sun, Yi; Yang, Wen-Xiang; Liu, Da-Qun

    2009-03-01

    Intrinsic DNA methylation pattern is an integral component of the epigenetic network in many eukaryotes. DNA methylation plays an important role in regulating gene expression in eukaryotes. Biological stress in plant provides an inherent epigenetic driving force of evolution. Study of DNA methylation patterns arising from biological stress will help us fully understand the epigenetic regulation of gene expression and DNA methylation of biological functions. The wheat near-isogenic lines TcLr19 and TcLr41 were resistant to races THTT and TKTJ, respectively, and Thatcher is compatible in the interaction with Puccinia triticina THTT and TKTJ, respectively. By means of methylation-sensitive amplified polymorphism (MSAP) analysis, the patterns of cytosine methylation in TcLr19, TcLr41, and Thatcher inoculated with P. triticina THTT and TKTJ were compared with those of the untreated samples. All the DNA fragments, each representing a recognition site cleaved by each or both of isoschizomers, were amplified using 60 pairs of selective primers. The results indicated that there was no significant difference between the challenged and unchallenged plants at DNA methylation level. However, epigenetic difference between the near-isogenic line for wheat leaf rust resistance gene Lr41 and Thatcher was present.

  8. Polymorphisms in the Prion Protein Gene of cattle breeds from Brazil

    Directory of Open Access Journals (Sweden)

    Cristiane C. Sanches

    Full Text Available ABSTRACT: One of the alterations that occur in the PRNP gene in bovines is the insertion/deletion (indel of base sequences in specific regions, such as indels of 12-base pairs (bp in intron 1 and of 23- bp in the promoter region. The deletion allele of 23 bp is associated with susceptibility to bovine spongiform encephalopathy (BSE as well as the presence of the deletion allele of 12 bp. In the present study, the variability of nucleotides in the promoter region and intron 1 of the PRNP gene was genotyped for the Angus, Canchim, Nellore and Simmental bovine breeds to identify the genotype profiles of resistance and/or susceptibility to BSE in each animal. Genomic DNA was extracted for amplification of the target regions of the PRNP gene using polymerase chain reaction (PCR and specific primers. The PCR products were submitted to electrophoresis in agarose gel 3% and sequencing for genotyping. With the exception of the Angus breed, most breeds exhibited a higher frequency of deletion alleles for 12 bp and 23 bp in comparison to their respective insertion alleles for both regions. These results represent an important contribution to understanding the formation process of the Brazilian herd in relation to bovine PRNP gene polymorphisms.

  9. Comparative genomic analysis of multidrug-resistant Streptococcus pneumoniae isolates

    Directory of Open Access Journals (Sweden)

    Pan F

    2018-05-01

    Full Text Available Fen Pan,1 Hong Zhang,1 Xiaoyan Dong,2 Weixing Ye,3 Ping He,4 Shulin Zhang,4 Jeff Xianchao Zhu,5 Nanbert Zhong1,2,6 1Department of Clinical Laboratory, Shanghai Children’s Hospital, Shanghai Jiaotong University, Shanghai, China; 2Department of Respiratory, Shanghai Children’s Hospital, Shanghai Jiaotong University, Shanghai, China; 3Shanghai Personal Biotechnology Co., Ltd, Shanghai, China; 4Department of Medical Microbiology and Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China; 5Zhejiang Bioruida Biotechnology co. Ltd, Zhejiang, China; 6New York State Institute for Basic Research in Developmental Disabilities, Staten Island, NY, USA Introduction: Multidrug resistance in Streptococcus pneumoniae has emerged as a serious problem to public health. A further understanding of the genetic diversity in antibiotic-resistant S. pneumoniae isolates is needed. Methods: We conducted whole-genome resequencing for 25 pneumococcal strains isolated from children with different antimicrobial resistance profiles. Comparative analysis focus on detection of single-nucleotide polymorphisms (SNPs and insertions and deletions (indels was conducted. Moreover, phylogenetic analysis was applied to investigate the genetic relationship among these strains. Results: The genome size of the isolates was ~2.1 Mbp, covering >90% of the total estimated size of the reference genome. The overall G+C% content was ~39.5%, and there were 2,200–2,400 open reading frames. All isolates with different drug resistance profiles harbored many indels (range 131–171 and SNPs (range 16,103–28,128. Genetic diversity analysis showed that the variation of different genes were associated with specific antibiotic resistance. Known antibiotic resistance genes (pbps, murMN, ciaH, rplD, sulA, and dpr were identified, and new genes (regR, argH, trkH, and PTS-EII closely related with antibiotic resistance were found, although these genes were primarily annotated

  10. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    Science.gov (United States)

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  11. Polymorphisms in AHI1 are not associated with type 2 diabetes or related phenotypes in Danes: non-replication of a genome-wide association result

    DEFF Research Database (Denmark)

    Holmkvist, J; Anthonsen, S; Wegner, L

    2008-01-01

    AIMS/HYPOTHESIS: A genome-wide association study recently identified an association between common variants, rs1535435 and rs9494266, in the AHI1 gene and type 2 diabetes. The aim of the present study was to investigate the putative association between these polymorphisms and type 2 diabetes or t...... the importance of independent and well-powered replication studies of the recent genome-wide association scans before a locus is robustly validated as being associated with type 2 diabetes.......AIMS/HYPOTHESIS: A genome-wide association study recently identified an association between common variants, rs1535435 and rs9494266, in the AHI1 gene and type 2 diabetes. The aim of the present study was to investigate the putative association between these polymorphisms and type 2 diabetes...... or type 2 diabetes-related metabolic traits in Danish individuals. METHODS: The previously associated polymorphisms were genotyped in the population-based Inter99 cohort (n=6162), the Danish ADDITION study (n=8428), a population-based sample of young healthy participants (n=377) and in additional type 2...

  12. [Analysis of genomic DNA methylation level in radish under cadmium stress by methylation-sensitive amplified polymorphism technique].

    Science.gov (United States)

    Yang, Jin-Lan; Liu, Li-Wang; Gong, Yi-Qin; Huang, Dan-Qiong; Wang, Feng; He, Ling-Li

    2007-06-01

    The level of cytosine methylation induced by cadmium in radish (Raphanus sativus L.) genome was analysed using the technique of methylation-sensitive amplified polymorphism (MSAP). The MSAP ratios in radish seedling exposed to cadmium chloride at the concentration of 50, 250 and 500 mg/L were 37%, 43% and 51%, respectively, and the control was 34%; the full methylation levels (C(m)CGG in double strands) were at 23%, 25% and 27%, respectively, while the control was 22%. The level of increase in MSAP and full methylation indicated that de novo methylation occurred in some 5'-CCGG sites under Cd stress. There was significant positive correlation between increase of total DNA methylation level and CdCl(2) concentration. Four types of MSAP patterns: de novo methylation, de-methylation, atypical pattern and no changes of methylation pattern were identified among CdCl(2) treatments and the control. DNA methylation alteration in plants treated with CdCl(2) was mainly through de novo methylation.

  13. A Genome Wide Association Study on Age at First Calving Using High Density Single Nucleotide Polymorphism Chips in Hanwoo (

    Directory of Open Access Journals (Sweden)

    K.-E. Hyeong

    2014-10-01

    Full Text Available Age at first calving is an important trait for achieving earlier reproductive performance. To detect quantitative trait loci (QTL for reproductive traits, a genome wide association study was conducted on the 96 Hanwoo cows that were born between 2008 and 2010 from 13 sires in a local farm (Juk-Am Hanwoo farm, Suncheon, Korea and genotyped with the Illumina 50K bovine single nucleotide polymorphism (SNP chips. Phenotypes were regressed on additive and dominance effects for each SNP using a simple linear regression model after the effects of birth-year-month and polygenes were considered. A forward regression procedure was applied to determine the best set of SNPs for age at first calving. A total of 15 QTL were detected at the comparison-wise 0.001 level. Two QTL with strong statistical evidence were found at 128.9 Mb and 111.1 Mb on bovine chromosomes (BTA 2 and 7, respectively, each of which accounted for 22% of the phenotypic variance. Also, five significant SNPs were detected on BTAs 10, 16, 20, 26, and 29. Multiple QTL were found on BTAs 1, 2, 7, and 14. The significant QTLs may be applied via marker assisted selection to increase rate of genetic gain for the trait, after validation tests in other Hanwoo cow populations.

  14. High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.

    Science.gov (United States)

    Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias

    2015-01-01

    Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.

  15. Should we use the single nucleotide polymorphism linked to in genomic evaluation of French trotter?

    Science.gov (United States)

    Brard, S; Ricard, A

    2015-10-01

    An A/C mutation responsible for the ability to pace in horses was recently discovered in the gene. It has also been proven that allele C has a negative effect on trotters' performances. However, in French trotters (FT), the frequency of allele A is only 77% due to an unexpected positive effect of allele C in late-career FT performances. Here we set out to ascertain whether the genotype at SNP (linked to ) should be used to compute EBV for FT. We used the genotypes of 630 horses, with 41,711 SNP retained. The pedigree comprised 5,699 horses. Qualification status (trotters need to complete a 2,000-m race within a limited time to begin their career) and earnings at different ages were precorrected for fixed effects and evaluated with a multitrait model. Estimated breeding values were computed with and without the genotype at SNP as a fixed effect in the model. The analyses were performed using pedigree only via BLUP and using the genotypes via genomic BLUP (GBLUP). The genotype at SNP was removed from the file of genotypes when already taken into account as a fixed effect. Alternatively, 3 groups of 100 candidates were used for validation. Validations were also performed on 50 random-clustered groups of 126 candidates and compared against the results of the 3 disjoint sets. For performances on which has a minor effect, the coefficients of correlation were not improved when the genotype at SNP was a fixed effect in the model (earnings at 3 and 4 yr). However, for traits proven strongly related to , the accuracy of evaluation was improved, increasing +0.17 for earnings at 2 yr, +0.04 for earnings at 5 yr and older, and +0.09 for qualification status (with the GBLUP method). For all traits, the bias was reduced when the SNP linked to was a fixed effect in the model. This work finds a clear rationale for using the genotype at for this multitrait evaluation. Genomic selection seemed to achieve better results than classic selection.

  16. Genomic expression and single-nucleotide polymorphism profiling discriminates chromophobe renal cell carcinoma and oncocytoma

    International Nuclear Information System (INIS)

    Tan, Min-Han; Furge, Kyle A; Kort, Eric; Giraud, Sophie; Ferlicot, Sophie; Vielh, Philippe; Amsellem-Ouazana, Delphine; Debré, Bernard; Flam, Thierry; Thiounn, Nicolas; Zerbib, Marc; Wong, Chin Fong; Benoît, Gérard; Droupy, Stéphane; Molinié, Vincent; Vieillefond, Annick; Tan, Puay Hoon; Richard, Stéphane; Teh, Bin Tean; Tan, Hwei Ling; Yang, Ximing J; Ditlev, Jonathon; Matsuda, Daisuke; Khoo, Sok Kean; Sugimura, Jun; Fujioka, Tomoaki

    2010-01-01

    Chromophobe renal cell carcinoma (chRCC) and renal oncocytoma are two distinct but closely related entities with strong morphologic and genetic similarities. While chRCC is a malignant tumor, oncocytoma is usually regarded as a benign entity. The overlapping characteristics are best explained by a common cellular origin, and the biologic differences between chRCC and oncocytoma are therefore of considerable interest in terms of carcinogenesis, diagnosis and clinical management. Previous studies have been relatively limited in terms of examining the differences between oncocytoma and chromophobe RCC. Gene expression profiling using the Affymetrix HGU133Plus2 platform was applied on chRCC (n = 15) and oncocytoma specimens (n = 15). Supervised analysis was applied to identify a discriminatory gene signature, as well as differentially expressed genes. High throughput single-nucleotide polymorphism (SNP) genotyping was performed on independent samples (n = 14) using Affymetrix GeneChip Mapping 100 K arrays to assess correlation between expression and gene copy number. Immunohistochemical validation was performed in an independent set of tumors. A novel 14 probe-set signature was developed to classify the tumors internally with 93% accuracy, and this was successfully validated on an external data-set with 94% accuracy. Pathway analysis highlighted clinically relevant dysregulated pathways of c-erbB2 and mammalian target of rapamycin (mTOR) signaling in chRCC, but no significant differences in p-AKT or extracellular HER2 expression was identified on immunohistochemistry. Loss of chromosome 1p, reflected in both cytogenetic and expression analysis, is common to both entities, implying this may be an early event in histogenesis. Multiple regional areas of cytogenetic alterations and corresponding expression biases differentiating the two entities were identified. Parafibromin, aquaporin 6, and synaptogyrin 3 were novel immunohistochemical markers effectively discriminating

  17. Genomic expression and single-nucleotide polymorphism profiling discriminates chromophobe renal cell carcinoma and oncocytoma

    Directory of Open Access Journals (Sweden)

    Thiounn Nicolas

    2010-05-01

    Full Text Available Abstract Background Chromophobe renal cell carcinoma (chRCC and renal oncocytoma are two distinct but closely related entities with strong morphologic and genetic similarities. While chRCC is a malignant tumor, oncocytoma is usually regarded as a benign entity. The overlapping characteristics are best explained by a common cellular origin, and the biologic differences between chRCC and oncocytoma are therefore of considerable interest in terms of carcinogenesis, diagnosis and clinical management. Previous studies have been relatively limited in terms of examining the differences between oncocytoma and chromophobe RCC. Methods Gene expression profiling using the Affymetrix HGU133Plus2 platform was applied on chRCC (n = 15 and oncocytoma specimens (n = 15. Supervised analysis was applied to identify a discriminatory gene signature, as well as differentially expressed genes. High throughput single-nucleotide polymorphism (SNP genotyping was performed on independent samples (n = 14 using Affymetrix GeneChip Mapping 100 K arrays to assess correlation between expression and gene copy number. Immunohistochemical validation was performed in an independent set of tumors. Results A novel 14 probe-set signature was developed to classify the tumors internally with 93% accuracy, and this was successfully validated on an external data-set with 94% accuracy. Pathway analysis highlighted clinically relevant dysregulated pathways of c-erbB2 and mammalian target of rapamycin (mTOR signaling in chRCC, but no significant differences in p-AKT or extracellular HER2 expression was identified on immunohistochemistry. Loss of chromosome 1p, reflected in both cytogenetic and expression analysis, is common to both entities, implying this may be an early event in histogenesis. Multiple regional areas of cytogenetic alterations and corresponding expression biases differentiating the two entities were identified. Parafibromin, aquaporin 6, and synaptogyrin 3 were novel

  18. A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the 'true citrus fruit trees' group (Citrinae, Rutaceae) and the origin of cultivated species.

    Science.gov (United States)

    Garcia-Lor, Andres; Curk, Franck; Snoussi-Trifa, Hager; Morillon, Raphael; Ancillo, Gema; Luro, François; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Despite differences in morphology, the genera representing 'true citrus fruit trees' are sexually compatible, and their phylogenetic relationships remain unclear. Most of the important commercial 'species' of Citrus are believed to be of interspecific origin. By studying polymorphisms of 27 nuclear genes, the average molecular differentiation between species was estimated and some phylogenetic relationships between 'true citrus fruit trees' were clarified. Sanger sequencing of PCR-amplified fragments from 18 genes involved in metabolite biosynthesis pathways and nine putative genes for salt tolerance was performed for 45 genotypes of Citrus and relatives of Citrus to mine single nucleotide polymorphisms (SNPs) and indel polymorphisms. Fifty nuclear simple sequence repeats (SSRs) were also analysed. A total of 16 238 kb of DNA was sequenced for each genotype, and 1097 single nucleotide polymorphisms (SNPs) and 50 indels were identified. These polymorphisms were more valuable than SSRs for inter-taxon differentiation. Nuclear phylogenetic analysis revealed that Citrus reticulata and Fortunella form a cluster that is differentiated from the clade that includes three other basic taxa of cultivated citrus (C. maxima, C. medica and C. micrantha). These results confirm the taxonomic subdivision between the subgenera Metacitrus and Archicitrus. A few genes displayed positive selection patterns within or between species, but most of them displayed neutral patterns. The phylogenetic inheritance patterns of the analysed genes were inferred for commercial Citrus spp. Numerous molecular polymorphisms (SNPs and indels), which are potentially useful for the analysis of interspecific genetic structures, have been identified. The nuclear phylogenetic network for Citrus and its sexually compatible relatives was consistent with the geographical origins of these genera. The positive selection observed for a few genes will help further works to analyse the molecular basis of the

  19. A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the ‘true citrus fruit trees’ group (Citrinae, Rutaceae) and the origin of cultivated species

    Science.gov (United States)

    Garcia-Lor, Andres; Curk, Franck; Snoussi-Trifa, Hager; Morillon, Raphael; Ancillo, Gema; Luro, François; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Background and Aims Despite differences in morphology, the genera representing ‘true citrus fruit trees’ are sexually compatible, and their phylogenetic relationships remain unclear. Most of the important commercial ‘species’ of Citrus are believed to be of interspecific origin. By studying polymorphisms of 27 nuclear genes, the average molecular differentiation between species was estimated and some phylogenetic relationships between ‘true citrus fruit trees’ were clarified. Methods Sanger sequencing of PCR-amplified fragments from 18 genes involved in metabolite biosynthesis pathways and nine putative genes for salt tolerance was performed for 45 genotypes of Citrus and relatives of Citrus to mine single nucleotide polymorphisms (SNPs) and indel polymorphisms. Fifty nuclear simple sequence repeats (SSRs) were also analysed. Key Results A total of 16 238 kb of DNA was sequenced for each genotype, and 1097 single nucleotide polymorphisms (SNPs) and 50 indels were identified. These polymorphisms were more valuable than SSRs for inter-taxon differentiation. Nuclear phylogenetic analysis revealed that Citrus reticulata and Fortunella form a cluster that is differentiated from the clade that includes three other basic taxa of cultivated citrus (C. maxima, C. medica and C. micrantha). These results confirm the taxonomic subdivision between the subgenera Metacitrus and Archicitrus. A few genes displayed positive selection patterns within or between species, but most of them displayed neutral patterns. The phylogenetic inheritance patterns of the analysed genes were inferred for commercial Citrus spp. Conclusions Numerous molecular polymorphisms (SNPs and indels), which are potentially useful for the analysis of interspecific genetic structures, have been identified. The nuclear phylogenetic network for Citrus and its sexually compatible relatives was consistent with the geographical origins of these genera. The positive selection observed for a few genes will

  20. Development and validation of cross-transferable and polymorphic DNA markers for detecting alien genome introgression in Oryza sativa from Oryza brachyantha.

    Science.gov (United States)

    Ray, Soham; Bose, Lotan K; Ray, Joshitha; Ngangkham, Umakanta; Katara, Jawahar L; Samantaray, Sanghamitra; Behera, Lambodar; Anumalla, Mahender; Singh, Onkar N; Chen, Meingsheng; Wing, Rod A; Mohapatra, Trilochan

    2016-08-01

    African wild rice Oryza brachyantha (FF), a distant relative of cultivated rice Oryza sativa (AA), carries genes for pests and disease resistance. Molecular marker assisted alien gene introgression from this wild species to its domesticated counterpart is largely impeded due to the scarce availability of cross-transferable and polymorphic molecular markers that can clearly distinguish these two species. Availability of the whole genome sequence (WGS) of both the species provides a unique opportunity to develop markers, which are cross-transferable. We observed poor cross-transferability (~0.75 %) of O. sativa specific sequence tagged microsatellite (STMS) markers to O. brachyantha. By utilizing the genome sequence information, we developed a set of 45 low cost PCR based co-dominant polymorphic markers (STS and CAPS). These markers were found cross-transferrable (84.78 %) between the two species and could distinguish them from each other and thus allowed tracing alien genome introgression. Finally, we validated a Monosomic Alien Addition Line (MAAL) carrying chromosome 1 of O. brachyantha in O. sativa background using these markers, as a proof of concept. Hence, in this study, we have identified a set molecular marker (comprising of STMS, STS and CAPS) that are capable of detecting alien genome introgression from O. brachyantha to O. sativa.

  1. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  2. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  3. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Directory of Open Access Journals (Sweden)

    Karolina Chwialkowska

    2017-11-01

    Full Text Available Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq. We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation

  4. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    Science.gov (United States)

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop

  5. Genome-wide generation and use of informative intron-spanning and intron-length polymorphism markers for high-throughput genetic analysis in rice

    Science.gov (United States)

    Badoni, Saurabh; Das, Sweta; Sayal, Yogesh K.; Gopalakrishnan, S.; Singh, Ashok K.; Rao, Atmakuri R.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    We developed genome-wide 84634 ISM (intron-spanning marker) and 16510 InDel-fragment length polymorphism-based ILP (intron-length polymorphism) markers from genes physically mapped on 12 rice chromosomes. These genic markers revealed much higher amplification-efficiency (80%) and polymorphic-potential (66%) among rice accessions even by a cost-effective agarose gel-based assay. A wider level of functional molecular diversity (17–79%) and well-defined precise admixed genetic structure was assayed by 3052 genome-wide markers in a structured population of indica, japonica, aromatic and wild rice. Six major grain weight QTLs (11.9–21.6% phenotypic variation explained) were mapped on five rice chromosomes of a high-density (inter-marker distance: 0.98 cM) genetic linkage map (IR 64 x Sonasal) anchored with 2785 known/candidate gene-derived ISM and ILP markers. The designing of multiple ISM and ILP markers (2 to 4 markers/gene) in an individual gene will broaden the user-preference to select suitable primer combination for efficient assaying of functional allelic variation/diversity and realistic estimation of differential gene expression profiles among rice accessions. The genomic information generated in our study is made publicly accessible through a user-friendly web-resource, “Oryza ISM-ILP marker” database. The known/candidate gene-derived ISM and ILP markers can be enormously deployed to identify functionally relevant trait-associated molecular tags by optimal-resource expenses, leading towards genomics-assisted crop improvement in rice. PMID:27032371

  6. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset.

    Science.gov (United States)

    Ignatieva, Elena V; Levitsky, Victor G; Yudin, Nikolay S; Moshkin, Mikhail P; Kolchanov, Nikolay A

    2014-01-01

    The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  7. Genetic basis of olfactory cognition: extremely high level of DNA sequence polymorphism in promoter regions of the human olfactory receptor genes revealed using the 1000 Genomes Project dataset

    Directory of Open Access Journals (Sweden)

    Elena V. Ignatieva

    2014-03-01

    Full Text Available The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors, which are activated by olfactory stimuli (ligands. Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter (a region of DNA about 100–1000 base pairs long located upstream of the transcription start site. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.. In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

  8. Single nucleotide polymorphisms and indel markers from the transcriptome of garlic

    Science.gov (United States)

    Garlic (Allium sativum L.) is cultivated world-wide and widely appreciated for its culinary uses. In spite of primarily being asexually propagated, garlic shows great diversity for adaptation to diverse production environments and bulb phenotypes. Anonymous molecular markers have been used to assess...

  9. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

    Science.gov (United States)

    Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

    2018-04-01

    Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  10. Genome features of "Dark-fly", a Drosophila line reared long-term in a dark environment.

    Directory of Open Access Journals (Sweden)

    Minako Izutsu

    Full Text Available Organisms are remarkably adapted to diverse environments by specialized metabolisms, morphology, or behaviors. To address the molecular mechanisms underlying environmental adaptation, we have utilized a Drosophila melanogaster line, termed "Dark-fly", which has been maintained in constant dark conditions for 57 years (1400 generations. We found that Dark-fly exhibited higher fecundity in dark than in light conditions, indicating that Dark-fly possesses some traits advantageous in darkness. Using next-generation sequencing technology, we determined the whole genome sequence of Dark-fly and identified approximately 220,000 single nucleotide polymorphisms (SNPs and 4,700 insertions or deletions (InDels in the Dark-fly genome compared to the genome of the Oregon-R-S strain, a control strain. 1.8% of SNPs were classified as non-synonymous SNPs (nsSNPs: i.e., they alter the amino acid sequence of gene products. Among them, we detected 28 nonsense mutations (i.e., they produce a stop codon in the protein sequence in the Dark-fly genome. These included genes encoding an olfactory receptor and a light receptor. We also searched runs of homozygosity (ROH regions as putative regions selected during the population history, and found 21 ROH regions in the Dark-fly genome. We identified 241 genes carrying nsSNPs or InDels in the ROH regions. These include a cluster of alpha-esterase genes that are involved in detoxification processes. Furthermore, analysis of structural variants in the Dark-fly genome showed the deletion of a gene related to fatty acid metabolism. Our results revealed unique features of the Dark-fly genome and provided a list of potential candidate genes involved in environmental adaptation.

  11. Genome and gene alterations by insertions and deletions in the evolution of human and chimpanzee chromosome 22

    Directory of Open Access Journals (Sweden)

    Volfovsky Natalia

    2009-01-01

    Full Text Available Abstract Background Understanding structure and function of human genome requires knowledge of genomes of our closest living relatives, the primates. Nucleotide insertions and deletions (indels play a significant role in differentiation that underlies phenotypic differences between humans and chimpanzees. In this study, we evaluated distribution, evolutionary history, and function of indels found by comparing syntenic regions of the human and chimpanzee genomes. Results Specifically, we identified 6,279 indels of 10 bp or greater in a ~33 Mb alignment between human and chimpanzee chromosome 22. After the exclusion of those in repetitive DNA, 1,429 or 23% of indels still remained. This group was characterized according to the local or genome-wide repetitive nature, size, location relative to genes, and other genomic features. We defined three major classes of these indels, using local structure analysis: (i those indels found uniquely without additional copies of indel sequence in the surrounding (10 Kb region, (ii those with at least one exact copy found nearby, and (iii those with similar but not identical copies found locally. Among these classes, we encountered a high number of exactly repeated indel sequences, most likely due to recent duplications. Many of these indels (683 of 1,429 were in proximity of known human genes. Coding sequences and splice sites contained significantly fewer of these indels than expected from random expectations, suggesting that selection is a factor in limiting their persistence. A subset of indels from coding regions was experimentally validated and their impacts were predicted based on direct sequencing in several human populations as well as chimpanzees, bonobos, gorillas, and two subspecies of orangutans. Conclusion Our analysis demonstrates that while indels are distributed essentially randomly in intergenic and intronic genomic regions, they are significantly under-represented in coding sequences. There are

  12. Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms.

    Science.gov (United States)

    Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro

    2010-04-27

    To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be

  13. A MITE-based genotyping method to reveal hundreds of DNA polymorphisms in an animal genome after a few generations of artificial selection

    Directory of Open Access Journals (Sweden)

    Tetreau Guillaume

    2008-10-01

    Full Text Available Abstract Background For most organisms, developing hundreds of genetic markers spanning the whole genome still requires excessive if not unrealistic efforts. In this context, there is an obvious need for methodologies allowing the low-cost, fast and high-throughput genotyping of virtually any species, such as the Diversity Arrays Technology (DArT. One of the crucial steps of the DArT technique is the genome complexity reduction, which allows obtaining a genomic representation characteristic of the studied DNA sample and necessary for subsequent genotyping. In this article, using the mosquito Aedes aegypti as a study model, we describe a new genome complexity reduction method taking advantage of the abundance of miniature inverted repeat transposable elements (MITEs in the genome of this species. Results Ae. aegypti genomic representations were produced following a two-step procedure: (1 restriction digestion of the genomic DNA and simultaneous ligation of a specific adaptor to compatible ends, and (2 amplification of restriction fragments containing a particular MITE element called Pony using two primers, one annealing to the adaptor sequence and one annealing to a conserved sequence motif of the Pony element. Using this protocol, we constructed a library comprising more than 6,000 DArT clones, of which at least 5.70% were highly reliable polymorphic markers for two closely related mosquito strains separated by only a few generations of artificial selection. Within this dataset, linkage disequilibrium was low, and marker redundancy was evaluated at 2.86% only. Most of the detected genetic variability was observed between the two studied mosquito strains, but individuals of the same strain could still be clearly distinguished. Conclusion The new complexity reduction method was particularly efficient to reveal genetic polymorphisms in Ae. egypti. Overall, our results testify of the flexibility of the DArT genotyping technique and open new

  14. Determination of the frequency of polymorphisms in genes related to the genome stability maintenance of the population residing at Monte Alegre, PA (Brazil) municipality

    International Nuclear Information System (INIS)

    Hozumi, Cristiny Gomes

    2010-01-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on earth, for man and all living things have always been exposed to these sources. Ionizing radiation is a known genotoxic agent which can affect the genomic stability and genes related to DNA repair may play a role when they have committed certain polymorphism. This study aimed to analyze the frequency of polymorphisms (SNPs) in genes of DNA repair and cell cycle control: hOGG1 (Ser326Cys), XRCC3 (Thr241 Met) and p53 (Arg72Pro) in saliva samples from a population located Monte Alegre, state of Para were collected in August 2008 and 40 samples of men and 46 samples of women, adding a total of 86 samples. By RFLP was determined the frequency of homozygous genotypes and / or heterozygous for polymorphic genes. The I)OGG1 gene was 5% of the allele 326Cys, XRCC3 gene found about 21 % of the allele 241 Met and p53 gene showed 40.8% of the 72Pro allele. And the genotype frequencies of individuals for the three genes were 91.04%, 88.06% and 59.7% for homozygous wild genotype, 5.97%, 11.94% and 22.39% for heterozygote genotype and 2,99%, zero and 17:91% for homozygous polymorphic hOGG1 genes respectively, XRCC3, p53. These values are similar to those found in previous studies. The influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, which is statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology in Monte Alegre, that help to characterization of local population. (author)

  15. Porcine SOX9 Gene Expression Is Influenced by an 18 bp Indel in the 5'-Untranslated Region.

    Directory of Open Access Journals (Sweden)

    Bertram Brenig

    Full Text Available Sex determining region Y-box 9 (SOX9 is an important regulator of sex and skeletal development and is expressed in a variety of embryonal and adult tissues. Loss or gain of function resulting from mutations within the coding region or chromosomal aberrations of the SOX9 locus lead to a plethora of detrimental phenotypes in humans and animals. One of these phenotypes is the so-called male-to-female or female-to-male sex-reversal which has been observed in several mammals including pig, dog, cat, goat, horse, and deer. In 38,XX sex-reversal French Large White pigs, a genome-wide association study suggested SOX9 as the causal gene, although no functional mutations were identified in affected animals. However, besides others an 18 bp indel had been detected in the 5'-untranslated region of the SOX9 gene by comparing affected animals and controls. We have identified the same indel (Δ18 between position +247 bp and +266 bp downstream the transcription start site of the porcine SOX9 gene in four other pig breeds; i.e., German Large White, Laiwu Black, Bamei, and Erhualian. These animals have been genotyped in an attempt to identify candidate genes for porcine inguinal and/or scrotal hernia. Because the 18 bp segment in the wild type 5'-UTR harbours a highly conserved cAMP-response element (CRE half-site, we analysed its role in SOX9 expression in vitro. Competition and immunodepletion electromobility shift assays demonstrate that the CRE half-site is specifically recognized by CREB. Both binding of CREB to the wild type as well as the absence of the CRE half-site in Δ18 reduced expression efficiency in HEK293T, PK-15, and ATDC5 cells significantly. Transfection experiments of wild type and Δ18 SOX9 promoter luciferase constructs show a significant reduction of RNA and protein levels depending on the presence or absence of the 18 bp segment. Hence, the data presented here demonstrate that the 18 bp indel in the porcine SOX9 5'-UTR is of functional

  16. Single-nucleotide polymorphism discovery in Leptographium longiclavatum, a mountain pine beetle-associated symbiotic fungus, using whole-genome resequencing.

    Science.gov (United States)

    Ojeda, Dario I; Dhillon, Braham; Tsui, Clement K M; Hamelin, Richard C

    2014-03-01

    Single-nucleotide polymorphisms (SNPs) are rapidly becoming the standard markers in population genomics studies; however, their use in nonmodel organisms is limited due to the lack of cost-effective approaches to uncover genome-wide variation, and the large number of individuals needed in the screening process to reduce ascertainment bias. To discover SNPs for population genomics studies in the fungal symbionts of the mountain pine beetle (MPB), we developed a road map to discover SNPs and to produce a genotyping platform. We undertook a whole-genome sequencing approach of Leptographium longiclavatum in combination with available genomics resources of another MPB symbiont, Grosmannia clavigera. We sequenced 71 individuals pooled into four groups using the Illumina sequencing technology. We generated between 27 and 30 million reads of 75 bp that resulted in a total of 1, 181 contigs longer than 2 kb and an assembled genome size of 28.9 Mb (N50 = 48 kb, average depth = 125x). A total of 9052 proteins were annotated, and between 9531 and 17,266 SNPs were identified in the four pools. A subset of 206 genes (containing 574 SNPs, 11% false positives) was used to develop a genotyping platform for this species. Using this roadmap, we developed a genotyping assay with a total of 147 SNPs located in 121 genes using the Illumina(®) Sequenom iPLEX Gold. Our preliminary genotyping (success rate = 85%) of 304 individuals from 36 populations supports the utility of this approach for population genomics studies in other MPB fungal symbionts and other fungal nonmodel species. © 2013 John Wiley & Sons Ltd.

  17. Whole-genome single-nucleotide polymorphism (SNP marker discovery and association analysis with the eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content in Larimichthys crocea

    Directory of Open Access Journals (Sweden)

    Shijun Xiao

    2016-12-01

    Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.

  18. Estimating additive and non-additive genetic variances and predicting genetic merits using genome-wide dense single nucleotide polymorphism markers.

    Directory of Open Access Journals (Sweden)

    Guosheng Su

    Full Text Available Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP markers. In addition, this study for the first time proposed a method to construct dominance relationship matrix using SNP markers and demonstrated it in detail. The proposed model was implemented to investigate the amounts of additive genetic, dominance and epistatic variations, and assessed the accuracy and unbiasedness of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1 a simple additive genetic model (MA, 2 a model including both additive and additive by additive epistatic genetic effects (MAE, 3 a model including both additive and dominance genetic effects (MAD, and 4 a full model including all three genetic components (MAED. Estimates of narrow-sense heritability were 0.397, 0.373, 0.379 and 0.357 for models MA, MAE, MAD and MAED, respectively. Estimated dominance variance and additive by additive epistatic variance accounted for 5.6% and 9.5% of the total phenotypic variance, respectively. Based on model MAED, the estimate of broad-sense heritability was 0.506. Reliabilities of genomic predicted breeding values for the animals without performance records were 28.5%, 28.8%, 29.2% and 29.5% for models MA, MAE, MAD and MAED, respectively. In addition, models including non-additive genetic effects improved unbiasedness of genomic predictions.

  19. Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

    Science.gov (United States)

    Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

    2009-01-01

    Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494

  20. Genetic diversity and population structure in Physalis peruviana and related taxa based on InDels and SNPs derived from COSII and IRG markers

    Science.gov (United States)

    Garzón-Martínez, Gina A.; Osorio-Guarín, Jaime A.; Delgadillo-Durán, Paola; Mayorga, Franklin; Enciso-Rodríguez, Felix E.; Landsman, David

    2015-01-01

    The genus Physalis is common in the Americas and includes several economically important species, among them Physalis peruviana that produces appetizing edible fruits. We studied the genetic diversity and population structure of P. peruviana and characterized 47 accessions of this species along with 13 accessions of related taxa consisting of 222 individuals from the Colombian Corporation of Agricultural Research (CORPOICA) germplasm collection, using Conserved Orthologous Sequences (COSII) and Immunity Related Genes (IRGs). In addition, 642 Single Nucleotide Polymorphism (SNPs) markers were identified and used for the genetic diversity analysis. A total of 121 alleles were detected in 24 InDels loci ranging from 2 to 9 alleles per locus, with an average of 5.04 alleles per locus. The average number of alleles in the SNP markers was two. The observed heterozygosity for P. peruviana with InDel and SNP markers was higher (0.48 and 0.59) than the expected heterozygosity (0.30 and 0.41). Interestingly, the observed heterozygosity in related taxa (0.4 and 0.12) was lower than the expected heterozygosity (0.59 and 0.25). The coefficient of population differentiation FST was 0.143 (InDels) and 0.038 (SNPs), showing a relatively low level of genetic differentiation among P. peruviana and related taxa. Higher levels of genetic variation were instead observed within populations based on the AMOVA analysis. Population structure analysis supported the presence of two main groups and PCA analysis based on SNP markers revealed two distinct clusters in the P. peruviana accessions corresponding to their state of cultivation. In this study, we identified molecular markers useful to detect genetic variation in Physalis germplasm for assisting conservation and crossbreeding strategies. PMID:26550601

  1. Genetic diversity and population structure in Physalis peruviana and related taxa based on InDels and SNPs derived from COSII and IRG markers.

    Science.gov (United States)

    Garzón-Martínez, Gina A; Osorio-Guarín, Jaime A; Delgadillo-Durán, Paola; Mayorga, Franklin; Enciso-Rodríguez, Felix E; Landsman, David; Mariño-Ramírez, Leonardo; Barrero, Luz Stella

    2015-12-01

    The genus Physalis is common in the Americas and includes several economically important species, among them Physalis peruviana that produces appetizing edible fruits. We studied the genetic diversity and population structure of P. peruviana and characterized 47 accessions of this species along with 13 accessions of related taxa consisting of 222 individuals from the Colombian Corporation of Agricultural Research (CORPOICA) germplasm collection, using Conserved Orthologous Sequences (COSII) and Immunity Related Genes (IRGs). In addition, 642 Single Nucleotide Polymorphism (SNPs) markers were identified and used for the genetic diversity analysis. A total of 121 alleles were detected in 24 InDels loci ranging from 2 to 9 alleles per locus, with an average of 5.04 alleles per locus. The average number of alleles in the SNP markers was two. The observed heterozygosity for P. peruviana with InDel and SNP markers was higher (0.48 and 0.59) than the expected heterozygosity (0.30 and 0.41). Interestingly, the observed heterozygosity in related taxa (0.4 and 0.12) was lower than the expected heterozygosity (0.59 and 0.25). The coefficient of population differentiation F ST was 0.143 (InDels) and 0.038 (SNPs), showing a relatively low level of genetic differentiation among P. peruviana and related taxa. Higher levels of genetic variation were instead observed within populations based on the AMOVA analysis. Population structure analysis supported the presence of two main groups and PCA analysis based on SNP markers revealed two distinct clusters in the P. peruviana accessions corresponding to their state of cultivation. In this study, we identified molecular markers useful to detect genetic variation in Physalis germplasm for assisting conservation and crossbreeding strategies.

  2. Signatures of selection in the Iberian honey bee: a genome wide approach using single nucleotide polymorphisms (SNPs)

    OpenAIRE

    Chavez-Galarza, Julio; Johnston, J. Spencer; Azevedo, João; Muñoz, Irene; De la Rúa, Pilar; Patton, John C.; Pinto, M. Alice

    2011-01-01

    Dissecting genome-wide (expansions, contractions, admixture) from genome-specific effects (selection) is a goal of central importance in evolutionary biology because it leads to more robust inferences of demographic history and to identification of adaptive divergence. The publication of the honey bee genome and the development of high-density SNPs genotyping, provide us with powerful tools, allowing us to identify signatures of selection in the honey bee genome. These signatur...

  3. Development of cleaved amplified polymorphic sequence (CAPS) and high-resolution melting (HRM) markers from the chloroplast genome of Glycyrrhiza species.

    Science.gov (United States)

    Jo, Ick-Hyun; Sung, Jwakyung; Hong, Chi-Eun; Raveendar, Sebastin; Bang, Kyong-Hwan; Chung, Jong-Wook

    2018-05-01

    Licorice ( Glycyrrhiza glabra ) is an important medicinal crop often used as health foods or medicine worldwide. The molecular genetics of licorice is under scarce owing to lack of molecular markers. Here, we have developed cleaved amplified polymorphic sequence (CAPS) and high-resolution melting (HRM) markers based on single nucleotide polymorphisms (SNP) by comparing the chloroplast genomes of two Glycyrrhiza species ( G. glabra and G. lepidota ). The CAPS and HRM markers were tested for diversity analysis with 24 Glycyrrhiza accessions. The restriction profiles generated with CAPS markers classified the accessions (2-4 genotypes) and melting curves (2-3) were obtained from the HRM markers. The number of alleles and major allele frequency were 2-6 and 0.31-0.92, respectively. The genetic distance and polymorphism information content values were 0.16-0.76 and 0.15-0.72, respectively. The phylogenetic relationships among the 24 accessions were estimated using a dendrogram, which classified them into four clades. Except clade III, the remaining three clades included the same species, confirming interspecies genetic correlation. These 18 CAPS and HRM markers might be helpful for genetic diversity assessment and rapid identification of licorice species.

  4. Genome-wide association study identifies polymorphisms associated with the analgesic effect of fentanyl in the preoperative cold pressor-induced pain test

    Directory of Open Access Journals (Sweden)

    Kaori Takahashi

    2018-03-01

    Full Text Available Opioid analgesics are widely used for the treatment of moderate to severe pain. The analgesic effects of opioids are well known to vary among individuals. The present study focused on the genetic factors that are associated with interindividual differences in pain and opioid sensitivity. We conducted a multistage genome-wide association study in subjects who were scheduled to undergo mandibular sagittal split ramus osteotomy and were not medicated until they received fentanyl for the induction of anesthesia. We preoperatively conducted the cold pressor-induced pain test before and after fentanyl administration. The rs13093031 and rs12633508 single-nucleotide polymorphisms (SNPs near the LOC728432 gene region and rs6961071 SNP in the tcag7.1213 gene region were significantly associated with the analgesic effect of fentanyl, based on differences in pain perception latency before and after fentanyl administration. The associations of these three SNPs that were identified in our exploratory study have not been previously reported. The two polymorphic loci (rs13093031 and rs12633508 were shown to be in strong linkage disequilibrium. Subjects with the G/G genotype of the rs13093031 and rs6961071 SNPs presented lower fentanyl-induced analgesia. Our findings provide a basis for investigating genetics-based analgesic sensitivity and personalized pain control. Keywords: Opioid sensitivity, Analgesia, Fentanyl, Polymorphism, GWAS

  5. Bos taurus strain:dairy beef (cattle): 1000 Bull Genomes Run 2, Bovine Whole Genome Sequence

    NARCIS (Netherlands)

    Bouwman, A.C.; Daetwyler, H.D.; Chamberlain, Amanda J.; Ponce, Carla Hurtado; Sargolzaei, Mehdi; Schenkel, Flavio S.; Sahana, Goutam; Govignon-Gion, Armelle; Boitard, Simon; Dolezal, Marlies; Pausch, Hubert; Brøndum, Rasmus F.; Bowman, Phil J.; Thomsen, Bo; Guldbrandtsen, Bernt; Lund, Mogens S.; Servin, Bertrand; Garrick, Dorian J.; Reecy, James M.; Vilkki, Johanna; Bagnato, Alessandro; Wang, Min; Hoff, Jesse L.; Schnabel, Robert D.; Taylor, Jeremy F.; Vinkhuyzen, Anna A.E.; Panitz, Frank; Bendixen, Christian; Holm, Lars-Erik; Gredler, Birgit; Hozé, Chris; Boussaha, Mekki; Sanchez, Marie Pierre; Rocha, Dominique; Capitan, Aurelien; Tribout, Thierry; Barbat, Anne; Croiseau, Pascal; Drögemüller, Cord; Jagannathan, Vidhya; Vander Jagt, Christy; Crowley, John J.; Bieber, Anna; Purfield, Deirdre C.; Berry, Donagh P.; Emmerling, Reiner; Götz, Kay Uwe; Frischknecht, Mirjam; Russ, Ingolf; Sölkner, Johann; Tassell, van Curtis P.; Fries, Ruedi; Stothard, Paul; Veerkamp, R.F.; Boichard, Didier; Goddard, Mike E.; Hayes, Ben J.

    2014-01-01

    Whole genome sequence data (BAM format) of 234 bovine individuals aligned to UMD3.1. The aim of the study was to identify genetic variants (SNPs and indels) for downstream analysis such as imputation, GWAS, and detection of lethal recessives. Additional sequences for later 1000 bull genomes runs can

  6. Dynamics of Indel Profiles Induced by Various CRISPR/Cas9 Delivery Methods

    DEFF Research Database (Denmark)

    Kosicki, Michael; Rajan, Sandeep S; Lorenzetti, Flaminia C

    2017-01-01

    The introduction of CRISPR/Cas9 gene editing in mammalian cells is a scientific breakthrough, which has greatly affected basic research and gene therapy. The simplicity and general access to CRISPR/Cas9 reagents has in an unprecedented manner "democratized" gene targeting in biomedical research...... approach. In this study we review the most commonly used indel detection methods and using a robust, sensitive, and cost efficient Indel Detection by Amplicon Analysis method, we have investigated the impact of the most commonly used CRISPR/Cas9 delivery formats, including lentivirus transduction, plasmid...

  7. Signatures of selection in the Iberian honey bee (Apis mellifera iberiensis) revealed by a genome scan analysis of single nucleotide polymorphisms.

    Science.gov (United States)

    Chávez-Galarza, Julio; Henriques, Dora; Johnston, J Spencer; Azevedo, João C; Patton, John C; Muñoz, Irene; De la Rúa, Pilar; Pinto, M Alice

    2013-12-01

    Understanding the genetic mechanisms of adaptive population divergence is one of the most fundamental endeavours in evolutionary biology and is becoming increasingly important as it will allow predictions about how organisms will respond to global environmental crisis. This is particularly important for the honey bee, a species of unquestionable ecological and economical importance that has been exposed to increasing human-mediated selection pressures. Here, we conducted a single nucleotide polymorphism (SNP)-based genome scan in honey bees collected across an environmental gradient in Iberia and used four FST -based outlier tests to identify genomic regions exhibiting signatures of selection. Additionally, we analysed associations between genetic and environmental data for the identification of factors that might be correlated or act as selective pressures. With these approaches, 4.4% (17 of 383) of outlier loci were cross-validated by four FST -based methods, and 8.9% (34 of 383) were cross-validated by at least three methods. Of the 34 outliers, 15 were found to be strongly associated with one or more environmental variables. Further support for selection, provided by functional genomic information, was particularly compelling for SNP outliers mapped to different genes putatively involved in the same function such as vision, xenobiotic detoxification and innate immune response. This study enabled a more rigorous consideration of selection as the underlying cause of diversity patterns in Iberian honey bees, representing an important first step towards the identification of polymorphisms implicated in local adaptation and possibly in response to recent human-mediated environmental changes. © 2013 John Wiley & Sons Ltd.

  8. Comprehensive search for intra- and inter-specific sequence polymorphisms among coding envelope genes of retroviral origin found in the human genome: genes and pseudogenes

    Directory of Open Access Journals (Sweden)

    Vasilescu Alexandre

    2005-09-01

    Full Text Available Abstract Background The human genome carries a high load of proviral-like sequences, called Human Endogenous Retroviruses (HERVs, which are the genomic traces of ancient infections by active retroviruses. These elements are in most cases defective, but open reading frames can still be found for the retroviral envelope gene, with sixteen such genes identified so far. Several of them are conserved during primate evolution, having possibly been co-opted by their host for a physiological role. Results To characterize further their status, we presently sequenced 12 of these genes from a panel of 91 Caucasian individuals. Genomic analyses reveal strong sequence conservation (only two non synonymous Single Nucleotide Polymorphisms [SNPs] for the two HERV-W and HERV-FRD envelope genes, i.e. for the two genes specifically expressed in the placenta and possibly involved in syncytiotrophoblast formation. We further show – using an ex vivo fusion assay for each allelic form – that none of these SNPs impairs the fusogenic function. The other envelope proteins disclose variable polymorphisms, with the occurrence of a stop codon and/or frameshift for most – but not all – of them. Moreover, the sequence conservation analysis of the orthologous genes that can be found in primates shows that three env genes have been maintained in a fully coding state throughout evolution including envW and envFRD. Conclusion Altogether, the present study strongly suggests that some but not all envelope encoding sequences are bona fide genes. It also provides new tools to elucidate the possible role of endogenous envelope proteins as susceptibility factors in a number of pathologies where HERVs have been suspected to be involved.

  9. Forensic performance of Investigator DIPplex indels genotyping kit in native, immigrant, and admixed populations in South Africa.

    Science.gov (United States)

    Hefke, Gwynneth; Davison, Sean; D'Amato, Maria Eugenia

    2015-12-01

    The utilization of binary markers in human individual identification is gaining ground in forensic genetics. We analyzed the polymorphisms from the first commercial indel kit Investigator DIPplex (Qiagen) in 512 individuals from Afrikaner, Indian, admixed Cape Colored, and the native Bantu Xhosa and Zulu origin in South Africa and evaluated forensic and population genetics parameters for their forensic application in South Africa. The levels of genetic diversity in population and forensic parameters in South Africa are similar to other published data, with lower diversity values for the native Bantu. Departures from Hardy-Weinberg expectations were observed in HLD97 in Indians, Admixed and Bantus, along with 6.83% null homozygotes in the Bantu populations. Sequencing of the flanking regions showed a previously reported transition G>A in rs17245568. Strong population structure was detected with Fst, AMOVA, and the Bayesian unsupervised clustering method in STRUCTURE. Therefore we evaluated the efficiency of individual assignments to population groups using the ancestral membership proportions from STRUCTURE and the Bayesian classification algorithm in Snipper App Suite. Both methods showed low cross-assignment error (0-4%) between Bantus and either Afrikaners or Indians. The differentiation between populations seems to be driven by four loci under positive selection pressure. Based on these results, we draw recommendations for the application of this kit in SA. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Typing of 30 insertion/deletions in Danes using the first commercial indel kit-Mentype(®) DIPplex

    DEFF Research Database (Denmark)

    Friis, Susanne Lunøe; Børsting, Claus; Rockenbauer, Eszter

    2012-01-01

    and all amplicon lengths were shorter than 160bp. Full indel profiles were generated from as little as 100pg of DNA. A total of 117 individuals from Danish paternity cases were successfully typed. No deviation from Hardy-Weinberg equilibrium was observed for any of the indels. The combined mean match...

  11. Genome-wide characterization of microsatellites in Cucumis hystrix and in silico identification of polymorphic SSR markers

    Science.gov (United States)

    Cucumis hystrix (2n = 2x = 24, genome HH) is a wild relative of cucumber (C. sativus L., 2n = 2x = 14) that possesses multiple disease resistances and has a great potential for cucumber improvement. Despite its importance, there is no genomic resource currently available for C. hystrix. To expedite ...

  12. Detection of Ribosomal DNA Sequence Polymorphisms in the Protist Plasmodiophora brassicae for the Identification of Geographical Isolates

    Directory of Open Access Journals (Sweden)

    Rawnak Laila

    2017-01-01

    Full Text Available Clubroot is a soil-borne disease caused by the protist Plasmodiophora brassicae (P. brassicae. It is one of the most economically important diseases of Brassica rapa and other cruciferous crops as it can cause remarkable yield reductions. Understanding P. brassicae genetics, and developing efficient molecular markers, is essential for effective detection of harmful races of this pathogen. Samples from 11 Korean field populations of P. brassicae (geographic isolates, collected from nine different locations in South Korea, were used in this study. Genomic DNA was extracted from the clubroot-infected samples to sequence the ribosomal DNA. Primers and probes for P. brassicae were designed using a ribosomal DNA gene sequence from a Japanese strain available in GenBank (accession number AB526843; isolate NGY. The nuclear ribosomal DNA (rDNA sequence of P. brassicae, comprising 6932 base pairs (bp, was cloned and sequenced and found to include the small subunits (SSUs and a large subunit (LSU, internal transcribed spacers (ITS1 and ITS2, and a 5.8s. Sequence variation was observed in both the SSU and LSU. Four markers showed useful differences in high-resolution melting analysis to identify nucleotide polymorphisms including single- nucleotide polymorphisms (SNPs, oligonucleotide polymorphisms, and insertions/deletions (InDels. A combination of three markers was able to distinguish the geographical isolates into two groups.

  13. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper.

    Science.gov (United States)

    Manivannan, Abinaya; Kim, Jin-Hee; Yang, Eun-Young; Ahn, Yul-Kyun; Lee, Eun-Su; Choi, Sena; Kim, Do-Sun

    2018-01-01

    Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS) approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP) indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  14. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper

    Directory of Open Access Journals (Sweden)

    Abinaya Manivannan

    2018-01-01

    Full Text Available Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  15. The humankind genome: from genetic diversity to the origin of human diseases.

    Science.gov (United States)

    Belizário, Jose E

    2013-12-01

    Genome-wide association studies have failed to establish common variant risk for the majority of common human diseases. The underlying reasons for this failure are explained by recent studies of resequencing and comparison of over 1200 human genomes and 10 000 exomes, together with the delineation of DNA methylation patterns (epigenome) and full characterization of coding and noncoding RNAs (transcriptome) being transcribed. These studies have provided the most comprehensive catalogues of functional elements and genetic variants that are now available for global integrative analysis and experimental validation in prospective cohort studies. With these datasets, researchers will have unparalleled opportunities for the alignment, mining, and testing of hypotheses for the roles of specific genetic variants, including copy number variations, single nucleotide polymorphisms, and indels as the cause of specific phenotypes and diseases. Through the use of next-generation sequencing technologies for genotyping and standardized ontological annotation to systematically analyze the effects of genomic variation on humans and model organism phenotypes, we will be able to find candidate genes and new clues for disease's etiology and treatment. This article describes essential concepts in genetics and genomic technologies as well as the emerging computational framework to comprehensively search websites and platforms available for the analysis and interpretation of genomic data.

  16. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc

    2014-02-15

    Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest. 2014 Elsevier Ltd. All rights reserved.

  17. Detection of Hereditary 1,25-Hydroxyvitamin D-Resistant Rickets Caused by Uniparental Disomy of Chromosome 12 Using Genome-Wide Single Nucleotide Polymorphism Array.

    Directory of Open Access Journals (Sweden)

    Mayuko Tamura

    Full Text Available Hereditary 1,25-dihydroxyvitamin D-resistant rickets (HVDRR is an autosomal recessive disease caused by biallelic mutations in the vitamin D receptor (VDR gene. No patients have been reported with uniparental disomy (UPD.Using genome-wide single nucleotide polymorphism (SNP array to confirm whether HVDRR was caused by UPD of chromosome 12.A 2-year-old girl with alopecia and short stature and without any family history of consanguinity was diagnosed with HVDRR by typical laboratory data findings and clinical features of rickets. Sequence analysis of VDR was performed, and the origin of the homozygous mutation was investigated by target SNP sequencing, short tandem repeat analysis, and genome-wide SNP array.The patient had a homozygous p.Arg73Ter nonsense mutation. Her mother was heterozygous for the mutation, but her father was negative. We excluded gross deletion of the father's allele or paternal discordance. Genome-wide SNP array of the family (the patient and her parents showed complete maternal isodisomy of chromosome 12. She was successfully treated with high-dose oral calcium.This is the first report of HVDRR caused by UPD, and the third case of complete UPD of chromosome 12, in the published literature. Genome-wide SNP array was useful for detecting isodisomy and the parental origin of the allele. Comprehensive examination of the homozygous state is essential for accurate genetic counseling of recurrence risk and appropriate monitoring for other chromosome 12 related disorders. Furthermore, oral calcium therapy was effective as an initial treatment for rickets in this instance.

  18. Nucleotide polymorphisms and haplotype diversity of RTCS gene in China elite maize inbred lines.

    Directory of Open Access Journals (Sweden)

    Enying Zhang

    Full Text Available The maize RTCS gene, encoding a LOB domain transcription factor, plays important roles in the initiation of embryonic seminal and postembryonic shoot-borne root. In this study, the genomic sequences of this gene in 73 China elite inbred lines, including 63 lines from 5 temperate heteroric groups and 10 tropic germplasms, were obtained, and the nucleotide polymorphisms and haplotype diversity were detected. A total of 63 sequence variants, including 44 SNPs and 19 indels, were identified at this locus, and most of them were found to be located in the regions of UTR and intron. The coding region of this gene in all tested inbred lines carried 14 haplotypes, which encoding 7 deferring RTCS proteins. Analysis of the polymorphism sites revealed that at least 6 recombination events have occurred. Among all 6 groups tested, only the P heterotic group had a much lower nucleotide diversity than the whole set, and selection analysis also revealed that only this group was under strong negative selection. However, the set of Huangzaosi and its derived lines possessed a higher nucleotide diversity than the whole set, and no selection signal were identified.

  19. An InDel in the Promoter of Al-ACTIVATED MALATE TRANSPORTER9 Selected during Tomato Domestication Determines Fruit Malate Contents and Aluminum Tolerance[OPEN

    Science.gov (United States)

    Wang, Xin; Hu, Tixu; Zhang, Fengxia; Wang, Bing; Li, Changxin; Yang, Tianxia; Li, Hanxia; Lu, Yongen; Ye, Zhibiao

    2017-01-01

    Deciphering the mechanism of malate accumulation in plants would contribute to a greater understanding of plant chemistry, which has implications for improving flavor quality in crop species and enhancing human health benefits. However, the regulation of malate metabolism is poorly understood in crops such as tomato (Solanum lycopersicum). Here, we integrated a metabolite-based genome-wide association study with linkage mapping and gene functional studies to characterize the genetics of malate accumulation in a global collection of tomato accessions with broad genetic diversity. We report that TFM6 (tomato fruit malate 6), which corresponds to Al-ACTIVATED MALATE TRANSPORTER9 (Sl-ALMT9 in tomato), is the major quantitative trait locus responsible for variation in fruit malate accumulation among tomato genotypes. A 3-bp indel in the promoter region of Sl-ALMT9 was linked to high fruit malate content. Further analysis indicated that this indel disrupts a W-box binding site in the Sl-ALMT9 promoter, which prevents binding of the WRKY transcription repressor Sl-WRKY42, thereby alleviating the repression of Sl-ALMT9 expression and promoting high fruit malate accumulation. Evolutionary analysis revealed that this highly expressed Sl-ALMT9 allele was selected for during tomato domestication. Furthermore, vacuole membrane-localized Sl-ALMT9 increases in abundance following Al treatment, thereby elevating malate transport and enhancing Al resistance. PMID:28814642

  20. Identification and analysis of Single Nucleotide Polymorphisms (SNPs in the mosquito Anopheles funestus, malaria vector

    Directory of Open Access Journals (Sweden)

    Hemingway Janet

    2007-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most common source of genetic variation in eukaryotic species and have become an important marker for genetic studies. The mosquito Anopheles funestus is one of the major malaria vectors in Africa and yet, prior to this study, no SNPs have been described for this species. Here we report a genome-wide set of SNP markers for use in genetic studies on this important human disease vector. Results DNA fragments from 50 genes were amplified and sequenced from 21 specimens of An. funestus. A third of specimens were field collected in Malawi, a third from a colony of Mozambican origin and a third form a colony of Angolan origin. A total of 494 SNPs including 303 within the coding regions of genes and 5 indels were identified. The physical positions of these SNPs in the genome are known. There were on average 7 SNPs per kilobase similar to that observed in An. gambiae and Drosophila melanogaster. Transitions outnumbered transversions, at a ratio of 2:1. The increased frequency of transition substitutions in coding regions is likely due to the structure of the genetic code and selective constraints. Synonymous sites within coding regions showed a higher polymorphism rate than non-coding introns or 3' and 5'flanking DNA with most of the substitutions in coding regions being observed at the 3rd codon position. A positive correlation in the level of polymorphism was observed between coding and non-coding regions within a gene. By genotyping a subset of 30 SNPs, we confirmed the validity of the SNPs identified during this study. Conclusion This set of SNP markers represents a useful tool for genetic studies in An. funestus, and will be useful in identifying candidate genes that affect diverse ranges of phenotypes that impact on vector control, such as resistance insecticide, mosquito behavior and vector competence.

  1. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  2. Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  3. Development and validation of a 20K single nucleotide polymorphism (SNP whole genome genotyping array for apple (Malus × domestica Borkh.

    Directory of Open Access Journals (Sweden)

    Luca Bianco

    Full Text Available High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus. A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs. Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  4. Development and Validation of a 20K Single Nucleotide Polymorphism (SNP) Whole Genome Genotyping Array for Apple (Malus × domestica Borkh)

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs. PMID:25303088

  5. Evaluation and optimisation of indel detection workflows for ion torrent sequencing of the BRCA1 and BRCA2 genes.

    Science.gov (United States)

    Yeo, Zhen Xuan; Wong, Joshua Chee Leong; Rozen, Steven G; Lee, Ann Siew Gek

    2014-06-24

    The Ion Torrent PGM is a popular benchtop sequencer that shows promise in replacing conventional Sanger sequencing as the gold standard for mutation detection. Despite the PGM's reported high accuracy in calling single nucleotide variations, it tends to generate many false positive calls in detecting insertions and deletions (indels), which may hinder its utility for clinical genetic testing. Recently, the proprietary analytical workflow for the Ion Torrent sequencer, Torrent Suite (TS), underwent a series of upgrades. We evaluated three major upgrades of TS by calling indels in the BRCA1 and BRCA2 genes. Our analysis revealed that false negative indels could be generated by TS under both default calling parameters and parameters adjusted for maximum sensitivity. However, indel calling with the same data using the open source variant callers, GATK and SAMtools showed that false negatives could be minimised with the use of appropriate bioinformatics analysis. Furthermore, we identified two variant calling measures, Quality-by-Depth (QD) and VARiation of the Width of gaps and inserts (VARW), which substantially reduced false positive indels, including non-homopolymer associated errors without compromising sensitivity. In our best case scenario that involved the TMAP aligner and SAMtools, we achieved 100% sensitivity, 99.99% specificity and 29% False Discovery Rate (FDR) in indel calling from all 23 samples, which is a good performance for mutation screening using PGM. New versions of TS, BWA and GATK have shown improvements in indel calling sensitivity and specificity over their older counterpart. However, the variant caller of TS exhibits a lower sensitivity than GATK and SAMtools. Our findings demonstrate that although indel calling from PGM sequences may appear to be noisy at first glance, proper computational indel calling analysis is able to maximize both the sensitivity and specificity at the single base level, paving the way for the usage of this technology

  6. Genome-wide association study identifies single-nucleotide polymorphism in KCNB1 associated with left ventricular mass in humans: The HyperGEN Study

    Directory of Open Access Journals (Sweden)

    Kraemer Rachel

    2009-05-01

    Full Text Available Abstract Background We conducted a genome-wide association study (GWAS and validation study for left ventricular (LV mass in the Family Blood Pressure Program – HyperGEN population. LV mass is a sensitive predictor of cardiovascular mortality and morbidity in all genders, races, and ages. Polymorphisms of candidate genes in diverse pathways have been associated with LV mass. However, subsequent studies have often failed to replicate these associations. Genome-wide association studies have unprecedented power to identify potential genes with modest effects on left LV mass. We describe here a GWAS for LV mass in Caucasians using the Affymetrix GeneChip Human Mapping 100 k Set. Cases (N = 101 and controls (N = 101 were selected from extreme tails of the LV mass index distribution from 906 individuals in the HyperGEN study. Eleven of 12 promising (Q Results Despite the relatively small sample, we identified 12 promising SNPs in the GWAS. Eleven SNPs were successfully genotyped in the validation study of 704 Caucasians and 1467 African Americans; 5 SNPs on chromosomes 5, 12, and 20 were significantly (P ≤ 0.05 associated with LV mass after correction for multiple testing. One SNP (rs756529 is intragenic within KCNB1, which is dephosphorylated by calcineurin, a previously reported candidate gene for LV hypertrophy within this population. Conclusion These findings suggest KCNB1 may be involved in the development of LV hypertrophy in humans.

  7. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata

    Directory of Open Access Journals (Sweden)

    Meghann K. Devlin-Durante

    2017-11-01

    Full Text Available The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata, to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.

  8. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata.

    Science.gov (United States)

    Devlin-Durante, Meghann K; Baums, Iliana B

    2017-01-01

    The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata , to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species.

  9. Genome editing using FACS enrichment of nuclease-expressing cells and indel detection by amplicon analysis

    DEFF Research Database (Denmark)

    Lonowski, Lindsey A; Narimatsu, Yoshiki; Riaz, Anjum

    2017-01-01

    , FACS enrichment of cells expressing nucleases linked to fluorescent proteins can be used to maximize knockout or knock-in editing efficiencies or to balance editing efficiency and toxic/off-target effects. The two methods can be combined to form a pipeline for cell-line editing that facilitates...

  10. Comprehensive identification of single nucleotide polymorphisms associated with beta-lactam resistance within pneumococcal mosaic genes.

    Directory of Open Access Journals (Sweden)

    Claire Chewapreecha

    2014-08-01

    Full Text Available Traditional genetic association studies are very difficult in bacteria, as the generally limited recombination leads to large linked haplotype blocks, confounding the identification of causative variants. Beta-lactam antibiotic resistance in Streptococcus pneumoniae arises readily as the bacteria can quickly incorporate DNA fragments encompassing variants that make the transformed strains resistant. However, the causative mutations themselves are embedded within larger recombined blocks, and previous studies have only analysed a limited number of isolates, leading to the description of "mosaic genes" as being responsible for resistance. By comparing a large number of genomes of beta-lactam susceptible and non-susceptible strains, the high frequency of recombination should break up these haplotype blocks and allow the use of genetic association approaches to identify individual causative variants. Here, we performed a genome-wide association study to identify single nucleotide polymorphisms (SNPs and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery. The large sample sizes allowed us to narrow the source of beta-lactam non-susceptibility from long recombinant fragments down to much smaller loci comprised of discrete or linked SNPs. While some loci appear to be universal resistance determinants, contributing equally to non-susceptibility for at least two classes of beta-lactam antibiotics, some play a larger role in resistance to particular antibiotics. All of the identified loci have a highly non-uniform distribution in the populations. They are enriched not only in vaccine-targeted, but also non-vaccine-targeted lineages, which may raise clinical concerns. Identification of single nucleotide polymorphisms underlying resistance will be essential for future use of genome sequencing to predict antibiotic sensitivity in clinical microbiology.

  11. The South Asian genome.

    Directory of Open Access Journals (Sweden)

    John C Chambers

    Full Text Available The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

  12. Characterization of the Gray Whale Eschrichtius robustus Genome and a Genotyping Array Based on Single-Nucleotide Polymorphisms in Candidate Genes.

    Science.gov (United States)

    DeWoody, J Andrew; Fernandez, Nadia B; Brüniche-Olsen, Anna; Antonides, Jennifer D; Doyle, Jacqueline M; San Miguel, Phillip; Westerman, Rick; Vertyankin, Vladimir V; Godard-Codding, Céline A J; Bickham, John W

    2017-06-01

    Genetic and genomic approaches have much to offer in terms of ecology, evolution, and conservation. To better understand the biology of the gray whale Eschrichtius robustus (Lilljeborg, 1861), we sequenced the genome and produced an assembly that contains ∼95% of the genes known to be highly conserved among eukaryotes. From this assembly, we annotated 22,711 genes and identified 2,057,254 single-nucleotide polymorphisms (SNPs). Using this assembly, we generated a curated list of candidate genes potentially subject to strong natural selection, including genes associated with osmoregulation, oxygen binding and delivery, and other aspects of marine life. From these candidate genes, we queried 92 autosomal protein-coding markers with a panel of 96 SNPs that also included 2 sexing and 2 mitochondrial markers. Genotyping error rates, calculated across loci and across 69 intentional replicate samples, were low (0.021%), and observed heterozygosity was 0.33 averaged over all autosomal markers. This level of variability provides substantial discriminatory power across loci (mean probability of identity of 1.6 × 10 -25 and mean probability of exclusion >0.999 with neither parent known), indicating that these markers provide a powerful means to assess parentage and relatedness in gray whales. We found 29 unique multilocus genotypes represented among our 36 biopsies (indicating that we inadvertently sampled 7 whales twice). In total, we compiled an individual data set of 28 western gray whales (WGSs) and 1 presumptive eastern gray whale (EGW). The lone EGW we sampled was no more or less related to the WGWs than expected by chance alone. The gray whale genomes reported here will enable comparative studies of natural selection in cetaceans, and the SNP markers should be highly informative for future studies of gray whale evolution, population structure, demography, and relatedness.

  13. A whole genome association study to detect additive and dominant single nucleotide polymorphisms for growth and carcass traits in Korean native cattle, Hanwoo

    Directory of Open Access Journals (Sweden)

    Yi Li

    2017-01-01

    Full Text Available Objective A whole genome association study was conducted to identify single nucleotide polymorphisms (SNPs with additive and dominant effects for growth and carcass traits in Korean native cattle, Hanwoo. Methods The data set comprised 61 sires and their 486 Hanwoo steers that were born between spring of 2005 and fall of 2007. The steers were genotyped with the 35,968 SNPs that were embedded in the Illumina bovine SNP 50K beadchip and six growth and carcass quality traits were measured for the steers. A series of lack-of-fit tests between the models was applied to classify gene expression pattern as additive or dominant. Results A total of 18 (0, 15 (3, 12 (8, 15 (18, 11 (7, and 21 (1 SNPs were detected at the 5% chromosome (genome - wise level for weaning weight (WWT, yearling weight (YWT, carcass weight (CWT, backfat thickness (BFT, longissimus dorsi muscle area (LMA and marbling score, respectively. Among the significant 129 SNPs, 56 SNPs had additive effects, 20 SNPs dominance effects, and 53 SNPs both additive and dominance effects, suggesting that dominance inheritance mode be considered in genetic improvement for growth and carcass quality in Hanwoo. The significant SNPs were located at 33 quantitative trait locus (QTL regions on 18 Bos Taurus chromosomes (i.e. BTA 3, 4, 5, 6, 7, 9, 11, 12, 13, 14, 16, 17, 18, 20, 23, 26, 28, and 29 were detected. There is strong evidence that BTA14 is the key chromosome affecting CWT. Also, BTA20 is the key chromosome for almost all traits measured (WWT, YWT, LMA. Conclusion The application of various additive and dominance SNP models enabled better characterization of SNP inheritance mode for growth and carcass quality traits in Hanwoo, and many of the detected SNPs or QTL had dominance effects, suggesting that dominance be considered for the whole-genome SNPs data and implementation of successive molecular breeding schemes in Hanwoo.

  14. Prevalence of IFNL3 gene polymorphism among blood donors and its relation to genomic profile of ancestry in Brazil.

    Science.gov (United States)

    Rizzo, Silvia Renata Cornelio Parolin; Gazito, Diana; Pott-Junior, Henrique; Latini, Flavia Roche Moreira; Castelo, Adauto

    The recent development of interferon-free regimens based on direct-acting antivirals for the treatment of chronic hepatitis C virus infection has benefited many but not all patients. Some patients still experience treatment failure, possibly attributed to unknown host and viral factors, such as IFNL3 gene polymorphism. The present study assessed the prevalence of rs12979860-CC, rs12979860-CT, and rs12979860-TT genotypes of the IFNL3 gene, and its relationship with ancestry informative markers in 949 adult Brazilian healthy blood donors. Race was analyzed using ancestry informative markers as a surrogate for ancestry. IFNL3 gene was genotyped using the ABI TaqMan single nucleotide polymorphisms genotyping assays. The overall frequency of rs12979860-CC genotype was 36.9%. The contribution of African ancestry was significantly higher among donors from the northeast region in relation to southeast donors, whereas the influence of European ancestry was significantly higher in southeast donors. Donors with rs12979860-CC and rs12979860-CT genotypes had similar ancestry background. The contribution of African ancestry was higher among rs12979860-TT genotype donors in comparison to both rs12979860-CC and rs12979860-CT genotypes. The prevalence of rs12979860-CC genotype is similar to that found in the US, despite the Brazilian ancestry informative markers admixture. However, in terms of ancestry, rs12979860-CT genotype was much closer to rs12979860-CC individuals than to rs12979860-TT. Copyright © 2016 Sociedade Brasileira de Infectologia. Published by Elsevier Editora Ltda. All rights reserved.

  15. Landscape genomics and biased FST approaches reveal single nucleotide polymorphisms under selection in goat breeds of North-East Mediterranean

    Directory of Open Access Journals (Sweden)

    Joost Stephane

    2009-02-01

    Full Text Available Abstract Background In this study we compare outlier loci detected using a FST based method with those identified by a recently described method based on spatial analysis (SAM. We tested a panel of single nucleotide polymorphisms (SNPs previously genotyped in individuals of goat breeds of southern areas of the Mediterranean basin (Italy, Greece and Albania. We evaluate how the SAM method performs with SNPs, which are increasingly employed due to their high number, low cost and easy of scoring. Results The combined use of the two outlier detection approaches, never tested before using SNP polymorphisms, resulted in the identification of the same three loci involved in milk and meat quality data by using the two methods, while the FST based method identified 3 more loci as under selection sweep in the breeds examined. Conclusion Data appear congruent by using the two methods for FST values exceeding the 99% confidence limits. The methods of FST and SAM can independently detect signatures of selection and therefore can reduce the probability of finding false positives if employed together. The outlier loci identified in this study could indicate adaptive variation in the analysed species, characterized by a large range of climatic conditions in the rearing areas and by a history of intense trade, that implies plasticity in adapting to new environments.

  16. Isolation and Characterization of 13 New Polymorphic Microsatellite Markers in the Phaseolus vulgaris L. (Common Bean Genome

    Directory of Open Access Journals (Sweden)

    Aihua Wang

    2012-09-01

    Full Text Available In this study, 13 polymorphic microsatellite markers were isolated from the Phaseolus vulgaris L. (common bean by using the Fast Isolation by AFLP of Sequence COntaining Repeats (FIASCO protocol. These markers revealed two to seven alleles, with an average of 3.64 alleles per locus. The polymorphic information content (PIC values ranged from 0.055 to 0.721 over 13 loci, with a mean value of 0.492, and 7 loci having PIC greater than 0.5. The expected heterozygosity (HE and observed heterozygosity (HO levels ranged from 0.057 to 0.814 and from 0.026 to 0.531, respectively. Cross-species amplification of the 13 prime pairs was performed in its related specie of Vigna unguiculata L. Seven out of all these markers showed cross-species transferability. These markers will be useful for future genetic diversity and population genetics studies for this agricultural specie and its related species.

  17. The Qatar genome: a population-specific tool for precision medicine in the Middle East

    Science.gov (United States)

    Fakhro, Khalid A; Staudt, Michelle R; Ramstetter, Monica Denise; Robay, Amal; Malek, Joel A; Badii, Ramin; Al-Marri, Ajayeb Al-Nabet; Khalil, Charbel Abi; Al-Shakaki, Alya; Chidiac, Omar; Stadler, Dora; Zirie, Mahmoud; Jayyousi, Amin; Salit, Jacqueline; Mezey, Jason G; Crystal, Ronald G; Rodriguez-Flores, Juan L

    2016-01-01

    Reaching the full potential of precision medicine depends on the quality of personalized genome interpretation. In order to facilitate precision medicine in regions of the Middle East and North Africa (MENA), a population-specific genome for the indigenous Arab population of Qatar (QTRG) was constructed by incorporating allele frequency data from sequencing of 1,161 Qataris, representing 0.4% of the population. A total of 20.9 million single nucleotide polymorphisms (SNPs) and 3.1 million indels were observed in Qatar, including an average of 1.79% novel variants per individual genome. Replacement of the GRCh37 standard reference with QTRG in a best practices genome analysis workflow resulted in an average of 7* deeper coverage depth (an improvement of 23%) and 756,671 fewer variants on average, a reduction of 16% that is attributed to common Qatari alleles being present in QTRG. The benefit for using QTRG varies across ancestries, a factor that should be taken into consideration when selecting an appropriate reference for analysis. PMID:27408750

  18. Genome-wide analysis reveals signatures of selection for important traits in domestic sheep from different ecoregions.

    Science.gov (United States)

    Liu, Zhaohua; Ji, Zhibin; Wang, Guizhi; Chao, Tianle; Hou, Lei; Wang, Jianmin

    2016-11-03

    Throughout a long period of adaptation and selection, sheep have thrived in a diverse range of ecological environments. Mongolian sheep is the common ancestor of the Chinese short fat-tailed sheep. Migration to different ecoregions leads to changes in selection pressures and results in microevolution. Mongolian sheep and its subspecies differ in a number of important traits, especially reproductive traits. Genome-wide intraspecific variation is required to dissect the genetic basis of these traits. This research resequenced 3 short fat-tailed sheep breeds with a 43.2-fold coverage of the sheep genome. We report more than 17 million single nucleotide polymorphisms and 2.9 million indels and identify 143 genomic regions with reduced pooled heterozygosity or increased genetic distance to each other breed that represent likely targets for selection during the migration. These regions harbor genes related to developmental processes, cellular processes, multicellular organismal processes, biological regulation, metabolic processes, reproduction, localization, growth and various components of the stress responses. Furthermore, we examined the haplotype diversity of 3 genomic regions involved in reproduction and found significant differences in TSHR and PRL gene regions among 8 sheep breeds. Our results provide useful genomic information for identifying genes or causal mutations associated with important economic traits in sheep and for understanding the genetic basis of adaptation to different ecological environments.

  19. A method for the analysis of 32 X chromosome insertion deletion polymorphisms in a single PCR

    DEFF Research Database (Denmark)

    Pereira, Rui; Pereira, Vania; Gomes, Iva

    2012-01-01

    population samples and revealed high forensic efficiency, as measured by the accumulated power of discrimination (0.9999990 was the lowest value in males and 0.999999999998 was the highest in females) and mean exclusion chance varied between 0.998 and 0.9996 in duos and between 0.99997 and 0.999998 in trios......-Indel multiplex system amplifying 32 biallelic markers in one single PCR. The multiplex includes X-Indels shown to be polymorphic in the major human population groups and follows a short amplicon strategy. The set was applied in the genetic characterization of sub-Saharan African, European and East Asian...

  20. Genomic diversity of Mycobacterium tuberculosis Beijing strains isolated in Tuscany, Italy, based on large sequence deletions, SNPs in putative DNA repair genes and MIRU-VNTR polymorphisms.

    Science.gov (United States)

    Garzelli, Carlo; Lari, Nicoletta; Rindi, Laura

    2016-03-01

    The Beijing genotype of Mycobacterium tuberculosis is cause of global concern as it is rapidly spreading worldwide, is considered hypervirulent, and is most often associated to massive spread of MDR/XDR TB, although these epidemiological or pathological properties have not been confirmed for all strains and in all geographic settings. In this paper, to gain new insights into the biogeographical heterogeneity of the Beijing family, we investigated a global sample of Beijing strains (22% from Italian-born, 78% from foreign-born patients) by determining large sequence polymorphism of regions RD105, RD181, RD150 and RD142, single nucleotide polymorphism of putative DNA repair genes mutT4 and mutT2 and MIRU-VNTR profiles based on 11 discriminative loci. We found that, although our sample of Beijing strains showed a considerable genomic heterogeneity, yielding both ancient and recent phylogenetic strains, the prevalent successful Beijing subsets were characterized by deletions of RD105 and RD181 and by one nucleotide substitution in one or both mutT genes. MIRU-VNTR analysis revealed 47 unique patterns and 9 clusters including a total of 33 isolates (41% of total isolates); the relatively high proportion of Italian-born Beijing TB patients, often occurring in mixed clusters, supports the possibility of an ongoing cross-transmission of the Beijing genotype to autochthonous population. High rates of extra-pulmonary localization and drug-resistance, particularly MDR, frequently reported for Beijing strains in other settings, were not observed in our survey. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Makeup of the genetic correlation between milk production traits using genome-wide single nucleotide polymorphism information.

    Science.gov (United States)

    van Binsbergen, R; Veerkamp, R F; Calus, M P L

    2012-04-01

    The correlated responses between traits may differ depending on the makeup of genetic covariances, and may differ from the predictions of polygenic covariances. Therefore, the objective of the present study was to investigate the makeup of the genetic covariances between the well-studied traits: milk yield, fat yield, protein yield, and their percentages in more detail. Phenotypic records of 1,737 heifers of research farms in 4 different countries were used after homogenizing and adjusting for management effects. All cows had a genotype for 37,590 single nucleotide polymorphisms (SNP). A bayesian stochastic search variable selection model was used to estimate the SNP effects for each trait. About 0.5 to 1.0% of the SNP had a significant effect on 1 or more traits; however, the SNP without a significant effect explained most of the genetic variances and covariances of the traits. Single nucleotide polymorphism correlations differed from the polygenic correlations, but only 10 regions were found with an effect on multiple traits; in 1 of these regions the DGAT1 gene was previously reported with an effect on multiple traits. This region explained up to 41% of the variances of 4 traits and explained a major part of the correlation between fat yield and fat percentage and contributes to asymmetry in correlated response between fat yield and fat percentage. Overall, for the traits in this study, the infinitesimal model is expected to be sufficient for the estimation of the variances and covariances. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  2. Whole-Genome Sequencing for National Surveillance of Shigella flexneri

    Directory of Open Access Journals (Sweden)

    Marie A. Chattaway

    2017-09-01

    Full Text Available National surveillance of Shigella flexneri ensures the rapid detection of outbreaks to facilitate public health investigation and intervention strategies. In this study, we used whole-genome sequencing (WGS to type S. flexneri in order to detect linked cases and support epidemiological investigations. We prospectively analyzed 330 isolates of S. flexneri received at the Gastrointestinal Bacteria Reference Unit at Public Health England between August 2015 and January 2016. Traditional phenotypic and WGS sub-typing methods were compared. PCR was carried out on isolates exhibiting phenotypic/genotypic discrepancies with respect to serotype. Phylogenetic relationships between isolates were analyzed by WGS using single nucleotide polymorphism (SNP typing to facilitate cluster detection. For 306/330 (93% isolates there was concordance between serotype derived from the genome and phenotypic serology. Discrepant results between the phenotypic and genotypic tests were attributed to novel O-antigen synthesis/modification gene combinations or indels identified in O-antigen synthesis/modification genes rendering them dysfunctional. SNP typing identified 36 clusters of two isolates or more. WGS provided microbiological evidence of epidemiologically linked clusters and detected novel O-antigen synthesis/modification gene combinations associated with two outbreaks. WGS provided reliable and robust data for monitoring trends in the incidence of different serotypes over time. SNP typing can be used to facilitate outbreak investigations in real-time thereby informing surveillance strategies and providing the opportunities for implementing timely public health interventions.

  3. Associations of activated coagulation factor VII and factor VIIa-antithrombin levels with genome-wide polymorphisms and cardiovascular disease risk.

    Science.gov (United States)

    Olson, N C; Raffield, L M; Lange, L A; Lange, E M; Longstreth, W T; Chauhan, G; Debette, S; Seshadri, S; Reiner, A P; Tracy, R P

    2018-01-01

    Essentials A fraction of coagulation factor VII circulates in blood as an activated protease (FVIIa). We evaluated FVIIa and FVIIa-antithrombin (FVIIa-AT) levels in the Cardiovascular Health Study. Polymorphisms in the F7 and PROCR loci were associated with FVIIa and FVIIa-AT levels. FVIIa may be an ischemic stroke risk factor in older adults and FVIIa-AT may assess mortality risk. Background A fraction of coagulation factor (F) VII circulates as an active protease (FVIIa). FVIIa also circulates as an inactivated complex with antithrombin (FVIIa-AT). Objective Evaluate associations of FVIIa and FVIIa-AT with genome-wide single nucleotide polymorphisms (SNPs) and incident coronary heart disease, ischemic stroke and mortality. Patients/Methods We measured FVIIa and FVIIa-AT in 3486 Cardiovascular Health Study (CHS) participants. We performed a genome-wide association scan for FVIIa and FVIIa-AT in European-Americans (n = 2410) and examined associations of FVII phenotypes with incident cardiovascular disease. Results In European-Americans, the most significant SNP for FVIIa and FVIIa-AT was rs1755685 in the F7 promoter region on chromosome 13 (FVIIa, β = -25.9 mU mL -1 per minor allele; FVIIa-AT, β = -26.6 pm per minor allele). Phenotypes were also associated with rs867186 located in PROCR on chromosome 20 (FVIIa, β = 7.8 mU mL -1 per minor allele; FVIIa-AT, β = 9.9 per minor allele). Adjusted for risk factors, a one standard deviation higher FVIIa was associated with increased risk of ischemic stroke (hazard ratio [HR], 1.12; 95% confidence interval [CI], 1.01, 1.23). Higher FVIIa-AT was associated with mortality from all causes (HR, 1.08; 95% CI, 1.03, 1.12). Among European-American CHS participants the rs1755685 minor allele was associated with lower ischemic stroke (HR, 0.69; 95% CI, 0.54, 0.88), but this association was not replicated in a larger multi-cohort analysis. Conclusions The results support the importance of the F7 and PROCR loci in

  4. Role of ACE and AGT gene polymorphisms in genetic susceptibility to diabetes mellitus type 2 in a Brazilian sample.

    Science.gov (United States)

    Wollinger, L M; Dal Bosco, S M; Rempe, C; Almeida, S E M; Berlese, D B; Castoldi, R P; Arndt, M E; Contini, V; Genro, J P

    2015-12-29

    The aim of the current study was to investigate the association between the InDel polymorphism in the angiotensin I-converting enzyme gene (ACE) and the rs699 polymorphism in the angiotensinogen gene (AGT) and diabetes mellitus type 2 (DM2) in a sample population from Southern Brazil. A case-control study was conducted with 228 patients with DM2 and 183 controls without DM2. The ACE InDel polymorphism was genotyped by polymerase chain reaction (PCR) with specific primers, followed by electrophoresis on 1.5% agarose gel. The AGT rs699 polymorphism was genotyped using a real-time PCR assay. No significant association between the ACE InDel polymorphism and DM2 was detected (P = 0.97). However, regarding the AGT rs699 polymorphism, DM2 patients had a significantly higher frequency of the AG genotype and lower frequency of the GG genotype when compared to the controls (P = 0.03). Our results suggest that there is an association between the AGT rs699 polymorphism and DM2 in a Brazilian sample.

  5. Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology.

    Science.gov (United States)

    Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Pierzchała, Mariusz; Feng, Yaping; Kadarmideen, Haja N; Kumar, Dibyendu

    2017-01-01

    RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF) and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits. The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel) positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs) with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM) SNP genotyping assay. The comprehensive

  6. Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology.

    Directory of Open Access Journals (Sweden)

    Chandra Shekhar Pareek

    Full Text Available RNA-seq is a useful next-generation sequencing (NGS technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits.The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM SNP genotyping assay. The

  7. Polymorphism at codon 36 of the p53 gene.

    Science.gov (United States)

    Felix, C A; Brown, D L; Mitsudomi, T; Ikagaki, N; Wong, A; Wasserman, R; Womer, R B; Biegel, J A

    1994-01-01

    A polymorphism at codon 36 in exon 4 of the p53 gene was identified by single strand conformation polymorphism (SSCP) analysis and direct sequencing of genomic DNA PCR products. The polymorphic allele, present in the heterozygous state in genomic DNAs of four of 100 individuals (4%), changes the codon 36 CCG to CCA, eliminates a FinI restriction site and creates a BccI site. Including this polymorphism there are four known polymorphisms in the p53 coding sequence.

  8. Association Study of Three Gene Polymorphisms Recently Identified by a Genome-Wide Association Study with Obesity-Related Phenotypes in Chinese Children.

    Science.gov (United States)

    Song, Qi-Ying; Song, Jie-Yun; Wang, Yang; Wang, Shuo; Yang, Yi-De; Meng, Xiang-Rui; Ma, Jun; Wang, Hai-Jun; Wang, Yan

    2017-01-01

    This study aimed to examine associations of three single-nucleotide polymorphisms (SNPs) with obesity-related phenotypes in Chinese children. These SNPs were identified by a recent genome-wide association (GWA) study among European children. Given that varied genetic backgrounds across different ethnicity may result in different association, it is necessary to study these associations in a different ethnic population. A total of 3,922 children, including 2,191 normal-weight, 873 overweight and 858 obese children, from three independent studies were included in the study. Logistic and linear regressions were performed, and meta-analyses were conducted to assess the associations between the SNPs and obesity-related phenotypes. The pooled odds ratios of the A-allele of rs564343 in PACS1 for obesity and severe obesity were 1.180 (p = 0.03) and 1.312 (p = 0.004), respectively. We also found that rs564343 was nominally associated with BMI, BMI standard deviation score (BMI-SDS), waist circumference, and waist-to-height ratio (p obesity in a non-European population. This SNP was also found to be associated with common obesity and various obesity-related phenotypes in Chinese children, which had not been reported in the original study. The results demonstrated the value of conducting genetic researches in populations with different ethnicity. © 2017 The Author(s) Published by S. Karger GmbH, Freiburg.

  9. A Whole Genome Association Study to Detect Single Nucleotide Polymorphisms for Blood Components (Immunity in a Cross between Korean Native Pig and Yorkshire

    Directory of Open Access Journals (Sweden)

    Y.-M. Lee

    2012-12-01

    Full Text Available The purpose of this study was to detect significant SNPs for blood components that were related to immunity using high single nucleotide polymorphism (SNP density panels in a Korean native pig (KNP×Yorkshire (YK cross population. A reciprocal design of KNP×YK produced 249 F2 individuals that were genotyped for a total of 46,865 available SNPs in the Illumina porcine 60K beadchip. To perform whole genome association analysis (WGA, phenotypes were regressed on each SNP under a simple linear regression model after adjustment for sex and slaughter age. To set up a significance threshold, 0.1% point-wise p value from F distribution was used for each SNP test. Among the significant SNPs for a trait, the best set of SNP markers were determined using a stepwise regression procedure with the rates of inclusion and exclusion of each SNP out of the model at 0.001 level. A total of 54 SNPs were detected; 10, 6, 4, 4, 5, 4, 5, 10, and 6 SNPs for neutrophil, lymphocyte, monocyte, eosinophil, basophil, atypical lymph, immunoglobulin, insulin, and insulin-like growth factor-I, respectively. Each set of significant SNPs per trait explained 24 to 42% of phenotypic variance. Several pleiotropic SNPs were detected on SSCs 4, 13, 14 and 15.

  10. PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

    Science.gov (United States)

    Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

    2016-10-06

    With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

  11. Predictive diagnosis of radiation hazard and therapeutic sensitivity by polymorphic marker. Individualized dedicare standing on genome diagnosis

    International Nuclear Information System (INIS)

    Imai, Takashi

    2009-01-01

    In the field of cancer treatment, genome analysis can contribute to individualized medicare. For the purpose of practical application of the analysis in clinic, the author and coworkers have studied the relationships between the SNP on 118 candidate genetic regions related with radiation sensitivity and late effect of carbon ion radiotherapy (CIR), dysuria, in patients with prostate cancer, of which process and result hitherto are presented here. Subjects are 197 patients, most of whom were enrolled in the phase II clinical trial, and 227 healthy volunteers. Patients received CIR with total dose of 66.0 GyE at 20 fr./5 weeks, and were divided in two groups of the training 132 cases (grade 0 and 1 dysuria 3 months after CIR was observed in 109 and 23 cases, respectively), and subsequent test 65 cases (grade 0 and 1 or more, 56 and 9) for prediction. In the training set, analysis of AUC-ROC (area under the curve of receiver operating characteristic) revealed that 5 SNP markers of SART1, ID3, EPDR1, PAH and XRCC6 among analyzed genes were correlated with the dysuria. The prediction was shown to be true in the test set. In total 32 patients with the dysuria, 29 cases (90.6%) were found to have more than 3 risk genotypes above. Analysis in the whole patients thus revealed that there were about 30% of false positive cases, but 11.5% of them were found to have the late effect 6 months after CIR. Thus, genomic diagnosis will be a much more useful tool for individualized medicare not only in prediction of the late effect risk described here but also in selection of therapeutic modality involving the heavy ion radiotherapy. (K.T.)

  12. Genomic single-nucleotide polymorphisms confirm that Gunnison and Greater sage-grouse are genetically well differentiated and that the Bi-State population is distinct

    Science.gov (United States)

    Oyler-McCance, Sara J.; Cornman, Robert S.; Jones, Kenneth L.; Fike, Jennifer

    2015-01-01

    Sage-grouse are iconic, declining inhabitants of sagebrush habitats in western North America, and their management depends on an understanding of genetic variation across the landscape. Two distinct species of sage-grouse have been recognized, Greater (Centrocercus urophasianus) and Gunnison sage-grouse (C. minimus), based on morphology, behavior, and variation at neutral genetic markers. A parapatric group of Greater Sage-Grouse along the border of California and Nevada ("Bi-State") is also genetically distinct at the same neutral genetic markers, yet not different in behavior or morphology. Because delineating taxonomic boundaries and defining conservation units is often difficult in recently diverged taxa and can be further complicated by highly skewed mating systems, we took advantage of new genomic methods that improve our ability to characterize genetic variation at a much finer resolution. We identified thousands of single-nucleotide polymorphisms (SNPs) among Gunnison, Greater, and Bi-State sage-grouse and used them to comprehensively examine levels of genetic diversity and differentiation among these groups. The pairwise multilocus fixation index (FST) was high (0.49) between Gunnison and Greater sage-grouse, and both principal coordinates analysis and model-based clustering grouped samples unequivocally by species. Standing genetic variation was lower within the Gunnison Sage-Grouse. The Bi-State population was also significantly differentiated from Greater Sage-Grouse, albeit more weakly (FST = 0.09), and genetic clustering results were consistent with reduced gene flow with Greater Sage-Grouse. No comparable genetic divisions were found within the Greater Sage-Grouse sample, which spanned the southern half of the range. Thus, we provide much stronger genetic evidence supporting the recognition of Gunnison Sage-Grouse as a distinct species with low genetic diversity. Further, our work confirms that the Bi-State population is differentiated from other

  13. Genome-wide association study identifies single nucleotide polymorphism in DYRK1A associated with replication of HIV-1 in monocyte-derived macrophages.

    Directory of Open Access Journals (Sweden)

    Sebastiaan M Bol

    2011-02-01

    Full Text Available HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART, macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages.Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96 or high (n = 96 p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16 × 10(-5. While the association was not genome-wide significant (p<1 × 10(-7, we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034. Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84 × 10(-6. In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048.These findings suggest that the kinase DYRK1A is involved in the replication of HIV-1, in vitro in macrophages

  14. Genome-Wide Association Study Identifies Single Nucleotide Polymorphism in DYRK1A Associated with Replication of HIV-1 in Monocyte-Derived Macrophages

    Science.gov (United States)

    Bol, Sebastiaan M.; Moerland, Perry D.; Limou, Sophie; van Remmerden, Yvonne; Coulonges, Cédric; van Manen, Daniëlle; Herbeck, Joshua T.; Fellay, Jacques; Sieberer, Margit; Sietzema, Jantine G.; van 't Slot, Ruben; Martinson, Jeremy; Zagury, Jean-François; Schuitemaker, Hanneke; van 't Wout, Angélique B.

    2011-01-01

    Background HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART), macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages. Methodology/Principal Findings Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96) or high (n = 96) p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16×10−5). While the association was not genome-wide significant (p<1×10−7), we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034). Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84×10−6). In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048). Conclusions/Significance These findings suggest that

  15. Genome-Wide Association Study to Identify Single Nucleotide Polymorphisms (SNPs) Associated With the Development of Erectile Dysfunction in African-American Men After Radiotherapy for Prostate Cancer

    International Nuclear Information System (INIS)

    Kerns, Sarah L.; Ostrer, Harry; Stock, Richard; Li, William; Moore, Julian; Pearlman, Alexander; Campbell, Christopher; Shao Yongzhao; Stone, Nelson; Kusnetz, Lynda; Rosenstein, Barry S.

    2010-01-01

    Purpose: To identify single nucleotide polymorphisms (SNPs) associated with erectile dysfunction (ED) among African-American prostate cancer patients treated with external beam radiation therapy. Methods and Materials: A cohort of African-American prostate cancer patients treated with external beam radiation therapy was observed for the development of ED by use of the five-item Sexual Health Inventory for Men (SHIM) questionnaire. Final analysis included 27 cases (post-treatment SHIM score ≤7) and 52 control subjects (post-treatment SHIM score ≥16). A genome-wide association study was performed using approximately 909,000 SNPs genotyped on Affymetrix 6.0 arrays (Affymetrix, Santa Clara, CA). Results: We identified SNP rs2268363, located in the follicle-stimulating hormone receptor (FSHR) gene, as significantly associated with ED after correcting for multiple comparisons (unadjusted p = 5.46 x 10 -8 , Bonferroni p = 0.028). We identified four additional SNPs that tended toward a significant association with an unadjusted p value -6 . Inference of population substructure showed that cases had a higher proportion of African ancestry than control subjects (77% vs. 60%, p = 0.005). A multivariate logistic regression model that incorporated estimated ancestry and four of the top-ranked SNPs was a more accurate classifier of ED than a model that included only clinical variables. Conclusions: To our knowledge, this is the first genome-wide association study to identify SNPs associated with adverse effects resulting from radiotherapy. It is important to note that the SNP that proved to be significantly associated with ED is located within a gene whose encoded product plays a role in male gonad development and function. Another key finding of this project is that the four SNPs most strongly associated with ED were specific to persons of African ancestry and would therefore not have been identified had a cohort of European ancestry been screened. This study demonstrates

  16. Replication of endometriosis-associated single-nucleotide polymorphisms from genome-wide association studies in a Caucasian population.

    Science.gov (United States)

    Sundqvist, J; Xu, H; Vodolazkaia, A; Fassbender, A; Kyama, C; Bokor, A; Gemzell-Danielsson, K; D'Hooghe, T M; Falconer, H

    2013-03-01

    Is it possible to replicate the previously identified genetic association of four single-nucleotide polymorphisms (SNPs), rs12700667, rs7798431, rs1250248 and rs7521902, with endometriosis in a Caucasian population? A borderline association was observed for rs1250248 and endometriosis (P = 0.049). However, we could not replicate the other previously identified endometriosis-associated SNPs (rs12700667, rs7798431 and rs7521902) in the same population. Endometriosis is considered a complex disease, influenced by several genetic and environmental factors, as well as interactions between them. Previous studies have found genetic associations with endometriosis for SNPs at the 7p15 and 2q35 loci in a Caucasian population. Allele frequencies of SNPs were investigated in patients with endometriosis and controls. Blood samples and peritoneal biopsies were taken from a Caucasian female population consisting of 1129 patients with endometriosis and 831 controls. DNA was extracted for genotyping. The study was performed at a University hospital and research laboratories. A weak association with endometriosis (all stages) was observed for rs1250248 (P = 0.049). No significant associations were observed for the SNPs rs12700667, rs7798431 and rs7521902. A non-significant trend towards the association of rs1250248 with moderate/severe endometriosis was observed (odds ratio 1.18, 95% confidence interval 0.97-1.44). The inability to confirm all previous findings may result from differences between populations and type II errors. Our result demonstrates the difficulty of identifying common genetic variants in complex diseases. This study was supported by grants from the Karolinska Institutet and Stockholm City County/Karolinska Institutet (ALF), Stockholm, Sweden, Swedish Medical Research Council (K2007-54X-14212-06-3, K2010-54X-14212-09-3), Stockholm, Sweden, Leuven University Research Council (Onderzoeksraad KU Leuven), the Leuven University Hospitals Clinical Research Foundation

  17. Whole-genome analysis of herbicide-tolerant mutant rice generated by Agrobacterium-mediated gene targeting.

    Science.gov (United States)

    Endo, Masaki; Kumagai, Masahiko; Motoyama, Ritsuko; Sasaki-Yamagata, Harumi; Mori-Hosokawa, Satomi; Hamada, Masao; Kanamori, Hiroyuki; Nagamura, Yoshiaki; Katayose, Yuichi; Itoh, Takeshi; Toki, Seiichi

    2015-01-01

    Gene targeting (GT) is a technique used to modify endogenous genes in target genomes precisely via homologous recombination (HR). Although GT plants are produced using genetic transformation techniques, if the difference between the endogenous and the modified gene is limited to point mutations, GT crops can be considered equivalent to non-genetically modified mutant crops generated by conventional mutagenesis techniques. However, it is difficult to guarantee the non-incorporation of DNA fragments from Agrobacterium in GT plants created by Agrobacterium-mediated GT despite screening with conventional Southern blot and/or PCR techniques. Here, we report a comprehensive analysis of herbicide-tolerant rice plants generated by inducing point mutations in the rice ALS gene via Agrobacterium-mediated GT. We performed genome comparative genomic hybridization (CGH) array analysis and whole-genome sequencing to evaluate the molecular composition of GT rice plants. Thus far, no integration of Agrobacterium-derived DNA fragments has been detected in GT rice plants. However, >1,000 single nucleotide polymorphisms (SNPs) and insertion/deletion (InDels) were found in GT plants. Among these mutations, 20-100 variants might have some effect on expression levels and/or protein function. Information about additive mutations should be useful in clearing out unwanted mutations by backcrossing. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  18. Analysis of genetic diversity in Brown Swiss, Jersey and Holstein populations using genome-wide single nucleotide polymorphism markers

    Directory of Open Access Journals (Sweden)

    Melka Melkaye G

    2012-03-01

    Full Text Available Abstract Background Studies of genetic diversity are essential in understanding the extent of differentiation between breeds, and in designing successful diversity conservation strategies. The objective of this study was to evaluate the level of genetic diversity within and between North American Brown Swiss (BS, n = 900, Jersey (JE, n = 2,922 and Holstein (HO, n = 3,535 cattle, using genotyped bulls. GENEPOP and FSTAT software were used to evaluate the level of genetic diversity within each breed and between each pair of the three breeds based on genome-wide SNP markers (n = 50,972. Results Hardy-Weinberg equilibrium (HWE exact test within breeds showed a significant deviation from equilibrium within each population (P st indicated that the combination of BS and HO in an ideally amalgamated population had higher genetic diversity than the other pairs of breeds. Conclusion Results suggest that the three bull populations have substantially different gene pools. BS and HO show the largest gene differentiation and jointly the highest total expected gene diversity compared to when JE is considered. If the loss of genetic diversity within breeds worsens in the future, the use of crossbreeding might be an option to recover genetic diversity, especially for the breeds with small population size.

  19. Evaluation of the frequency of polymorphisms in XRCC1 (Arg399Gln) and XPD (Lys751Gln) genes related to the genome stability maintenance in individuals of the resident population from Monte Alegre, PA/Brazil municipality

    International Nuclear Information System (INIS)

    Duarte, Isabelle Magliano

    2010-01-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on Earth. Ionizing radiation is a known genotoxic agent, which can affect biological molecules, causing DNA damage and genomic instability. The cellular system of DNA repair plays an important role in maintaining genomic stability by repairing DNA damage caused by genotoxic agents. However, genes related to DNA repair may have their role committed when presenting a certain polymorphism. This study intended to analyze the frequency of single nucleotide polymorphisms (SNPs) in genes of DNA repair XRCC1 (Arg39-9Gln) and XPD (Lys751Gln) in a: population of the city of Monte Alegre, that resides in an area of high exposure to natural radioactivity. Samples of saliva were collected from individuals of the population of Monte Alegre, in which 40 samples were of male and 46 female. Through the use of RFLP (length polymorphism restriction fragment) the frequency of homozygous genotypes and / or heterozygous was determined for polymorphic genes. The XRCC1 gene had 65.4% of the presence of the allele 399Gln and XPD gene had 32.9% of the 751Gln allele. These values are similar to those found in previous studies for the XPD gene, whereas XRCC1 showed a frequency much higher than described in the literature. The. influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology for risk assessment of cancer in the population of Monte Alegre. (author)

  20. Minding the gap: Frequency of indels in mtDNA control region sequence data and influence on population genetic analyses

    Science.gov (United States)

    Pearce, J.M.

    2006-01-01

    Insertions and deletions (indels) result in sequences of various lengths when homologous gene regions are compared among individuals or species. Although indels are typically phylogenetically informative, occurrence and incorporation of these characters as gaps in intraspecific population genetic data sets are rarely discussed. Moreover, the impact of gaps on estimates of fixation indices, such as FST, has not been reviewed. Here, I summarize the occurrence and population genetic signal of indels among 60 published studies that involved alignments of multiple sequences from the mitochondrial DNA (mtDNA) control region of vertebrate taxa. Among 30 studies observing indels, an average of 12% of both variable and parsimony-informative sites were composed of these sites. There was no consistent trend between levels of population differentiation and the number of gap characters in a data block. Across all studies, the average influence on estimates of ??ST was small, explaining only an additional 1.8% of among population variance (range 0.0-8.0%). Studies most likely to observe an increase in ??ST with the inclusion of gap characters were those with control region DNA appears small, dependent upon total number of variable sites in the data block, and related to species-specific characteristics and the spatial distribution of mtDNA lineages that contain indels. ?? 2006 Blackwell Publishing Ltd.

  1. DNA sequence polymorphisms within the bovine guanine nucleotide-binding protein Gs subunit alpha (Gsα-encoding (GNAS genomic imprinting domain are associated with performance traits

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2011-01-01

    Full Text Available Abstract Background Genes which are epigenetically regulated via genomic imprinting can be potential targets for artificial selection during animal breeding. Indeed, imprinted loci have been shown to underlie some important quantitative traits in domestic mammals, most notably muscle mass and fat deposition. In this candidate gene study, we have identified novel associations between six validated single nucleotide polymorphisms (SNPs spanning a 97.6 kb region within the bovine guanine nucleotide-binding protein Gs subunit alpha gene (GNAS domain on bovine chromosome 13 and genetic merit for a range of performance traits in 848 progeny-tested Holstein-Friesian sires. The mammalian GNAS domain consists of a number of reciprocally-imprinted, alternatively-spliced genes which can play a major role in growth, development and disease in mice and humans. Based on the current annotation of the bovine GNAS domain, four of the SNPs analysed (rs43101491, rs43101493, rs43101485 and rs43101486 were located upstream of the GNAS gene, while one SNP (rs41694646 was located in the second intron of the GNAS gene. The final SNP (rs41694656 was located in the first exon of transcripts encoding the putative bovine neuroendocrine-specific protein NESP55, resulting in an aspartic acid-to-asparagine amino acid substitution at amino acid position 192. Results SNP genotype-phenotype association analyses indicate that the single intronic GNAS SNP (rs41694646 is associated (P ≤ 0.05 with a range of performance traits including milk yield, milk protein yield, the content of fat and protein in milk, culled cow carcass weight and progeny carcass conformation, measures of animal body size, direct calving difficulty (i.e. difficulty in calving due to the size of the calf and gestation length. Association (P ≤ 0.01 with direct calving difficulty (i.e. due to calf size and maternal calving difficulty (i.e. due to the maternal pelvic width size was also observed at the rs

  2. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2012-01-01

    Full Text Available Abstract Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952 of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612 were intronic and 9% (n = 464 were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS. Significant (P ® MassARRAY. No significant differences (P > 0.1 were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total. Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving

  3. TRPV1 Gene Polymorphisms Are Associated with Type 2 Diabetes by Their Interaction with Fat Consumption in the Korean Genome Epidemiology Study.

    Science.gov (United States)

    Park, Sunmin; Zhang, Xin; Lee, Na Ra; Jin, Hyun-Seok

    2016-01-01

    Different transient receptor potential vanilloid 1 (TRPV1) variants may be differently activated by noxious stimuli. We investigated how TRPV1 variants modulated the prevalence of type 2 diabetes and specific gene-nutrient interactions. Among 8,842 adults aged 40-69 years in the Korean Genome Epidemiology Study, the associations between TRPV1 genotypes and the prevalence of type 2 diabetes as well as their gene-nutrient interactions were investigated after adjusting for the covariates of age, gender, residence area, body mass index, daily energy intake, and total activity. The TRPV1 rs161364 and rs8065080 minor alleles lowered HOMA-IR and the risk of type 2 diabetes after adjusting for covariates. There were gene-nutrient interactions between TRPV1 variants rs161364 and rs8065080 and preference for oily taste, intake of oily foods, and fat intake after adjusting for covariates. Among subjects with the minor alleles of TRPV1 rs161364 and rs8065080, the group with a high preference for oily foods had a lower odds ratio for type 2 diabetes. Consistent with the preference for taste, among subjects with the minor alleles, the group with high fat intake from oily foods also exhibited a lower risk of type 2 diabetes than subjects with the major alleles. People with the minor alleles of the TRPV1 single nucleotide polymorphisms rs161364 and rs8065080 have a lower risk of diabetes with a high-fat diet, but people with the major alleles are at a higher risk of type 2 diabetes when consuming high-fat diets. The majority of people should be careful about a high fat intake. © 2016 S. Karger AG, Basel.

  4. Genomic Variability of Mycobacterium tuberculosis Strains of the Euro-American Lineage Based on Large Sequence Deletions and 15-Locus MIRU-VNTR Polymorphism

    Science.gov (United States)

    Rindi, Laura; Medici, Chiara; Bimbi, Nicola; Buzzigoli, Andrea; Lari, Nicoletta; Garzelli, Carlo

    2014-01-01

    A sample of 260 Mycobacterium tuberculosis strains assigned to the Euro-American family was studied to identify phylogenetically informative genomic regions of difference (RD). Mutually exclusive deletions of regions RD115, RD122, RD174, RD182, RD183, RD193, RD219, RD726 and RD761 were found in 202 strains; the RDRio deletion was detected exclusively among the RD174-deleted strains. Although certain deletions were found more frequently in certain spoligotype families (i.e., deletion RD115 in T and LAM, RD174 in LAM, RD182 in Haarlem, RD219 in T and RD726 in the “Cameroon” family), the RD-defined sublineages did not specifically match with spoligotype-defined families, thus arguing against the use of spoligotyping for establishing exact phylogenetic relationships between strains. Notably, when tested for katG463/gyrA95 polymorphism, all the RD-defined sublineages belonged to Principal Genotypic Group (PGG) 2, except sublineage RD219 exclusively belonging to PGG3; the 58 Euro-American strains with no deletion were of either PGG2 or 3. A representative sample of 197 isolates was then analyzed by standard 15-locus MIRU-VNTR typing, a suitable approach to independently assess genetic relationships among the strains. Analysis of the MIRU-VNTR typing results by using a minimum spanning tree (MST) and a classical dendrogram showed groupings that were largely concordant with those obtained by RD-based analysis. Isolates of a given RD profile show, in addition to closely related MIRU-VNTR profiles, related spoligotype profiles that can serve as a basis for better spoligotype-based classification. PMID:25197794

  5. Phylogeny and molecular signatures (conserved proteins and indels that are specific for the Bacteroidetes and Chlorobi species

    Directory of Open Access Journals (Sweden)

    Lorenzini Emily

    2007-05-01

    Full Text Available Abstract Background The Bacteroidetes and Chlorobi species constitute two main groups of the Bacteria that are closely related in phylogenetic trees. The Bacteroidetes species are widely distributed and include many important periodontal pathogens. In contrast, all Chlorobi are anoxygenic obligate photoautotrophs. Very few (or no biochemical or molecular characteristics are known that are distinctive characteristics of these bacteria, or are commonly shared by them. Results Systematic blast searches were performed on each open reading frame in the genomes of Porphyromonas gingivalis W83, Bacteroides fragilis YCH46, B. thetaiotaomicron VPI-5482, Gramella forsetii KT0803, Chlorobium luteolum (formerly Pelodictyon luteolum DSM 273 and Chlorobaculum tepidum (formerly Chlorobium tepidum TLS to search for proteins that are uniquely present in either all or certain subgroups of Bacteroidetes and Chlorobi. These studies have identified > 600 proteins for which homologues are not found in other organisms. This includes 27 and 51 proteins that are specific for most of the sequenced Bacteroidetes and Chlorobi genomes, respectively; 52 and 38 proteins that are limited to species from the Bacteroidales and Flavobacteriales orders, respectively, and 5 proteins that are common to species from these two orders; 185 proteins that are specific for the Bacteroides genus. Additionally, 6 proteins that are uniquely shared by species from the Bacteroidetes and Chlorobi phyla (one of them also present in the Fibrobacteres have also been identified. This work also describes two large conserved inserts in DNA polymerase III (DnaE and alanyl-tRNA synthetase that are distinctive characteristics of the Chlorobi species and a 3 aa deletion in ClpB chaperone that is mainly found in various Bacteroidales, Flavobacteriales and Flexebacteraceae, but generally not found in the homologs from other organisms. Phylogenetic analyses of the Bacteroidetes and Chlorobi species is also

  6. Evolution and genome specialization of Brucella suis biovar 2 Iberian lineages.

    Science.gov (United States)

    Ferreira, Ana Cristina; Tenreiro, Rogério; de Sá, Maria Inácia Corrêa; Dias, Ricardo

    2017-09-12

    Swine brucellosis caused by B. suis biovar 2 is an emergent disease in domestic pigs in Europe. The emergence of this pathogen has been linked to the increase of extensive pig farms and the high density of infected wild boars (Sus scrofa). In Portugal and Spain, the majority of strains share specific molecular characteristics, which allowed establishing an Iberian clonal lineage. However, several strains isolated from wild boars in the North-East region of Spain are similar to strains isolated in different Central European countries. Comparative analysis of five newly fully sequenced B. suis biovar 2 strains belonging to the main circulating clones in Iberian Peninsula, with publicly available Brucella spp. genomes, revealed that strains from Iberian clonal lineage share 74% similarity with those reference genomes. Besides the 210 kb translocation event present in all biovar 2 strains, an inversion with 944 kb was presented in chromosome I of strains from the Iberian clone. At left and right crossover points, the inversion disrupted a TRAP dicarboxylate transporter, DctM subunit, and an integral membrane protein TerC. The gene dctM is well conserved in Brucella spp. except in strains from the Iberian clonal lineage. Intraspecies comparative analysis also exposed a number of biovar-, haplotype- and strain-specific insertion-deletion (INDELs) events and single nucleotide polymorphisms (SNPs) that could explain differences in virulence and host specificities. Most discriminative mutations were associated to membrane related molecules (29%) and enzymes involved in catabolism processes (20%). Molecular identification of both B. suis biovar 2 clonal lineages could be easily achieved using the target-PCR procedures established in this work for the evaluated INDELs. Whole-genome analyses supports that the B. suis biovar 2 Iberian clonal lineage evolved from the Central-European lineage and suggests that the genomic specialization of this pathogen in the Iberian Peninsula

  7. A genome-wide association study for milk production traits in Danish Jersey cattle using a 50K single nucleotide polymorphism chip

    DEFF Research Database (Denmark)

    Mai, Duy Minh; Sahana, Goutam; Christiansen, Freddy

    2010-01-01

    on BTA4, BTA5, BTA13, BTA20, and BTA29 were new QTL for fat index. We found 7 pleiotropic or very closely linked QTL. Most of the QTL were associated with polymorphisms within narrow regions and several may represent the effects of polymorphisms of genes: DGAT1, casein, ARFGAP3, CYP11B1, and CDC...

  8. Genome-wide resequencing of KRICE_CORE reveals their potential for future breeding, as well as functional and evolutionary studies in the post-genomic era.

    Science.gov (United States)

    Kim, Tae-Sung; He, Qiang; Kim, Kyu-Won; Yoon, Min-Young; Ra, Won-Hee; Li, Feng Peng; Tong, Wei; Yu, Jie; Oo, Win Htet; Choi, Buung; Heo, Eun-Beom; Yun, Byoung-Kook; Kwon, Soon-Jae; Kwon, Soon-Wook; Cho, Yoo-Hyun; Lee, Chang-Yong; Park, Beom-Seok; Park, Yong-Jin

    2016-05-26

    Rice germplasm collections continue to grow in number and size around the world. Since maintaining and screening such massive resources remains challenging, it is important to establish practical methods to manage them. A core collection, by definition, refers to a subset of the entire population that preserves the majority of genetic diversity, enhancing the efficiency of germplasm utilization. Here, we report whole-genome resequencing of the 137 rice mini core collection or Korean rice core set (KRICE_CORE) that represents 25,604 rice germplasms deposited in the Korean genebank of the Rural Development Administration (RDA). We implemented the Illumina HiSeq 2000 and 2500 platform to produce short reads and then assembled those with 9.8 depths using Nipponbare as a reference. Comparisons of the sequences with the reference genome yielded more than 15 million (M) single nucleotide polymorphisms (SNPs) and 1.3 M INDELs. Phylogenetic and population analyses using 2,046,529 high-quality SNPs successfully assigned rice accessions to the relevant rice subgroups, suggesting that these SNPs capture evolutionary signatures that have accumulated in rice subpopulations. Furthermore, genome-wide association studies (GWAS) for four exemplary agronomic traits in the KRIC_CORE manifest the utility of KRICE_CORE; that is, identifying previously defined genes or novel genetic factors that potentially regulate important phenotypes. This study provides strong evidence that the size of KRICE_CORE is small but contains high genetic and functional diversity across the genome. Thus, our resequencing results will be useful for future breeding, as well as functional and evolutionary studies, in the post-genomic era.

  9. Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly

    Directory of Open Access Journals (Sweden)

    Shultz Jeffry

    2008-07-01

    Full Text Available Abstract Background Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max genome has been shown to be composed of approximately four thousand short interspersed homeologous regions with 1, 2 or 4 copies per haploid genome by RFLP analysis, microsatellite anchors to BACs and by contigs formed from BAC fingerprints. Despite these similar regions,, the genome has been sequenced by whole genome shotgun sequence (WGS. Here the aim was to use BAC end sequences (BES derived from three minimum tile paths (MTP to examine the extent and homogeneity of polyploid-like regions within contigs and the extent of correlation between the polyploid-like regions inferred from fingerprinting and the polyploid-like sequences inferred from WGS matches. Results Results show that when sequence divergence was 1–10%, the copy number of homeologous regions could be identified from sequence variation in WGS reads overlapping BES. Homeolog sequence variants (HSVs were single nucleotide polymorphisms (SNPs; 89% and single nucleotide indels (SNIs 10%. Larger indels were rare but present (1%. Simulations that had predicted fingerprints of homeologous regions could be separated when divergence exceeded 2% were shown to be false. We show that a 5–10% sequence divergence is necessary to separate homeologs by fingerprinting. BES compared to WGS traces showed polyploid-like regions with less than 1% sequence divergence exist at 2.3% of the locations assayed. Conclusion The use of HSVs like SNPs and SNIs to characterize BACs wil improve contig building methods. The implications for bioinformatic and functional annotation of polyploid and paleopolyploid genomes show that a combined approach of BAC fingerprint based physical maps, WGS sequence and HSV-based partitioning of BAC clones from homeologous regions to separate contigs will allow reliable de

  10. Gene-based single nucleotide polymorphism markers for genetic and association mapping in common bean.

    Science.gov (United States)

    Galeano, Carlos H; Cortés, Andrés J; Fernández, Andrea C; Soler, Álvaro; Franco-Herrera, Natalia; Makunde, Godwill; Vanderleyden, Jos; Blair, Matthew W

    2012-06-26

    In common bean, expressed sequence tags (ESTs) are an underestimated source of gene-based markers such as insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). However, due to the nature of these conserved sequences, detection of markers is difficult and portrays low levels of polymorphism. Therefore, development of intron-spanning EST-SNP markers can be a valuable resource for genetic experiments such as genetic mapping and association studies. In this study, a total of 313 new gene-based markers were developed at target genes. Intronic variation was deeply explored in order to capture more polymorphism. Introns were putatively identified after comparing the common bean ESTs with the soybean genome, and the primers were designed over intron-flanking regions. The intronic regions were evaluated for parental polymorphisms using the single strand conformational polymorphism (SSCP) technique and Sequenom MassARRAY system. A total of 53 new marker loci were placed on an integrated molecular map in the DOR364 × G19833 recombinant inbred line (RIL) population. The new linkage map was used to build a consensus map, merging the linkage maps of the BAT93 × JALO EEP558 and DOR364 × BAT477 populations. A total of 1,060 markers were mapped, with a total map length of 2,041 cM across 11 linkage groups. As a second application of the generated resource, a diversity panel with 93 genotypes was evaluated with 173 SNP markers using the MassARRAY-platform and KASPar technology. These results were coupled with previous SSR evaluations and drought tolerance assays carried out on the same individuals. This agglomerative dataset was examined, in order to discover marker-trait associations, using general linear model (GLM) and mixed linear model (MLM). Some significant associations with yield components were identified, and were consistent with previous findings. In short, this study illustrates the power of intron-based markers for linkage and association mapping in

  11. Medicina genómica: Aplicaciones del polimorfismo de un nucleótido y micromatrices de ADN Genomic Medicine: Polymorphisms and microarray applications

    Directory of Open Access Journals (Sweden)

    Monica P. Spalvieri

    2004-12-01

    Full Text Available Esta actualización tiene por objeto difundir un nuevo enfoque de las variaciones del ADN entre individuos y comentar las nuevas tecnologías para su detección. La secuenciación total del genoma humano es el comienzo para conocer la diversidad genética. La unidad de medida reconocida de esta variabilidad es el polimorfismo de un solo nucleótido (single nucleotide polymorphism o SNP. El estudio de los SNPs está restringido a la investigación pero las numerosas publicaciones sobre el tema hacen vislumbrar su entrada en la práctica clínica. Se presentan ejemplos del uso de SNPs como marcadores moleculares en la genotipificación étnica, la expresión génica de enfermedades y como potenciales blancos farmacológicos. Se comenta la técnica de las matrices (arrays que facilita el estudio de múltiples secuencias de genes mediante chips de diseño específico. Los métodos convencionales analizan hasta un máximo de 20 genes, mientras que una sola micromatriz provee información sobre decenas de miles de genes simultáneamente con una genotipificación rápida y exacta. Los avances de la biotecnología permitirán conocer, además de la secuencia de cada gen, la frecuencia y ubicación exacta de los SNPs y su influencia en los comportamientos celulares. Si bien la validez de los resultados y la eficiencia de las micromatrices son aún controvertidos, el conocimiento y caracterización del perfil genético de un paciente impulsará seguramente un cambio radical en la prevención, diagnóstico, pronóstico y tratamiento de las enfermedades humanas.This update shows new concepts related to the significance of DNA variations among individuals, as well as to their detection by using a new technology. The sequencing of the human genome is only the beginning of what will enable us to understand genetic diversity. The unit of DNA variability is the polymorphism of a single nucleotide (SNP. At present, studies on SNPs are restricted to basic research

  12. Synteny conservation between two distantly-related Rosaceae genomes: Prunus (the stone fruits and Fragaria (the strawberry

    Directory of Open Access Journals (Sweden)

    Sargent Daniel J

    2008-06-01

    Full Text Available Abstract Background The Rosaceae encompass a large number of economically-important diploid and polyploid fruit and ornamental species in many different genera. The basic chromosome numbers of these genera are x = 7, 8 and 9 and all have compact and relatively similar genome sizes. Comparative mapping between distantly-related genera has been performed to a limited extent in the Rosaceae including a comparison between Malus (subfamily Maloideae and Prunus (subfamily Prunoideae; however no data has been published to date comparing Malus or Prunus to a member of the subfamily Rosoideae. In this paper we compare the genome of Fragaria, a member of the Rosoideae, to Prunus, a member of the Prunoideae. Results The diploid genomes of Prunus (2n = 2x = 16 and Fragaria (2n = 2x = 14 were compared through the mapping of 71 anchor markers – 40 restriction fragment length polymorphisms (RFLPs, 29 indels or single nucleotide polymorphisms (SNPs derived from expressed sequence tags (ESTs and two simple-sequence repeats (SSRs – on the reference maps of both genera. These markers provided good coverage of the Prunus (78% and Fragaria (78% genomes, with maximum gaps and average densities of 22 cM and 7.3 cM/marker in Prunus and 32 cM and 8.0 cM/marker in Fragaria. Conclusion Our results indicate a clear pattern of synteny, with most markers of each chromosome of one of these species mapping to one or two chromosomes of the other. A large number of rearrangements (36, most of which produced by inversions (27 and the rest (9 by translocations or fission/fusion events could also be inferred. We have provided the first framework for the comparison of the position of genes or DNA sequences of these two economically valuable and yet distantly-related genera of the Rosaceae.

  13. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    Science.gov (United States)

    2011-09-01

    SNP Array v2. A ‘proof-of-concept’ advanced data mining algorithm for unsupervised analysis of genome-wide association study (GWAS) dataset was... Opal F AUS Yes U141 Peggs F AUS Yes U142 Taxi F AUS Yes U143 Riso MI MAL Yes U144 Szarik MI GSD Yes U145 Astor MI MAL Yes U146 Roy MC MAL Yes... mining of genetic studies in general, and especially GWAS. As a proof-of-concept, a classification analysis of the WG SNP typing dataset of a

  14. High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak

    Directory of Open Access Journals (Sweden)

    Trout-Yakel Keri M

    2010-02-01

    Full Text Available Abstract Background A large, multi-province outbreak of listeriosis associated with ready-to-eat meat products contaminated with Listeria monocytogenes serotype 1/2a occurred in Canada in 2008. Subtyping of outbreak-associated isolates using pulsed-field gel electrophoresis (PFGE revealed two similar but distinct AscI PFGE patterns. High-throughput pyrosequencing of two L. monocytogenes isolates was used to rapidly provide the genome sequence of the primary outbreak strain and to investigate the extent of genetic diversity associated with a change of a single restriction enzyme fragment during PFGE. Results The chromosomes were collinear, but differences included 28 single nucleotide polymorphisms (SNPs and three indels, including a 33 kbp prophage that accounted for the observed difference in AscI PFGE patterns. The distribution of these traits was assessed within further clinical, environmental and food isolates associated with the outbreak, and this comparison indicated that three distinct, but highly related strains may have been involved in this nationwide outbreak. Notably, these two isolates were found to harbor a 50 kbp putative mobile genomic island encoding translocation and efflux functions that has not been observed in other Listeria genomes. Conclusions High-throughput genome sequencing provided a more detailed real-time assessment of genetic traits characteristic of the outbreak strains than could be achieved with routine subtyping methods. This study confirms that the latest generation of DNA sequencing technologies can be applied during high priority public health events, and laboratories need to prepare for this inevitability and assess how to properly analyze and interpret whole genome sequences in the context of molecular epidemiology.

  15. Tracing melioidosis back to the source: using whole-genome sequencing to investigate an outbreak originating from a contaminated domestic water supply.

    Science.gov (United States)

    McRobb, Evan; Sarovich, Derek S; Price, Erin P; Kaestli, Mirjam; Mayo, Mark; Keim, Paul; Currie, Bart J

    2015-04-01

    Melioidosis, a disease of public health importance in Southeast Asia and northern Australia, is caused by the Gram-negative soil bacillus Burkholderia pseudomallei. Melioidosis is typically acquired through environmental exposure, and case clusters are rare, even in regions where the disease is endemic. B. pseudomallei is classed as a tier 1 select agent by the Centers for Disease Control and Prevention; from a biodefense perspective, source attribution is vital in an outbreak scenario to rule out a deliberate release. Two cases of melioidosis within a 3-month period at a residence in rural northern Australia prompted an investigation to determine the source of exposure. B. pseudomallei isolates from the property's groundwater supply matched the multilocus sequence type of the clinical isolates. Whole-genome sequencing confirmed the water supply as the probable source of infection in both cases, with the clinical isolates differing from the likely infecting environmental strain by just one single nucleotide polymorphism (SNP) each. For the first time, we report a phylogenetic analysis of genomewide insertion/deletion (indel) data, an approach conventionally viewed as problematic due to high mutation rates and homoplasy. Our whole-genome indel analysis was concordant with the SNP phylogeny, and these two combined data sets provided greater resolution and a better fit with our epidemiological chronology of events. Collectively, this investigation represents a highly accurate account of source attribution in a melioidosis outbreak and gives further insight into a frequently overlooked reservoir of B. pseudomallei. Our methods and findings have important implications for outbreak source tracing of this bacterium and other highly recombinogenic pathogens. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  16. Identification of Single Nucleotide Polymorphisms and analysis of Linkage Disequilibrium in sunflower elite inbred lines using the candidate gene approach

    Directory of Open Access Journals (Sweden)

    Heinz Ruth A

    2008-01-01

    Full Text Available Abstract Background Association analysis is a powerful tool to identify gene loci that may contribute to phenotypic variation. This includes the estimation of nucleotide diversity, the assessment of linkage disequilibrium structure (LD and the evaluation of selection processes. Trait mapping by allele association requires a high-density map, which could be obtained by the addition of Single Nucleotide Polymorphisms (SNPs and short insertion and/or deletions (indels to SSR and AFLP genetic maps. Nucleotide diversity analysis of randomly selected candidate regions is a promising approach for the success of association analysis and fine mapping in the sunflower genome. Moreover, knowledge of the distance over which LD persists, in agronomically meaningful sunflower accessions, is important to establish the density of markers and the experimental design for association analysis. Results A set of 28 candidate genes related to biotic and abiotic stresses were studied in 19 sunflower inbred lines. A total of 14,348 bp of sequence alignment was analyzed per individual. In average, 1 SNP was found per 69 nucleotides and 38 indels were identified in the complete data set. The mean nucleotide polymorphism was moderate (θ = 0.0056, as expected for inbred materials. The number of haplotypes per region ranged from 1 to 9 (mean = 3.54 ± 1.88. Model-based population structure analysis allowed detection of admixed individuals within the set of accessions examined. Two putative gene pools were identified (G1 and G2, with a large proportion of the inbred lines being assigned to one of them (G1. Consistent with the absence of population sub-structuring, LD for G1 decayed more rapidly (r2 = 0.48 at 643 bp; trend line, pooled data than the LD trend line for the entire set of 19 individuals (r2 = 0.64 for the same distance. Conclusion Knowledge about the patterns of diversity and the genetic relationships between breeding materials could be an invaluable aid in crop

  17. Baseline frequency of chromosomal aberrations and sister chromatid exchanges in peripheral blood lymphocytes of healthy individuals living in Turin (North-Western Italy): assessment of the effects of age, sex and GSTs gene polymorphisms on the levels of genomic damage.

    Science.gov (United States)

    Santovito, Alfredo; Cervella, Piero; Delpero, Massimiliano

    2016-05-01

    The increased exposure to environmental pollutants has led to the awareness of the necessity for constant monitoring of human populations, especially those living in urban areas. This study evaluated the background levels of genomic damage in a sample of healthy subjects living in the urban area of Turin (Italy). The association between DNA damage with age, sex and GSTs polymorphisms was assessed. One hundred and one individuals were randomly sampled. Sister Chromatid Exchanges (SCEs) and Chromosomal Aberrations (CAs) assays, as well as genotyping of GSTT1 and GSTM1 genes, were performed. Mean values of SCEs and CAs were 5.137 ± 0.166 and 0.018 ± 0.002, respectively. Results showed age and gender associated with higher frequencies of these two cytogenetic markers. The eldest subjects (51-65 years) showed significantly higher levels of genomic damage than younger individuals. GSTs polymorphisms did not appear to significantly influence the frequencies of either markers. The CAs background frequency observed in this study is one of the highest reported among European populations. Turin is one of the most polluted cities in Europe in terms of air fine PM10 and ozone and the clastogenic potential of these pollutants may explain the high frequencies of chromosomal rearrangements reported here.

  18. Single Nucleotide Polymorphism

    DEFF Research Database (Denmark)

    Børsting, Claus; Pereira, Vania; Andersen, Jeppe Dyrberg

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the genome. They have been studied extensively in the last decade with various purposes in mind. In this chapter, we will discuss the advantages and disadvantages of using SNPs for human identification...... of SNPs. This will allow acquisition of more information from the sample materials and open up for new possibilities as well as new challenges....

  19. QTL Mapping by Whole Genome Re-sequencing and Analysis of Candidate Genes for Nitrogen Use Efficiency in Rice

    Directory of Open Access Journals (Sweden)

    Xinghai Yang

    2017-09-01

    Full Text Available Nitrogen is a major nutritional element in rice production. However, excessive application of nitrogen fertilizer has caused severe environmental pollution. Therefore, development of rice varieties with improved nitrogen use efficiency (NUE is urgent for sustainable agriculture. In this study, bulked segregant analysis (BSA combined with whole genome re-sequencing (WGS technology was applied to finely map quantitative trait loci (QTL for NUE. A key QTL, designated as qNUE6 was identified on chromosome 6 and further validated by Insertion/Deletion (InDel marker-based substitutional mapping in recombinants from F2 population (NIL-13B4 × GH998. Forty-four genes were identified in this 266.5-kb region. According to detection and annotation analysis of variation sites, 39 genes with large-effect single-nucleotide polymorphisms (SNPs and large-effect InDels were selected as candidates and their expression levels were analyzed by qRT-PCR. Significant differences in the expression levels of LOC_Os06g15370 (peptide transporter PTR2 and LOC_Os06g15420 (asparagine synthetase were observed between two parents (Y11 and GH998. Phylogenetic analysis in Arabidopsis thaliana identified two closely related homologs, AT1G68570 (AtNPF3.1 and AT5G65010 (ASN2, which share 72.3 and 87.5% amino acid similarity with LOC_Os06g15370 and LOC_Os06g15420, respectively. Taken together, our results suggested that qNUE6 is a possible candidate gene for NUE in rice. The fine mapping and candidate gene analysis of qNUE6 provide the basis of molecular breeding for genetic improvement of rice varieties with high NUE, and lay the foundation for further cloning and functional analysis.

  20. Genomic expression catalogue of a global collection of BCG vaccine strains show evidence for highly diverged metabolic and cell-wall adaptations

    KAUST Repository

    Abdallah, Abdallah

    2015-10-21

    Although Bacillus Calmette-Guérin (BCG) vaccines against tuberculosis have been available for more than 90 years, their effectiveness has been hindered by variable protective efficacy and a lack of lasting memory responses. One factor contributing to this variability may be the diversity of the BCG strains that are used around the world, in part from genomic changes accumulated during vaccine production and their resulting differences in gene expression. We have compared the genomes and transcriptomes of a global collection of fourteen of the most widely used BCG strains at single base-pair resolution. We have also used quantitative proteomics to identify key differences in expression of proteins across five representative BCG strains of the four tandem duplication (DU) groups. We provide a comprehensive map of single nucleotide polymorphisms (SNPs), copy number variation and insertions and deletions (indels) across fourteen BCG strains. Genome-wide SNP characterization allowed the construction of a new and robust phylogenic genealogy of BCG strains. Transcriptional and proteomic profiling revealed a metabolic remodeling in BCG strains that may be reflected by altered immunogenicity and possibly vaccine efficacy. Together, these integrated-omic data represent the most comprehensive catalogue of genetic variation across a global collection of BCG strains.

  1. Genomic expression catalogue of a global collection of BCG vaccine strains show evidence for highly diverged metabolic and cell-wall adaptations

    KAUST Repository

    Abdallah, Abdallah; Hill-Cawthorne, Grant A.; Otto, Thomas D.; Coll, Francesc; Guerra-Assunç ã o, José Afonso; Gao, Ge; Naeem, Raeece; Ansari, Hifzur Rahman; Malas, Tareq Majed Yasin; Adroub, Sabir; Verboom, Theo; Ummels, Roy; Zhang, Huoming; Panigrahi, Aswini Kumar; McNerney, Ruth; Brosch, Roland; Clark, Taane G.; Behr, Marcel A.; Bitter, Wilbert; Pain, Arnab

    2015-01-01

    Although Bacillus Calmette-Guérin (BCG) vaccines against tuberculosis have been available for more than 90 years, their effectiveness has been hindered by variable protective efficacy and a lack of lasting memory responses. One factor contributing to this variability may be the diversity of the BCG strains that are used around the world, in part from genomic changes accumulated during vaccine production and their resulting differences in gene expression. We have compared the genomes and transcriptomes of a global collection of fourteen of the most widely used BCG strains at single base-pair resolution. We have also used quantitative proteomics to identify key differences in expression of proteins across five representative BCG strains of the four tandem duplication (DU) groups. We provide a comprehensive map of single nucleotide polymorphisms (SNPs), copy number variation and insertions and deletions (indels) across fourteen BCG strains. Genome-wide SNP characterization allowed the construction of a new and robust phylogenic genealogy of BCG strains. Transcriptional and proteomic profiling revealed a metabolic remodeling in BCG strains that may be reflected by altered immunogenicity and possibly vaccine efficacy. Together, these integrated-omic data represent the most comprehensive catalogue of genetic variation across a global collection of BCG strains.

  2. Family Polymorphism

    DEFF Research Database (Denmark)

    Ernst, Erik

    2001-01-01

    safety and flexibility at the level of multi-object systems. We are granted the flexibility of using different families of kinds of objects, and we are guaranteed the safety of the combination. This paper highlights the inability of traditional polymorphism to handle multiple objects, and presents family...... polymorphism as a way to overcome this problem. Family polymorphism has been implemented in the programming language gbeta, a generalized version of Beta, and the source code of this implementation is available under GPL....

  3. Eighteen polymorphic microsatellites for domestic pigeon Columba ...

    Indian Academy of Sciences (India)

    certain parasites which cause health problems in humans and domestic animals ... The genomic DNA was isolated using standard protocol as described by ..... panel of polymorphic microsatellite markers in Himalayan monal. Lophophorus ...

  4. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  5. Electroclinical presentation and genotype-phenotype relationships in patients with Unverricht-Lundborg disease carrying compound heterozygous CSTB point and indel mutations.

    Science.gov (United States)

    Canafoglia, Laura; Gennaro, Elena; Capovilla, Giuseppe; Gobbi, Giuseppe; Boni, Antonella; Beccaria, Francesca; Viri, Maurizio; Michelucci, Roberto; Agazzi, Pamela; Assereto, Stefania; Coviello, Domenico A; Di Stefano, Maria; Rossi Sebastiano, Davide; Franceschetti, Silvana; Zara, Federico

    2012-12-01

    Unverricht-Lundborg disease (EPM1A) is frequently due to an unstable expansion of a dodecamer repeat in the CSTB gene, whereas other types of mutations are rare. EPM1A due to homozygous expansion has a rather stereotyped presentation with prominent action myoclonus. We describe eight patients with five different compound heterozygous CSTB point or indel mutations in order to highlight their particular phenotypical presentations and evaluate their genotype-phenotype relationships. We screened CSTB mutations by means of Southern blotting and the sequencing of the genomic DNA of each proband. CSTB messenger RNA (mRNA) aberrations were characterized by sequencing the complementary DNA (cDNA) of lymphoblastoid cells, and assessing the protein concentrations in the lymphoblasts. The patient evaluations included the use of a simplified myoclonus severity rating scale, multiple neurophysiologic tests, and electroencephalography (EEG)-polygraphic recordings. To highlight the particular clinical features and disease time-course in compound heterozygous patients, we compared some of their characteristics with those observed in a series of 40 patients carrying the common homozygous expansion mutation observed at the C. Besta Foundation, Milan, Italy. The eight compound heterozygous patients belong to six EPM1A families (out of 52; 11.5%) diagnosed at the Laboratory of Genetics of the Galliera Hospitals in Genoa, Italy. They segregated five different heterozygous point or indel mutations in association with the common dodecamer expansion. Four patients from three families had previously reported CSTB mutations (c.67-1G>C and c.168+1_18del); one had a novel nonsense mutation at the first exon (c.133C>T) leading to a premature stop codon predicting a short peptide; the other three patients from two families had a complex novel indel mutation involving the donor splice site of intron 2 (c.168+2_169+21delinsAA) and leading to an aberrant transcript with a partially retained intron

  6. Involvement of the Ventrolateral Prefrontal Cortex in Learning Others' Bad Reputations and Indelible Distrust.

    Science.gov (United States)

    Suzuki, Atsunobu; Ito, Yuichi; Kiyama, Sachiko; Kunimi, Mitsunobu; Ohira, Hideki; Kawaguchi, Jun; Tanabe, Hiroki C; Nakai, Toshiharu

    2016-01-01

    A bad reputation can persistently affect judgments of an individual even when it turns out to be invalid and ought to be disregarded. Such indelible distrust may reflect that the negative evaluation elicited by a bad reputation transfers to a person. Consequently, the person him/herself may come to activate this negative evaluation irrespective of the accuracy of the reputation. If this theoretical model is correct, an evaluation-related brain region will be activated when witnessing a person whose bad reputation one has learned about, regardless of whether the reputation is deemed valid or not. Here, we tested this neural hypothesis with functional magnetic resonance imaging (fMRI). Participants memorized faces paired with either a good or a bad reputation. Next, they viewed the faces alone and inferred whether each person was likely to cooperate, first while retrieving the reputations, and then while trying to disregard them as false. A region of the left ventrolateral prefrontal cortex (vlPFC), which may be involved in negative evaluation, was activated by faces previously paired with bad reputations, irrespective of whether participants attempted to retrieve or disregard these reputations. Furthermore, participants showing greater activity of the left ventrolateral prefrontal region in response to the faces with bad reputations were more likely to infer that these individuals would not cooperate. Thus, once associated with a bad reputation, a person may elicit evaluation-related brain responses on their own, thereby evoking distrust independently of their reputation.

  7. Structural analysis of polarizing indels: an emerging consensus on the root of the tree of life

    Directory of Open Access Journals (Sweden)

    Bourne Philip E

    2009-08-01

    Full Text Available Abstract Background The root of the tree of life has been a holy grail ever since Darwin first used the tree as a metaphor for evolution. New methods seek to narrow down the location of the root by excluding it from branches of the tree of life. This is done by finding traits that must be derived, and excluding the root from the taxa those traits cover. However the two most comprehensive attempts at this strategy, performed by Cavalier-Smith and Lake et al., have excluded each other's rootings. Results The indel polarizations of Lake et al. rely on high quality alignments between paralogs that diverged before the last universal common ancestor (LUCA. Therefore, sequence alignment artifacts may skew their conclusions. We have reviewed their data using protein structure information where available. Several of the conclusions are quite different when viewed in the light of structure which is conserved over longer evolutionary time scales than sequence. We argue there is no polarization that excludes the root from all Gram-negatives, and that polarizations robustly exclude the root from the Archaea. Conclusion We conclude that there is no contradiction between the polarization datasets. The combination of these datasets excludes the root from every possible position except near the Chloroflexi. Reviewers This article was reviewed by Greg Fournier (nominated by J. Peter Gogarten, Purificación López-García, and Eugene Koonin.

  8. Length and nucleotide sequence polymorphism at the trnL and trnF non-coding regions of chloroplast genomes among Saccharum and Erianthus species

    Science.gov (United States)

    The aneupolyploidy genome of sugarcane (Saccharum hybrids spp.) and lack of a classical genetic linkage map make genetics research most difficult for sugarcane. Whole genome sequencing and genetic characterization of sugarcane and related taxa are far behind other crops. In this study, universal PCR...

  9. Discovering and verifying DNA polymorphisms in a mung bean [V. radiata (L. R. Wilczek] collection by EcoTILLING and sequencing

    Directory of Open Access Journals (Sweden)

    Dean Rob E

    2008-06-01

    Full Text Available Abstract Background Vigna radiata, which is classified in the family Fabaceae, is an important economic crop and a dietary staple in many developing countries. The species radiata can be further subdivided into varieties of which the variety sublobata is currently acknowledged as the putative progenitor of radiata. EcoTILLING was employed to identify single nucleotide polymorphisms (SNPs and small insertions/deletions (INDELS in a collection of Vigna radiata accessions. Findings A total of 157 DNA polymorphisms in the collection were produced from ten primer sets when using V. radiata var. sublobata as the reference. The majority of polymorphisms detected were found in putative introns. The banding patterns varied from simple to complex as the number of DNA polymorphisms between two pooled samples increased. Numerous SNPs and INDELS ranging from 4–24 and 1–6, respectively, were detected in all fragments when pooling V. radiata var. sublobata with V. radiata var. radiata. On the other hand, when accessions of V. radiata var. radiata were mixed together and digested with CEL I relatively few SNPs and no INDELS were detected. Conclusion EcoTILLING was utilized to identify polymorphisms in a collection of mung bean, which previously showed limited molecular genetic diversity and limited morphological diversity in the flowers and pod descriptors. Overall, EcoTILLING proved to be a powerful genetic analysis tool providing the rapid identification of naturally occurring variation.

  10. New traits in crops produced by genome editing techniques based on deletions

    NARCIS (Netherlands)

    Wiel, van de C.C.M.; Schaart, J.G.; Lotz, L.A.P.; Smulders, M.J.M.

    2017-01-01

    One of the most promising New Plant Breeding Techniques is genome editing (also called gene editing) with the help of a programmable site-directed nuclease (SDN). In this review, we focus on SDN-1, which is the generation of small deletions or insertions (indels) at a precisely defined location in

  11. Analysis of indel variations in the human disease-associated genes ...

    Indian Academy of Sciences (India)

    Keywords. insertion–deletion variations; haematological disease; tumours; human genetics. Journal of Genetics ... domly selected healthy Korean individuals using a blood genomic DNA ... Bioinformatics annotation and 3-D protein structure analysis. In this study ..... 2009 A genome-wide meta-analysis identifies. Journal of ...

  12. Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

    Science.gov (United States)

    2012-01-01

    Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource

  13. Comparative genome analysis identifies two large deletions in the genome of highly-passaged attenuated Streptococcus agalactiae strain YM001 compared to the parental pathogenic strain HN016.

    Science.gov (United States)

    Wang, Rui; Li, Liping; Huang, Yan; Luo, Fuguang; Liang, Wanwen; Gan, Xi; Huang, Ting; Lei, Aiying; Chen, Ming; Chen, Lianfu

    2015-11-04

    Streptococcus agalactiae (S. agalactiae), also known as group B Streptococcus (GBS), is an important pathogen for neonatal pneumonia, meningitis, bovine mastitis, and fish meningoencephalitis. The global outbreaks of Streptococcus disease in tilapia cause huge economic losses and threaten human food hygiene safety as well. To investigate the mechanism of S. agalactiae pathogenesis in tilapia and develop attenuated S. agalactiae vaccine, this study sequenced and comparatively analyzed the whole genomes of virulent wild-type S. agalactiae strain HN016 and its highly-passaged attenuated strain YM001 derived from tilapia. We performed Illumina sequencing of DNA prepared from strain HN016 and YM001. Sequencedreads were assembled and nucleotide comparisons, single nucleotide polymorphism (SNP) , indels were analyzed between the draft genomes of HN016 and YM001. Clustered regularly interspaced short palindromic repeats (CRISPRs) and prophage were detected and analyzed in different S. agalactiae strains. The genome of S. agalactiae YM001 was 2,047,957 bp with a GC content of 35.61 %; it contained 2044 genes and 88 RNAs. Meanwhile, the genome of S. agalactiae HN016 was 2,064,722 bp with a GC content of 35.66 %; it had 2063 genes and 101 RNAs. Comparative genome analysis indicated that compared with HN016, YM001 genome had two significant large deletions, at the sizes of 5832 and 11,116 bp respectively, resulting in the deletion of three rRNA and ten tRNA genes, as well as the deletion and functional damage of ten genes related to metabolism, transport, growth, anti-stress, etc. Besides these two large deletions, other ten deletions and 28 single nucleotide variations (SNVs) were also identified, mainly affecting the metabolism- and growth-related genes. The genome of attenuated S. agalactiae YM001 showed significant variations, resulting in the deletion of 10 functional genes, compared to the parental pathogenic strain HN016. The deleted and mutated functional genes all

  14. High Resolution Melt (HRM) analysis is an efficient tool to genotype EMS mutants in complex crop genomes.

    Science.gov (United States)

    Lochlainn, Seosamh Ó; Amoah, Stephen; Graham, Neil S; Alamer, Khalid; Rios, Juan J; Kurup, Smita; Stoute, Andrew; Hammond, John P; Østergaard, Lars; King, Graham J; White, Phillip J; Broadley, Martin R

    2011-12-08

    Targeted Induced Loci Lesions IN Genomes (TILLING) is increasingly being used to generate and identify mutations in target genes of crop genomes. TILLING populations of several thousand lines have been generated in a number of crop species including Brassica rapa. Genetic analysis of mutants identified by TILLING requires an efficient, high-throughput and cost effective genotyping method to track the mutations through numerous generations. High resolution melt (HRM) analysis has been used in a number of systems to identify single nucleotide polymorphisms (SNPs) and insertion/deletions (IN/DELs) enabling the genotyping of different types of samples. HRM is ideally suited to high-throughput genotyping of multiple TILLING mutants in complex crop genomes. To date it has been used to identify mutants and genotype single mutations. The aim of this study was to determine if HRM can facilitate downstream analysis of multiple mutant lines identified by TILLING in order to characterise allelic series of EMS induced mutations in target genes across a number of generations in complex crop genomes. We demonstrate that HRM can be used to genotype allelic series of mutations in two genes, BraA.CAX1a and BraA.MET1.a in Brassica rapa. We analysed 12 mutations in BraA.CAX1.a and five in BraA.MET1.a over two generations including a back-cross to the wild-type. Using a commercially available HRM kit and the Lightscanner™ system we were able to detect mutations in heterozygous and homozygous states for both genes. Using HRM genotyping on TILLING derived mutants, it is possible to generate an allelic series of mutations within multiple target genes rapidly. Lines suitable for phenotypic analysis can be isolated approximately 8-9 months (3 generations) from receiving M3 seed of Brassica rapa from the RevGenUK TILLING service.

  15. High Resolution Melt (HRM analysis is an efficient tool to genotype EMS mutants in complex crop genomes

    Directory of Open Access Journals (Sweden)

    Lochlainn Seosamh Ó

    2011-12-01

    Full Text Available Abstract Background Targeted Induced Loci Lesions IN Genomes (TILLING is increasingly being used to generate and identify mutations in target genes of crop genomes. TILLING populations of several thousand lines have been generated in a number of crop species including Brassica rapa. Genetic analysis of mutants identified by TILLING requires an efficient, high-throughput and cost effective genotyping method to track the mutations through numerous generations. High resolution melt (HRM analysis has been used in a number of systems to identify single nucleotide polymorphisms (SNPs and insertion/deletions (IN/DELs enabling the genotyping of different types of samples. HRM is ideally suited to high-throughput genotyping of multiple TILLING mutants in complex crop genomes. To date it has been used to identify mutants and genotype single mutations. The aim of this study was to determine if HRM can facilitate downstream analysis of multiple mutant lines identified by TILLING in order to characterise allelic series of EMS induced mutations in target genes across a number of generations in complex crop genomes. Results We demonstrate that HRM can be used to genotype allelic series of mutations in two genes, BraA.CAX1a and BraA.MET1.a in Brassica rapa. We analysed 12 mutations in BraA.CAX1.a and five in BraA.MET1.a over two generations including a back-cross to the wild-type. Using a commercially available HRM kit and the Lightscanner™ system we were able to detect mutations in heterozygous and homozygous states for both genes. Conclusions Using HRM genotyping on TILLING derived mutants, it is possible to generate an allelic series of mutations within multiple target genes rapidly. Lines suitable for phenotypic analysis can be isolated approximately 8-9 months (3 generations from receiving M3 seed of Brassica rapa from the RevGenUK TILLING service.

  16. Functional Analysis of In-frame Indel ARID1A Mutations Reveals New Regulatory Mechanisms of Its Tumor Suppressor Functions

    Directory of Open Access Journals (Sweden)

    Bin Guan

    2012-10-01

    Full Text Available AT-rich interactive domain 1A (ARID1A has emerged as a new tumor suppressor in which frequent somatic mutations have been identified in several types of human cancers. Although most ARID1A somatic mutations are frame-shift or nonsense mutations that contribute to mRNA decay and loss of protein expression, 5% of ARID1A mutations are in-frame insertions or deletions (indels that involve only a small stretch of peptides. Naturally occurring in-frame indel mutations provide unique and useful models to explore the biology and regulatory role of ARID1A. In this study, we analyzed indel mutations identified in gynecological cancers to determine how these mutations affect the tumor suppressor function of ARID1A. Our results demonstrate that all in-frame mutants analyzed lost their ability to inhibit cellular proliferation or activate transcription of CDKN1A, which encodes p21, a downstream effector of ARID1A. We also showed that ARID1A is a nucleocytoplasmic protein whose stability depends on its subcellular localization. Nuclear ARID1A is less stable than cytoplasmic ARID1A because ARID1A is rapidly degraded by the ubiquitin-proteasome system in the nucleus. In-frame deletions affecting the consensus nuclear export signal reduce steady-state protein levels of ARID1A. This defect in nuclear exportation leads to nuclear retention and subsequent degradation. Our findings delineate a mechanism underlying the regulation of ARID1A subcellular distribution and protein stability and suggest that targeting the nuclear ubiquitin-proteasome system can increase the amount of the ARID1A protein in the nucleus and restore its tumor suppressor functions.

  17. An empirical test of the treatment of indels during optimization alignment based on the phylogeny of the genus Secale (Poaceae)

    DEFF Research Database (Denmark)

    Petersen, Gitte; Seberg, Ole; Aagesen, Lone

    2004-01-01

    The ability of the program POY, implementing optimization alignment, to deal with major indels is explored and discussed in connection with a phylogenetic analysis of the genus Secale based on partial Adhl sequences. The Adhl sequences used span exon 2-4. Nearly all variation is found in intron 2...... recovers both genera as monophyletic when knowledge of the duplication is incorporated in the analysis. The phylogenetic relationships within Secale are not clearly resolved. Subspecific taxa of Secale strictum have identical sequences and they are confined to a monophyletic group. However, the two...

  18. Genomic DNA fingerprinting of clinical Haemophilus influenzae isolates by polymerase chain reaction amplification: comparison with major outer-membrane protein and restriction fragment length polymorphism analysis

    NARCIS (Netherlands)

    van Belkum, A.; Duim, B.; Regelink, A.; Möller, L.; Quint, W.; van Alphen, L.

    1994-01-01

    Non-capsulate strains of Haemophilus influenzae were genotyped by analysis of variable DNA segments obtained by amplification of genomic DNA with the polymerase chain reaction (PCR fingerprinting). Discrete fragments of 100-2000 bp were obtained. The reproducibility of the procedure was assessed by

  19. Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex changes and multiple forms of chromosomal instability in colorectal cancers

    DEFF Research Database (Denmark)

    Gaasenbeek, Michelle; Howarth, Kimberley; Rowan, Andrew J

    2006-01-01

    Cancers with chromosomal instability (CIN) are held to be aneuploid/polyploid with multiple large-scale gains/deletions, but the processes underlying CIN are unclear and different types of CIN might exist. We investigated colorectal cancer cell lines using array-comparative genomic hybridization...

  20. Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers

    DEFF Research Database (Denmark)

    Su, Guosheng; Christensen, Ole Fredslund; Ostersen, Tage

    2012-01-01

    of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1) a simple additive genetic model (MA), 2) a model including both additive and additive by additive epistatic genetic effects (MAE), 3) a model including both additive and dominance genetic effects...

  1. Evolution of the P-type II ATPase gene family in the fungi and presence of structural genomic changes among isolates of Glomus intraradices

    Directory of Open Access Journals (Sweden)

    Sanders Ian R

    2006-03-01

    Full Text Available Abstract Background The P-type II ATPase gene family encodes proteins with an important role in adaptation of the cell to variation in external K+, Ca2+ and Na2+ concentrations. The presence of P-type II gene subfamilies that are specific for certain kingdoms has been reported but was sometimes contradicted by discovery of previously unknown homologous sequences in newly sequenced genomes. Members of this gene family have been sampled in all of the fungal phyla except the arbuscular mycorrhizal fungi (AMF; phylum Glomeromycota, which are known to play a key-role in terrestrial ecosystems and to be genetically highly variable within populations. Here we used highly degenerate primers on AMF genomic DNA to increase the sampling of fungal P-Type II ATPases and to test previous predictions about their evolution. In parallel, homologous sequences of the P-type II ATPases have been used to determine the nature and amount of polymorphism that is present at these loci among isolates of Glomus intraradices harvested from the same field. Results In this study, four P-type II ATPase sub-families have been isolated from three AMF species. We show that, contrary to previous predictions, P-type IIC ATPases are present in all basal fungal taxa. Additionally, P-Type IIE ATPases should no longer be considered as exclusive to the Ascomycota and the Basidiomycota, since we also demonstrate their presence in the Zygomycota. Finally, a comparison of homologous sequences encoding P-type IID ATPases showed unexpectedly that indel mutations among coding regions, as well as specific gene duplications occur among AMF individuals within the same field. Conclusion On the basis of these results we suggest that the diversification of P-Type IIC and E ATPases followed the diversification of the extant fungal phyla with independent events of gene gains and losses. Consistent with recent findings on the human genome, but at a much smaller geographic scale, we provided evidence

  2. Drosophila Model for the Analysis of Genesis of LIM-kinase 1-Dependent Williams-Beuren Syndrome Cognitive Phenotypes: INDELs, Transposable Elements of the Tc1/Mariner Superfamily and MicroRNAs

    Directory of Open Access Journals (Sweden)

    Elena V. Savvateeva-Popova

    2017-09-01

    Full Text Available Genomic disorders, the syndromes with multiple manifestations, may occur sporadically due to unequal recombination in chromosomal regions with specific architecture. Therefore, each patient may carry an individual structural variant of DNA sequence (SV with small insertions and deletions (INDELs sometimes less than 10 bp. The transposable elements of the Tc1/mariner superfamily are often associated with hotspots for homologous recombination involved in human genetic disorders, such as Williams Beuren Syndromes (WBS with LIM-kinase 1-dependent cognitive defects. The Drosophila melanogaster mutant agnts3 has unusual architecture of the agnostic locus harboring LIMK1: it is a hotspot of chromosome breaks, ectopic contacts, underreplication, and recombination. Here, we present the analysis of LIMK1-containing locus sequencing data in agnts3 and three D. melanogaster wild-type strains—Canton-S, Berlin, and Oregon-R. We found multiple strain-specific SVs, namely, single base changes and small INDEls. The specific feature of agnts3 is 28 bp A/T-rich insertion in intron 1 of LIMK1 and the insertion of mobile S-element from Tc1/mariner superfamily residing ~460 bp downstream LIMK1 3′UTR. Neither of SVs leads to amino acid substitutions in agnts3 LIMK1. However, they apparently affect the nucleosome distribution, non-canonical DNA structure formation and transcriptional factors binding. Interestingly, the overall expression of miRNAs including the biomarkers for human neurological diseases, is drastically reduced in agnts3 relative to the wild-type strains. Thus, LIMK1 DNA structure per se, as well as the pronounced changes in total miRNAs profile, probably lead to LIMK1 dysregulation and complex behavioral dysfunctions observed in agnts3 making this mutant a simple plausible Drosophila model for WBS.

  3. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi Xuan; Han, Bin; Kurata, Nori

    2015-01-01

    . Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all

  4. Genomic resources for water yam (Dioscorea alata L.): analyses of EST-Sequences, De Novo sequencing and GBS libraries

    Science.gov (United States)

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources such as SSRs, SNPs and InDels in several model and non-model plant species. Yam (Dioscorea spp.) i...

  5. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  6. Genomic profiling of thousands of candidate polymorphisms predicts risk of relapse in 778 Danish and German childhood acute lymphoblastic leukemia patients

    DEFF Research Database (Denmark)

    Wesolowska, Agata; Borst, L.; Dalgaard, Marlene Danner

    2015-01-01

    Childhood acute lymphoblastic leukemia survival approaches 90%. New strategies are needed to identify the 10–15% who evade cure. We applied targeted, sequencing-based genotyping of 25 000 to 34 000 preselected potentially clinically relevant singlenucleotide polymorphisms (SNPs) to identify host...... associated with risk of relapse across protocols. SNP and biologic pathway level analyses associated relapse risk with leukemia aggressiveness, glucocorticosteroid pharmacology/response and drug transport/metabolism pathways. Classification and regression tree analysis identified three distinct risk groups...... defined by end of induction residual leukemia, white blood cell count and variants in myeloperoxidase (MPO), estrogen receptor 1 (ESR1), lamin B1 (LMNB1) and matrix metalloproteinase-7 (MMP7) genes, ATP-binding cassette transporters and glucocorticosteroid transcription regulation pathways. Relapse rates...

  7. Ancestry informative markers: inference of ancestry in aged bone samples using an autosomal AIM-Indel multiplex.

    Science.gov (United States)

    Romanini, Carola; Romero, Magdalena; Salado Puerto, Mercedes; Catelli, Laura; Phillips, Christopher; Pereira, Rui; Gusmão, Leonor; Vullo, Carlos

    2015-05-01

    Ancestry informative markers (AIMs) can be useful to infer ancestry proportions of the donors of forensic evidence. The probability of success typing degraded samples, such as human skeletal remains, is strongly influenced by the DNA fragment lengths that can be amplified and the presence of PCR inhibitors. Several AIM panels are available amongst the many forensic marker sets developed for genotyping degraded DNA. Using a 46 AIM Insertion Deletion (Indel) multiplex, we analyzed human skeletal remains of post mortem time ranging from 35 to 60 years from four different continents (Sub-Saharan Africa, South and Central America, East Asia and Europe) to ascertain the genetic ancestry components. Samples belonging to non-admixed individuals could be assigned to their corresponding continental group. For the remaining samples with admixed ancestry, it was possible to estimate the proportion of co-ancestry components from the four reference population groups. The 46 AIM Indel set was informative enough to efficiently estimate the proportion of ancestry even in samples yielding partial profiles, a frequent occurrence when analyzing inhibited and/or degraded DNA extracts. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Association and Genetic Identification of Loci for Four Fruit Traits in Tomato Using InDel Markers

    Directory of Open Access Journals (Sweden)

    Xiaoxi Liu

    2017-07-01

    Full Text Available Tomato (Solanum lycopersicum fruit weight (FW, soluble solid content (SSC, fruit shape and fruit color are crucial for yield, quality and consumer acceptability. In this study, a 192 accessions tomato association panel comprising a mixture of wild species, cherry tomato, landraces, and modern varieties collected worldwide was genotyped with 547 InDel markers evenly distributed on 12 chromosomes and scored for FW, SSC, fruit shape index (FSI, and color parameters over 2 years with three replications each year. The association panel was sorted into two subpopulations. Linkage disequilibrium ranged from 3.0 to 47.2 Mb across 12 chromosomes. A set of 102 markers significantly (p < 1.19–1.30 × 10−4 associated with SSC, FW, fruit shape, and fruit color was identified on 11 of the 12 chromosomes using a mixed linear model. The associations were compared with the known gene/QTLs for the same traits. Genetic analysis using F2 populations detected 14 and 4 markers significantly (p < 0.05 associated with SSC and FW, respectively. Some loci were commonly detected by both association and linkage analysis. Particularly, one novel locus for FW on chromosome 4 detected by association analysis was also identified in F2 populations. The results demonstrated that association mapping using limited number of InDel markers and a relatively small population could not only complement and enhance previous QTL information, but also identify novel loci for marker-assisted selection of fruit traits in tomato.

  9. PeachVar-DB: A Curated Collection of Genetic Variations for the Interactive Analysis of Peach Genome Data.

    Science.gov (United States)

    Cirilli, Marco; Flati, Tiziano; Gioiosa, Silvia; Tagliaferri, Ilario; Ciacciulli, Angelo; Gao, Zhongshan; Gattolin, Stefano; Geuna, Filippo; Maggi, Francesco; Bottoni, Paolo; Rossini, Laura; Bassi, Daniele; Castrignanò, Tiziana; Chillemi, Giovanni

    2018-01-01

    Applying next-generation sequencing (NGS) technologies to species of agricultural interest has the potential to accelerate the understanding and exploration of genetic resources. The storage, availability and maintenance of huge quantities of NGS-generated data remains a major challenge. The PeachVar-DB portal, available at http://hpc-bioinformatics.cineca.it/peach, is an open-source catalog of genetic variants present in peach (Prunus persica L. Batsch) and wild-related species of Prunus genera, annotated from 146 samples publicly released on the Sequence Read Archive (SRA). We designed a user-friendly web-based interface of the database, providing search tools to retrieve single nucleotide polymorphism (SNP) and InDel variants, along with useful statistics and information. PeachVar-DB results are linked to the Genome Database for Rosaceae (GDR) and the Phytozome database to allow easy access to other external useful plant-oriented resources. In order to extend the genetic diversity covered by the PeachVar-DB further, and to allow increasingly powerful comparative analysis, we will progressively integrate newly released data. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  10. Application of a Combination of a Knowledge-Based Algorithm and 2-Stage Screening to Hypothesis-Free Genomic Data on Irinotecan-Treated Patients for Identification of a Candidate Single Nucleotide Polymorphism Related to an Adverse Effect

    Science.gov (United States)

    Takahashi, Hiro; Sai, Kimie; Saito, Yoshiro; Kaniwa, Nahoko; Matsumura, Yasuhiro; Hamaguchi, Tetsuya; Shimada, Yasuhiro; Ohtsu, Atsushi; Yoshino, Takayuki; Doi, Toshihiko; Okuda, Haruhiro; Ichinohe, Risa; Takahashi, Anna; Doi, Ayano; Odaka, Yoko; Okuyama, Misuzu; Saijo, Nagahiro; Sawada, Jun-ichi; Sakamoto, Hiromi; Yoshida, Teruhiko

    2014-01-01

    Interindividual variation in a drug response among patients is known to cause serious problems in medicine. Genomic information has been proposed as the basis for “personalized” health care. The genome-wide association study (GWAS) is a powerful technique for examining single nucleotide polymorphisms (SNPs) and their relationship with drug response variation; however, when using only GWAS, it often happens that no useful SNPs are identified due to multiple testing problems. Therefore, in a previous study, we proposed a combined method consisting of a knowledge-based algorithm, 2 stages of screening, and a permutation test for identifying SNPs. In the present study, we applied this method to a pharmacogenomics study where 109,365 SNPs were genotyped using Illumina Human-1 BeadChip in 168 cancer patients treated with irinotecan chemotherapy. We identified the SNP rs9351963 in potassium voltage-gated channel subfamily KQT member 5 (KCNQ5) as a candidate factor related to incidence of irinotecan-induced diarrhea. The p value for rs9351963 was 3.31×10−5 in Fisher's exact test and 0.0289 in the permutation test (when multiple testing problems were corrected). Additionally, rs9351963 was clearly superior to the clinical parameters and the model involving rs9351963 showed sensitivity of 77.8% and specificity of 57.6% in the evaluation by means of logistic regression. Recent studies showed that KCNQ4 and KCNQ5 genes encode members of the M channel expressed in gastrointestinal smooth muscle and suggested that these genes are associated with irritable bowel syndrome and similar peristalsis diseases. These results suggest that rs9351963 in KCNQ5 is a possible predictive factor of incidence of diarrhea in cancer patients treated with irinotecan chemotherapy and for selecting chemotherapy regimens, such as irinotecan alone or a combination of irinotecan with a KCNQ5 opener. Nonetheless, clinical importance of rs9351963 should be further elucidated. PMID:25127363

  11. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    Directory of Open Access Journals (Sweden)

    Walker M Andrew

    2006-09-01

    Full Text Available Abstract Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c, 54 (Dixon, 83 (Ann1 and 9 (Temecula-1. A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes

  12. Single Nucleotide Polymorphisms in B-Genome Specific UDP-Glucosyl Transferases Associated with Fusarium Head Blight Resistance and Reduced Deoxynivalenol Accumulation in Wheat Grain.

    Science.gov (United States)

    Sharma, Pallavi; Gangola, Manu P; Huang, Chen; Kutcher, H Randy; Ganeshan, Seedhabadee; Chibbar, Ravindra N

    2018-01-01

    An in vitro spike culture method was optimized to evaluate Fusarium head blight (FHB) resistance in wheat (Triticum aestivum) and used to screen a population of ethyl methane sulfonate treated spike culture-derived variants (SCDV). Of the 134 SCDV evaluated, the disease severity score of 47 of the variants was ≤30%. Single nucleotide polymorphisms (SNP) in the UDP-glucosyltransferase (UGT) genes, TaUGT-2B, TaUGT-3B, and TaUGT-EST, differed between AC Nanda (an FHB-susceptible wheat variety) and Sumai-3 (an FHB-resistant wheat cultivar). SNP at 450 and 1,558 bp from the translation initiation site in TaUGT-2B and TaUGT-3B, respectively were negatively correlated with FHB severity in the SCDV population, whereas the SNP in TaUGT-EST was not associated with FHB severity. Fusarium graminearum strain M7-07-1 induced early expression of TaUGT-2B and TaUGT-3B in FHB-resistant SCDV lines, which were associated with deoxynivalenol accumulation and reduced FHB disease progression. At 8 days after inoculation, deoxynivalenol concentration varied from 767 ppm in FHB-resistant variants to 2,576 ppm in FHB-susceptible variants. The FHB-resistant SCDV identified can be used as new sources of FHB resistance in wheat improvement programs.

  13. Accelerating Genome Editing in CHO Cells Using CRISPR Cas9 and CRISPy, a Web-Based Target Finding Tool

    DEFF Research Database (Denmark)

    Ronda, Carlotta; Pedersen, Lasse Ebdrup; Hansen, Henning Gram

    2014-01-01

    of the CRISPR Cas9 technology in CHO cells by generating site-specific gene disruptions in COSMC and FUT8, both of which encode proteins involved in glycosylation. The tested single guide RNAs (sgRNAs) created an indel frequency up to 47.3% in COSMC, while an indel frequency up to 99.7% in FUT8 was achieved...... mutations at the target sites, with a strong preference for single base indels. Finally, we have developed a user-friendly bioinformatics tool, named “CRISPy” for rapid identification of sgRNA target sequences in the CHO-K1 genome. The CRISPy tool identified 1,970,449 CRISPR targets divided into 27...

  14. A response to Yu et al. "A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP) array", BMC Bioinformatics 2007, 8: 145.

    Science.gov (United States)

    Rueda, Oscar M; Diaz-Uriarte, Ramon

    2007-10-16

    Yu et al. (BMC Bioinformatics 2007,8: 145+) have recently compared the performance of several methods for the detection of genomic amplification and deletion breakpoints using data from high-density single nucleotide polymorphism arrays. One of the methods compared is our non-homogenous Hidden Markov Model approach. Our approach uses Markov Chain Monte Carlo for inference, but Yu et al. ran the sampler for a severely insufficient number of iterations for a Markov Chain Monte Carlo-based method. Moreover, they did not use the appropriate reference level for the non-altered state. We rerun the analysis in Yu et al. using appropriate settings for both the Markov Chain Monte Carlo iterations and the reference level. Additionally, to show how easy it is to obtain answers to additional specific questions, we have added a new analysis targeted specifically to the detection of breakpoints. The reanalysis shows that the performance of our method is comparable to that of the other methods analyzed. In addition, we can provide probabilities of a given spot being a breakpoint, something unique among the methods examined. Markov Chain Monte Carlo methods require using a sufficient number of iterations before they can be assumed to yield samples from the distribution of interest. Running our method with too small a number of iterations cannot be representative of its performance. Moreover, our analysis shows how our original approach can be easily adapted to answer specific additional questions (e.g., identify edges).

  15. Genotyping by sequencing reveals the interspecific C. maxima / C. reticulata admixture along the genomes of modern citrus varieties of mandarins, tangors, tangelos, orangelos and grapefruits.

    Science.gov (United States)

    Oueslati, Amel; Salhi-Hannachi, Amel; Luro, François; Vignes, Hélène; Mournet, Pierre; Ollitrault, Patrick

    2017-01-01

    The mandarin horticultural group is an important component of world citrus production for the fresh fruit market. This group formerly classified as C. reticulata is highly polymorphic and recent molecular studies have suggested that numerous cultivated mandarins were introgressed by C. maxima (the pummelos). C. maxima and C. reticulata are also the ancestors of sweet and sour oranges, grapefruit, and therefore of all the "small citrus" modern varieties (mandarins, tangors, tangelos) derived from sexual hybridization between these horticultural groups. Recently, NGS technologies have greatly modified how plant evolution and genomic structure are analyzed, moving from phylogenetics to phylogenomics. The objective of this work was to develop a workflow for phylogenomic inference from Genotyping By Sequencing (GBS) data and to analyze the interspecific admixture along the nine citrus chromosomes for horticultural groups and recent varieties resulting from the combination of the C. reticulata and C. maxima gene pools. A GBS library was established from 55 citrus varieties, using the ApekI restriction enzyme and selective PCR to improve the read depth. Diagnostic polymorphisms (DPs) of C. reticulata/C. maxima differentiation were identified and used to decipher the phylogenomic structure of the 55 varieties. The GBS approach was powerful and revealed 30,289 SNPs and 8,794 Indels with 12.6% of missing data. 11,133 DPs were selected covering the nine chromosomes with a higher density in genic regions. GBS combined with the detection of DPs was powerful for deciphering the "phylogenomic karyotypes" of cultivars derived from admixture of the two ancestral species after a limited number of interspecific recombinations. All the mandarins, mandarin hybrids, tangelos and tangors analyzed displayed introgression of C. maxima in different parts of the genome. C. reticulata/C. maxima admixture should be a major component of the high phenotypic variability of this germplasm opening

  16. Polymorphic Contracts

    Science.gov (United States)

    Belo, João Filipe; Greenberg, Michael; Igarashi, Atsushi; Pierce, Benjamin C.

    Manifest contracts track precise properties by refining types with predicates - e.g., {x : Int |x > 0 } denotes the positive integers. Contracts and polymorphism make a natural combination: programmers can give strong contracts to abstract types, precisely stating pre- and post-conditions while hiding implementation details - for example, an abstract type of stacks might specify that the pop operation has input type {x :α Stack |not ( empty x )} . We formalize this combination by defining FH, a polymorphic calculus with manifest contracts, and establishing fundamental properties including type soundness and relational parametricity. Our development relies on a significant technical improvement over earlier presentations of contracts: instead of introducing a denotational model to break a problematic circularity between typing, subtyping, and evaluation, we develop the metatheory of contracts in a completely syntactic fashion, omitting subtyping from the core system and recovering it post facto as a derived property.

  17. Genomic diversity and affinities in population groups of North West India: an analysis of Alu insertion and a single nucleotide polymorphism.

    Science.gov (United States)

    Saini, J S; Kumar, A; Matharoo, K; Sokhi, J; Badaruddoza; Bhanwer, A J S

    2012-12-15

    The North West region of India is extremely important to understand the peopling of India, as it acted as a corridor to the foreign invaders from Eurasia and Central Asia. A series of these invasions along with multiple migrations led to intermixture of variable populations, strongly contributing to genetic variations. The present investigation was designed to explore the genetic diversities and affinities among the five major ethnic groups from North West India; Brahmin, Jat Sikh, Bania, Rajput and Gujjar. A total of 327 individuals of the abovementioned ethnic groups were analyzed for 4 Alu insertion marker loci (ACE, PV92, APO and D1) and a Single Nucleotide Polymorphism (SNP) rs2234693 in the intronic region of the ESR1 gene. Statistical analysis was performed to interpret the genetic structure and diversity of the population groups. Genotypes for ACE, APO, ESR1 and PV92 loci were found to be in Hardy-Weinberg equilibrium in all the ethnic groups, while significant departures were observed at the D1 locus in every investigated population after Bonferroni's correction. The average heterozygosity for all the loci in these ethnic groups was fairly substantial ranging from 0.3927 ± 0.1877 to 0.4333 ± 0.1416. Inbreeding coefficient indicated an overall 10% decrease in heterozygosity in these North West Indian populations. The gene differentiation among the populations was observed to be of the order of 0.013. Genetic distance estimates revealed that Gujjars were close to Banias and Jat Sikhs were close to Rajputs. Overall the study favored the recent division of the populations of North West India into largely endogamous groups. It was observed that the populations of North West India represent a more or less homogenous genetic entity, owing to their common ancestral history as well as geographical proximity. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Oligonucleotide array discovery of polymorphisms in cultivated tomato (Solanum lycopersicum L. reveals patterns of SNP variation associated with breeding

    Directory of Open Access Journals (Sweden)

    Zhu Tong

    2009-10-01

    Full Text Available Abstract Background Cultivated tomato (Solanum lycopersicum L. has narrow genetic diversity that makes it difficult to identify polymorphisms between elite germplasm. We explored array-based single feature polymorphism (SFP discovery as a high-throughput approach for marker development in cultivated tomato. Results Three varieties, FL7600 (fresh-market, OH9242 (processing, and PI114490 (cherry were used as a source of genomic DNA for hybridization to oligonucleotide arrays. Identification of SFPs was based on outlier detection using regression analysis of normalized hybridization data within a probe set for each gene. A subset of 189 putative SFPs was sequenced for validation. The rate of validation depended on the desired level of significance (α used to define the confidence interval (CI, and ranged from 76% for polymorphisms identified at α ≤ 10-6 to 60% for those identified at α ≤ 10-2. Validation percentage reached a plateau between α ≤ 10-4 and α ≤ 10-7, but failure to identify known SFPs (Type II error increased dramatically at α ≤ 10-6. Trough sequence validation, we identified 279 SNPs and 27 InDels in 111 loci. Sixty loci contained ≥ 2 SNPs per locus. We used a subset of validated SNPs for genetic diversity analysis of 92 tomato varieties and accessions. Pairwise estimation of θ (Fst suggested significant differentiation between collections of fresh-market, processing, vintage, Latin American (landrace, and S. pimpinellifolium accessions. The fresh-market and processing groups displayed high genetic diversity relative to vintage and landrace groups. Furthermore, the patterns of SNP variation indicated that domestication and early breeding practices have led to progressive genetic bottlenecks while modern breeding practices have reintroduced genetic variation into the crop from wild species. Finally, we examined the ratio of non-synonymous (Ka to synonymous substitutions (Ks for 20 loci with multiple SNPs (≥ 4 per

  19. The pattern of polymorphism in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    2005-07-01

    Full Text Available We resequenced 876 short fragments in a sample of 96 individuals of Arabidopsis thaliana that included stock center accessions as well as a hierarchical sample from natural populations. Although A. thaliana is a selfing weed, the pattern of polymorphism in general agrees with what is expected for a widely distributed, sexually reproducing species. Linkage disequilibrium decays rapidly, within 50 kb. Variation is shared worldwide, although population structure and isolation by distance are evident. The data fail to fit standard neutral models in several ways. There is a genome-wide excess of rare alleles, at least partially due to selection. There is too much variation between genomic regions in the level of polymorphism. The local level of polymorphism is negatively correlated with gene density and positively correlated with segmental duplications. Because the data do not fit theoretical null distributions, attempts to infer natural selection from polymorphism data will require genome-wide surveys of polymorphism in order to identify anomalous regions. Despite this, our data support the utility of A. thaliana as a model for evolutionary functional genomics.

  20. Using genomic data to unravel the root of the placental mammal phylogeny.

    Science.gov (United States)

    Murphy, William J; Pringle, Thomas H; Crider, Tess A; Springer, Mark S; Miller, Webb

    2007-04-01

    The phylogeny of placental mammals is a critical framework for choosing future genome sequencing targets and for resolving the ancestral mammalian genome at the nucleotide level. Despite considerable recent progress defining superordinal relationships, several branches remain poorly resolved, including the root of the placental tree. Here we analyzed the genome sequence assemblies of human, armadillo, elephant, and opossum to identify informative coding indels that would serve as rare genomic changes to infer early events in placental mammal phylogeny. We also expanded our species sampling by including sequence data from >30 ongoing genome projects, followed by PCR and sequencing validation of each indel in additional taxa. Our data provide support for a sister-group relationship between Afrotheria and Xenarthra (the Atlantogenata hypothesis), which is in turn the sister-taxon to Boreoeutheria. We failed to recover any indels in support of a basal position for Xenarthra (Epitheria), which is suggested by morphology and a recent retroposon analysis, or a hypothesis with Afrotheria basal (Exafricoplacentalia), which is favored by phylogenetic analysis of large nuclear gene data sets. In addition, we identified two retroposon insertions that also support Atlantogenata and none for the alternative hypotheses. A revised molecular timescale based on these phylogenetic inferences suggests Afrotheria and Xenarthra diverged from other placental mammals approximately 103 (95-114) million years ago. We discuss the impacts of this topology on earlier phylogenetic reconstructions and repeat-based inferences of phylogeny.

  1. Targeted Porcine Genome Engineering with TALENs

    DEFF Research Database (Denmark)

    Luo, Yonglun; Lin, Lin; Golas, Mariola Monika

    2015-01-01

    confers precisely editing (e.g., mutations or indels) or insertion of a functional transgenic cassette to user-designed loci. Techniques for targeted genome engineering are growing dramatically and include, e.g., zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs......, including construction of sequence-specific TALENs, delivery of TALENs into primary porcine fibroblasts, and detection of TALEN-mediated cleavage, is described. This chapter is useful for scientists who are inexperienced with TALEN engineering of porcine cells as well as of other large animals....

  2. Genomewide variation in an introgression line of rice-Zizania revealed by whole-genome re-sequencing.

    Directory of Open Access Journals (Sweden)

    Zhen-Hui Wang

    Full Text Available BACKGROUND: Hybridization between genetically diverged organisms is known as an important avenue that drives plant genome evolution. The possible outcomes of hybridization would be the occurrences of genetic instabilities in the resultant hybrids. It remained under-investigated however whether pollination by alien pollens of a closely related but sexually "incompatible" species could evoke genomic changes and to what extent it may result in phenotypic novelties in the derived progenies. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we have re-sequenced the genomes of Oryza sativa ssp. japonica cv. Matsumae and one of its derived introgressant RZ35 that was obtained from an introgressive hybridization between Matsumae and Zizanialatifolia Griseb. in general, 131 millions 90 base pair (bp paired-end reads were generated which covered 13.2 and 21.9 folds of the Matsumae and RZ35 genomes, respectively. Relative to Matsumae, a total of 41,724 homozygous single nucleotide polymorphisms (SNPs and 17,839 homozygous insertions/deletions (indels were identified in RZ35, of which 3,797 SNPs were nonsynonymous mutations. Furthermore, rampant mobilization of transposable elements (TEs was found in the RZ35 genome. The results of pathogen inoculation revealed that RZ35 exhibited enhanced resistance to blast relative to Matsumae. Notably, one nonsynonymous mutation was found in the known blast resistance gene Pid3/Pi25 and real-time quantitative (q RT-PCR analysis revealed constitutive up-regulation of its expression, suggesting both altered function and expression of Pid3/Pi25 may be responsible for the enhanced resistance to rice blast by RZ35. CONCLUSIONS/SIGNIFICANCE: Our results demonstrate that introgressive hybridization by Zizania has provoked genomewide, extensive genomic changes in the rice genome, and some of which have resulted in important phenotypic novelties. These findings suggest that introgressive hybridization by alien pollens of even a

  3. [Development of indel markers for molecular authentication of Panax ginseng and P. quinquefolius].

    Science.gov (United States)

    Wang, Rong-Bo; Tian, Hui-Li; Wang, Hong-Tao; Li, Gui-Sheng

    2018-04-01

    Panax ginseng and P. quinquefolius are two kinds of important medicinal herbs. They are morphologically similar but have different pharmacological effects. Therefore, botanical origin authentication of these two ginsengs is of great importance for ensuring pharmaceutical efficacy and food safety. Based on the fact that intron position in orthologous genes is highly conserved across plant species, intron length polymorphisms were exploited from unigenes of ginseng. Specific primers were respectively designed for these two species based on their insertion/deletion sequences of cytochrome P450 and glyceraldehyde 3-phosphate dehydrogenase, and multiplex PCR was conducted for molecular authentication of P.ginseng and P. quinquefolius. The results showed that the developed multiplex PCR assay was effective for molecular authentication of P.ginseng and P. quinquefolius without strict PCR condition and the optimization of reaction system.This study provides a preferred ideal marker system for molecular authentication of ginseng,and the presented method can be employed in origin authentication of other herbal preparations. Copyright© by the Chinese Pharmaceutical Association.

  4. Analysis of three polymorphisms in Bidayuh ethnic of Sarawak ...

    African Journals Online (AJOL)

    Insertion/deletion polymorphism of YAP (DYS287), M96 and M120 polymorphisms in Bidayuh ethnic populations of Sarawak, Malaysia were analyzed in this study. Genomic DNA was extracted from 180 buccal samples and amplified by Hot-Start PCR method. The amplified PCR products were separated by using 2% ...

  5. Targeted Porcine Genome Engineering with TALENs

    DEFF Research Database (Denmark)

    Luo, Yonglun; Lin, Lin; Golas, Mariola Monika

    2015-01-01

    Genetically modified pigs are becoming an invaluable animal model for agricultural, pharmaceutical, and biomedical applications. Unlike traditional transgenesis, which is accomplished by randomly inserting an exogenous transgene cassette into the natural chromosomal context, targeted genome editing...... confers precisely editing (e.g., mutations or indels) or insertion of a functional transgenic cassette to user-designed loci. Techniques for targeted genome engineering are growing dramatically and include, e.g., zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs......), and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems. These systems provide enormous potential applications. In this chapter, we review the use of TALENs for targeted genome editing with focus on their application in pigs. In addition, a brief protocol...

  6. Prospects for Genomic Research in Forestry

    Directory of Open Access Journals (Sweden)

    K. V. Krutovsky

    2014-08-01

    Full Text Available Conifers are keystone species of boreal forests. Their whole genome sequencing, assembly and annotation will allow us to understand the evolution of the complex ancient giant conifer genomes that are 4 times larger in larch and 7–9 times larger in pines than the human genome. Genomic studies will allow also to obtain important whole genome sequence data and develop highly polymorphic and informative genetic markers, such as microsatellites and single nucleotide polymorphisms (SNPs that can be efficiently used in timber origin identification, for genetic variation monitoring, to study local and climate change adaptation and in tree improvement and conservation programs.

  7. Prevalence of single nucleotide polymorphism among 27 diverse alfalfa genotypes as assessed by transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Li Xuehui

    2012-10-01

    Full Text Available Abstract Background Alfalfa, a perennial, outcrossing species, is a widely planted forage legume producing highly nutritious biomass. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker assisted breeding strategies can enhance alfalfa improvement efforts, particularly if many genome-wide markers are available. Transcriptome sequencing enables efficient high-throughput discovery of single nucleotide polymorphism (SNP markers for a complex polyploid species. Result The transcriptomes of 27 alfalfa genotypes, including elite breeding genotypes, parents of mapping populations, and unimproved wild genotypes, were sequenced using an Illumina Genome Analyzer IIx. De novo assembly of quality-filtered 72-bp reads generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, with an average read depth of 55.9-fold for each genotype. Overall, 21,954 (87.2% of the 25,183 contigs represented 14,878 unique protein accessions. Gene ontology (GO analysis suggested that a broad diversity of genes was represented in the resulting sequences. The realignment of individual reads to the contigs enabled the detection of 872,384 SNPs and 31,760 InDels. High resolution melting (HRM analysis was used to validate 91% of 192 putative SNPs identified by sequencing. Both allelic variants at about 95% of SNP sites identified among five wild, unimproved genotypes are still present in cultivated alfalfa, and all four US breeding programs also contain a high proportion of these SNPs. Thus, little evidence exists among this dataset for loss of significant DNA sequence diversity from either domestication or breeding of alfalfa. Structure analysis indicated that individuals from the subspecies falcata, the diploid subspecies caerulea, and the tetraploid subspecies sativa (cultivated tetraploid alfalfa were clearly separated. Conclusion We used transcriptome sequencing to discover large numbers of SNPs

  8. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  9. Evaluation of a Panel of Single-Nucleotide Polymorphisms in miR-146a and miR-196a2 Genomic Regions in Patients with Chronic Periodontitis.

    Science.gov (United States)

    Venugopal, Priyanka; Lavu, Vamsi; RangaRao, Suresh; Venkatesan, Vettriselvi

    2017-04-01

    Periodontitis is an inflammatory disease caused by bacterial triggering of the host immune-inflammatory response, which in turn is regulated by microRNAs (miRNA). Polymorphisms in the miRNA pathways affect the expression of several target genes such as tumor necrosis factor-α and interleukins, which are associated with progression of disease. The objective of this study was to identify the association between the MiR-146a single nucleotide polymorphisms (SNPs) (rs2910164, rs57095329, and rs73318382), the MiR-196a2 (rs11614913) SNP and chronic periodontitis. Genotyping was performed for the MiR-146a (rs2910164, rs57095329, and rs73318382) and the MiR-196a2 (rs11614913) polymorphisms in 180 healthy controls and 190 cases of chronic periodontitis by the direct Sanger sequencing technique. The strength of the association between the polymorphisms and chronic periodontitis was evaluated using logistic regression analysis. Haplotype and linkage analyses among the polymorphisms was performed. Multifactorial dimensionality reduction was performed to determine epistatic interaction among the polymorphisms. The MiR-196a2 polymorphism revealed a significant inverse association with chronic periodontitis. Haplotype analysis of MiR-146a and MiR-196a2 polymorphisms revealed 13 different combinations, of which 5 were found to have an inverse association with chronic periodontitis. The present study has demonstrated a significant inverse association of MiR-196a2 polymorphism with chronic periodontitis.

  10. A validated pipeline for detection of SNVs and short InDels from RNA Sequencing

    Directory of Open Access Journals (Sweden)

    Nitin Mandloi

    2017-12-01

    In this study, we have developed a pipeline to detect germline variants from RNA-seq data. The pipeline steps include: pre-processing, alignment, GATK best practices for RNA-seq and variant filtering. The pre-processing step includes base and adapter trimming and removal of contamination reads from rRNA, tRNA, mitochondrial DNA and repeat regions. The read alignment of the pre-processed reads is performed using STAR/HiSAT. After this we used GATK best practices for the RNA-seq dataset to call germline variants. We benchmarked our pipeline on NA12878 RNA-seq data downloaded from SRA (SRR1258218. After variant calling, the quality passed variants were compared against the gold standard variants provided by GIAB consortium. Of the total ~3.6 million high quality variants reported as gold standard variants for this sample (considering whole genome, our pipeline identified ~58,104 variants to be expressed in RNA-seq. Our pipeline achieved more than 99% of sensitivity in detection of germline variants.

  11. Polymorphisms in folate metabolism genes are associated with susceptibility to presbycusis.

    Science.gov (United States)

    Manche, Santoshi Kumari; Jangala, Madhavi; Dudekula, Dinesh; Koralla, Meganadh; Akka, Jyothy

    2018-03-01

    Presbycusis or age related hearing loss is caused by several extrinsic and intrinsic factors that damage the auditory system. Gene polymorphisms in folate metabolism were found to play an important role in the etiology of presbycusis. The present study aimed to investigate the role of 5,10-methylenetetrahydrofolate reductase (MTHFR), methionine synthase (MTR) and thymidylate synthase (TYMS) gene polymorphisms in the onset of presbycusis in a South Indian population. A total of 220 subjects confirmed with presbycusis along with 270 age and sex matched healthy controls visiting MAA ENT Hospitals, Hyderabad, India were enrolled for the study. Genotyping of MTHFR C677T (rs180133) and A1298C (rs1801131), MTR A2756G (rs1805087), TSER (rs1801136) and TS1494indel6 bp (rs16430) was carried out using PCR & PCR-RFLP methods. The 'TT' genotype of MTHFR C677T and '152 bp/152 bp' genotype of TS1494indel6 bp showed statistically significant risk for presbycusis while CC genotype of MTHFR A1298C, '2R/2R' genotype of TSER at 3'UTR and 6 bp ins/6 bp ins of TYMS at 5'UTR were found to be protective. The T-A-A haplotype combination of MTHFR C677T, MTHFR A1298C and MTR A2756G as well as 3R- 152 bp of TYMS at 5'UTR and 3'UTR were also found to contribute significant risk for the onset of presbycusis. Further, the combination of SNP loci TSER: TS1494indel6 bp exhibited moderate linkage in presbycusis. The present pilot study identified the significant association of gene variants of MTHFR and TYMS with presbycusis. These findings aid in early diagnosis of hearing loss in the elderly population. Copyright © 2018 Elsevier Inc. All rights reserved.

  12. Endothelial nitric oxide synthase gene polymorphisms associated ...

    African Journals Online (AJOL)

    Endothelial nitric oxide synthase (NOS3) is involved in key steps of immune response. Genetic factors predispose individuals to periodontal disease. This study's aim was to explore the association between NOS3 gene polymorphisms and clinical parameters in patients with periodontal disease. Genomic DNA was obtained ...

  13. [Association analysis of SNP-63 and indel-19 variant in the calpain-10 gene with polycystic ovary syndrome in women of reproductive age].

    Science.gov (United States)

    Flores-Martínez, Silvia Esperanza; Castro-Martínez, Anna Gabriela; López-Quintero, Andrés; García-Zapién, Alejandra Guadalupe; Torres-Rodríguez, Ruth Noemí; Sánchez-Corona, José

    2015-01-01

    Polycystic ovary syndrome is a complex and heterogeneous disease involving both reproductive and metabolic problems. It has been suggested a genetic predisposition in the etiology of this syndrome. The identification of calpain-10 gene (CAPN10) as the first candidate gene for type 2 diabetes mellitus, has focused the interest in investigating their possible relation with the polycystic ovary syndrome, because this syndrome is associated with hyperinsulinemia and insulin resistance, two metabolic abnormalities associated with type 2 diabetes mellitus. To investigate if there is association between the SNP-63 and the variant indel-19 of the CAPN10 gene and polycystic ovary syndrome in women of reproductive age. This study included 101 women (55 with polycystic ovary syndrome and 46 without polycystic ovary syndrome). The genetic variant indel-19 was identified by electrophoresis of the amplified fragments by PCR, and the SNP-63 by PCR-RFLP. The allele and genotype frequencies of the two variants do not differ significatly between women with polycystic ovary syndrome and control women group. The haplotype 21 (defined by the insertion allele of indel-19 variant and C allele of SNP-63) was found with higher frequency in both study groups, being more frequent in the polycystic ovary syndrome patients group, however, this difference was not statistically significant (p = 0.8353). The results suggest that SNP-63 and indel-19 variant of the CAPN10 gene do not represent a risk factor for polycystic ovary syndrome in our patients group. Copyright © 2015. Published by Masson Doyma México S.A.

  14. Determination of the frequency of polymorphisms in genes related to the genome stability maintenance of the population residing at Monte Alegre, PA (Brazil) municipality; Determinacao da frequencia de polimorfismos em genes relacionados a manutencao da estabilidade do genoma na populacao residente no municipio de Monte Alegre, PA

    Energy Technology Data Exchange (ETDEWEB)

    Hozumi, Cristiny Gomes

    2010-07-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on earth, for man and all living things have always been exposed to these sources. Ionizing radiation is a known genotoxic agent which can affect the genomic stability and genes related to DNA repair may play a role when they have committed certain polymorphism. This study aimed to analyze the frequency of polymorphisms (SNPs) in genes of DNA repair and cell cycle control: hOGG1 (Ser326Cys), XRCC3 (Thr241 Met) and p53 (Arg72Pro) in saliva samples from a population located Monte Alegre, state of Para were collected in August 2008 and 40 samples of men and 46 samples of women, adding a total of 86 samples. By RFLP was determined the frequency of homozygous genotypes and / or heterozygous for polymorphic genes. The I)OGG1 gene was 5% of the allele 326Cys, XRCC3 gene found about 21 % of the allele 241 Met and p53 gene showed 40.8% of the 72Pro allele. And the genotype frequencies of individuals for the three genes were 91.04%, 88.06% and 59.7% for homozygous wild genotype, 5.97%, 11.94% and 22.39% for heterozygote genotype and 2,99%, zero and 17:91% for homozygous polymorphic hOGG1 genes respectively, XRCC3, p53. These values are similar to those found in previous studies. The influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, which is statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology in Monte Alegre, that help to characterization of local population. (author)

  15. [Polymorphism in the Serotonin Transporter Gene (SLC6A4) and Emotional Bipolar Disorder in Two Regional Mental Health Centers from the Eje Cafetero (Colombia)].

    Science.gov (United States)

    Ramos, Lucero Rengifo; Arias, Duverney Gaviria; Salazar, Liliana Salazar; Vélez, Juan Pablo; Pardo, Stella Lozano

    2012-03-01

    The indel polymorphisms in the promoting region and the 2(nd) intron polymorphisms in the serotonin transporter gene (SLC6A4) have been associated to bipolar disorder 1 (BD1) in several population studies. The objective was to analyze the genotypic and allelic frequencies in both gene regions in a study of cases and controls with individuals from Risaralda and Quindío (Colombia) so as to establish possible associations to BD1, and compare results with previous and similar studies. 133 patients and 120 controls were studied. L and S indel polymorphisms in the promoting region were analyzed by PCR, together with VNTR STin2.10 and STin 2.12 VNTRs polymorphisms in the 2(nd) intron of the SL-C6A4 gene Genotypic and allelic frequencies for the S and L polymorphisms were similar both in cases and controls. However, the LL genotype was significantly increased both in BD1 population (OR=1.89; CI95%=1.1-3.68), and when discriminated by gender. This particular genotype in general population is OR=2.22; IC95%=1.04-5.66 for women, and OR=1.62; IC 95%=0.71-4.39 for men. No significant genotypic and allelic differences were found for VNTR STin2.10 and STin 2.12. polymorphisms. No association was found between polymorphisms of 5-HTTLPR polymorphisms and the 2(nd) intron of the serotonin transporting gene in general patients with BD1, nor when compared by gender. Our results are similar to those reported for Caucasian populations and differ from those of Asian and Brazilian populations. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.

  16. QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species

    Directory of Open Access Journals (Sweden)

    Voorrips Roeland E

    2006-10-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only. Results We have developed a new algorithm to detect reliable SNPs and insertions/deletions (indels in EST data, both with and without quality files. Implemented in a pipeline called QualitySNP, it uses three filters for the identification of reliable SNPs. Filter 1 screens for all potential SNPs and identifies variation between or within genotypes. Filter 2 is the core filter that uses a haplotype-based strategy to detect reliable SNPs. Clusters with potential paralogs as well as false SNPs caused by sequencing errors are identified. Filter 3 screens SNPs by calculating a confidence score, based upon sequence redundancy and quality. Non-synonymous SNPs are subsequently identified by detecting open reading frames of consensus sequences (contigs with SNPs. The pipeline includes a data storage and retrieval system for haplotypes, SNPs and alignments. QualitySNP's versatility is demonstrated by the identification of SNPs in EST datasets from potato, chicken and humans. Conclusion QualitySNP is an efficient tool for SNP detection, storage and retrieval in diploid as well as polyploid species. It is available for running on Linux or UNIX systems. The program, test data, and user manual are available at

  17. How to interpret Methylation Sensitive Amplified Polymorphism (MSAP) profiles?

    OpenAIRE

    Fulneček, Jaroslav; Kovařík, Aleš

    2014-01-01

    Background DNA methylation plays a key role in development, contributes to genome stability, and may also respond to external factors supporting adaptation and evolution. To connect different types of stimuli with particular biological processes, identifying genome regions with altered 5-methylcytosine distribution at a genome-wide scale is important. Many researchers are using the simple, reliable, and relatively inexpensive Methylation Sensitive Amplified Polymorphism (MSAP) method that is ...

  18. Exp2 polymorphisms associated with variation for fiber quality properties in cotton (Gossypium spp.

    Directory of Open Access Journals (Sweden)

    Daohua He

    2014-10-01

    Full Text Available Plant expansins are a group of extracellular proteins thought to affect the quality of cotton fibers. Previous expression profile analysis revealed that six Expansin A genes are present in cotton, of which two (GhExp1 and GhExp2 produce transcripts that are specific to the developing cotton fiber. To identify the phenotypic function of Exp2, and to determine whether nucleotide variation among alleles of Exp2 affects fiber quality, candidate gene association mapping was conducted. Gene-specific primers were designed to amplify the Exp2 gene. By amplicon sequencing, the nucleotide diversity of Exp2 was investigated across 92 accessions (including 7 Gossypium arboreum, 74 Gossypium hirsutum, and 11 Gossypium barbadense accessions with different fiber qualities. Twenty-six SNPs and seven InDels including 14 from the coding region of Exp2 were detected, forming twelve distinct haplotypes in the cotton collection. Among the 14 SNPs in the coding region, five were missense mutations and nine were synonymous nucleotide changes. The average SNP/InDel per nucleotide ratio was 2.61% (one SNP per 39 bp, with 1.81 and 3.87% occurring in coding and non-coding regions, respectively. Nucleotide and haplotype diversity across the entire Exp2 region was 0.00603 (π and 0.844, respectively, and diversity in non-coding regions was higher than that in coding regions. For linkage disequilibrium (LD, the mean r2 value for all polymorphism loci pairs was 0.48, and LD did not decay over 748 bp. Based on 132 simple sequence repeat (SSR loci evenly covering 26 chromosomes, the population structure was estimated, and the accessions were divided into seven groups that agreed well with their genomic origin and evolutionary history. A general linear model was used to calculate the Exp2-wide diversity–trait associations of 5 fiber quality traits, considering population structure (Q. Four SNPs in Exp2 were associated with at least one of the fiber quality traits, but not with

  19. Comparative Genomics of Mycoplasma bovis Strains Reveals That Decreased Virulence with Increasing Passages Might Correlate with Potential Virulence-Related Factors

    Directory of Open Access Journals (Sweden)

    Muhammad A. Rasheed

    2017-05-01

    Full Text Available Mycoplasma bovis is an important cause of bovine respiratory disease worldwide. To understand its virulence mechanisms, we sequenced three attenuated M. bovis strains, P115, P150, and P180, which were passaged in vitro 115, 150, and 180 times, respectively, and exhibited progressively decreasing virulence. Comparative genomics was performed among the wild-type M. bovis HB0801 (P1 strain and the P115, P150, and P180 strains, and one 14.2-kb deleted region covering 14 genes was detected in the passaged strains. Additionally, 46 non-sense single-nucleotide polymorphisms and indels were detected, which confirmed that more passages result in more mutations. A subsequent collective bioinformatics analysis of paralogs, metabolic pathways, protein-protein interactions, secretory proteins, functionally conserved domains, and virulence-related factors identified 11 genes that likely contributed to the increased attenuation in the passaged strains. These genes encode ascorbate-specific phosphotransferase system enzyme IIB and IIA components, enolase, L-lactate dehydrogenase, pyruvate kinase, glycerol, and multiple sugar ATP-binding cassette transporters, ATP binding proteins, NADH dehydrogenase, phosphate acetyltransferase, transketolase, and a variable surface protein. Fifteen genes were shown to be enriched in 15 metabolic pathways, and they included the aforementioned genes encoding pyruvate kinase, transketolase, enolase, and L-lactate dehydrogenase. Hydrogen peroxide (H2O2 production in M. bovis strains representing seven passages from P1 to P180 decreased progressively with increasing numbers of passages and increased attenuation. However, eight mutants specific to eight individual genes within the 14.2-kb deleted region did not exhibit altered H2O2 production. These results enrich the M. bovis genomics database, and they increase our understanding of the mechanisms underlying M. bovis virulence.

  20. Characterisation of genetic markers in Mungbean using direct amplification of length polymorphisms (DALP)

    International Nuclear Information System (INIS)

    Kumar, S.V.; Tan, S.G.; Quah, S.C.

    2000-01-01

    A newly developed technique, Direct Amplification of Length Polymorphisms (DALP), developed by Desmarais and co-workers in 1998 was successfully used to identify and characterise new genetic markers in mungbean (Vigyia radiata). DALP uses an arbitrarily primed PCR (AP-PCR) to produce genomic fingerprints and is specifically designed to enable direct sequencing of polymorphic bands. In this study, an oligonucleotide pair DALP235 and DAPLR were tested on four varieties of mungbean (V3476, P4281, V5973 and V5784) and produced, through PCR, specific multibanded fingerprints which showed polymorphisms. These polymorphic bands are the result of length polymorphisms as well as absence and presence of bands. Some of the polymorphic zones may be codominantly inherited and may be potential microsatellites. The success of DALP in characterising new polymorphic loci and its ability to discover microsatellites without the use of priori knowledge of the mungbean genome is revolutionary. This would greatly facilitate the breeding and improvement of the crop. (author)

  1. A protocol for isolating insect mitochondrial genomes: a case study of NUMT in Melipona flavolineata (Hymenoptera: Apidae).

    Science.gov (United States)

    Françoso, Elaine; Gomes, Fernando; Arias, Maria Cristina

    2016-07-01

    Nuclear mitochondrial DNA insertions (NUMTs) are mitochondrial DNA sequences that have been transferred into the nucleus and are recognized by the presence of indels and stop codons. Although NUMTs have been identified in a diverse range of species, their discovery was frequently accidental. Here, our initial goal was to develop and standardize a simple method for isolating NUMTs from the nuclear genome of a single bee. Subsequently, we tested our new protocol by determining whether the indels and stop codons of the cytochrome c oxidase subunit I (COI) sequence of Melipona flavolineata are of nuclear origin. The new protocol successfully demonstrated the presence of a COI NUMT. In addition to NUMT investigations, the protocol described here will also be very useful for studying mitochondrial mutations related to diseases and for sequencing complete mitochondrial genomes with high read coverage by Next-Generation technology.

  2. Deep comparative genomics among Chlamydia trachomatis lymphogranuloma venereum isolates highlights genes potentially involved in pathoadaptation.

    Science.gov (United States)

    Borges, Vítor; Gomes, João Paulo

    2015-06-01

    Lymphogranuloma venereum (LGV) is a human sexually transmitted disease caused by the obligate intracellular bacterium Chlamydia trachomatis (serovars L1-L3). LGV clinical manifestations range from severe ulcerative proctitis (anorectal syndrome), primarily caused by the epidemic L2b strains, to painful inguinal lymphadenopathy (the typical LGV bubonic form). Besides potential host-related factors, the differential disease severity and tissue tropism among LGV strains is likely a function of the genetic backbone of the strains. We aimed to characterize the genetic variability among LGV strains as strain- or serovar-specific mutations may underlie phenotypic signatures, and to investigate the mutational events that occurred throughout the pathoadaptation of the epidemic L2b lineage. By analyzing 20 previously published genomes from L1, L2, L2b and L3 strains and two new genomes from L2b strains, we detected 1497 variant sites and about 100 indels, affecting 453 genes and 144 intergenic regions, with 34 genes displaying a clear overrepresentation of nonsynonymous mutations. Effectors and/or type III secretion substrates (almost all of those described in the literature) and inclusion membrane proteins showed amino acid changes that were about fivefold more frequent than silent changes. More than 120 variant sites occurred in plasmid-regulated virulence genes, and 66% yielded amino acid changes. The identified serovar-specific variant sites revealed that the L2b-specific mutations are likely associated with higher fitness and pointed out potential targets for future highly discriminatory diagnostic/typing tests. By evaluating the evolutionary pathway beyond the L2b clonal radiation, we observed that 90.2% of the intra-L2b variant sites occurring in coding regions involve nonsynonymous mutations, where CT456/tarp has been the main target. Considering the progress on C. trachomatis genetic manipulation, this study may constitute an important contribution for prioritizing

  3. Polymorphic Embedding of DSLs

    DEFF Research Database (Denmark)

    Hofer, Christian; Ostermann, Klaus; Rendel, Tillmann

    2008-01-01

    propose polymorphic embedding of DSLs, where many different interpretations of a DSL can be provided as reusable components, and show how polymorphic embedding can be realized in the programming language Scala. With polymorphic embedding, the static type-safety, modularity, composability and rapid...

  4. The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    Science.gov (United States)

    Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

    2015-04-01

    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.

  5. GenPlay Multi-Genome, a tool to compare and analyze multiple human genomes in a graphical interface.

    Science.gov (United States)

    Lajugie, Julien; Fourel, Nicolas; Bouhassira, Eric E

    2015-01-01

    Parallel visualization of multiple individual human genomes is a complex endeavor that is rapidly gaining importance with the increasing number of personal, phased and cancer genomes that are being generated. It requires the display of variants such as SNPs, indels and structural variants that are unique to specific genomes and the introduction of multiple overlapping gaps in the reference sequence. Here, we describe GenPlay Multi-Genome, an application specifically written to visualize and analyze multiple human genomes in parallel. GenPlay Multi-Genome is ideally suited for the comparison of allele-specific expression and functional genomic data obtained from multiple phased genomes in a graphical interface with access to multiple-track operation. It also allows the analysis of data that have been aligned to custom genomes rather than to a standard reference and can be used as a variant calling format file browser and as a tool to compare different genome assembly, such as hg19 and hg38. GenPlay is available under the GNU public license (GPL-3) from http://genplay.einstein.yu.edu. The source code is available at https://github.com/JulienLajugie/GenPlay. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Influence of correlation between HLA-G polymorphism and Interleukin-6 (IL6) gene expression on the risk of schizophrenia.

    Science.gov (United States)

    Shivakumar, Venkataram; Debnath, Monojit; Venugopal, Deepthi; Rajasekaran, Ashwini; Kalmady, Sunil V; Subbanna, Manjula; Narayanaswamy, Janardhanan C; Amaresha, Anekal C; Venkatasubramanian, Ganesan

    2018-07-01

    Converging evidence suggests important implications of immuno-inflammatory pathway in the risk and progression of schizophrenia. Prenatal infection resulting in maternal immune activation and developmental neuroinflammation reportedly increases the risk of schizophrenia in the offspring by generating pro-inflammatory cytokines including IL-6. However, it is not known how prenatal infection can induce immuno-inflammatory responses despite the presence of immuno-inhibitory Human Leukocyte Antigen-G (HLA-G) molecules. To address this, the present study was aimed at examining the correlation between 14 bp Insertion/Deletion (INDEL) polymorphism of HLA-G and IL-6 gene expression in schizophrenia patients. The 14 bp INDEL polymorphism was studied by PCR amplification/direct sequencing and IL-6 gene expression was quantified by using real-time RT-PCR in 56 schizophrenia patients and 99 healthy controls. We observed significantly low IL6 gene expression in the peripheral mononuclear cells (PBMCs) of schizophrenia patients (t = 3.8, p = .004) compared to the controls. In addition, schizophrenia patients carrying Del/Del genotype of HLA-G 14 bp INDEL exhibited significantly lower IL6 gene expression (t = 3.1; p = .004) than the Del/Ins as well as Ins/Ins carriers. Our findings suggest that presence of "high-expressor" HLA-G 14 bp Del/Del genotype in schizophrenia patients could attenuate IL-6 mediated inflammation in schizophrenia. Based on these findings it can be assumed that HLA-G and cytokine interactions might play an important role in the immunological underpinnings of schizophrenia. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Molecular effects of autoimmune-risk promoter polymorphisms on expression, exon choice, and translational efficiency of interferon regulatory factor 5.

    Science.gov (United States)

    Clark, Daniel N; Lambert, Jared P; Till, Rodney E; Argueta, Lissenya B; Greenhalgh, Kathryn E; Henrie, Brandon; Bills, Trieste; Hawkley, Tyson F; Roznik, Marinya G; Sloan, Jason M; Mayhew, Vera; Woodland, Loc; Nelson, Eric P; Tsai, Meng-Hsuan; Poole, Brian D

    2014-05-01

    The rs2004640 single nucleotide polymorphism and the CGGGG copy-number variant (rs77571059) are promoter polymorphisms within interferon regulatory factor 5 (IRF5). They have been implicated as susceptibility factors for several autoimmune diseases. IRF5 uses alternative promoter splicing, where any of 4 first exons begin the mRNA. The CGGGG indel is in exon 1A's promoter; the rs2004640 allele creates a splicing recognition site, enabling usage of exon 1B. This study aimed at characterizing alterations in IRF5 mRNA due to these polymorphisms. Cells with risk polymorphisms exhibited ~2-fold higher levels of IRF5 mRNA and protein, but demonstrated no change in mRNA stability. Quantitative PCR demonstrated decreased usage of exons 1C and 1D in cell lines with the risk polymorphisms. RNA folding analysis revealed a hairpin in exon 1B; mutational analysis showed that the hairpin shape decreased translation 5-fold. Although translation of mRNA that uses exon 1B is low due to a hairpin, increased IRF5 mRNA levels in individuals with the rs2004640 risk allele lead to higher overall protein expression. In addition, several new splice variants of IRF5 were sequenced. IRF5's promoter polymorphisms alter first exon usage and increase transcription levels. High levels of IRF5 may bias the immune system toward autoimmunity.

  8. Identification of polymorphisms in and genes and their associations with plumage colors in Asian duck breeds

    Directory of Open Access Journals (Sweden)

    Hasina Sultana

    2018-02-01

    Full Text Available Objective The aim of this study was to investigate the effect of single nucleotide polymorphisms (SNPs of the melanogenesis associated transcription factor (MITF and dopachrome tautomerase (DCT genes on plumage coloration in Asian native duck breeds. MITF encodes a protein for microphthalmia-associated transcription factor, which regulates the development and function of melanocytes for pigmentation of skin, hair, and eyes. Among the tyrosinase-related family genes, DCT is a pigment cell-specific gene that plays important roles in the melanin synthesis pathway and the expression of skin, feather, and retina color. Methods Five Asian duck varieties (black Korean native, white Korean native, commercial Peking, Nageswari, and Bangladeshi Deshi white ducks were investigated to examine the polymorphisms associated with plumage colors. Among previously identified SNPs, three synonymous SNPs and one indel of MITF and nine SNPs in exon regions of DCT were genotyped. The allele frequencies for SNPs of the black and white plumage color populations were estimated and Fisher’s exact test was conducted to assess the association between the allele frequencies of these two populations. Results Two synonymous SNPs (c.114T>G and c.147T>C and a 14-bp indel (GCTGCAAAC AGATG in intron 7 of MITF were significantly associated with the black- and white-colored breeds (pG (p.His313Arg] in DCT, was highly significantly associated (pG was significantly associated (p<0.05 with black and white color plumage in the studied duck populations. Conclusion The results of this study provide a basis for further investigations of the associations between polymorphisms and plumage color phenotypes in Asian duck breeds.

  9. The Banana Genome Hub

    Science.gov (United States)

    Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

    2013-01-01

    Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967

  10. Genomic applications in forensic medicine

    DEFF Research Database (Denmark)

    Børsting, Claus; Morling, Niels

    2016-01-01

    Since the 1980s, advances in DNA technology have revolutionized the scope and practice of forensic medicine. From the days of restriction fragment length polymorphisms (RFLPs) to short tandem repeats (STRs), the current focus is on the next generation genome sequencing. It has been almost a decad...

  11. Evaluation of the frequency of polymorphisms in XRCC1 (Arg399Gln) and XPD (Lys751Gln) genes related to the genome stability maintenance in individuals of the resident population from Monte Alegre, PA/Brazil municipality; Avaliacao da frequencia de polimorfismos nos genes XRCC1 (Arg399Gln) e XPD (Lys751Gln) relacionados a manutencao da estabilidade do genoma em individuos da populacao residente no municipio de Monte Alegre, PA

    Energy Technology Data Exchange (ETDEWEB)

    Duarte, Isabelle Magliano

    2010-07-01

    The human exposure to ionizing radiation coming from natural sources is an inherent feature of human life on Earth. Ionizing radiation is a known genotoxic agent, which can affect biological molecules, causing DNA damage and genomic instability. The cellular system of DNA repair plays an important role in maintaining genomic stability by repairing DNA damage caused by genotoxic agents. However, genes related to DNA repair may have their role committed when presenting a certain polymorphism. This study intended to analyze the frequency of single nucleotide polymorphisms (SNPs) in genes of DNA repair XRCC1 (Arg39-9Gln) and XPD (Lys751Gln) in a: population of the city of Monte Alegre, that resides in an area of high exposure to natural radioactivity. Samples of saliva were collected from individuals of the population of Monte Alegre, in which 40 samples were of male and 46 female. Through the use of RFLP (length polymorphism restriction fragment) the frequency of homozygous genotypes and / or heterozygous was determined for polymorphic genes. The XRCC1 gene had 65.4% of the presence of the allele 399Gln and XPD gene had 32.9% of the 751Gln allele. These values are similar to those found in previous studies for the XPD gene, whereas XRCC1 showed a frequency much higher than described in the literature. The. influence of these polymorphisms, which are involved in DNA repair and consequent genotoxicity induced by radiation depends on dose and exposure factors such as smoking, statistically a factor in public health surveillance in the region. This study gathered information and molecular epidemiology for risk assessment of cancer in the population of Monte Alegre. (author)

  12. Identificación de polimorfismos en genes candidatos de resistencia en yuca (Manihot esculenta Crantz Identification of polymorphisms in resistance gene candidates in cassava (Manihot esculenta Crantz

    Directory of Open Access Journals (Sweden)

    Andrea Vásquez

    2012-04-01

    proteins have a few conserved domains. Taking advantage of the recent release of complete cassava genome sequence, we identified cassava R-like proteins in this genome. With this information, primers were designed to amplify 13 genes showing similarity to known R genes. For 10 of them we obtained amplification in the varieties TMS30572 and CM2177-2, which represent the parents used in the construction of the cassava genetic map. After sequencing the amplicons obtained, we identified 37 SNPs (Single Nucleotide Polymorphisms between these two cassava varieties, which represent 18 (48.6% transitions and 19 (45.9% transversions. The remaining are insertions/deletions (indels. This knowledge will help to develop appropriate strategies for the generation of CAPs (Cleaved Amplified Polymorphisms markers to assess their segregation in the F1 population, allowing the localization of these markers on the cassava genetic map.

  13. Evaluation of PRNP Expression Based on Genotypes and Alleles of Two Indel Loci in the Medulla Oblongata of Japanese Black and Japanese Brown Cattle

    Science.gov (United States)

    Msalya, George; Shimogiri, Takeshi; Ohno, Shotaro; Okamoto, Shin; Kawabe, Kotaro; Minezawa, Mitsuru; Maeda, Yoshizane

    2011-01-01

    Background Prion protein (PrP) level plays the central role in bovine spongiform encephalopathy (BSE) susceptibility. Increasing the level of PrP decreases incubation period for this disease. Therefore, studying the expression of the cellular PrP or at least the messenger RNA might be used in selection for preventing the propagation of BSE and other prion diseases. Two insertion/deletion (indel) variations have been tentatively associated with susceptibility/resistance of cattle to classical BSE. Methodology/Principal Findings We studied the expression of each genotype at the two indel sites in Japanese Black (JB) and Japanese Brown (JBr) cattle breeds by a standard curve method of real-time PCR. Five diplotypes subdivided into two categories were selected from each breed. The two cattle breeds were considered differently. Expression of PRNP was significantly (p0.05). Conclusion Our results suggest that the del/del genotype or at least its del allele may modulate the expression of PRNP at the 23-bp locus in the medulla oblongata of these cattle breeds. PMID:21611160

  14. Brucella abortus Strain 2308 Wisconsin Genome: Importance of the Definition of Reference Strains

    Science.gov (United States)

    Suárez-Esquivel, Marcela; Ruiz-Villalobos, Nazareth; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Roop II, R. Martin; Comerci, Diego J.; Barquero-Calvo, Elías; Chacón-Díaz, Carlos; Caswell, Clayton C.; Baker, Kate S.; Chaves-Olarte, Esteban; Thomson, Nicholas R.; Moreno, Edgardo; Letesson, Jean J.; De Bolle, Xavier; Guzmán-Verri, Caterina

    2016-01-01

    Brucellosis is a bacterial infectious disease affecting a wide range of mammals and a neglected zoonosis caused by species of the genetically homogenous genus Brucella. As in most studies on bacterial diseases, research in brucellosis is carried out by using reference strains as canonical models to understand the mechanisms underlying host pathogen interactions. We performed whole genome sequencing analysis of the reference strain B. abortus 2308 routinely used in our laboratory, including manual curated annotation accessible as an editable version through a link at https://en.wikipedia.org/wiki/Brucella#Genomics. Comparison of this genome with two publically available 2308 genomes showed significant differences, particularly indels related to insertional elements, suggesting variability related to the transposition of these elements within the same strain. Considering the outcome of high resolution genomic techniques in the bacteriology field, the conventional concept of strain definition needs to be revised. PMID:27746773

  15. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  16. Polymorphous computing fabric

    Science.gov (United States)

    Wolinski, Christophe Czeslaw [Los Alamos, NM; Gokhale, Maya B [Los Alamos, NM; McCabe, Kevin Peter [Los Alamos, NM

    2011-01-18

    Fabric-based computing systems and methods are disclosed. A fabric-based computing system can include a polymorphous computing fabric that can be customized on a per application basis and a host processor in communication with said polymorphous computing fabric. The polymorphous computing fabric includes a cellular architecture that can be highly parameterized to enable a customized synthesis of fabric instances for a variety of enhanced application performances thereof. A global memory concept can also be included that provides the host processor random access to all variables and instructions associated with the polymorphous computing fabric.

  17. Association of MTHFR polymorphisms with nsCL/P in Chinese ...

    African Journals Online (AJOL)

    Xianrong Xu

    2016-04-26

    Apr 26, 2016 ... Aim: In this study, we aim to investigate the association between the polymorphism in MTHFR .... DNA extraction, library preparation, and sequencing. Genomic ..... comparative study in Mexican, West African, and European.

  18. sY116, a human Y-linked polymorphic STS

    Indian Academy of Sciences (India)

    3Laboratoire d'ImmunogeÂneÂtique, Faculte de Sciences de Tunis, Tunis, Tunisia ... studying genomic instabilities in some types of cancer is discussed. Materials ..... polymorphisms by denaturing high-performance liquid chroma- tography.

  19. Development of Insertion and Deletion Markers based on Biparental Resequencing for Fine Mapping Seed Weight in Soybean

    Directory of Open Access Journals (Sweden)

    Ying-hui Li

    2014-11-01

    Full Text Available As a complement to single nucleotide polymorphisms (SNPs and simple sequence repeats (SSRs, biallelic insertions and deletions (InDels represent powerful molecular markers with desirable features for filling the gap in current genetic linkage maps. In this study, 28,908 small InDel polymorphisms (1–5 base pair, bp distributed genome-wide were identified and annotated by comparison of a whole-genome resequencing data set from two soybean [ (L. Merr.] genotypes, cultivar Zhonghunag13 (ZH and line Zhongpin03-5373 (ZP. The physical distribution of InDel polymorphisms in soybean genome was uneven, and matched closely with the distribution of previously annotated genes. The average density of InDel in the arm region was significantly higher than that in the pericentromeric region. The genomic regions that were fixed between the two elites were elucidated. With this information, five InDel markers within a putative quantitative trait locus (QTL for seed weight (SW, , were developed and used to genotype 254 recombinant inbred lines (RILs derived from the cross of ZP × ZH. Adding these five InDel markers to previously used SNP and SSR markers facilitated the discovery of further recombination events allowing fine-mapping the QTL to a 0.5 Mbp region. Our study clearly underlines the high value of InDel markers for map-based cloning and marker-assisted selection in soybean.

  20. Polymorphs and polymorphic cocrystals of temozolomide.

    Science.gov (United States)

    Babu, N Jagadeesh; Reddy, L Sreenivas; Aitipamula, Srinivasulu; Nangia, Ashwini

    2008-07-07

    Crystal polymorphism in the antitumor drug temozolomide (TMZ), cocrystals of TMZ with 4,4'-bipyridine-N,N'-dioxide (BPNO), and solid-state stability were studied. Apart from a known X-ray crystal structure of TMZ (form 1), two new crystalline modifications, forms 2 and 3, were obtained during attempted cocrystallization with carbamazepine and 3-hydroxypyridine-N-oxide. Conformers A and B of the drug molecule are stabilized by intramolecular amide N--HN(imidazole) and N--HN(tetrazine) interactions. The stable conformer A is present in forms 1 and 2, whereas both conformers crystallized in form 3. Preparation of polymorphic cocrystals I and II (TMZBPNO 1:0.5 and 2:1) were optimized by using solution crystallization and grinding methods. The metastable nature of polymorph 2 and cocrystal II is ascribed to unused hydrogen-bond donors/acceptors in the crystal structure. The intramolecularly bonded amide N-H donor in the less stable structure makes additional intermolecular bonds with the tetrazine C==O group and the imidazole N atom in stable polymorph 1 and cocrystal I, respectively. All available hydrogen-bond donors and acceptors are used to make intermolecular hydrogen bonds in the stable crystalline form. Synthon polymorphism and crystal stability are discussed in terms of hydrogen-bond reorganization.

  1. Identification of mitochondrial DNA sequence variation and development of single nucleotide polymorphic markers for CMS-D8 in cotton.

    Science.gov (United States)

    Suzuki, Hideaki; Yu, Jiwen; Wang, Fei; Zhang, Jinfa

    2013-06-01

    Cytoplasmic male sterility (CMS), which is a maternally inherited trait and controlled by novel chimeric genes in the mitochondrial genome, plays a pivotal role in the production of hybrid seed. In cotton, no PCR-based marker has been developed to discriminate CMS-D8 (from Gossypium trilobum) from its normal Upland cotton (AD1, Gossypium hirsutum) cytoplasm. The objective of the current study was to develop PCR-based single nucleotide polymorphic (SNP) markers from mitochondrial genes for the CMS-D8 cytoplasm. DNA sequence variation in mitochondrial genes involved in the oxidative phosphorylation chain including ATP synthase subunit 1, 4, 6, 8 and 9, and cytochrome c oxidase 1, 2 and 3 subunits were identified by comparing CMS-D8, its isogenic maintainer and restorer lines on the same nuclear genetic background. An allelic specific PCR (AS-PCR) was utilized for SNP typing by incorporating artificial mismatched nucleotides into the third or fourth base from the 3' terminus in both the specific and nonspecific primers. The result indicated that the method modifying allele-specific primers was successful in obtaining eight SNP markers out of eight SNPs using eight primer pairs to discriminate two alleles between AD1 and CMS-D8 cytoplasms. Two of the SNPs for atp1 and cox1 could also be used in combination to discriminate between CMS-D8 and CMS-D2 cytoplasms. Additionally, a PCR-based marker from a nine nucleotide insertion-deletion (InDel) sequence (AATTGTTTT) at the 59-67 bp positions from the start codon of atp6, which is present in the CMS and restorer lines with the D8 cytoplasm but absent in the maintainer line with the AD1 cytoplasm, was also developed. A SNP marker for two nucleotide substitutions (AA in AD1 cytoplasm to CT in CMS-D8 cytoplasm) in the intron (1,506 bp) of cox2 gene was also developed. These PCR-based SNP markers should be useful in discriminating CMS-D8 and AD1 cytoplasms, or those with CMS-D2 cytoplasm as a rapid, simple, inexpensive, and

  2. Human lymphocyte polymorphisms detected by quantitative two-dimensional electrophoresis

    International Nuclear Information System (INIS)

    Goldman, D.; Merril, C.R.

    1983-01-01

    A survey of 186 soluble lymphocyte proteins for genetic polymorphism was carried out utilizing two-dimensional electrophoresis of 14 C-labeled phytohemagglutinin (PHA)-stimulated human lymphocyte proteins. Nineteen of these proteins exhibited positional variation consistent with independent genetic polymorphism in a primary sample of 28 individuals. Each of these polymorphisms was characterized by quantitative gene-dosage dependence insofar as the heterozygous phenotype expressed approximately 50% of each allelic gene product as was seen in homozygotes. Patterns observed were also identical in monozygotic twins, replicate samples, and replicate gels. The three expected phenotypes (two homozygotes and a heterozygote) were observed in each of 10 of these polymorphisms while the remaining nine had one of the homozygous classes absent. The presence of the three phenotypes, the demonstration of gene-dosage dependence, and our own and previous pedigree analysis of certain of these polymorphisms supports the genetic basis of these variants. Based on this data, the frequency of polymorphic loci for man is: P . 19/186 . .102, and the average heterozygosity is .024. This estimate is approximately 1/3 to 1/2 the rate of polymorphism previously estimated for man in other studies using one-dimensional electrophoresis of isozyme loci. The newly described polymorphisms and others which should be detectable in larger protein surveys with two-dimensional electrophoresis hold promise as genetic markers of the human genome for use in gene mapping and pedigree analyses

  3. Genomic gigantism: DNA loss is slow in mountain grasshoppers.

    Science.gov (United States)

    Bensasson, D; Petrov, D A; Zhang, D X; Hartl, D L; Hewitt, G M

    2001-02-01

    Several studies have shown DNA loss to be inversely correlated with genome size in animals. These studies include a comparison between Drosophila and the cricket, Laupala, but there has been no assessment of DNA loss in insects with very large genomes. Podisma pedestris, the brown mountain grasshopper, has a genome over 100 times as large as that of Drosophila and 10 times as large as that of Laupala. We used 58 paralogous nuclear pseudogenes of mitochondrial origin to study the characteristics of insertion, deletion, and point substitution in P. pedestris and Italopodisma. In animals, these pseudogenes are "dead on arrival"; they are abundant in many different eukaryotes, and their mitochondrial origin simplifies the identification of point substitutions accumulated in nuclear pseudogene lineages. There appears to be a mononucleotide repeat within the 643-bp pseudogene sequence studied that acts as a strong hot spot for insertions or deletions (indels). Because the data for other insect species did not contain such an unusual region, hot spots were excluded from species comparisons. The rate of DNA loss relative to point substitution appears to be considerably and significantly lower in the grasshoppers studied than in Drosophila or Laupala. This suggests that the inverse correlation between genome size and the rate of DNA loss can be extended to comparisons between insects with large or gigantic genomes (i.e., Laupala and Podisma). The low rate of DNA loss implies that in grasshoppers, the accumulation of point mutations is a more potent force for obscuring ancient pseudogenes than their loss through indel accumulation, whereas the reverse is true for Drosophila. The main factor contributing to the difference in the rates of DNA loss estimated for grasshoppers, crickets, and Drosophila appears to be deletion size. Large deletions are relatively rare in Podisma and Italopodisma.

  4. Genome-to-genome analysis highlights the impact of the human innate and adaptive immune systems on the hepatitis C virus

    Science.gov (United States)

    Ip, Camilla; Magri, Andrea; Von Delft, Annette; Bonsall, David; Chaturvedi, Nimisha; Bartha, Istvan; Smith, David; Nicholson, George; McVean, Gilean; Trebes, Amy; Piazza, Paolo; Fellay, Jacques; Cooke, Graham; Foster, Graham R; Hudson, Emma; McLauchlan, John; Simmonds, Peter; Bowden, Rory; Klenerman, Paul; Barnes, Eleanor; Spencer, Chris C. A.

    2018-01-01

    Outcomes of hepatitis C virus (HCV) infection and treatment depend on viral and host genetic factors. We use human genome-wide genotyping arrays and new whole-genome HCV viral sequencing technologies to perform a systematic genome-to-genome study of 542 individuals chronically infected with HCV, predominately genotype 3. We show that both HLA alleles and interferon lambda innate immune system genes drive viral genome polymorphism, and that IFNL4 genotypes determine HCV viral load through a mechanism that is dependent on a specific polymorphism in the HCV polyprotein. We highlight the interplay between innate immune responses and the viral genome in HCV control. PMID:28394351

  5. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU

    Directory of Open Access Journals (Sweden)

    Ruibang Luo

    2014-06-01

    Full Text Available This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels, BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads, or just 25 min for 210-fold whole exome sequencing. BALSA’s speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  6. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU.

    Science.gov (United States)

    Luo, Ruibang; Wong, Yiu-Lun; Law, Wai-Chun; Lee, Lap-Kei; Cheung, Jeanno; Liu, Chi-Man; Lam, Tak-Wah

    2014-01-01

    This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels), BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads), or just 25 min for 210-fold whole exome sequencing. BALSA's speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  7. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

    Science.gov (United States)

    Cingolani, Pablo; Platts, Adrian; Wang, Le Lily; Coon, Melissa; Nguyen, Tung; Wang, Luan; Land, Susan J; Lu, Xiangyi; Ruden, Douglas M

    2012-01-01

    We describe a new computer program, SnpEff, for rapidly categorizing the effects of variants in genome sequences. Once a genome is sequenced, SnpEff annotates variants based on their genomic locations and predicts coding effects. Annotated genomic locations include intronic, untranslated region, upstream, downstream, splice site, or intergenic regions. Coding effects such as synonymous or non-synonymous amino acid replacement, start codon gains or losses, stop codon gains or losses, or frame shifts can be predicted. Here the use of SnpEff is illustrated by annotating ~356,660 candidate SNPs in ~117 Mb unique sequences, representing a substitution rate of ~1/305 nucleotides, between the Drosophila melanogaster w(1118); iso-2; iso-3 strain and the reference y(1); cn(1) bw(1) sp(1) strain. We show that ~15,842 SNPs are synonymous and ~4,467 SNPs are non-synonymous (N/S ~0.28). The remaining SNPs are in other categories, such as stop codon gains (38 SNPs), stop codon losses (8 SNPs), and start codon gains (297 SNPs) in the 5'UTR. We found, as expected, that the SNP frequency is proportional to the recombination frequency (i.e., highest in the middle of chromosome arms). We also found that start-gain or stop-lost SNPs in Drosophila melanogaster often result in additions of N-terminal or C-terminal amino acids that are conserved in other Drosophila species. It appears that the 5' and 3' UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus. As genome sequencing is becoming inexpensive and routine, SnpEff enables rapid analyses of whole-genome sequencing data to be performed by an individual laboratory.

  8. Population genetic study for 24 STR loci and Y indel (GlobalFiler™ PCR Amplification kit and PowerPlex® Fusion system) in 1000 Korean individuals.

    Science.gov (United States)

    Park, Hyun-Chul; Kim, Kicheol; Nam, Younhyoung; Park, Jihye; Lee, Jinmyung; Lee, Hyehyeon; Kwon, Hansol; Jin, Hanjun; Kim, Wook; Kim, Won; Lim, Sikeun

    2016-07-01

    Allele frequencies for 23 autosomal short tandem repeat loci (D3S1358, vWA, D16S539, CSF1PO, TPOX, D8S1179, D21S11, D18S51, TH01, FGA, D5S818, D13S317, D7S820, D2S441, D19S433, D22S1045, D10S1248, D1S1656, D12S391, D2S1338, SE33, Penta D, Penta E), 1 Y-chromosome short tandem repeat locus (DYS391) and Y indel were obtained from 1000 unrelated individuals of the Korean population. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  9. Genome-wide characterization of genetic variants and putative regions under selection in meat and egg-type chicken lines.

    Science.gov (United States)

    Boschiero, Clarissa; Moreira, Gabriel Costa Monteiro; Gheyas, Almas Ara; Godoy, Thaís Fernanda; Gasparin, Gustavo; Mariani, Pilar Drummond Sampaio Corrêa; Paduan, Marcela; Cesar, Aline Silva Mello; Ledur, Mônica Corrêa; Coutinho, Luiz Lehmann

    2018-01-25

    Meat and egg-type chickens have been selected for several generations for different traits. Artificial and natural selection for different phenotypes can change frequency of genetic variants, leaving particular genomic footprints throghtout the genome. Thus, the aims of this study were to sequence 28 chickens from two Brazilian lines (meat and white egg-type) and use this information to characterize genome-wide genetic variations, identify putative regions under selection using Fst method, and find putative pathways under selection. A total of 13.93 million SNPs and 1.36 million INDELs were identified, with more variants detected from the broiler (meat-type) line. Although most were located in non-coding regions, we identified 7255 intolerant non-synonymous SNPs, 512 stopgain/loss SNPs, 1381 frameshift and 1094 non-frameshift INDELs that may alter protein functions. Genes harboring intolerant non-synonymous SNPs affected metabolic pathways related mainly to reproduction and endocrine systems in the white-egg layer line, and lipid metabolism and metabolic diseases in the broiler line. Fst analysis in sliding windows, using SNPs and INDELs separately, identified over 300 putative regions of selection overlapping with more than 250 genes. For the first time in chicken, INDEL variants were considered for selection signature analysis, showing high level of correlation in results between SNP and INDEL data. The putative regions of selection signatures revealed interesting candidate genes and pathways related to important phenotypic traits in chicken, such as lipid metabolism, growth, reproduction, and cardiac development. In this study, Fst method was applied to identify high confidence putative regions under selection, providing novel insights into selection footprints that can help elucidate the functional mechanisms underlying different phenotypic traits relevant to meat and egg-type chicken lines. In addition, we generated a large catalog of line-specific and common

  10. Inducing indel mutation in the SOX6 gene by zinc finger nuclease for gamma reactivation: An approach towards gene therapy of beta thalassemia.

    Science.gov (United States)

    Modares Sadeghi, Mehran; Shariati, Laleh; Hejazi, Zahra; Shahbazi, Mansoureh; Tabatabaiefar, Mohammad Amin; Khanahmad, Hossein

    2018-03-01

    β-thalassemia is a common autosomal recessive disorder characterized by a deficiency in the synthesis of β-chains. Evidences show that increased HbF levels improve the symptoms in patients with β-thalassemia or sickle cell anemia. In this study, ZFN technology was applied to induce a mutation in the binding domain region of SOX6 to reactivate γ-globin expression. The sequences coding for ZFP arrays were designed and sub cloned in TDH plus as a transfer vector. The ZFN expression was confirmed using Western blot analysis. In the next step, using the site-directed mutagenesis strategy through the overlap PCR, a missense mutation (D64V) was induced in the catalytic domain of the integrase gene in the packaging plasmid and verified using DNA sequencing. Then, the integrase minus lentivirus containing ZFN cassette was packaged. Transduction of K562 cells with this virus was performed. Mutation detection assay was performed. The indel percentage of the cells transducted with lenti virus containing ZFN was 31%. After 5 days of erythroid differentiation with 15 μg/mL cisplatin, the levels of γ-globin mRNA were sixfold in the cells treated with ZFN compared to untreated cells. In the meantime, the measurement of HbF expression levels was carried out using hemoglobin electrophoresis and showed the same results. Integrase minus lentivirus can provide a useful tool for efficient transient gene expression and helps avoid disadvantages of gene targeting using the native virus. The ZFN strategy applied here to induce indel on SOX6 gene in adult erythroid progenitors may provide a method to activate fetal hemoglobin expression in individuals with β-thalassemia. © 2017 Wiley Periodicals, Inc.

  11. Analysis of single nucleotide polymorphisms in case-control studies.

    Science.gov (United States)

    Li, Yonghong; Shiffman, Dov; Oberbauer, Rainer

    2011-01-01

    Single nucleotide polymorphisms (SNPs) are the most common type of genetic variants in the human genome. SNPs are known to modify susceptibility to complex diseases. We describe and discuss methods used to identify SNPs associated with disease in case-control studies. An outline on study population selection, sample collection and genotyping platforms is presented, complemented by SNP selection, data preprocessing and analysis.

  12. In-silico single nucleotide polymorphisms (SNP) mining of Sorghum ...

    African Journals Online (AJOL)

    Single nucleotide polymorphisms (SNPs) may be considered the ultimate genetic markers as they represent the finest resolution of a DNA sequence (a single nucleotide), and are generally abundant in populations with a low mutation rate. SNPs are important tools in studying complex genetic traits and genome evolution.

  13. Random amplified polymorphic DNA (RAPD) markers reveal genetic ...

    African Journals Online (AJOL)

    The present study evaluated genetic variability of superior bael genotypes collected from different parts of Andaman Islands, India using fruit characters and random amplified polymorphic DNA (RAPD) markers. Genomic DNA extracted from leaf material using cetyl trimethyl ammonium bromide (CTAB) method was ...

  14. Polymorphism of the simple sequence repeat (AAC)5 in the ...

    Indian Academy of Sciences (India)

    2013-12-04

    Dec 4, 2013 ... SSRs could be present in coding and noncoding regions, contributing to genome dynamics and evolution. Previous studies by our research group detected molecular and cytogenetic riboso- mal DNA (rDNA) polymorphisms in Old Portuguese bread and durum wheat cultivars. Considering the rRNA genes.

  15. Genome-wide association study of multiplex schizophrenia pedigrees

    DEFF Research Database (Denmark)

    Levinson, Douglas F; Shi, Jianxin; Wang, Kai

    2012-01-01

    The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs).......The authors used a genome-wide association study (GWAS) of multiply affected families to investigate the association of schizophrenia to common single-nucleotide polymorphisms (SNPs) and rare copy number variants (CNVs)....

  16. GENOMIC DNA-FINGERPRINTING OF CLINICAL HAEMOPHILUS-INFLUENZAE ISOLATES BY POLYMERASE CHAIN-REACTION AMPLIFICATION - COMPARISON WITH MAJOR OUTER-MEMBRANE PROTEIN AND RESTRICTION-FRAGMENT-LENGTH-POLYMORPHISM ANALYSIS

    NARCIS (Netherlands)

    VANBELKUM, A; DUIM, B; REGELINK, A; MOLLER, L; QUINT, W; VANALPHEN, L

    Non-capsulate strains of Haemophilus influenzae were genotyped by analysis of variable DNA segments obtained by amplification of genomic DNA with the polymerase chain reaction (PCR fingerprinting). Discrete fragments of 100-2000 bp were obtained. The reproducibility of the procedure was assessed by

  17. Brucella abortus strain 2308 Wisconsin genome: importance of the definition of reference strains

    Directory of Open Access Journals (Sweden)

    Marcela Suárez-Esquivel

    2016-09-01

    Full Text Available Brucellosis is a bacterial infectious disease affecting a wide range of mammals and a neglected zoonosis caused by species of the genetically homogenous genus Brucella. As in most studies on bacterial diseases, research in brucellosis is carried out by using reference strains as canonical models to understand the mechanisms underlying host pathogen interactions. We performed whole genome sequencing (WGS analysis of the reference strain Brucella abortus 2308 routinely used in our laboratory, including manual curated annotation accessible as an editable version at www.wikipedia.Comparison of this genome with two publically available 2308 genomes showed significant differences, particularly indels related to insertional elements, suggesting variability related to the transposition of these elements within the same strain. Considering the outcome of high resolution genomic techniques in the bacteriology field, the conventional concept of strain definition needs to be revised.

  18. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    Directory of Open Access Journals (Sweden)

    Sarwar Azam

    2016-01-01

    Full Text Available Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  19. Polymorphisms associated with ventricular tachyarrhythmias: rationale, design, and endpoints of the 'diagnostic data influence on disease management and relation of genomics to ventricular tachyarrhythmias in implantable cardioverter/defibrillator patients (DISCOVERY)' study

    DEFF Research Database (Denmark)

    Wieneke, Heinrich; Spencker, Sebastian; Svendsen, Jesper Hastrup

    2010-01-01

    Implantable cardioverter-defibrillator (ICD) therapy is effective in primary and secondary prevention for patients who are at high risk of sudden cardiac death. However, the current risk stratification of patients who may benefit from this therapy is unsatisfactory. Single nucleotide polymorphism...... pathways will be investigated. As it is a diagnostic study, DISCOVERY will also investigate the impact of long-term device diagnostic data on the management of patients suffering from chronic cardiac disease as well as medical decisions made regarding their treatment.......Implantable cardioverter-defibrillator (ICD) therapy is effective in primary and secondary prevention for patients who are at high risk of sudden cardiac death. However, the current risk stratification of patients who may benefit from this therapy is unsatisfactory. Single nucleotide polymorphisms...... modulate the risk for arrhythmias and sudden cardiac death, and identification of common variants could help to better identify patients at risk. The DISCOVERY study is an interventional, longitudinal, prospective, multi-centre diagnostic study that will enrol 1287 patients in approximately 80 European...

  20. [Study of Chloroplast DNA Polymorphism in the Sunflower (Helianthus L.)].

    Science.gov (United States)

    Markina, N V; Usatov, A V; Logacheva, M D; Azarin, K V; Gorbachenko, C F; Kornienko, I V; Gavrilova, V A; Tihobaeva, V E

    2015-08-01

    The polymorphism of microsatellite loci of chloroplast genome in six Helianthus species and 46 lines of cultivated sunflower H. annuus (17 CMS lines and 29 Rf-lines) were studied. The differences between species are confined to four SSR loci. Within cultivated forms of the sunflower H. annuus, the polymorphism is absent. A comparative analysis was performed on sequences of the cpDNA inbred line 3629, line 398941 of the wild sunflower, and the American line HA383 H. annuus. As a result, 52 polymorphic loci represented by 27 SSR and 25 SNP were found; they can be used for genotyping of H. annuus samples, including cultural varieties: twelve polymorphic positions, of which eight are SSR and four are SNP.

  1. A comparison of rice chloroplast genomes

    DEFF Research Database (Denmark)

    Tang, Jiabin; Xia, Hong'ai; Cao, Mengliang

    2004-01-01

    Using high quality sequence reads extracted from our whole genome shotgun repository, we assembled two chloroplast genome sequences from two rice (Oryza sativa) varieties, one from 93-11 (a typical indica variety) and the other from PA64S (an indica-like variety with maternal origin of japonica......), which are both parental varieties of the super-hybrid rice, LYP9. Based on the patterns of high sequence coverage, we partitioned chloroplast sequence variations into two classes, intravarietal and intersubspecific polymorphisms. Intravarietal polymorphisms refer to variations within 93-11 or PA64S...

  2. Molecular Identification of Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) Markers.

    Science.gov (United States)

    Al-Khalifah, Nasser S; Shanavaskhan, A E

    2017-01-01

    Ambiguity in the total number of date palm cultivars across the world is pointing toward the necessity for an enumerative study using standard morphological and molecular markers. Among molecular markers, DNA markers are more suitable and ubiquitous to most applications. They are highly polymorphic in nature, frequently occurring in genomes, easy to access, and highly reproducible. Various molecular markers such as restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), simple sequence repeats (SSR), inter-simple sequence repeats (ISSR), and random amplified polymorphic DNA (RAPD) markers have been successfully used as efficient tools for analysis of genetic variation in date palm. This chapter explains a stepwise protocol for extracting total genomic DNA from date palm leaves. A user-friendly protocol for RAPD analysis and a table showing the primers used in different molecular techniques that produce polymorphisms in date palm are also provided.

  3. Complete mitochondrial genome sequence of black mustard (Brassica nigra; BB) and comparison with Brassica oleracea (CC) and Brassica carinata (BBCC).

    Science.gov (United States)

    Yamagishi, Hiroshi; Tanaka, Yoshiyuki; Terachi, Toru

    2014-11-01

    Crop species of Brassica (Brassicaceae) consist of three monogenomic species and three amphidiploid species resulting from interspecific hybridizations among them. Until now, mitochondrial genome sequences were available for only five of these species. We sequenced the mitochondrial genome of the sixth species, Brassica nigra (nuclear genome constitution BB), and compared it with those of Brassica oleracea (CC) and Brassica carinata (BBCC). The genome was assembled into a 232 145 bp circular sequence that is slightly larger than that of B. oleracea (219 952 bp). The genome of B. nigra contained 33 protein-coding genes, 3 rRNA genes, and 17 tRNA genes. The cox2-2 gene present in B. oleracea was absent in B. nigra. Although the nucleotide sequences of 52 genes were identical between B. nigra and B. carinata, the second exon of rps3 showed differences including an insertion/deletion (indel) and nucleotide substitutions. A PCR test to detect the indel revealed intraspecific variation in rps3, and in one line of B. nigra it amplified a DNA fragment of the size expected for B. carinata. In addition, the B. carinata lines tested here produced DNA fragments of the size expected for B. nigra. The results indicate that at least two mitotypes of B. nigra were present in the maternal parents of B. carinata.

  4. Complete Chloroplast Genome Sequences and Comparative Analysis of Chenopodium quinoa and C. album.

    Science.gov (United States)

    Hong, Su-Young; Cheon, Kyeong-Sik; Yoo, Ki-Oug; Lee, Hyun-Oh; Cho, Kwang-Soo; Suh, Jong-Taek; Kim, Su-Jeong; Nam, Jeong-Hwan; Sohn, Hwang-Bae; Kim, Yul-Ho

    2017-01-01

    The Chenopodium genus comprises ~150 species, including Chenopodium quinoa and Chenopodium album , two important crops with high nutritional value. To elucidate the phylogenetic relationship between the two species, the complete chloroplast (cp) genomes of these species were obtained by next generation sequencing. We performed comparative analysis of the sequences and, using InDel markers, inferred phylogeny and genetic diversity of the Chenopodium genus. The cp genome is 152,099 bp ( C. quinoa ) and 152,167 bp ( C. album ) long. In total, 119 genes (78 protein-coding, 37 tRNA, and 4 rRNA) were identified. We found 14 ( C. quinoa ) and 15 ( C. album ) tandem repeats (TRs); 14 TRs were present in both species and C. album and C. quinoa each had one species-specific TR. The trnI-GAU intron sequences contained one ( C. quinoa ) or two ( C. album ) copies of TRs (66 bp); the InDel marker was designed based on the copy number variation in TRs. Using the InDel markers, we detected this variation in the TR copy number in four species, Chenopodium hybridum, Chenopodium pumilio, Chenopodium ficifolium , and Chenopodium koraiense , but not in Chenopodium glaucum . A comparison of coding and non-coding regions between C. quinoa and C. album revealed divergent sites. Nucleotide diversity >0.025 was found in 17 regions-14 were located in the large single copy region (LSC), one in the inverted repeats, and two in the small single copy region (SSC). A phylogenetic analysis based on 59 protein-coding genes from 25 taxa resolved Chenopodioideae monophyletic and sister to Betoideae. The complete plastid genome sequences and molecular markers based on divergence hotspot regions in the two Chenopodium taxa will help to resolve the phylogenetic relationships of Chenopodium .

  5. Identification and characterization of transcript polymorphisms in soybean lines varying in oil composition and content.

    Science.gov (United States)

    Goettel, Wolfgang; Xia, Eric; Upchurch, Robert; Wang, Ming-Li; Chen, Pengyin; An, Yong-Qiang Charles

    2014-04-23

    Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality.

  6. Nested Inversion Polymorphisms Predispose Chromosome 22q11.2 to Meiotic Rearrangements

    NARCIS (Netherlands)

    Demaerel, Wolfram; Hestand, Matthew S.; Vergaelen, Elfi; Swillen, Ann; López-Sánchez, Marcos; Pérez-Jurado, Luis A.; McDonald-Mcginn, Donna M.; Zackai, Elaine; Emanuel, Beverly S.; Morrow, Bernice E.; Breckpot, Jeroen; Devriendt, Koenraad; Vermeesch, Joris R.; Antshel, Kevin M.; Arango, Celso; Armando, Marco; Bassett, Anne S.; Bearden, Carrie E.; Boot, Erik; Bravo-Sanchez, Marta; Breetvelt, Elemi; Busa, Tiffany; Butcher, Nancy J.; Campbell, Linda E.; Carmel, Miri; Chow, Eva W C; Crowley, T. Blaine; Cubells, Joseph; Cutler, David; Demaerel, Wolfram; Digilio, Maria Cristina; Duijff, Sasja; Eliez, Stephan; Emanuel, Beverly S.; Epstein, Michael P.; Evers, Rens; Fernandez Garcia-Moya, Luis; Fiksinski, Ania; Fraguas, David; Fremont, Wanda; Fritsch, Rosemarie; Garcia-Minaur, Sixto; Golden, Aaron; Gothelf, Doron; Guo, Tingwei; Gur, Ruben C.; Gur, Raquel E.; Heine-Suner, Damian; Hestand, Matthew; Hooper, Stephen R.; Kates, Wendy R.; Kushan, Leila; Laorden-Nieto, Alejandra; Maeder, Johanna; Marino, Bruno; Marshall, Christian R.; McCabe, Kathryn; McDonald-Mcginn, Donna M.; Michaelovosky, Elena; Morrow, Bernice E.; Moss, Edward; Mulle, Jennifer; Murphy, Declan; Murphy, Kieran C.; Murphy, Clodagh M.; Niarchou, Maria; Ornstein, Claudia; Owen, Michael J; Philip, Nicole; Repetto, Gabriela M.; Schneider, Maude; Shashi, Vandana; Simon, Tony J.; Swillen, Ann; Tassone, Flora; Unolt, Marta; Van Amelsvoort, Therese; van den Bree, Marianne B M; Van Duin, Esther; Vergaelen, Elfi; Vermeesch, Joris R.; Vicari, Stefano; Vingerhoets, Claudia; Vorstman, Jacob; Warren, Steve; Weinberger, Ronnie; Weisman, Omri; Weizman, Abraham; Zackai, Elaine; Zhang, Zhengdong; Zwick, Michael

    2017-01-01

    Inversion polymorphisms between low-copy repeats (LCRs) might predispose chromosomes to meiotic non-allelic homologous recombination (NAHR) events and thus lead to genomic disorders. However, for the 22q11.2 deletion syndrome (22q11.2DS), the most common genomic disorder, no such inversions have

  7. Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs

    Science.gov (United States)

    Green, Richard E; Braun, Edward L; Armstrong, Joel; Earl, Dent; Nguyen, Ngan; Hickey, Glenn; Vandewege, Michael W; St John, John A; Capella-Gutiérrez, Salvador; Castoe, Todd A; Kern, Colin; Fujita, Matthew K; Opazo, Juan C; Jurka, Jerzy; Kojima, Kenji K; Caballero, Juan; Hubley, Robert M; Smit, Arian F; Platt, Roy N; Lavoie, Christine A; Ramakodi, Meganathan P; Finger, John W; Suh, Alexander; Isberg, Sally R; Miles, Lee; Chong, Amanda Y; Jaratlerdsiri, Weerachai; Gongora, Jaime; Moran, Christopher; Iriarte, Andrés; McCormack, John; Burgess, Shane C; Edwards, Scott V; Lyons, Eric; Williams, Christina; Breen, Matthew; Howard, Jason T; Gresham, Cathy R; Peterson, Daniel G; Schmitz, Jürgen; Pollock, David D; Haussler, David; Triplett, Eric W; Zhang, Guojie; Irie, Naoki; Jarvis, Erich D; Brochu, Christopher A; Schmidt, Carl J; McCarthy, Fiona M; Faircloth, Brant C; Hoffmann, Federico G; Glenn, Travis C; Gabaldón, Toni; Paten, Benedict; Ray, David A

    2015-01-01

    To provide context for the diversifications of archosaurs, the group that includes crocodilians, dinosaurs and birds, we generated draft genomes of three crocodilians, Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the relatively rapid evolution of bird genomes represents an autapomorphy within that clade. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these new data combined with newly published bird genomes allowed us to reconstruct the partial genome of the common ancestor of archosaurs providing a tool to investigate the genetic starting material of crocodilians, birds, and dinosaurs. PMID:25504731

  8. Whole genome sequencing-based characterization of extensively drug resistant (XDR) strains of Mycobacterium tuberculosis from Pakistan

    KAUST Repository

    Hasan, Zahra; Ali, Asho; McNerney, Ruth; Mallard, Kim; Hill-Cawthorne, Grant A.; Coll, Francesc; Nair, Mridul; Pain, Arnab; Clark, Taane G.; Hasan, Rumina

    2015-01-01

    Objectives: The global increase in drug resistance in Mycobacterium tuberculosis (MTB) strains increases the focus on improved molecular diagnostics for MTB. Extensively drug-resistant (XDR) - TB is caused by MTB strains resistant to rifampicin, isoniazid, fluoroquinolone and aminoglycoside antibiotics. Resistance to anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs), in particular MTB genes. However, there is regional variation between MTB lineages and the SNPs associated with resistance. Therefore, there is a need to identify common resistance conferring SNPs so that effective molecular-based diagnostic tests for MTB can be developed. This study investigated used whole genome sequencing (WGS) to characterize 37 XDR MTB isolates from Pakistan and investigated SNPs related to drug resistance. Methods: XDR-TB strains were selected. DNA was extracted from MTB strains, and samples underwent WGS with 76-base-paired end fragment sizes using Illumina paired end HiSeq2000 technology. Raw sequence data were mapped uniquely to H37Rv reference genome. The mappings allowed SNPs and small indels to be called using SAMtools/BCFtools. Results: This study found that in all XDR strains, rifampicin resistance was attributable to SNPs in the rpoB RDR region. Isoniazid resistance-associated mutations were primarily related to katG codon 315 followed by inhA S94A. Fluoroquinolone resistance was attributable to gyrA 91-94 codons in most strains, while one did not have SNPs in either gyrA or gyrB. Aminoglycoside resistance was mostly associated with SNPs in rrs, except in 6 strains. Ethambutol resistant strains had embB codon 306 mutations, but many strains did not have this present. The SNPs were compared with those present in commercial assays such as LiPA Hain MDRTBsl, and the sensitivity of the assays for these strains was evaluated. Conclusions: If common drug resistance associated with SNPs evaluated the concordance between phenotypic and

  9. Whole genome sequencing-based characterization of extensively drug resistant (XDR) strains of Mycobacterium tuberculosis from Pakistan

    KAUST Repository

    Hasan, Zahra

    2015-03-01

    Objectives: The global increase in drug resistance in Mycobacterium tuberculosis (MTB) strains increases the focus on improved molecular diagnostics for MTB. Extensively drug-resistant (XDR) - TB is caused by MTB strains resistant to rifampicin, isoniazid, fluoroquinolone and aminoglycoside antibiotics. Resistance to anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs), in particular MTB genes. However, there is regional variation between MTB lineages and the SNPs associated with resistance. Therefore, there is a need to identify common resistance conferring SNPs so that effective molecular-based diagnostic tests for MTB can be developed. This study investigated used whole genome sequencing (WGS) to characterize 37 XDR MTB isolates from Pakistan and investigated SNPs related to drug resistance. Methods: XDR-TB strains were selected. DNA was extracted from MTB strains, and samples underwent WGS with 76-base-paired end fragment sizes using Illumina paired end HiSeq2000 technology. Raw sequence data were mapped uniquely to H37Rv reference genome. The mappings allowed SNPs and small indels to be called using SAMtools/BCFtools. Results: This study found that in all XDR strains, rifampicin resistance was attributable to SNPs in the rpoB RDR region. Isoniazid resistance-associated mutations were primarily related to katG codon 315 followed by inhA S94A. Fluoroquinolone resistance was attributable to gyrA 91-94 codons in most strains, while one did not have SNPs in either gyrA or gyrB. Aminoglycoside resistance was mostly associated with SNPs in rrs, except in 6 strains. Ethambutol resistant strains had embB codon 306 mutations, but many strains did not have this present. The SNPs were compared with those present in commercial assays such as LiPA Hain MDRTBsl, and the sensitivity of the assays for these strains was evaluated. Conclusions: If common drug resistance associated with SNPs evaluated the concordance between phenotypic and

  10. Detection and correction of false segmental duplications caused by genome mis-assembly

    Science.gov (United States)

    2010-01-01

    Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes. PMID:20219098

  11. Detection of DNA methylation changes in micropropagated banana plants using methylation-sensitive amplification polymorphism (MSAP).

    Science.gov (United States)

    Peraza-Echeverria, S; Herrera-Valencia, V A.; Kay, A -J.

    2001-07-01

    The extent of DNA methylation polymorphisms was evaluated in micropropagated banana (Musa AAA cv. 'Grand Naine') derived from either the vegetative apex of the sucker or the floral apex of the male inflorescence using the methylation-sensitive amplification polymorphism (MSAP) technique. In all, 465 fragments, each representing a recognition site cleaved by either or both of the isoschizomers were amplified using eight combinations of primers. A total of 107 sites (23%) were found to be methylated at cytosine in the genome of micropropagated banana plants. In plants micropropagated from the male inflorescence explant 14 (3%) DNA methylation events were polymorphic, while plants micropropagated from the sucker explant produced 8 (1.7%) polymorphisms. No DNA methylation polymorphisms were detected in conventionally propagated banana plants. These results demonstrated the usefulness of MSAP to detect DNA methylation events in micropropagated banana plants and indicate that DNA methylation polymorphisms are associated with micropropagation.

  12. Search for methylation-sensitive amplification polymorphisms in mutant figs.

    Science.gov (United States)

    Rodrigues, M G F; Martins, A B G; Bertoni, B W; Figueira, A; Giuliatti, S

    2013-07-08

    Fig (Ficus carica) breeding programs that use conventional approaches to develop new cultivars are rare, owing to limited genetic variability and the difficulty in obtaining plants via gamete fusion. Cytosine methylation in plants leads to gene repression, thereby affecting transcription without changing the DNA sequence. Previous studies using random amplification of polymorphic DNA and amplified fragment length polymorphism markers revealed no polymorphisms among select fig mutants that originated from gamma-irradiated buds. Therefore, we conducted methylation-sensitive amplified polymorphism analysis to verify the existence of variability due to epigenetic DNA methylation among these mutant selections compared to the main cultivar 'Roxo-de-Valinhos'. Samples of genomic DNA were double-digested with either HpaII (methylation sensitive) or MspI (methylation insensitive) and with EcoRI. Fourteen primer combinations were tested, and on an average, non-methylated CCGG, symmetrically methylated CmCGG, and hemimethylated hmCCGG sites accounted for 87.9, 10.1, and 2.0%, respectively. MSAP analysis was effective in detecting differentially methylated sites in the genomic DNA of fig mutants, and methylation may be responsible for the phenotypic variation between treatments. Further analyses such as polymorphic DNA sequencing are necessary to validate these differences, standardize the regions of methylation, and analyze reads using bioinformatic tools.

  13. a potential source of spurious associations in genome-wide ...

    Indian Academy of Sciences (India)

    2010-04-01

    Apr 1, 2010 ... Genome-wide association studies (GWAS) examine the entire human genome with the goal of identifying genetic variants. (usually single nucleotide polymorphisms (SNPs)) that are associated with phenotypic traits such as disease status and drug response. The discordance of significantly associated ...

  14. Utilization of complete chloroplast genomes for phylogenetic studies

    NARCIS (Netherlands)

    Ramlee, Shairul Izan Binti

    2016-01-01

    Chloroplast DNA sequence polymorphisms are a primary source of data in many plant phylogenetic studies. The chloroplast genome is relatively conserved in its evolution making it an ideal molecule to retain phylogenetic signals. The chloroplast genome is also largely, but not completely, free from

  15. Polymorphs of Pridopidine Hydrochloride

    DEFF Research Database (Denmark)

    Zimmermann, A.; Frostrup, B.; Bond, A. D.

    2012-01-01

    of both polymorphs contain N+-H center dot center dot center dot Cl-center dot center dot center dot N+-H center dot center dot center dot interactions, and the polymorphism can be viewed as alternative orientations (parallel or antiparallel) of comparable molecular columns while retaining the center dot...... center dot center dot N+-H center dot center dot center dot Cl-center dot center dot center dot N+-H center dot center dot center dot motif between columns. Forms I and II have melting points of 199 and 210 degrees C, respectively. Following melting of form I, a kinetically controlled crystallization...

  16. Advances in genome editing for improved animal breeding: A review

    Directory of Open Access Journals (Sweden)

    Shakil Ahmad Bhat

    2017-11-01

    Full Text Available Since centuries, the traits for production and disease resistance are being targeted while improving the genetic merit of domestic animals, using conventional breeding programs such as inbreeding, outbreeding, or introduction of marker-assisted selection. The arrival of new scientific concepts, such as cloning and genome engineering, has added a new and promising research dimension to the existing animal breeding programs. Development of genome editing technologies such as transcription activator-like effector nuclease, zinc finger nuclease, and clustered regularly interspaced short palindromic repeats systems begun a fresh era of genome editing, through which any change in the genome, including specific DNA sequence or indels, can be made with unprecedented precision and specificity. Furthermore, it offers an opportunity of intensification in the frequency of desirable alleles in an animal population through gene-edited individuals more rapidly than conventional breeding. The specific research is evolving swiftly with a focus on improvement of economically important animal species or their traits all of which form an important subject of this review. It also discusses the hurdles to commercialization of these techniques despite several patent applications owing to the ambiguous legal status of genome-editing methods on account of their disputed classification. Nonetheless, barring ethical concerns gene-editing entailing economically important genes offers a tremendous potential for breeding animals with desirable traits.

  17. Localizing recent adaptive evolution in the human genome

    DEFF Research Database (Denmark)

    Williamson, Scott H; Hubisz, Melissa J; Clark, Andrew G

    2007-01-01

    , clusters of olfactory receptors, genes involved in nervous system development and function, immune system genes, and heat shock genes. We also observe consistent evidence of selective sweeps in centromeric regions. In general, we find that recent adaptation is strikingly pervasive in the human genome......-nucleotide polymorphism ascertainment, while also providing fine-scale estimates of the position of the selected site, we analyzed a genomic dataset of 1.2 million human single-nucleotide polymorphisms genotyped in African-American, European-American, and Chinese samples. We identify 101 regions of the human genome...

  18. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  19. Correction for Measurement Error from Genotyping-by-Sequencing in Genomic Variance and Genomic Prediction Models

    DEFF Research Database (Denmark)

    Ashraf, Bilal; Janss, Luc; Jensen, Just

    sample). The GBSeq data can be used directly in genomic models in the form of individual SNP allele-frequency estimates (e.g., reference reads/total reads per polymorphic site per individual), but is subject to measurement error due to the low sequencing depth per individual. Due to technical reasons....... In the current work we show how the correction for measurement error in GBSeq can also be applied in whole genome genomic variance and genomic prediction models. Bayesian whole-genome random regression models are proposed to allow implementation of large-scale SNP-based models with a per-SNP correction...... for measurement error. We show correct retrieval of genomic explained variance, and improved genomic prediction when accounting for the measurement error in GBSeq data...

  20. KEJADIAN INDEL SIMULTAN PADA INTRON 7 GEN BRANCHED-CHAIN Α-KETOACID DEHYDROGENASE E1A (BCKDHA PADA SAPI MADURA

    Directory of Open Access Journals (Sweden)

    Asri Febriana

    2015-08-01

    Full Text Available Madura cattle is one of the Indonesian local cattle breeds derived from crossing between Zebu cattle (Bos indicus and banteng (Bos javanicus. Branched-chain α-ketoacid dehydrogenase (BCKDH is one of the main enzyme complexes in the inner mitochondrial membrane that metabolizes branched chain amino acid (BCAA, ie valine, leucine, and isoleucine. The diversity of the nucleotide sequences of the genes largely determine the efficiency of enzyme encoded. This paper aimed to determine the nucleotide variation contained in section intron 7, exon 8, and intron 8 genes BCKDHA on Madura cattle. This study was conducted on three Madura cattle that used as bull race (karapan, beauty contest (sonok, and beef cattle. The analysis showed that the variation in intron higher than occurred in the exon. Simultaneous indel found at base position 34 and 68 in sonok cattle. In addition, the C266T variant found in beef cattle. These variants do not cause significant changes in amino acids. There was no specific mutation in intron 7, exon 8, and intron 8 were found in Madura cattle designation. This indicated the absence of differentiation Madura cattle designation of selection pressure of BCKDHA gene.

  1. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.

    Science.gov (United States)

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-09-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.

  2. Genome shotgun sequencing and development of microsatellite ...

    African Journals Online (AJOL)

    Analysis of the gerbera genome DNA ('Raon') general library showed that sequences of (AT), (AG), (AAG) and (AAT) repeats appeared most often, whereas (AC), (AAC) and (ACC) were the least frequent. Primer pairs were designed for 80 loci. Only eight primer pairs produced reproducible polymorphic bands in the 28 ...

  3. Extreme genomes

    OpenAIRE

    DeLong, Edward F

    2000-01-01

    The complete genome sequence of Thermoplasma acidophilum, an acid- and heat-loving archaeon, has recently been reported. Comparative genomic analysis of this 'extremophile' is providing new insights into the metabolic machinery, ecology and evolution of thermophilic archaea.

  4. Grass genomes

    OpenAIRE

    Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

    1998-01-01

    For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...

  5. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  6. Teaching polymorphism early

    DEFF Research Database (Denmark)

    2005-01-01

    Is it possible to teach dynamic polymorphism early? What techniques could facilitate teaching it in Java. This panel will bring together people who have considered this question and attempted to implement it in various ways, some more completely than others. It will also give participants...

  7. Transposable element activity, genome regulation and human health.

    Science.gov (United States)

    Wang, Lu; Jordan, I King

    2018-03-02

    A convergence of novel genome analysis technologies is enabling population genomic studies of human transposable elements (TEs). Population surveys of human genome sequences have uncovered thousands of individual TE insertions that segregate as common genetic variants, i.e. TE polymorphisms. These recent TE insertions provide an important source of naturally occurring human genetic variation. Investigators are beginning to leverage population genomic data sets to execute genome-scale association studies for assessing the phenotypic impact of human TE polymorphisms. For example, the expression quantitative trait loci (eQTL) analytical paradigm has recently been used to uncover hundreds of associations between human TE insertion variants and gene expression levels. These include population-specific gene regulatory effects as well as coordinated changes to gene regulatory networks. In addition, analyses of linkage disequilibrium patterns with previously characterized genome-wide association study (GWAS) trait variants have uncovered TE insertion polymorphisms that are likely causal variants for a variety of common complex diseases. Gene regulatory mechanisms that underlie specific disease phenotypes have been proposed for a number of these trait associated TE polymorphisms. These new population genomic approaches hold great promise for understanding how ongoing TE activity contributes to functionally relevant genetic variation within and between human populations. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Common variants at the CHEK2 gene locus and risk of epithelial ovarian cancer

    NARCIS (Netherlands)

    Lawrenson, K.; Iversen, E.S.; Tyrer, J.; Weber, R.P.; Concannon, P.; Hazelett, D.J.; Li, Q.; Marks, J.R.; Berchuck, A.; Lee, J.M.; Aben, K.K.H.; Anton-Culver, H.; Antonenkova, N.; Bandera, E.V.; Bean, Y.; Beckmann, M.W.; Bisogna, M.; Bjorge, L.; Bogdanova, N.; Brinton, L.A.; Brooks-Wilson, A.; Bruinsma, F.; Butzow, R.; Campbell, I.G.; Carty, K.; Chang-Claude, J.; Chenevix-Trench, G.; Chen, A; Chen, Z.; Cook, L.S.; Cramer, D.W; Cunningham, J.M.; Cybulski, C.; Plisiecka-Halasa, J.; Dennis, J.; Dicks, E.; Doherty, J.A.; Dork, T.; Bois, A. du; Eccles, D.; Easton, D.T.; Edwards, R.P.; Eilber, U.; Ekici, A.B.; Fasching, P.A.; Fridley, B.L.; Gao, Y.T.; Gentry-Maharaj, A.; Giles, G.G.; Glasspool, R.; Goode, E.L.; Goodman, M.T.; Gronwald, J.; Harter, P.; Hasmad, H.N.; Hein, A.; Heitz, F.; Hildebrandt, M.A.T.; Hillemanns, P.; Hogdall, E.; Hogdall, C.; Hosono, S.; Jakubowska, A.; Paul, J.; Jensen, A.; Karlan, B.Y.; Kjaer, S.K.; Kelemen, L.E.; Kellar, M.; Kelley, J.L.; Kiemeney, L.A.; Krakstad, C.; Lambrechts, D.; Lambrechts, S.; Le, N.D.; Lee, A.W.; Cannioto, R.; Leminen, A.; Lester, J.; Levine, D.A.; Liang, D.; Lissowska, J.; Lu, K.; Lubinski, J.; Lundvall, L.; Massuger, L.F.; Matsuo, K.; McGuire, V.; McLaughlin, J.R.; Nevanlinna, H.; McNeish, I.; Menon, U.; Modugno, F.; Moysich, K.B.; Narod, S.A.; Nedergaard, L.; Ness, R.B.; Azmi, M.A. Noor; Odunsi, K.; Olson, S.H.

    2015-01-01

    Genome-wide association studies have identified 20 genomic regions associated with risk of epithelial ovarian cancer (EOC), but many additional risk variants may exist. Here, we evaluated associations between common genetic variants [single nucleotide polymorphisms (SNPs) and indels] in DNA repair

  9. Clinical Implications of Human Population Differences in Genome-wide Rates of Functional Genotypes

    Directory of Open Access Journals (Sweden)

    Ali eTorkamani

    2012-11-01

    Full Text Available There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient’s genome must be contrasted with variants in a reference set of genomes made up of other individuals’ genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates likely functional derived (i.e., non-ancestral, based on the chimp genome single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ~5.5-6.1 million total derived variants, of which ~12,000 are likely to have a functional effect (~5000 coding and ~7000 non-coding. We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.

  10. Detection of DNA polymorphisms in Dendrobium Sonia White mutant lines

    International Nuclear Information System (INIS)

    Affrida Abu Hassan; Putri Noor Faizah Megat Mohd Tahir; Zaiton Ahmad; Mohd Nazir Basiran

    2006-01-01

    Dendrobium Sonia white mutant lines were obtained through gamma ray induced mutation of purple flower Dendrobium Sonia at dosage 35 Gy. Amplified Fragment Length Polymorphism (AFLP) technique was used to compare genomic variations in these mutant lines with the control. Our objectives were to detect polymorphic fragments from these mutants to provide useful information on genes involving in flower colour expression. AFLP is a PCR based DNA fingerprinting technique. It involves digestion of DNA with restriction enzymes, ligation of adapter and selective amplification using primer with one (pre-amplification) and three (selective amplification) arbitrary nucleotides. A total number of 20 primer combinations have been tested and 7 produced clear fingerprint patterns. Of these, 13 polymorphic bands have been successfully isolate and cloned. (Author)

  11. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  12. BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment

    Science.gov (United States)

    Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy

    2016-01-01

    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome. PMID:27461955

  13. Efficient CRISPR/Cas9-Based Genome Engineering in Human Pluripotent Stem Cells.

    Science.gov (United States)

    Kime, Cody; Mandegar, Mohammad A; Srivastava, Deepak; Yamanaka, Shinya; Conklin, Bruce R; Rand, Tim A

    2016-01-01

    Human pluripotent stem cells (hPS cells) are rapidly emerging as a powerful tool for biomedical discovery. The advent of human induced pluripotent stem cells (hiPS cells) with human embryonic stem (hES)-cell-like properties has led to hPS cells with disease-specific genetic backgrounds for in vitro disease modeling and drug discovery as well as mechanistic and developmental studies. To fully realize this potential, it will be necessary to modify the genome of hPS cells with precision and flexibility. Pioneering experiments utilizing site-specific double-strand break (DSB)-mediated genome engineering tools, including zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), have paved the way to genome engineering in previously recalcitrant systems such as hPS cells. However, these methods are technically cumbersome and require significant expertise, which has limited adoption. A major recent advance involving the clustered regularly interspaced short palindromic repeats (CRISPR) endonuclease has dramatically simplified the effort required for genome engineering and will likely be adopted widely as the most rapid and flexible system for genome editing in hPS cells. In this unit, we describe commonly practiced methods for CRISPR endonuclease genomic editing of hPS cells into cell lines containing genomes altered by insertion/deletion (indel) mutagenesis or insertion of recombinant genomic DNA. Copyright © 2016 John Wiley & Sons, Inc.

  14. [Recent advances of amplified fragment length polymorphism and its applications in forensic botany].

    Science.gov (United States)

    Li, Cheng-Tao; Li, Li

    2008-10-01

    Amplified fragment length polymorphism (AFLP) is a new molecular marker to detect genomic polymorphism. This new technology has advantages of high resolution, good stability, and reproducibility. Great achievements have been derived in recent years in AFLP related technologies with several AFLP expanded methodologies available. AFLP technology has been widely used in the fields of plant, animal, and microbes. It has become one of the hotspots in Forensic Botany. This review focuses on the recent advances of AFLP and its applications in forensic biology.

  15. Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB

    Directory of Open Access Journals (Sweden)

    Joon-Ho Lee

    2014-09-01

    Full Text Available Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB (http://snugenome2.snu.ac.kr/HSDB provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.

  16. Genomic Variants Revealed by Invariably Missing Genotypes in Nelore Cattle.

    Directory of Open Access Journals (Sweden)

    Joaquim Manoel da Silva

    Full Text Available High density genotyping panels have been used in a wide range of applications. From population genetics to genome-wide association studies, this technology still offers the lowest cost and the most consistent solution for generating SNP data. However, in spite of the application, part of the generated data is always discarded from final datasets based on quality control criteria used to remove unreliable markers. Some discarded data consists of markers that failed to generate genotypes, labeled as missing genotypes. A subset of missing genotypes that occur in the whole population under study may be caused by technical issues but can also be explained by the presence of genomic variations that are in the vicinity of the assayed SNP and that prevent genotyping probes from annealing. The latter case may contain relevant information because these missing genotypes might be used to identify population-specific genomic variants. In order to assess which case is more prevalent, we used Illumina HD Bovine chip genotypes from 1,709 Nelore (Bos indicus samples. We found 3,200 missing genotypes among the whole population. NGS re-sequencing data from 8 sires were used to verify the presence of genomic variations within their flanking regions in 81.56% of these missing genotypes. Furthermore, we discovered 3,300 novel SNPs/Indels, 31% of which are located in genes that may affect traits of importance for the genetic improvement of cattle production.

  17. Rapid evolutionary change of common bean (Phaseolus vulgaris L plastome, and the genomic diversification of legume chloroplasts

    Directory of Open Access Journals (Sweden)

    Dávila Guillermo

    2007-07-01

    Full Text Available Abstract Background Fabaceae (legumes is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean 1. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome.

  18. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  19. No association between a common single nucleotide polymorphism, rs4141463, in the MACROD2 gene and autism spectrum disorder.

    NARCIS (Netherlands)

    Curran, S.; Bolton, P.; Rozsnyai, K.; Chiocchetti, A.; Klauck, S.M.; Duketis, E.; Poustka, F.; Schlitt, S.; Freitag, C.M.; Lee, I. van der; Muglia, P.; Poot, M.; Staal, W.G.; Jonge, M.V. de; Ophoff, R.A.; Lewis, C.; Skuse, D.; Mandy, W.; Vassos, E.; Fossdal, R.; Magnusson, P.; Hreidarsson, S.; Saemundsen, E.; Stefansson, H.; Stefansson, K.; Collier, D.

    2011-01-01

    The Autism Genome Project (AGP) Consortium recently reported genome-wide significant association between autism and an intronic single nucleotide polymorphism marker, rs4141463, within the MACROD2 gene. In the present study we attempted to replicate this finding using an independent case-control

  20. Genome sequencing and comparative genomics analysis revealed pathogenic potential in Penicillium capsulatum as a novel fungal pathogen belonging to Eurotiales

    Directory of Open Access Journals (Sweden)

    Ying Yang

    2016-10-01

    Full Text Available Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptome of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNP in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen.

  1. A simple optimization can improve the performance of single feature polymorphism detection by Affymetrix expression arrays

    Directory of Open Access Journals (Sweden)

    Fujisawa Hironori

    2010-05-01

    Full Text Available Abstract Background High-density oligonucleotide arrays are effective tools for genotyping numerous loci simultaneously. In small genome species (genome size: Results We compared the single feature polymorphism (SFP detection performance of whole-genome and transcript hybridizations using the Affymetrix GeneChip® Rice Genome Array, using the rice cultivars with full genome sequence, japonica cultivar Nipponbare and indica cultivar 93-11. Both genomes were surveyed for all probe target sequences. Only completely matched 25-mer single copy probes of the Nipponbare genome were extracted, and SFPs between them and 93-11 sequences were predicted. We investigated optimum conditions for SFP detection in both whole genome and transcript hybridization using differences between perfect match and mismatch probe intensities of non-polymorphic targets, assuming that these differences are representative of those between mismatch and perfect targets. Several statistical methods of SFP detection by whole-genome hybridization were compared under the optimized conditions. Causes of false positives and negatives in SFP detection in both types of hybridization were investigated. Conclusions The optimizations allowed a more than 20% increase in true SFP detection in whole-genome hybridization and a large improvement of SFP detection performance in transcript hybridization. Significance analysis of the microarray for log-transformed raw intensities of PM probes gave the best performance in whole genome hybridization, and 22,936 true SFPs were detected with 23.58% false positives by whole genome hybridization. For transcript hybridization, stable SFP detection was achieved for highly expressed genes, and about 3,500 SFPs were detected at a high sensitivity (> 50% in both shoot and young panicle transcripts. High SFP detection performances of both genome and transcript hybridizations indicated that microarrays of a complex genome (e.g., of Oryza sativa can be

  2. Genotypic Characterization of Bradyrhizobium Strains Nodulating Endemic Woody Legumes of the Canary Islands by PCR-Restriction Fragment Length Polymorphism Analysis of Genes Encoding 16S rRNA (16S rDNA) and 16S-23S rDNA Intergenic Spacers, Repetitive Extragenic Palindromic PCR Genomic Fingerprinting, and Partial 16S rDNA Sequencing

    Science.gov (United States)

    Vinuesa, Pablo; Rademaker, Jan L. W.; de Bruijn, Frans J.; Werner, Dietrich

    1998-01-01

    We present a phylogenetic analysis of nine strains of symbiotic nitrogen-fixing bacteria isolated from nodules of tagasaste (Chamaecytisus proliferus) and other endemic woody legumes of the Canary Islands, Spain. These and several reference strains were characterized genotypically at different levels of taxonomic resolution by computer-assisted analysis of 16S ribosomal DNA (rDNA) PCR-restriction fragment length polymorphisms (PCR-RFLPs), 16S-23S rDNA intergenic spacer (IGS) RFLPs, and repetitive extragenic palindromic PCR (rep-PCR) genomic fingerprints with BOX, ERIC, and REP primers. Cluster analysis of 16S rDNA restriction patterns with four tetrameric endonucleases grouped the Canarian isolates with the two reference strains, Bradyrhizobium japonicum USDA 110spc4 and Bradyrhizobium sp. strain (Centrosema) CIAT 3101, resolving three genotypes within these bradyrhizobia. In the analysis of IGS RFLPs with three enzymes, six groups were found, whereas rep-PCR fingerprinting revealed an even greater genotypic diversity, with only two of the Canarian strains having similar fingerprints. Furthermore, we show that IGS RFLPs and even very dissimilar rep-PCR fingerprints can be clustered into phylogenetically sound groupings by combining them with 16S rDNA RFLPs in computer-assisted cluster analysis of electrophoretic patterns. The DNA sequence analysis of a highly variable 264-bp segment of the 16S rRNA genes of these strains was found to be consistent with the fingerprint-based classification. Three different DNA sequences were obtained, one of which was not previously described, and all belonged to the B. japonicum/Rhodopseudomonas rDNA cluster. Nodulation assays revealed that none of the Canarian isolates nodulated Glycine max or Leucaena leucocephala, but all nodulated Acacia pendula, C. proliferus, Macroptilium atropurpureum, and Vigna unguiculata. PMID:9603820

  3. Intragenomic polymorphisms among high-copy loci: a genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae).

    Science.gov (United States)

    Weitemier, Kevin; Straub, Shannon C K; Fishbein, Mark; Liston, Aaron

    2015-01-01

    Despite knowledge that concerted evolution of high-copy loci is often imperfect, studies that investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the "noncoding" ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).

  4. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    2010-04-01

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  5. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  6. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    Directory of Open Access Journals (Sweden)

    Yao Xu

    Full Text Available Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus and Qinchuan (Bos taurus are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV were identified by aligning Nanyang to Qinchuan genome, 783 of which (27% encompassed the coding regions of 495 functional genes. The gene ontology (GO analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio = -2.34988; P value = 1.53E-102. Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs

  7. Differentiation and diagnosis of Pseudocercosporella herpotrichoides (Fron) Deighton with genomic DNA probes

    DEFF Research Database (Denmark)

    Frei, U; Wenzel, G.

    1993-01-01

    Repetitive genomic clones were used to differentiate between varieties within the species Pseudocercosporella herpotrichoides. From 21 clones tested 13 revealed restriction fragment length polymorphisms among isolates. Cluster analysis was performed based on these data. Differentiation of isolate...

  8. Host genome variations and risk of infections during induction treatment for childhood acute lymphoblastic leukaemia

    DEFF Research Database (Denmark)

    Lund, Bendik; Wesolowska-Andersen, Agata; Lausen, Birgitte

    2014-01-01

    Objectives: To investigate association of host genomic variation and risk of infections during treatment for childhood acute lymphoblastic leukaemia (ALL). Methods: We explored association of 34 000 singlenucleotide polymorphisms (SNPs) related primarily to pharmacogenomics and immune function...

  9. An international collaborative family-based whole genome quantitative trait linkage scan for myopic refractive error

    DEFF Research Database (Denmark)

    Abbott, Diana; Li, Yi-Ju; Guggenheim, Jeremy A

    2012-01-01

    To investigate quantitative trait loci linked to refractive error, we performed a genome-wide quantitative trait linkage analysis using single nucleotide polymorphism markers and family data from five international sites....

  10. Single-tube tetradecaplex panel of highly polymorphic microsatellite markers hemophilia A.

    Science.gov (United States)

    Zhao, M; Chen, M; Tan, A S C; Cheah, F S H; Mathew, J; Wong, P C; Chong, S S

    2017-07-01

    Essentials Preimplantation genetic diagnosis (PGD) of severe hemophilia A relies on linkage analysis. Simultaneous multi-marker screening can simplify selection of informative markers in a couple. We developed a single-tube tetradecaplex panel of polymorphic markers for hemophilia A PGD use. Informative markers can be used for linkage analysis alone or combined with mutation detection. Background It is currently not possible to perform single-cell preimplantation genetic diagnosis (PGD) to directly detect the common inversion mutations of the factor VIII (F8) gene responsible for severe hemophilia A (HEMA). As such, PGD for such inversion carriers relies on indirect analysis of linked polymorphic markers. Objectives To simplify linkage-based PGD of HEMA, we aimed to develop a panel of highly polymorphic microsatellite markers located near the F8 gene that could be simultaneously genotyped in a multiplex-PCR reaction. Methods We assessed the polymorphism of various microsatellite markers located ≤ 1 Mb from F8 in 177 female subjects. Highly polymorphic markers were selected for co-amplification with the AMELX/Y indel dimorphism in a single-tube reaction. Results Thirteen microsatellite markers located within 0.6 Mb of F8 were successfully co-amplified with AMELX/Y in a single-tube reaction. Observed heterozygosities of component markers ranged from 0.43 to 0.84, and ∼70-80% of individuals were heterozygous for ≥ 5 markers. The tetradecaplex panel successfully identified fully informative markers in a couple interested in PGD for HEMA because of an intragenic F8 point mutation, with haplotype phasing established through a carrier daughter. In-vitro fertilization (IVF)-PGD involved single-tube co-amplification of fully informative markers with AMELX/Y and the mutation-containing F8 amplicon, followed by microsatellite analysis and amplicon mutation-site minisequencing analysis. Conclusions The single-tube multiplex-PCR format of this highly polymorphic

  11. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Analysis of the genetic variation in Mycobacterium tuberculosis strains by multiple genome alignments

    Directory of Open Access Journals (Sweden)

    Morales Juan

    2008-11-01

    Full Text Available Abstract Background The recent determination of the complete nucleotide sequence of several Mycobacterium tuberculosis (MTB genomes allows the use of comparative genomics as a tool for dissecting the nature and consequence of genetic variability within this species. The multiple alignment of the genomes of clinical strains (CDC1551, F11, Haarlem and C, along with the genomes of laboratory strains (H37Rv and H37Ra, provides new insights on the mechanisms of adaptation of this bacterium to the human host. Findings The genetic variation found in six M. tuberculosis strains does not involve significant genomic rearrangements. Most of the variation results from deletion and transposition events preferentially associated with insertion sequences and genes of the PE/PPE family but not with genes implicated in virulence. Using a Perl-based software islandsanalyser, which creates a representation of the genetic variation in the genome, we identified differences in the patterns of distribution and frequency of the polymorphisms across the genome. The identification of genes displaying strain-specific polymorphisms and the extrapolation of the number of strain-specific polymorphisms to an unlimited number of genomes indicates that the different strains contain a limited number of unique polymorphisms. Conclusion The comparison of multiple genomes demonstrates that the M. tuberculosis genome is currently undergoing an active process of gene decay, analogous to the adaptation process of obligate bacterial symbionts. This observation opens new perspectives into the evolution and the understanding of the pathogenesis of this bacterium.

  13. Genomic Epidemiology of Salmonella enterica Serotype Enteritidis based on Population Structure of Prevalent Lineages

    DEFF Research Database (Denmark)

    Deng, Xiangyu; Desai, Prerak T.; den Bakker, Henk C.

    2014-01-01

    serotype Nitra strains. Single-nucleotide polymorphisms were filtered to identify 4,887 reliable loci that distinguished all isolates from each other. Our whole-genome single-nucleotide polymorphism typing approach was robust for S. enterica Enteritidis subtyping with combined data for different strains...

  14. Low levels of LTR retrotransposon deletion by ectopic recombination in the gigantic genomes of salamanders.

    Science.gov (United States)

    Frahry, Matthew Blake; Sun, Cheng; Chong, Rebecca A; Mueller, Rachel Lockridge

    2015-02-01

    Across the tree of life, species vary dramatically in nuclear genome size. Mutations that add or remove sequences from genomes-insertions or deletions, or indels-are the ultimate source of this variation. Differences in the tempo and mode of insertion and deletion across taxa have been proposed to contribute to evolutionary diversity in genome size. Among vertebrates, most of the largest genomes are found within the salamanders, an amphibian clade with genome sizes ranging from ~14 to ~120 Gb. Salamander genomes have been shown to experience slower rates of DNA loss through small (i.e., genomes. However, no studies have addressed DNA loss from salamander genomes resulting from larger deletions. Here, we focus on one type of large deletion-ectopic-recombination-mediated removal of LTR retrotransposon sequences. In ectopic recombination, double-strand breaks are repaired using a "wrong" (i.e., ectopic, or non-allelic) template sequence-typically another locus of similar sequence. When breaks occur within the LTR portions of LTR retrotransposons, ectopic-recombination-mediated repair can produce deletions that remove the internal transposon sequence and the equivalent of one of the two LTR sequences. These deletions leave a signature in the genome-a solo LTR sequence. We compared levels of solo LTRs in the genomes of four salamander species with levels present in five vertebrates with smaller genomes. Our results demonstrate that salamanders have low levels of solo LTRs, suggesting that ectopic-recombination-mediated deletion of LTR retrotransposons occurs more slowly than in other vertebrates with smaller genomes.

  15. Genome Imprinting

    Indian Academy of Sciences (India)

    the cell nucleus (mitochondrial and chloroplast genomes), and. (3) traits governed ... tively good embryonic development but very poor development of membranes and ... Human homologies for the type of situation described above are naturally ..... imprint; (b) New modifications of the paternal genome in germ cells of each ...

  16. Baculovirus Genomics

    NARCIS (Netherlands)

    Oers, van M.M.; Vlak, J.M.

    2007-01-01

    Baculovirus genomes are covalently closed circles of double stranded-DNA varying in size between 80 and 180 kilobase-pair. The genomes of more than fourty-one baculoviruses have been sequenced to date. The majority of these (37) are pathogenic to lepidopteran hosts; three infect sawflies

  17. Genomic Testing

    Science.gov (United States)

    ... this database. Top of Page Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) In 2004, the Centers for Disease Control and Prevention launched the EGAPP initiative to establish and test a ... and other applications of genomic technology that are in transition from ...

  18. Ancient genomes

    OpenAIRE

    Hoelzel, A Rus

    2005-01-01

    Ever since its invention, the polymerase chain reaction has been the method of choice for work with ancient DNA. In an application of modern genomic methods to material from the Pleistocene, a recent study has instead undertaken to clone and sequence a portion of the ancient genome of the cave bear.

  19. Detection of human DNA polymorphisms with a simplified denaturing gradient gel electrophoresis technique

    International Nuclear Information System (INIS)

    Noll, W.W.; Collins, M.

    1987-01-01

    Single base pair differences between otherwise identical DNA molecules can result in altered melting behavior detectable by denaturing gradient gel electrophoresis. The authors have developed a simplified procedure for using denaturing gradient gel electrophoresis to detect base pair changes in genomic DNA. Genomic DNA is digested with restriction enzymes and hybridized in solution to labeled single-stranded probe DNA. The excess probe is then hybridized to complementary phage M13 template DNA, and the reaction mixture is electrophoresed on a denaturing gradient gel. Only the genomic DNA probe hybrids migrate into the gel. Differences in hybrid mobility on the gel indicate base pair changes in the genomic DNA. They have used this technique to identify two polymorphic sites within a 1.2-kilobase region of human chromosome 20. This approach should greatly facilitate the identification of DNA polymorphisms useful for gene linkage studies and the diagnosis of genetic diseases

  20. Best Linear Unbiased Prediction of Genomic Breeding Values Using a Trait-Specific Marker-Derived Relationship Matrix

    NARCIS (Netherlands)

    Zhe Zhang, Z.; Liu, J.F.; Ding, Z.; Bijma, P.; Koning, de D.J.

    2010-01-01

    With the availability of high density whole-genome single nucleotide polymorphism chips, genomic selection has become a promising method to estimate genetic merit with potentially high accuracy for animal, plant and aquaculture species of economic importance. With markers covering the entire genome,

  1. Cas9-nickase-mediated genome editing corrects hereditary tyrosinemia in rats.

    Science.gov (United States)

    Shao, Yanjiao; Wang, Liren; Guo, Nana; Wang, Shengfei; Yang, Lei; Li, Yajing; Wang, Mingsong; Yin, Shuming; Han, Honghui; Zeng, Li; Zhang, Ludi; Hui, Lijian; Ding, Qiurong; Zhang, Jiqin; Geng, Hongquan; Liu, Mingyao; Li, Dali

    2018-05-04

    Hereditary tyrosinemia type I (HTI) is a metabolic genetic disorder caused by mutation of fumarylacetoacetate hydrolase (FAH). Because of the accumulation of toxic metabolites, HTI causes severe liver cirrhosis, liver failure, and even hepatocellular carcinoma. HTI is an ideal model for gene therapy, and several strategies have been shown to ameliorate HTI symptoms in animal models. Although CRISPR/Cas9-mediated genome editing is able to correct the Fah mutation in mouse models, WT Cas9 induces numerous undesired mutations that have raised safety concerns for clinical applications. To develop a new method for gene correction with high fidelity, we generated a Fah mutant rat model to investigate whether Cas9 nickase (Cas9n)-mediated genome editing can efficiently correct the Fah First, we confirmed that Cas9n rarely induces indels in both on-target and off-target sites in cell lines. Using WT Cas9 as a positive control, we delivered Cas9n and the repair donor template/single guide (sg)RNA through adenoviral vectors into HTI rats. Analyses of the initial genome editing efficiency indicated that only WT Cas9 but not Cas9n causes indels at the on-target site in the liver tissue. After receiving either Cas9n or WT Cas9-mediated gene correction therapy, HTI rats gained weight steadily and survived. Fah-expressing hepatocytes occupied over 95% of the liver tissue 9 months after the treatment. Moreover, CRISPR/Cas9-mediated gene therapy prevented the progression of liver cirrhosis, a phenotype that could not be recapitulated in the HTI mouse model. These results strongly suggest that Cas9n-mediated genome editing is a valuable and safe gene therapy strategy for this genetic disease. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  2. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  3. Identification of polymorphic inversions from genotypes

    Directory of Open Access Journals (Sweden)

    Cáceres Alejandro

    2012-02-01

    Full Text Available Abstract Background Polymorphic inversions are a source of genetic variability with a direct impact on recombination frequencies. Given the difficulty of their experimental study, computational methods have been developed to infer their existence in a large number of individuals using genome-wide data of nucleotide variation. Methods based on haplotype tagging of known inversions attempt to classify individuals as having a normal or inverted allele. Other methods that measure differences between linkage disequilibrium attempt to identify regions with inversions but unable to classify subjects accurately, an essential requirement for association studies. Results We present a novel method to both identify polymorphic inversions from genome-wide genotype data and classify individuals as containing a normal or inverted allele. Our method, a generalization of a published method for haplotype data 1, utilizes linkage between groups of SNPs to partition a set of individuals into normal and inverted subpopulations. We employ a sliding window scan to identify regions likely to have an inversion, and accumulation of evidence from neighboring SNPs is used to accurately determine the inversion status of each subject. Further, our approach detects inversions directly from genotype data, thus increasing its usability to current genome-wide association studies (GWAS. Conclusions We demonstrate the accuracy of our method to detect inversions and classify individuals on principled-simulated genotypes, produced by the evolution of an inversion event within a coalescent model 2. We applied our method to real genotype data from HapMap Phase III to characterize the inversion status of two known inversions within the regions 17q21 and 8p23 across 1184 individuals. Finally, we scan the full genomes of the European Origin (CEU and Yoruba (YRI HapMap samples. We find population-based evidence for 9 out of 15 well-established autosomic inversions, and for 52 regions

  4. Characterization and compilation of polymorphic simple sequence repeat (SSR markers of peanut from public database

    Directory of Open Access Journals (Sweden)

    Zhao Yongli

    2012-07-01

    Full Text Available Abstract Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L. genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5% within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5% was the most abundant followed by AAG (12.1%, AAT (10.9%, and AT (10.3%.The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders.

  5. Small molecules enhance CRISPR genome editing in pluripotent stem cells.

    Science.gov (United States)

    Yu, Chen; Liu, Yanxia; Ma, Tianhua; Liu, Kai; Xu, Shaohua; Zhang, Yu; Liu, Honglei; La Russa, Marie; Xie, Min; Ding, Sheng; Qi, Lei S

    2015-02-05

    The bacterial CRISPR-Cas9 system has emerged as an effective tool for sequence-specific gene knockout through non-homologous end joining (NHEJ), but it remains inefficient for precise editing of genome sequences. Here we develop a reporter-based screening approach for high-throughput identification of chemical compounds that can modulate precise genome editing through homology-directed repair (HDR). Using our screening method, we have identified small molecules that can enhance CRISPR-mediated HDR efficiency, 3-fold for large fragment insertions and 9-fold for point mutations. Interestingly, we have also observed that a small molecule that inhibits HDR can enhance frame shift insertion and deletion (indel) mutations mediated by NHEJ. The identified small molecules function robustly in diverse cell types with minimal toxicity. The use of small molecules provides a simple and effective strategy to enhance precise genome engineering applications and facilitates the study of DNA repair mechanisms in mammalian cells. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. CpGislandEVO: A Database and Genome Browser for Comparative Evolutionary Genomics of CpG Islands

    Directory of Open Access Journals (Sweden)

    Guillermo Barturen

    2013-01-01

    Full Text Available Hypomethylated, CpG-rich DNA segments (CpG islands, CGIs are epigenome markers involved in key biological processes. Aberrant methylation is implicated in the appearance of several disorders as cancer, immunodeficiency, or centromere instability. Furthermore, methylation differences at promoter regions between human and chimpanzee strongly associate with genes involved in neurological/psychological disorders and cancers. Therefore, the evolutionary comparative analyses of CGIs can provide insights on the functional role of these epigenome markers in both health and disease. Given the lack of specific tools, we developed CpGislandEVO. Briefly, we first compile a database of statistically significant CGIs for the best assembled mammalian genome sequences available to date. Second, by means of a coupled browser front-end, we focus on the CGIs overlapping orthologous genes extracted from OrthoDB, thus ensuring the comparison between CGIs located on truly homologous genome segments. This allows comparing the main compositional features between homologous CGIs. Finally, to facilitate nucleotide comparisons, we lifted genome coordinates between assemblies from different species, which enables the analysis of sequence divergence by direct count of nucleotide substitutions and indels occurring between homologous CGIs. The resulting CpGislandEVO database, linking together CGIs and single-cytosine DNA methylation data from several mammalian species, is freely available at our website.

  7. Genome Organization Drives Chromosome Fragility.

    Science.gov (United States)

    Canela, Andres; Maman, Yaakov; Jung, Seolkyoung; Wong, Nancy; Callen, Elsa; Day, Amanda; Kieffer-Kwon, Kyong-Rim; Pekowska, Aleksandra; Zhang, Hongliang; Rao, Suhas S P; Huang, Su-Chen; Mckinnon, Peter J; Aplan, Peter D; Pommier, Yves; Aiden, Erez Lieberman; Casellas, Rafael; Nussenzweig, André

    2017-07-27

    In this study, we show that evolutionarily conserved chromosome loop anchors bound by CCCTC-binding factor (CTCF) and cohesin are vulnerable to DNA double strand breaks (DSBs) mediated by topoisomerase 2B (TOP2B). Polymorphisms in the genome that redistribute CTCF/cohesin occupancy rewire DNA cleavage sites to novel loop anchors. While transcription- and replication-coupled genomic rearrangements have been well documented, we demonstrate that DSBs formed at loop anchors are largely transcription-, replication-, and cell-type-independent. DSBs are continuously formed throughout interphase, are enriched on both sides of strong topological domain borders, and frequently occur at breakpoint clusters commonly translocated in cancer. Thus, loop anchors serve as fragile sites that generate DSBs and chromosomal rearrangements. VIDEO ABSTRACT. Published by Elsevier Inc.

  8. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different...... correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome...

  9. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

    International Nuclear Information System (INIS)

    Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A.

    2005-01-01

    Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes

  10. Convergent functional genomics of psychiatric disorders.

    Science.gov (United States)

    Niculescu, Alexander B

    2013-10-01

    Genetic and gene expression studies, in humans and animal models of psychiatric and other medical disorders, are becoming increasingly integrated. Particularly for genomics, the convergence and integration of data across species, experimental modalities and technical platforms is providing a fit-to-disease way of extracting reproducible and biologically important signal, in contrast to the fit-to-cohort effect and limited reproducibility of human genetic analyses alone. With the advent of whole-genome sequencing and the realization that a major portion of the non-coding genome may contain regulatory variants, Convergent Functional Genomics (CFG) approaches are going to be essential to identify disease-relevant signal from the tremendous polymorphic variation present in the general population. Such work in psychiatry can provide an example of how to address other genetically complex disorders, and in turn will benefit by incorporating concepts from other areas, such as cancer, cardiovascular diseases, and diabetes. © 2013 Wiley Periodicals, Inc.

  11. GAPIT: genome association and prediction integrated tool.

    Science.gov (United States)

    Lipka, Alexander E; Tian, Feng; Wang, Qishan; Peiffer, Jason; Li, Meng; Bradbury, Peter J; Gore, Michael A; Buckler, Edward S; Zhang, Zhiwu

    2012-09-15

    Software programs that conduct genome-wide association studies and genomic prediction and selection need to use methodologies that maximize statistical power, provide high prediction accuracy and run in a computationally efficient manner. We developed an R package called Genome Association and Prediction Integrated Tool (GAPIT) that implements advanced statistical methods including the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The GAPIT package can handle large datasets in excess of 10 000 individuals and 1 million single-nucleotide polymorphisms with minimal computational time, while providing user-friendly access and concise tables and graphs to interpret results. http://www.maizegenetics.net/GAPIT. zhiwu.zhang@cornell.edu Supplementary data are available at Bioinformatics online.

  12. MTHFR Glu429Ala and ERCC5 His46His polymorphisms are associated with prognosis in colorectal cancer patients: analysis of two independent cohorts from Newfoundland.

    Directory of Open Access Journals (Sweden)

    Amit A Negandhi

    Full Text Available In this study, 27 genetic polymorphisms that were previously reported to be associated with clinical outcomes in colorectal cancer patients were investigated in relation to overall survival (OS and disease free survival (DFS in colorectal cancer patients from Newfoundland.The discovery and validation cohorts comprised of 532 and 252 patients, respectively. Genotypes of 27 polymorphisms were first obtained in the discovery cohort and survival analyses were performed assuming the co-dominant genetic model. Polymorphisms associated with disease outcomes in the discovery cohort were then investigated in the validation cohort.When adjusted for sex, age, tumor stage and microsatellite instability (MSI status, four polymorphisms were independent predictors of OS in the discovery cohort MTHFR Glu429Ala (HR: 1.72, 95%CI: 1.04-2.84, p = 0.036, ERCC5 His46His (HR: 1.78, 95%CI: 1.15-2.76, p = 0.01, SERPINE1 -675indelG (HR: 0.52, 95%CI: 0.32-0.84, p = 0.008, and the homozygous deletion of GSTM1 gene (HR: 1.4, 95%CI: 1.03-1.92, p = 0.033. In the validation cohort, the MTHFR Glu429Ala polymorphism was associated with shorter OS (HR: 1.71, 95%CI: 1.18-2.49, p = 0.005, although with a different genotype than the discovery cohort (CC genotype in the discovery cohort and AC genotype in the validation cohort. When stratified based on treatment with 5-Fluorouracil (5-FU-based regimens, this polymorphism was associated with reduced OS only in patients not treated with 5-FU. In the DFS analysis, when adjusted for other variables, the TT genotype of the ERCC5 His46His polymorphism was associated with shorter DFS in both cohorts (discovery cohort: HR: 1.54, 95%CI: 1.04-2.29, p = 0.032 and replication cohort: HR: 1.81, 95%CI: 1.11-2.94, p = 0.018.In this study, associations of the MTHFR Glu429Ala polymorphism with OS and the ERCC5 His46His polymorphism with DFS were identified in two colorectal cancer patient cohorts. Our results also suggest

  13. Polymorphic Evolutionary Games.

    Science.gov (United States)

    Fishman, Michael A

    2016-06-07

    In this paper, I present an analytical framework for polymorphic evolutionary games suitable for explicitly modeling evolutionary processes in diploid populations with sexual reproduction. The principal aspect of the proposed approach is adding diploid genetics cum sexual recombination to a traditional evolutionary game, and switching from phenotypes to haplotypes as the new game׳s pure strategies. Here, the relevant pure strategy׳s payoffs derived by summing the payoffs of all the phenotypes capable of producing gametes containing that particular haplotype weighted by the pertinent probabilities. The resulting game is structurally identical to the familiar Evolutionary Games with non-linear pure strategy payoffs (Hofbauer and Sigmund, 1998. Cambridge University Press), and can be analyzed in terms of an established analytical framework for such games. And these results can be translated into the terms of genotypic, and whence, phenotypic evolutionary stability pertinent to the original game. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. ALIS-FLP: Amplified ligation selected fragment-length polymorphism method for microbial genotyping

    DEFF Research Database (Denmark)

    Brillowska-Dabrowska, A.; Wianecka, M.; Dabrowski, Slawomir

    2008-01-01

    A DNA fingerprinting method known as ALIS-FLP (amplified ligation selected fragment-length polymorphism) has been developed for selective and specific amplification of restriction fragments from TspRI restriction endonuclease digested genomic DNA. The method is similar to AFLP, but differs...

  15. Direct detection of single-nucleotide polymorphisms in bacterial DNA by SNPtrap

    DEFF Research Database (Denmark)

    Grønlund, Hugo Ahlm; Moen, Birgitte; Hoorfar, Jeffrey

    2011-01-01

    A major challenge with single-nucleotide polymorphism (SNP) fingerprinting of bacteria and higher organisms is the combination of genome-wide screenings with the potential of multiplexing and accurate SNP detection. Single-nucleotide extension by the minisequencing principle represents a technolo...

  16. Effects of bovine prolactin gene polymorphism within exon 4 on milk ...

    African Journals Online (AJOL)

    In this study, polymorphism of prolactin gene was analyzed as a candidate gene responsible for variation and genetic trends in milk yield and composition traits. Genomic DNAs were extracted from 268 semen samples belonged to Iranian Holstein bulls. Genotyping for the prolactin gene using PCRRFLP technique and RsaI ...

  17. NFE2L2 pathway polymorphisms and lung function decline in chronic obstructive pulmonary disease

    NARCIS (Netherlands)

    Sandford, Andrew J.; Malhotra, Deepti; Boezen, H. Marike; Siedlinski, Mateusz; Postma, Dirkje S.; Wong, Vivien; Akhabir, Loubna; He, Jian-Qing; Connett, John E.; Anthonisen, Nicholas R.; Pare, Peter D.; Biswal, Shyam

    2012-01-01

    Sandford AJ, Malhotra D, Boezen HM, Siedlinski M, Postma DS, Wong V, Akhabir L, He JQ, Connett JE, Anthonisen NR, Pare PD, Biswal S. NFE2L2 pathway polymorphisms and lung function decline in chronic obstructive pulmonary disease. Physiol Genomics 44: 754-763, 2012. First published June 12, 2012;

  18. Lupus-related single nucleotide polymorphisms and risk of diffuse large B-cell lymphoma

    NARCIS (Netherlands)

    Bernatsky, Sasha; Velásquez García, Héctor A; Spinelli, John; Gaffney, Patrick; Smedby, Karin E; Ramsey-Goldman, Rosalind; Wang, Sophia S.; Adami, Hans-Olov; Albanes, Demetrius; Angelucci, Emanuele; Ansell, Stephen M.; Asmann, Yan W.; Becker, Nikolaus; Benavente, Yolanda; Berndt, Sonja I.; Bertrand, Kimberly A.; Birmann, Brenda M.; Boeing, Heiner; Boffetta, Paolo; Bracci, Paige M.; Brennan, Paul; Brooks-Wilson, Angela R.; Cerhan, James R.; Chanock, Stephen J.; Clavel, Jacqueline; Conde, Lucia; Cotenbader, Karen H; Cox, David G; Cozen, Wendy; Crouch, Simon; De Roos, Anneclaire J.; De Sanjose, Silvia; Di Lollo, Simonetta; Diver, W. Ryan; Dogan, Ahmet; Foretova, Lenka; Ghesquières, Hervé; Giles, Graham G.; Glimelius, Bengt; Habermann, Thomas M.; Haioun, Corinne; Hartge, Patricia; Hjalgrim, Henrik; Holford, Theodore R.; Holly, Elizabeth A.; Jackson, Rebecca D.; Kaaks, Rudolph; Kane, Eleanor; Kelly, Rachel S.; Klein, Robert J.; Kraft, Peter; Kricker, Anne; Lan, Qing; Lawrence, Charles; Liebow, Mark; Lightfoot, Tracy; Link, Brian K.; Maynadie, Marc; McKay, James; Melbye, Mads; Molina, Thierry Jo; Monnereau, Alain; Morton, Lindsay M.; Nieters, Alexandra; North, Kari E.; Novak, Anne J.; Offit, Kenneth; Purdue, Mark P.; Rais, Marco; Riby, Jacques; Roman, Eve; Rothman, Nathaniel; Salles, Gilles; Severi, Gianluca; Severson, Richard K.; Skibola, Christine F.; Slager, Susan L.; Smith, Alex; Smith, Martyn T.; Southey, Melissa C.; Staines, Anthony; Teras, Lauren R.; Thompson, Carrie A.; Tilly, Hervé; Tinker, Lesley F.; Tjonneland, Anne; Turner, Jenny; Vajdic, Claire M.; Vermeulen, Roel C H; Vijai, Joseph; Vineis, Paolo; Virtamo, Jarmo; Wang, Zhaoming; Weinstein, Stephanie; Witzig, Thomas E.; Zelenetz, Andrew; Zeleniuch-Jacquotte, Anne; Zhang, Yawei; Zheng, Tongzhang; Zucca, Mariagrazia; Clarke, Ann E

    2017-01-01

    Objective: Determinants of the increased risk of diffuse large B-cell lymphoma (DLBCL) in SLE are unclear. Using data from a recent lymphoma genome-wide association study (GWAS), we assessed whether certain lupus-related single nucleotide polymorphisms (SNPs) were also associated with DLBCL.

  19. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    Science.gov (United States)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  20. Insertional Polymorphisms of Endogenous Feline Leukemia Viruses

    Science.gov (United States)

    Roca, Alfred L.; Nash, William G.; Menninger, Joan C.; Murphy, William J.; O'Brien, Stephen J.

    2005-01-01

    The number, chromosomal distribution, and insertional polymorphisms of endogenous feline leukemia viruses (enFeLVs) were determined in four domestic cats (Burmese, Egyptian Mau, Persian, and nonbreed) using fluorescent in situ hybridization and radiation hybrid mapping. Twenty-nine distinct enFeLV loci were detected across 12 of the 18 autosomes. Each cat carried enFeLV at only 9 to 16 of the loci, and many loci were heterozygous for presence of the provirus. Thus, an average of 19 autosomal copies of enFeLV were present per cat diploid genome. Only five of the autosomal enFeLV sites were present in all four cats, and at only one autosomal locus, B4q15, was enFeLV present in both homologues of all four cats. A single enFeLV occurred in the X chromosome of the Burmese cat, while three to five enFeLV proviruses occurred in each Y chromosome. The X chromosome and nine autosomal enFeLV loci were telomeric, suggesting that ectopic recombination between nonhomologous subtelomeres may contribute to enFeLV distribution. Since endogenous FeLVs may affect the infectiousness or pathogenicity of exogenous FeLVs, genomic variation in enFeLVs represents a candidate for genetic influences on FeLV leukemogenesis in cats. PMID:15767400

  1. Association between Single Nucleotide Polymorphisms in Vitamin D Receptor Gene Polymorphisms and Permanent Tooth Caries Susceptibility to Permanent Tooth Caries in Chinese Adolescent

    Directory of Open Access Journals (Sweden)

    Miao Yu

    2017-01-01

    Full Text Available Purpose. Dental caries is a multifactorial infectious disease. In this study, we investigated whether single nucleotide polymorphisms (SNPs in vitamin D receptor (VDR gene were associated with susceptibility to permanent tooth caries in Chinese adolescents. Method. A total of 200 dental caries patients and 200 healthy controls aged 12 years were genotyped for VDR gene polymorphisms using the PCR-restriction fragment length polymorphism (PCR-RFLP assay. All of them were examined for their oral and dental status with the WHO criteria, and clinical information such as the Decayed Missing Filled Teeth Index (DMFT was evaluated. Genomic DNA was extracted from the buccal epithelial cells. The four polymorphic SNPs (Bsm I, Taq I, Apa I, and Fok I in VDR were assessed for both genotypic and phenotypic susceptibilities. Results. Among the four examined VDR gene polymorphisms, the increased frequency of the CT and CC genotype of the Fok I VDR gene polymorphism was associated with dental caries in 12-year-old adolescent, compared with the controls (X2 = 17.813, p≤0.001. Moreover, Fok I polymorphic allele C frequency was significantly increased in the dental caries cases, compared to the controls (X2 = 14.144, p≤0.001, OR = 1.730, 95% CI = 1.299–2.303. However, the other three VDR gene polymorphisms (Bsm I, Taq I, and Apa I showed no statistically significant differences in the caries groups compared with the controls. Conclusion. VDR-Fok I gene polymorphisms may be associated with susceptibility to permanent tooth caries in Chinese adolescent.

  2. Interleukin-1beta gene polymorphisms in Taiwanese patients with gout.

    Science.gov (United States)

    Chen, Man-Ling; Huang, Chung-Ming; Tsai, Chang-Hai; Tsai, Fuu-Jen

    2005-04-01

    The purpose of this study was to examine whether interleukin-1 beta (IL-1beta) promoter and exon 5 gene polymorphisms are markers of susceptibility or clinical manifestations in Taiwanese patients with gout. The study included 196 patients in addition to 103 unrelated healthy control subjects living in central Taiwan. From genomic DNA, polymorphisms of the gene for IL-1beta promoter and IL-1beta exon 5 were typed. Allelic frequencies were compared between the two groups, and the relationship between allelic frequencies and clinical manifestations of gout was evaluated. No significant differences were observed in the allelic frequencies of the IL-1beta promoter between patients with gout and healthy control subjects. Additionally, we did not detect any association of the IL-1beta promoter genotype with the clinical and laboratory profiles of gout patients. However, there was a significant difference between the two groups in terms of hypertriglyceridemia (P=0.0004, chi(2)=12.52, OR 7.14, 95%CI 0.012-0.22). There was also a significant difference in the genotype of IL-1beta exon 5 polymorphism between patients with and without hypertriglyceridemia. Results of the present study suggest that polymorphisms of the IL-1beta promoter and IL-1beta exon 5 are not related to gout patients in central Taiwan.

  3. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Science.gov (United States)

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  4. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  5. Single nucleotide polymorphism in genome-wide association of ...

    African Journals Online (AJOL)

    Mohd Fareed

    2012-09-25

    Sep 25, 2012 ... Codeine, Tramadol, Acetaminophen. CYP2C9. Celecoxib .... Pharmacogenet- ics of acute azathioprine toxicity: relationship to thiopurine ... Martinez C, Cueto R,. Garcia-Martin E. Pharmacogenomics in drug induced liver.

  6. Genome polymorphism markers and stress genes expression for ...

    African Journals Online (AJOL)

    SAM

    2014-06-11

    Jun 11, 2014 ... RNA extraction and purification for SOD and PAL gene expression. Fresh leaf tissues (100 mg), from ... Data analysis. Gelquant program for quantification of protein, DNA and RNA gel. (version 1.8.2) was used for .... by reprogramming the expression of endogenous genes. Higher level of these antioxidant ...

  7. Estimated allele substitution effects underlying genomic evaluation models depend on the scaling of allele counts

    NARCIS (Netherlands)

    Bouwman, Aniek C.; Hayes, Ben J.; Calus, Mario P.L.

    2017-01-01

    Background: Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of

  8. Right-hand-side updating for fast computing of genomic breeding values

    NARCIS (Netherlands)

    Calus, M.P.L.

    2014-01-01

    Since both the number of SNPs (single nucleotide polymorphisms) used in genomic prediction and the number of individuals used in training datasets are rapidly increasing, there is an increasing need to improve the efficiency of genomic prediction models in terms of computing time and memory (RAM)

  9. Genome analysis and DNA marker-based characterisation of pathogenic trypanosomes

    NARCIS (Netherlands)

    Agbo, Edwin Chukwura

    2003-01-01

    The advances in genomics technologies and genome analysis methods that offer new leads for accelerating discovery of putative targets for developing overall control tools are reviewed in Chapter 1. In Chapter 2, a PCR typing method based on restriction fragment length polymorphism analysis of the

  10. Genomic and bioinformatics analyses of HAdV-4vac and HAdV-7vac, two human adenovirus (HAdV) strains that constituted original prophylaxis against HAdV-related acute respiratory disease, a reemerging epidemic disease.

    Science.gov (United States)

    Purkayastha, Anjan; Su, Jing; McGraw, John; Ditty, Susan E; Hadfield, Ted L; Seto, Jason; Russell, Kevin L; Tibbetts, Clark; Seto, Donald

    2005-07-01

    Vaccine strains of human adenovirus serotypes 4 and 7 (HAdV-4vac and HAdV-7vac) have been used successfully to prevent adenovirus-related acute respiratory disease outbreaks. The genomes of these two vaccine strains have been sequenced, annotated, and compared with their prototype equivalents with the goals of understanding their genomes for molecular diagnostics applications, vaccine redevelopment, and HAdV pathoepidemiology. These reference genomes are archived in GenBank as HAdV-4vac (35,994 bp; AY594254) and HAdV-7vac (35,240 bp; AY594256). Bioinformatics and comparative whole-genome analyses with their recently reported and archived prototype genomes reveal six mismatches and four insertions-deletions (indels) between the HAdV-4 prototype and vaccine strains, in contrast to the 611 mismatches and 130 indels between the HAdV-7 prototype and vaccine strains. Annotation reveals that the HAdV-4vac and HAdV-7vac genomes contain 51 and 50 coding units, respectively. Neither vaccine strain appears to be attenuated for virulence based on bioinformatics analyses. There is evidence of genome recombination, as the inverted terminal repeat of HAdV-4vac is initially identical to that of species C whereas the prototype is identical to species B1. These vaccine reference sequences yield unique genome signatures for molecular diagnostics. As a molecular forensics application, these references identify the circulating and problematic 1950s era field strains as the original HAdV-4 prototype and the Greider prototype, from which the vaccines are derived. Thus, they are useful for genomic comparisons to current epidemic and reemerging field strains, as well as leading to an understanding of pathoepidemiology among the human adenoviruses.

  11. Serotonin transporter (SERT gene polymorphism in Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Mahmut Özkaya

    2004-06-01

    Full Text Available Background: Parkinson disease (PD is the second most common neurodegenerative disorder with a prevalence of about 2% in persons older than 65 years of age. Neurodegenerative process in PD is not restricted to the dopaminergic neurons of the substantia nigra but also affects serotoninergic neurons. It has been shown that PD brains with Lewy bodies in the substantia nigra also had Lewy bodies in the raphe nuclei. The re-uptake of 5HT released into the synaptic cleft is mediated by the 5HT transporter (SERT. The SERT gene has been mapped to the chromosome of 17q11.1-q12 and has two main polymorphisms: intron two VNTR polymorphism and promoter region 44 bp insertion/deletion polymorphism. Objective: In this study we investigated whether two polymorphic regions in the serotonin transporter gene are associated with PD. Material and Method: After obtaining informed consent, blood samples were collected from 76 patients and 54 healthy volunteers. Genomic DNA was extracted from peripheral leucocytes using standard methods. The SERT gene genotypes were determined using polymerase chain reaction (PCR method. Results: Based on the intron 2 VNTR polymorphism of SERT gene, the distribution of 12/12, 12/10 and 10/10 genotypes were found as, 56.6 %, 35.5 %, 7.9 % in patients whereas this genotype distribution in control group was 40.7 %, 46.3 % and 13 %, respectively. According to 5-HTTLPR polymorphism, the distribution of L/L, L/S and S/S genotypes were found as 27.6 % 51.3 % and 21.1 % in patients whereas this genotype distribution in control group was 33.4 %, 50.0 % and 16.6 %, respectively. Despite the fact that the genotype distribution of SERT gene polymorphism in patients and control group seemed to be different from each other, this difference was not found to be statistically significant. Conclusion: This finding suggests that polymorphisms within the SERT gene do not play a major role in PD susceptibility in the Turkish population.

  12. Polymorphisms of interleukin-1β and MUC7 genes in burning mouth syndrome.

    Science.gov (United States)

    Kim, Moon-Jong; Kim, Jihoon; Chang, Ji-Youn; Kim, Yoon-Young; Kho, Hong-Seop

    2017-04-01

    The objectives of the present study are to compare polymorphisms of the IL-1β and MUC7 genes between patients with burning mouth syndrome (BMS) and controls and to investigate relationships between these polymorphisms and clinical characteristics in BMS patients. Forty female BMS patients and 40 gender- and age-matched controls were included. Genomic DNA was extracted from saliva samples. Single-nucleotide polymorphisms of IL-1β -511 and +3954 and variation in number of tandem repeat (VNTR) polymorphism of MUC7 were analyzed. Relationships between genotypic polymorphism data and clinical characteristics in BMS patients were also analyzed. There were no significant differences in the genotypes of IL-1β -511 and +3954 and of MUC7 between the groups. There were no significant differences in symptom duration and intensity of BMS patients according to their IL-1β and MUC7 genotypes. The T allele of IL-1β -511 showed associations with psychometry results in BMS patients: paranoid ideation (P = 0.014), Global Severity Index (P = 0.025), and Positive Symptom Total (P = 0.008). The genotypic polymorphisms of IL-1β -511 and +3954, and of MUC7 VNTR, had no direct associations with the development of BMS. However, the T allele of IL-1β -511 may increase the risk of BMS by increasing psychological asthenia. The genotypic polymorphisms of IL-1β -511 may increase the risk for the development of BMS by increasing psychological asthenia.

  13. Significant association of interleukin-4 gene intron 3 VNTR polymorphism with susceptibility to knee osteoarthritis.

    Science.gov (United States)

    Yigit, Serbulent; Inanir, Ahmet; Tekcan, Akın; Tural, Ercan; Ozturk, Gokhan Tuna; Kismali, Gorkem; Karakus, Nevin

    2014-03-01

    Interleukin-4 (IL-4) is a strong chondroprotective cytokine and polymorphisms within this gene may be a risk factor for osteoarthritis (OA). We aimed to investigate genotype and allele frequencies of IL-4 gene intron 3 variable number of tandem repeats (VNTR) polymorphism in patients with knee OA in a Turkish population. The study included 202 patients with knee OA and 180 healthy controls. Genomic DNA was isolated and IL-4 gene 70 bp VNTR polymorphism determined by using polymerase chain reaction (PCR) with specific primers followed by restriction fragment length polymorphism (RFLP) analysis. Our result show that there was statistically significant difference between knee OA patients and control group with respect to IL-4 genotype distribution and allele frequencies (p=0.000, OR: 0.20, 95% CI: 0.10-0.41, OR: 0.22, 95% CI: 0.12-0.42, respectively). Our findings suggest that there is an association of IL-4 gene intron 3 VNTR polymorphism with susceptibility of a person for development of knee OA. As a result, IL-4 gene intron 3 VNTR polymorphism could be a genetic marker in OA in a Turkish study population. This is the first association study that evaluates the associations between IL-4 gene VNTR polymorphism and knee OA. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  14. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austria......, Australia, China, Denmark, France, Italy, Japan, Spain and the USA) met to address the pressing need for genome sequencing of cephalopod mollusks. This group, drawn from cephalopod biologists, neuroscientists, developmental and evolutionary biologists, materials scientists, bioinformaticians and researchers...... active in sequencing, assembling and annotating genomes, agreed on a set of cephalopod species of particular importance for initial sequencing and developed strategies and an organization (CephSeq Consortium) to promote this sequencing. The conclusions and recommendations of this meeting are described...

  15. Draft genome sequence of Dethiosulfovibrio salsuginis DSM 21565T an anaerobic, slightly halophilic bacterium isolated from a Colombian saline spring.

    Science.gov (United States)

    Díaz-Cárdenas, Carolina; López, Gina; Alzate-Ocampo, José David; González, Laura N; Shapiro, Nicole; Woyke, Tanja; Kyrpides, Nikos C; Restrepo, Silvia; Baena, Sandra

    2017-01-01

    A bacterium belonging to the phylum Synergistetes , genus Dethiosulfovibrio was isolated in 2007 from a saline spring in Colombia. Dethiosulfovibrio salsuginis USBA 82 T ( DSM 21565 T = KCTC 5659 T ) is a mesophilic, strictly anaerobic, slightly halophilic, Gram negative bacterium with a diderm cell envelope. The strain ferments peptides, amino acids and a few organic acids. Here we present the description of the complete genome sequencing and annotation of the type species Dethiosulfovibrio salsuginis USBA 82 T . The genome consisted of 2.68 Mbp with a 53.7% G + C . A total of 2609 genes were predicted and of those, 2543 were protein coding genes and 66 were RNA genes. We detected in USBA 82 T genome six Synergistetes conserved signature indels (CSIs), specific for Jonquetella, Pyramidobacter and Dethiosulfovibrio . The genome of D. salsuginis contained, as expected, genes related to amino acid transport, amino acid metabolism and thiosulfate reduction. These genes represent the major gene groups of Synergistetes , related with their phenotypic traits, and interestingly, 11.8% of the genes in the genome belonged to the amino acid fermentation COG category. In addition, we identified in the genome some ammonification genes such as nitrate reductase genes. The presence of proline operon genes could be related to de novo synthesis of proline to protect the cell in response to high osmolarity. Our bioinformatics workflow included antiSMASH and BAGEL3 which allowed us to identify bacteriocins genes in the genome.

  16. A monoclinic polymorph of theophylline

    Directory of Open Access Journals (Sweden)

    Shuo Zhang

    2011-12-01

    Full Text Available A monoclinic polymorph of theophylline, C7H8N4O2, has been obtained from a chloroform/methanol mixture by evaporation under ambient conditions. The new polymorph crystallizes with two mo