WorldWideScience

Sample records for genomic indel polymorphisms

  1. Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN

    Science.gov (United States)

    Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger

    2016-01-01

    Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831

  2. Development of INDEL Markers for Genetic Mapping Based on Whole Genome Resequencing in Soybean.

    Science.gov (United States)

    Song, Xiaofeng; Wei, Haichao; Cheng, Wen; Yang, Suxin; Zhao, Yanxiu; Li, Xuan; Luo, Da; Zhang, Hui; Feng, Xianzhong

    2015-10-19

    Soybean [Glycine max (L.) Merrill] is an important crop worldwide. In this study, a Chinese local soybean cultivar, Hedou 12, was resequenced by next generation sequencing technology to develop INsertion/DELetion (INDEL) markers for genetic mapping. 49,276 INDEL polymorphisms and 242,059 single nucleotide polymorphisms were detected between Hedou 12 and the Williams 82 reference sequence. Of these, 243 candidate INDEL markers ranging from 5-50 bp in length were chosen for validation, and 165 (68%) of them revealed polymorphisms between Hedou 12 and Williams 82. The validated INDEL markers were also tested in 12 other soybean cultivars. The number of polymorphisms in the pairwise comparisons of 14 soybean cultivars varied from 27 to 165. To test the utility of these INDEL markers, they were used to perform genetic mapping of a crinkly leaf mutant, and the CRINKLY LEAF locus was successfully mapped to a 360 kb region on chromosome 7. This research shows that high-throughput sequencing technologies can facilitate the development of genome-wide molecular markers for genetic mapping in soybean.

  3. SSRs and INDELs mined from the sunflower EST database: abundance, polymorphisms, and cross-taxa utility.

    Science.gov (United States)

    Heesacker, Adam; Kishore, Venkata K; Gao, Wenxiang; Tang, Shunxue; Kolkman, Judith M; Gingle, Alan; Matvienko, Marta; Kozik, Alexander; Michelmore, Richard M; Lai, Zhao; Rieseberg, Loren H; Knapp, Steven J

    2008-11-01

    Simple sequence repeats (SSRs) are abundant and frequently highly polymorphic in transcribed sequences and widely targeted for marker development in eukaryotes. Sunflower (Helianthus annuus) transcript assemblies were built and mined to identify SSRs and insertions-deletions (INDELs) for marker development, comparative mapping, and other genomics applications in sunflower. We describe the spectrum and frequency of SSRs identified in the sunflower EST database, a catalog of 16,643 EST-SSRs, a collection of 484 EST-SSR and 43 EST-INDEL markers developed from common sunflower ESTs, polymorphisms of the markers among the parents of several intraspecific and interspecific mapping populations, and the transferability of the markers to closely and distantly related species in the Compositae. Of 17,904 unigenes in the transcript assembly, 1,956 (10.9%) harbored one or more SSRs with repeat counts of n > or = 5. EST-SSR markers were 1.6-fold more polymorphic among exotic than elite genotypes and 0.7-fold less polymorphic than non-genic SSR markers. Of 466 EST-SSR or INDEL markers screened for cross-species amplification and polymorphisms, 413 (88.6%) amplified alleles from one or more wild species (H. argophyllus, H. tuberosus, H. anomalus, H. paradoxus, and H. deserticola), whereas 69 (14.8%) amplified alleles from safflower (Carthamus tinctorius) and 67 (14.4%) amplified alleles from lettuce (Lactuca sativa); hence, only a fraction were transferable to distantly related genera in the Compositae, whereas most were transferable to wild relatives of H. annuus. Several thousand additional SSRs were identified in the EST database and supply a wealth of templates for EST-SSR marker development in sunflower.

  4. Indels, structural variation, and recombination drive genomic diversity in Plasmodium falciparum.

    Science.gov (United States)

    Miles, Alistair; Iqbal, Zamin; Vauterin, Paul; Pearson, Richard; Campino, Susana; Theron, Michel; Gould, Kelda; Mead, Daniel; Drury, Eleanor; O'Brien, John; Ruano Rubio, Valentin; MacInnis, Bronwyn; Mwangi, Jonathan; Samarakoon, Upeka; Ranford-Cartwright, Lisa; Ferdig, Michael; Hayton, Karen; Su, Xin-Zhuan; Wellems, Thomas; Rayner, Julian; McVean, Gil; Kwiatkowski, Dominic

    2016-09-01

    The malaria parasite Plasmodium falciparum has a great capacity for evolutionary adaptation to evade host immunity and develop drug resistance. Current understanding of parasite evolution is impeded by the fact that a large fraction of the genome is either highly repetitive or highly variable and thus difficult to analyze using short-read sequencing technologies. Here, we describe a resource of deep sequencing data on parents and progeny from genetic crosses, which has enabled us to perform the first genome-wide, integrated analysis of SNP, indel and complex polymorphisms, using Mendelian error rates as an indicator of genotypic accuracy. These data reveal that indels are exceptionally abundant, being more common than SNPs and thus the dominant mode of polymorphism within the core genome. We use the high density of SNP and indel markers to analyze patterns of meiotic recombination, confirming a high rate of crossover events and providing the first estimates for the rate of non-crossover events and the length of conversion tracts. We observe several instances of meiotic recombination within copy number variants associated with drug resistance, demonstrating a mechanism whereby fitness costs associated with resistance mutations could be compensated and greater phenotypic plasticity could be acquired.

  5. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  6. Genome-wide insertion–deletion (InDel) marker discovery and genotyping for genomics-assisted breeding applications in chickpea

    Science.gov (United States)

    Das, Shouvik; Upadhyaya, Hari D.; Srivastava, Rishi; Bajaj, Deepak; Gowda, C.L.L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We developed 21,499 genome-wide insertion–deletion (InDel) markers (2- to 54-bp in silico fragment length polymorphism) by comparing the genomic sequences of four (desi, kabuli and wild C. reticulatum) chickpea [Cicer arietinum (L.)] accessions. InDel markers showing 2- to 6-bp fragment length polymorphism among accessions were abundant (76.8%) in the chickpea genome. The physically mapped 7,643 and 13,856 markers on eight chromosomes and unanchored scaffolds, respectively, were structurally and functionally annotated. The 4,506 coding (23% large-effect frameshift mutations) and regulatory InDel markers were identified from 3,228 genes (representing 11.7% of total 27,571 desi genes), suggesting their functional relevance for trait association/genetic mapping. High amplification (97%) and intra-specific polymorphic (60–83%) potential and wider genetic diversity (15–89%) were detected by genome-wide 6,254 InDel markers among desi, kabuli and wild accessions using even a simpler cost-effective agarose gel-based assay. This signifies added advantages of this user-friendly genetic marker system for manifold large-scale genotyping applications in laboratories with limited infrastructure and resources. Utilizing 6,254 InDel markers-based high-density (inter-marker distance: 0.212 cM) inter-specific genetic linkage map (ICC 4958 × ICC 17160) of chickpea as a reference, three major genomic regions harboring six flowering and maturity time robust QTLs (16.4–27.5% phenotypic variation explained, 8.1–11.5 logarithm of odds) were identified. Integration of genetic and physical maps at these target QTL intervals mapped on three chromosomes delineated five InDel markers-containing candidate genes tightly linked to the QTLs governing flowering and maturity time in chickpea. Taken together, our study demonstrated the practical utility of developing and high-throughput genotyping of such beneficial InDel markers at a genome-wide scale to expedite genomics

  7. Genetic polymorphism of 30 autosomal InDel loci in Chinese Uygur population residing in Xinjiang

    Directory of Open Access Journals (Sweden)

    Ru-feng BAI

    2014-10-01

    Full Text Available Objective To investigate the genetic data of 30 insertion deletion polymorphism (InDel loci included in Investigator® DIPplex in Uygur population from Xinjiang, and evaluate its application in forensic medicine. Methods Allele frequencies, population genetics parameters of the 30 InDels were determined in 223 unrelated Uygur individuals with Investigator® DIPplex, and they were statistically analyzed and compared with available data of other populations of different races from different regions. Results After Bonferroni's correction, there were no significant departure from Hardy-Weinberg equilibrium or linkage disequilibrium between the loci. The average heterozygosity (Ho was 0.468 6, the mean discrimination power (DP was 0.609 5, and the total probability of discrimination power (TDP reached 0.999 999 999 995. The cumulative probability of exclusion was 0.995 478 in trio cases (CPEtrio and 0.972 007 in duo cases (CPEduo. The genetic distance between Uygur and Kazakh was closer than those between Uygur and other populations, such as African American. Conclusion Multiplex detection of the 30 InDel loci revealed a moderately high polymorphic genetic distribution in Chinese Uygur population residing in Xinjiang, demonstrating that the Investigator® DIPplex kit can be used as a supplementary tool for human identity tests, especially in challenging DNA cases. DOI: 10.11855/j.issn.0577-7402.2014.10.10

  8. Differentiation of Indica-Japonica rice revealed by insertion/deletion (InDel) fragments obtained from the comparative genomic study of DNA sequences between 93-11 (Indica) and Nipponbare (Japonica)

    Institute of Scientific and Technical Information of China (English)

    CAI Xingxing; LIU Jing; QIU Yinqiu; ZHAO Wei; SONG Zhiping; LU Baorong

    2007-01-01

    DNA polymorphisms from nucleotide insertion/deletions (InDels) in genomic sequences are the basis for developing InDel molecular markers.To validate the InDel primer pairs on the basis of the comparative genomic study on DNA sequences between an Indica rice 93-11 and a Japonica rice Nipponbare for identifying Indica and Japonica rice varieties and studying wild Oryza species,we studied 49 Indica,43 Japonica,and 24 wild rice accessions collected from ten Asian countries using 45 InDel primer pairs.Results indicated that of the 45 InDel primer pairs,41 can accurately identify Indica and Japonica rice varieties with a reliability of over 80%.The scatter plotting data of the principal component analysis (PCA) indicated that:(i) the InDel primer pairs can easily distinguish Indica from Japonica rice varieties,in addition to revealing their genetic differentiation;(ii) the AA-genome wild rice species showed a relatively close genetic relationship with the Indica rice varieties;and (iii)the non-AA genome wild rice species did not show evident differentiation into the Indica and Japonica types.It is concluded from the study that most of the InDel primer pairs obtained from DNA sequences of 93-11 and Nipponbare can be used for identifying lndica and Japonica rice varieties,and for studying genetic relationships of wild rice species,particularly in terms of the Indica-Japonica differentiation.

  9. Genome assembly quality: Assessment and improvement using the neutral indel model

    Science.gov (United States)

    Meader, Stephen; Hillier, LaDeana W.; Locke, Devin; Ponting, Chris P.; Lunter, Gerton

    2010-01-01

    We describe a statistical and comparative-genomic approach for quantifying error rates of genome sequence assemblies. The method exploits not substitutions but the pattern of insertions and deletions (indels) in genome-scale alignments for closely related species. Using two- or three-way alignments, the approach estimates the amount of aligned sequence containing clusters of nucleotides that were wrongly inserted or deleted during sequencing or assembly. Thus, the method is well-suited to assessing fine-scale sequence quality within single assemblies, between different assemblies of a single set of reads, and between genome assemblies for different species. When applying this approach to four primate genome assemblies, we found that average gap error rates per base varied considerably, by up to sixfold. As expected, bacterial artificial chromosome (BAC) sequences contained lower, but still substantial, predicted numbers of errors, arguing for caution in regarding BACs as the epitome of genome fidelity. We then mapped short reads, at approximately 10-fold statistical coverage, from a Bornean orangutan onto the Sumatran orangutan genome assembly originally constructed from capillary reads. This resulted in a reduced gap error rate and a separation of error-prone from high-fidelity sequence. Over 5000 predicted indel errors in protein-coding sequence were corrected in a hybrid assembly. Our approach contributes a new fine-scale quality metric for assemblies that should facilitate development of improved genome sequencing and assembly strategies. PMID:20305016

  10. Single strand conformation polymorphism based SNP and Indel markers for genetic mapping and synteny analysis of common bean (Phaseolus vulgaris L.

    Directory of Open Access Journals (Sweden)

    Gómez Marcela

    2009-12-01

    Full Text Available Abstract Background Expressed sequence tags (ESTs are an important source of gene-based markers such as those based on insertion-deletions (Indels or single-nucleotide polymorphisms (SNPs. Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs, to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes. Results A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 × G19833 recombinant inbred line (RIL population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 × 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes. Conclusion The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction

  11. Association of an indel polymorphism in the 3'UTR of the caprine SPRN gene with scrapie positivity in the central nervous system.

    Science.gov (United States)

    Peletto, Simone; Bertolini, Silvia; Maniaci, Maria Grazia; Colussi, Silvia; Modesto, Paola; Biolatti, Cristina; Bertuzzi, Simone; Caramelli, Maria; Maurella, Cristiana; Acutis, Pier Luigi

    2012-07-01

    The aim of this study was to analyse the SPRN genes of goats from several scrapie outbreaks in order to detect polymorphisms and to look for association with scrapie occurrence, by an unmatched case-control study. A region of the caprine SPRN gene encompassing the entire ORF and a fragment of the 3'UTR revealed a total of 11 mutations: 10 single-nucleotide polymorphisms and one indel polymorphism. Only two non-synonymous mutations occurring at very low incidence were identified. A significant association with scrapie positivity in the central nervous system was found for an indel polymorphism (602_606insCTCCC) in the 3'UTR. Bioinformatics analyses suggest that this indel may modulate scrapie susceptibility via a microRNA-mediated post-transcriptional mechanism. This is the first study to demonstrate an association between the SPRN gene and goat scrapie. The identified indel may serve as a genetic target other than PRNP to predict disease risk in future genetics-based scrapie-control approaches in goats.

  12. Development of a multiplex taqMan real-time PCR assay for typing of Mycoplasma pneumoniae based on type-specific indels identified through whole genome sequencing.

    Science.gov (United States)

    Wolff, Bernard J; Benitez, Alvaro J; Desai, Heta P; Morrison, Shatavia S; Diaz, Maureen H; Winchell, Jonas M

    2017-03-01

    We developed a multiplex real-time PCR assay for simultaneously detecting M. pneumoniae and typing into historically-defined P1 types. Typing was achieved based on the presence of short type-specific indels identified through whole genome sequencing. This assay was 100% specific compared to existing methods and may be useful during epidemiologic investigations.

  13. Detection of genomic variations and DNA polymorphisms and impact on analysis of meiotic recombination and genetic mapping.

    Science.gov (United States)

    Qi, Ji; Chen, Yamao; Copenhaver, Gregory P; Ma, Hong

    2014-07-08

    DNA polymorphisms are important markers in genetic analyses and are increasingly detected by using genome resequencing. However, the presence of repetitive sequences and structural variants can lead to false positives in the identification of polymorphic alleles. Here, we describe an analysis strategy that minimizes false positives in allelic detection and present analyses of recently published resequencing data from Arabidopsis meiotic products and individual humans. Our analysis enables the accurate detection of sequencing errors, small insertions and deletions (indels), and structural variants, including large reciprocal indels and copy number variants, from comparisons between the resequenced and reference genomes. We offer an alternative interpretation of the sequencing data of meiotic products, including the number and type of recombination events, to illustrate the potential for mistakes in single-nucleotide polymorphism calling. Using these examples, we propose that the detection of DNA polymorphisms using resequencing data needs to account for nonallelic homologous sequences.

  14. Indel markers: genetic diversity of 38 polymorphisms in Brazilian populations and application in a paternity investigation with post mortem material.

    Science.gov (United States)

    Manta, Fernanda; Caiafa, Alexandre; Pereira, Rui; Silva, Dayse; Amorim, António; Carvalho, Elizeu F; Gusmão, Leonor

    2012-09-01

    Aiming to evaluate the usefulness of 38 non-coding bi-allelic autosomal indels in genetic identification and kinship testing, three Brazilian population samples were studied: two from Rio de Janeiro (including a sample of individuals with self-declared African ancestry) and one Native American population of Terena from Mato Grosso do Sul. Based on the observed allele frequencies, parameters of forensic relevance were calculated. The combined power of discrimination of the 38 indels was high in all studied groups (PD≥0.9999999999997), although slightly lower in Native Americans. Genetic distance analysis showed significant differences between the allele frequencies in the Rio de Janeiro population and those previously reported for Europeans, Africans and Asians explained by its intermediate position between Europeans and Africans. As expected, the Terena sample was significantly different from all the other populations: Brazilians from Rio de Janeiro general population and with self-declared African ancestry, Europeans, Africans and East Asians. Finally, the performance of the 38-indel multiplex assay was tested in post-mortem material with positive results, supporting the use of short amplicon bi-allelic markers as an additional tool to STR analysis when DNA molecules are degraded.

  15. Genome editing using FACS enrichment of nuclease-expressing cells and indel detection by amplicon analysis

    DEFF Research Database (Denmark)

    Lonowski, Lindsey A; Narimatsu, Yoshiki; Riaz, Anjum;

    2017-01-01

    This protocol describes methods for increasing and evaluating the efficiency of genome editing based on the CRISPR-Cas9 (clustered regularly interspaced short palindromic repeats-CRISPR-associated 9) system, transcription activator-like effector nucleases (TALENs) or zinc-finger nucleases (ZFNs...

  16. Genome-wide DNA polymorphism in the indica rice varieties RGD-7S and Taifeng B as revealed by whole genome re-sequencing.

    Science.gov (United States)

    Fu, Chong-Yun; Liu, Wu-Ge; Liu, Di-Lin; Li, Ji-Hua; Zhu, Man-Shan; Liao, Yi-Long; Liu, Zhen-Rong; Zeng, Xue-Qin; Wang, Feng

    2016-03-01

    Next-generation sequencing technologies provide opportunities to further understand genetic variation, even within closely related cultivars. We performed whole genome resequencing of two elite indica rice varieties, RGD-7S and Taifeng B, whose F1 progeny showed hybrid weakness and hybrid vigor when grown in the early- and late-cropping seasons, respectively. Approximately 150 million 100-bp pair-end reads were generated, which covered ∼86% of the rice (Oryza sativa L. japonica 'Nipponbare') reference genome. A total of 2,758,740 polymorphic sites including 2,408,845 SNPs and 349,895 InDels were detected in RGD-7S and Taifeng B, respectively. Applying stringent parameters, we identified 961,791 SNPs and 46,640 InDels between RGD-7S and Taifeng B (RGD-7S/Taifeng B). The density of DNA polymorphisms was 256.8 SNPs and 12.5 InDels per 100 kb for RGD-7S/Taifeng B. Copy number variations (CNVs) were also investigated. In RGD-7S, 1989 of 2727 CNVs were overlapped in 218 genes, and 1231 of 2010 CNVs were annotated in 175 genes in Taifeng B. In addition, we verified a subset of InDels in the interval of hybrid weakness genes, Hw3 and Hw4, and obtained some polymorphic InDel markers, which will provide a sound foundation for cloning hybrid weakness genes. Analysis of genomic variations will also contribute to understanding the genetic basis of hybrid weakness and heterosis.

  17. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds

    Science.gov (United States)

    Lobo, Francisco Pereira; Yamagishi, Michel Eduardo Beleza; Chud, Tatiane Cristina Seleguim; Caetano, Alexandre Rodrigues; Munari, Danísio Prado; Garrick, Dorian J.; Machado, Marco Antonio; Martins, Marta Fonseca; Carvalho, Maria Raquel; Cole, John Bruce; Barbosa da Silva, Marcos Vinicius Gualberto

    2017-01-01

    Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose), Gyr, Girolando and Holstein (dairy production). A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs) and 3,828,041 insertions/deletions (InDels) were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs. PMID:28323836

  18. ddRAD-seq phylogenetics based on nucleotide, indel, and presence-absence polymorphisms: Analyses of two avian genera with contrasting histories.

    Science.gov (United States)

    DaCosta, Jeffrey M; Sorenson, Michael D

    2016-01-01

    Genotype-by-sequencing (GBS) methods have revolutionized the field of molecular ecology, but their application in molecular phylogenetics remains somewhat limited. In addition, most phylogenetic studies based on large GBS data sets have relied on analyses of concatenated data rather than species tree methods that explicitly account for genealogical stochasticity among loci. We explored the utility of "double-digest" restriction site-associated DNA sequencing (ddRAD-seq) for phylogenetic analyses of the Lagonosticta firefinches (family Estrildidae) and the Vidua brood parasitic finches (family Viduidae). As expected, the number of homologous loci shared among samples was negatively correlated with genetic distance due to the accumulation of restriction site polymorphisms. Nonetheless, for each genus, we obtained data sets of ∼3000 loci shared in common among all samples, including a more distantly related outgroup taxon. For all samples combined, we obtained >1000 homologous loci despite ∼20my divergence between estrildid and parasitic finches. In addition to nucleotide polymorphisms, the ddRAD-seq data yielded large sets of indel and locus presence-absence polymorphisms, all of which had higher consistency indices than mtDNA sequence data in the context of concatenated parsimony analyses. Species tree methods, using individual gene trees or single nucleotide polymorphisms as input, generated results broadly consistent with analyses of concatenated data, particularly for Lagonosticta, which appears to have a well resolved, bifurcating history. Results for Vidua were also generally consistent across methods and data sets, although nodal support and results from different species tree methods were more variable. Lower gene tree congruence in Vidua is likely the result of its unique evolutionary history, which includes rapid speciation by host shift and occasional hybridization and introgression due to incomplete reproductive isolation. We conclude that dd

  19. Genome-wide analysis of intraspecific DNA polymorphism in 'Micro-Tom', a model cultivar of tomato (Solanum lycopersicum).

    Science.gov (United States)

    Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh

    2014-02-01

    Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.

  20. Simple Detection of Large InDeLS by DHPLC: The ACE Gene as a Model

    Directory of Open Access Journals (Sweden)

    Renata Guedes Koyama

    2008-01-01

    Full Text Available Insertion-deletion polymorphism (InDeL is the second most frequent type of genetic variation in the human genome. For the detection of large InDeLs, researchers usually resort to either PCR gel analysis or RFLP, but these are time consuming and dependent on human interpretation. Therefore, a more efficient method for genotyping this kind of genetic variation is needed. In this report, we describe a method that can detect large InDeLs by DHPLC (denaturating high-performance liquid chromatography using the angiotensin-converting enzyme (ACE gene I/D polymorphism as a model. The InDeL targeted in this study is characterized by a 288 bp Alu element insertion (I. We used DHPLC at nondenaturating conditions to analyze the PCR product with a flow through the chromatographic column under two different gradients based on the differences between D and I sequences. The analysis described is quick and easy, making this technique a suitable and efficient means for DHPLC users to screen InDeLs in genetic epidemiological studies.

  1. The complete chloroplast genome provides insight into the evolution and polymorphism of Panax ginseng

    Directory of Open Access Journals (Sweden)

    Yongbing eZhao

    2015-01-01

    Full Text Available Panax ginseng C.A. Meyer (P. ginseng is an important medicinal plant and is often used in traditional Chinese medicine. With next generation sequencing (NGS technology, we determined the complete chloroplast genome sequences for four Chinese P. ginseng strains, which are Damaya (DMY, Ermaya (EMY, Gaolishen (GLS and Yeshanshen (YSS. The total chloroplast genome sequence length for DMY, EMY and GLS was 156,354 bp, while that for YSS was 156,355 bp. Comparative genomic analysis of the chloroplast genome sequences indicate that gene content, GC content, and gene order in DMY are quite similar to its relative species, and nucleotide sequence diversity of inverted repeat region (IR is lower than that of its counterparts, large single copy region (LSC and small single copy region (SSC. A comparison among these four P. ginseng strains revealed that the chloroplast genome sequences of DMY, EMY, and GLS were identical and YSS had a 1-bp insertion at base 5472. To further study the heterogeneity in chloroplast genome during domestication, high-resolution reads were mapped to the genome sequences to investigate the differences at the minor allele level; 208 minor allele sites with minor allele frequencies (MAF of ≥ 0.05 were identified. The polymorphism site numbers per kb of chloroplast genome sequence for DMY, EMY, GLS, and YSS were 0.74, 0.59, 0.97, and 1.23, respectively. All the minor allele sites located in LSC and IR regions, and the four strains showed the same variation types (substitution base or indel at all identified polymorphism sites. Comparison results of heterogeneity in the chloroplast genome sequences showed that the minor allele sites on the chloroplast genome were undergoing purifying selection to adapt to changing environment during domestication process. A study of P. ginseng chloroplast genome with particular focus on minor allele sites would aid in investigating the dynamics on the chloroplast genomes and different P. ginseng

  2. Insertions/deletions-associated nucleotide polymorphism in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Changjiang Guo

    2016-11-01

    Full Text Available Although high levels of within-species variation are commonly observed, a general mechanism for the origin of such variation is still lacking. Insertions and deletions (indels are a widespread feature of genomes and we hypothesize that there might be an association between indels and patterns of nucleotide polymorphism. Here, we investigate flanking sequences around 18 indels (>100bp among a large number of accessions of the plant, Arabidopsis thaliana. We found two distinct haplotypes, i.e. a nucleotide dimorphism, present around each of these indels and dimorphic haplotypes always corresponded to the indel-present/-absent patterns. In addition, the peaks of nucleotide diversity between the two divergent alleles were closely associated with these indels. Thus, there exists a close association between indels and dimorphisms. Further analysis suggests that indel-associated substitutions could be an important component of genetic variation shaping nucleotide polymorphism in Arabidopsis. Finally, we suggest a mechanism by which indels might generate these highly divergent haplotypes. This study provides evidence that nucleotide dimorphisms, which are frequently regarded as evidence of frequency-dependent selection, could be explained simply by structural variation in the genome.

  3. Efficient indica and japonica rice identification based on the InDel molecular method: Its implication in rice breeding and evolutionary research

    Institute of Scientific and Technical Information of China (English)

    Bao-Rong Lu; Xingxing Cai; Xin Jin

    2009-01-01

    An efficient molecular method for the accurate and efficient identification of indica and japonica rice was created based on the poly-morphisms of insertion/deletion (InDel) DNA fragments obtained from the basic local alignment search tool (BLAST) to the entire genomic sequences of indica (93-11) and japonica rice (Nipponbare). The 45 InDel loci were validated experimentally by the polymerase chain reaction (PCR) and polyacrylamide gel electrophoresis (PAGE) in 44 typical indica and japonica rice varieties, including 93-11 and Nipponbare. A neutrality test of the data matrix generated from electrophoretic banding patterns of various InDel loci indicated that 34 InDel loci were strongly associated with the differentiation of indica and japonica rice. More extensive analyses involving cultivated rice varieties from 11 Asian countries, and 12 wild Oryza species with various origins confirmed that indica and japonica characteristics could accurately be determined via calculating the average frequency of indica- or japonica-specific alleles on different InDel loci across the rice genome. This method was named as the "InDel molecular index" that combines molecular and statistical methods in determining the indica and japonica characteristics of rice varieties. Compared with the traditional methods based essentially on morphology, the InDel molecular index provides a very accurate, rapid, simple, and efficient method for identifying indica and japonica rice. In addition, the InDel index can be used to determine indica or japonica characteristics of wild Oryza species, which largely extends the utility of this method. The InDel molecular index provides a new tool for the effective selection of appropriate indica or japonica rice germplasm in rice breeding. It also offers a novel model for the study of the origin, evolution, and genetic differentiation of indica and japonica rice adapted to various environmental changes.

  4. SIFT Indel: predictions for the functional effects of amino acid insertions/deletions in proteins.

    Science.gov (United States)

    Hu, Jing; Ng, Pauline C

    2013-01-01

    Indels in the coding regions of a gene can either cause frameshifts or amino acid insertions/deletions. Frameshifting indels are indels that have a length that is not divisible by 3 and subsequently cause frameshifts. Indels that have a length divisible by 3 cause amino acid insertions/deletions or block substitutions; we call these 3n indels. The new amino acid changes resulting from 3n indels could potentially affect protein function. Therefore, we construct a SIFT Indel prediction algorithm for 3n indels which achieves 82% accuracy, 81% sensitivity, 82% specificity, 82% precision, 0.63 MCC, and 0.87 AUC by 10-fold cross-validation. We have previously published a prediction algorithm for frameshifting indels. The rules for the prediction of 3n indels are different from the rules for the prediction of frameshifting indels and reflect the biological differences of these two different types of variations. SIFT Indel was applied to human 3n indels from the 1000 Genomes Project and the Exome Sequencing Project. We found that common variants are less likely to be deleterious than rare variants. The SIFT indel prediction algorithm for 3n indels is available at http://sift-dna.org/

  5. SNPs & indels Schizophyllum commune

    NARCIS (Netherlands)

    Nieuwenhuis, B.P.S.; Aanen, D.K.

    2013-01-01

    This description accompanies four files containing SNPs and indels found in two sets of isolates of Schizophyllum commune. This dataset was created for and used in Nieuwenhuis, Nieuwhof and Aanen (2013) On the asymmetry of mating in natural populations of the mushroom fungus Schizophyllum commune. F

  6. Genome-wide detection of chromosomal rearrangements, indels, and mutations in circular chromosomes by short read sequencing

    DEFF Research Database (Denmark)

    Skovgaard, Ole; Bak, Mads; Løbner-Olesen, Anders;

    2011-01-01

    a combination of WGS and genome copy number analysis, for the identification of mutations that suppress the growth deficiency imposed by excessive initiations from the Escherichia coli origin of replication, oriC. The E. coli chromosome, like the majority of bacterial chromosomes, is circular, and DNA...... replication is initiated by assembling two replication complexes at the origin, oriC. These complexes then replicate the chromosome bidirectionally toward the terminus, ter. In a population of growing cells, this results in a copy number gradient, so that origin-proximal sequences are more frequent than...... origin-distal sequences. Major rearrangements in the chromosome are, therefore, readily identified by changes in copy number, i.e., certain sequences become over- or under-represented. Of the eight mutations analyzed in detail here, six were found to affect a single gene only, one was a large chromosomal...

  7. On the inversion-indel distance.

    Science.gov (United States)

    Willing, Eyla; Zaccaria, Simone; Braga, Marília D V; Stoye, Jens

    2013-01-01

    The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be computed thanks to a pioneering approach of Hannenhalli and Pevzner in 1995. In 2000, El-Mabrouk extended the inversion model to allow the comparison of unichromosomal genomes with unequal contents, thus insertions and deletions of DNA segments besides inversions. However, an exact algorithm was presented only for the case in which we have insertions alone and no deletion (or vice versa), while a heuristic was provided for the symmetric case, that allows both insertions and deletions and is called the inversion-indel distance. In 2005, Yancopoulos, Attie and Friedberg started a new branch of research by introducing the generic double cut and join (DCJ) operation, that can represent several genome rearrangements (including inversions). Among others, the DCJ model gave rise to two important results. First, it has been shown that the inversion distance can be computed in a simpler way with the help of the DCJ operation. Second, the DCJ operation originated the DCJ-indel distance, that allows the comparison of genomes with unequal contents, considering DCJ, insertions and deletions, and can be computed in linear time. In the present work we put these two results together to solve an open problem, showing that, when the graph that represents the relation between the two compared genomes has no bad components, the inversion-indel distance is equal to the DCJ-indel distance. We also give a lower and an upper bound for the inversion-indel distance in the presence of bad components.

  8. Sequence context of indel mutations and their effect on protein evolution in a bacterial endosymbiont.

    Science.gov (United States)

    Williams, Laura E; Wernegreen, Jennifer J

    2013-01-01

    Indel mutations play key roles in genome and protein evolution, yet we lack a comprehensive understanding of how indels impact evolutionary processes. Genome-wide analyses enabled by next-generation sequencing can clarify the context and effect of indels, thereby integrating a more detailed consideration of indels with our knowledge of nucleotide substitutions. To this end, we sequenced Blochmannia chromaiodes, an obligate bacterial endosymbiont of carpenter ants, and compared it with the close relative, B. pennsylvanicus. The genetic distance between these species is small enough for accurate whole genome alignment but large enough to provide a meaningful spectrum of indel mutations. We found that indels are subjected to purifying selection in coding regions and even intergenic regions, which show a reduced rate of indel base pairs per kilobase compared with nonfunctional pseudogenes. Indels occur almost exclusively in repeat regions composed of homopolymers and multimeric simple sequence repeats, demonstrating the importance of sequence context for indel mutations. Despite purifying selection, some indels occur in protein-coding genes. Most are multiples of three, indicating selective pressure to maintain the reading frame. The deleterious effect of frameshift-inducing indels is minimized by either compensation from a nearby indel to restore reading frame or the indel's location near the 3'-end of the gene. We observed amino acid divergence exceeding nucleotide divergence in regions affected by frameshift-inducing indels, suggesting that these indels may either drive adaptive protein evolution or initiate gene degradation. Our results shed light on how indel mutations impact processes of molecular evolution underlying endosymbiont genome evolution.

  9. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels...

  10. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) ...

  11. Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

    DEFF Research Database (Denmark)

    Huang, Shujia; Rao, Junhua; Ye, Weijian;

    2015-01-01

    Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) ...

  12. Barcode System for Genetic Identification of Soybean [Glycine max (L.) Merrill] Cultivars Using InDel Markers Specific to Dense Variation Blocks.

    Science.gov (United States)

    Sohn, Hwang-Bae; Kim, Su-Jeong; Hwang, Tae-Young; Park, Hyang-Mi; Lee, Yu-Young; Markkandan, Kesavan; Lee, Dongwoo; Lee, Sunghoon; Hong, Su-Young; Song, Yun-Ho; Koo, Bon-Cheol; Kim, Yul-Ho

    2017-01-01

    For genetic identification of soybean [Glycine max (L.) Merrill] cultivars, insertions/deletions (InDel) markers have been preferred currently because they are easy to use, co-dominant and relatively abundant. Despite their biological importance, the investigation of InDels with proven quality and reproducibility has been limited. In this study, we described soybean barcode system approach based on InDel makers, each of which is specific to a dense variation block (dVB) with non-random recombination due to many variations. Firstly, 2,274 VBs were mined by analyzing whole genome data in six soybean cultivars (Backun, Sinpaldal 2, Shingi, Daepoong, Hwangkeum, and Williams 82) for transferability to dVB-specific InDel markers. Secondly, 73,327 putative InDels in the dVB regions were identified for the development of soybean barcode system. Among them, 202 dVB-specific InDels from all soybean cultivars were selected by gel electrophoresis, which were converted as 2D barcode types according to comparing amplicon polymorphisms in the five cultivars to the reference cultivar. Finally, the polymorphism of the markers were assessed in 147 soybean cultivars, and the soybean barcode system that allows a clear distinction among soybean cultivars is also detailed. In addition, the changing of the dVBs in a chromosomal level can be quickly identified due to investigation of the reshuffling pattern of the soybean cultivars with 27 maker sets. Especially, a backcross-inbred offspring, "Singang" and a recurrent parent, "Sowon" were identified by using the 27 InDel markers. These results indicate that the soybean barcode system enables not only the minimal use of molecular markers but also comparing the data from different sources due to no need of exploiting allele binning in new varieties.

  13. Genetic polymorphism analyses of 30 InDels in Chinese Xibe ethnic group and its population genetic differentiations with other groups.

    Science.gov (United States)

    Meng, Hao-Tian; Zhang, Yu-Dang; Shen, Chun-Mei; Yuan, Guo-Lian; Yang, Chun-Hua; Jin, Rui; Yan, Jiang-Wei; Wang, Hong-Dan; Liu, Wen-Juan; Jing, Hang; Zhu, Bo-Feng

    2015-02-05

    In the present study, we obtained population genetic data and forensic parameters of 30 InDel loci in Chinese Xibe ethnic group from northwestern China and studied the genetic relationships between the studied Xibe group and other reference groups. The observed heterozygosities ranged from 0.1704 at HLD118 locus to 0.5247 at HLD92 locus while the expected heterozygosities ranged from 0.1559 at HLD118 locus to 0.4997 at HLD101 locus. The cumulative power of exclusion and total probability of discrimination power in the studied group were 0.9867 and 0.9999999999902 for the 30 loci, respectively. Analyses of structure, PCA, interpopulation differentiations and phylogenetic tree revealed that the Xibe group had close genetic relationships with South Korean, Beijing Han and Guangdong Han groups. The results indicated that these 30 loci should only be used as a complement for autosomal STRs in paternity cases but could provide an acceptable level of discrimination in forensic identification cases in the studied Xibe group. Further studies should be conducted for better understanding of the Xibe genetic background.

  14. Genome Polymorphisms Between Indica and Japonica Revealed by RFLP

    Institute of Scientific and Technical Information of China (English)

    WANG Song-wen; LIU Xia; XU Cai-guo; SHI Li-li; ZHANG Xin; DING De-liang; WANG Yong

    2007-01-01

    Revealing the genome polymorphisms between indica and japonica subspecies; RFLP markers, which are located across 12 chromosomes of rice, were used to analyze indica-japonica differentiation in different rice varieties. At the same time, genome sequence variations of screened loci were analyzed by bioinformatics method. Twenty-eight RFLP probes, which can classify indica-japonica rice, were confirmed. Subspecies genome polymorphisms of screened loci were found by analyzing the publication of the genome sequences data of rice. The study indicated that these screened markers can be used for classifying indica-japonica subspecies. With the publication of the genome sequences of rice, marker polymorphisms between indica and japonica subspecies can be revealed by genome differentiation.

  15. Identification of conserved and polymorphic STRs for personal genomes

    Science.gov (United States)

    2014-01-01

    Background Short tandem repeats (STRs) are abundant in human genomes. Numerous STRs have been shown to be associated with genetic diseases and gene regulatory functions, and have been selected as genetic markers for evolutionary and forensic analyses. High-throughput next generation sequencers have fostered new cutting-edge computing techniques for genome-scale analyses, and cross-genome comparisons have facilitated the efficient identification of polymorphic STR markers for various applications. Results An automated and efficient system for detecting human polymorphic STRs at the genome scale is proposed in this study. Assembled contigs from next generation sequencing data were aligned and calibrated according to selected reference sequences. To verify identified polymorphic STRs, human genomes from the 1000 Genomes Project were employed for comprehensive analyses, and STR markers from the Combined DNA Index System (CODIS) and disease-related STR motifs were also applied as cases for evaluation. In addition, we analyzed STR variations for highly conserved homologous genes and human-unique genes. In total 477 polymorphic STRs were identified from 492 human-unique genes, among which 26 STRs were retrieved and clustered into three different groups for efficient comparison. Conclusions We have developed an online system that efficiently identifies polymorphic STRs and provides novel distinguishable STR biomarkers for different levels of specificity. Candidate polymorphic STRs within a personal genome could be easily retrieved and compared to the constructed STR profile through query keywords, gene names, or assembled contigs. PMID:25560225

  16. Whole genome sequencing of Gir cattle for identifying polymorphisms and loci under selection.

    Science.gov (United States)

    Liao, Xiaoping; Peng, Fred; Forni, Selma; McLaren, David; Plastow, Graham; Stothard, Paul

    2013-10-01

    Genetic variation in Gir cattle (Bos indicus) has so far not been well characterized. In this study, we used whole genome sequencing of three Gir bulls and a pooled sample from another 11 bulls to identify polymorphisms and loci under selection. A total of 9 990 733 single nucleotide polymorphisms (SNPs) and 604 308 insertion/deletions (indels) were discovered in Gir samples, of which 62.34% and 83.62%, respectively, are previously unknown. Moreover, we detected 79 putative selective sweeps using the sequence data of the pooled sample. One of the most striking sweeps harbours several genes belonging to the cathelicidin gene family, such as CAMP, CATHL1, CATHL2, and CATHL3, which are related to pathogen- and parasite-resistance. Another interesting region harbours genes encoding mitogen-activated protein kinases, which are involved in directing cellular responses to a variety of stimuli, such as osmotic stress and heat shock. These findings are particularly interesting because Gir is resistant to hot temperatures and tropical diseases. This initial selective sweep analysis of Gir cattle has revealed a number of loci that could be important for their adaptation to tropical climates.

  17. Genome-wide polymorphisms show unexpected targets of natural selection

    OpenAIRE

    Pespeni, Melissa H.; Garfield, David A.; Manier, Mollie K; Palumbi, Stephen R.

    2011-01-01

    Natural selection can act on all the expressed genes of an individual, leaving signatures of genetic differentiation or diversity at many loci across the genome. New power to assay these genome-wide effects of selection comes from associating multi-locus patterns of polymorphism with gene expression and function. Here, we performed one of the first genome-wide surveys in a marine species, comparing purple sea urchins, Strongylocentrotus purpuratus, from two distant locations along the species...

  18. Development of EST-based SNP and InDel markers and their utilization in tetraploid cotton genetic mapping

    Science.gov (United States)

    Expressed sequence tags (ESTs) were analyzed in silico in order to identify single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels) in cotton. A total of 1349 EST-based SNP and InDel markers were developed by comparing ESTs between Gossypium hirsutum and G. barbadense, m...

  19. A PCR based protocol for detecting indel mutations induced by TALENs and CRISPR/Cas9 in zebrafish.

    Directory of Open Access Journals (Sweden)

    Chuan Yu

    Full Text Available Genome editing techniques such as the zinc-finger nucleases (ZFNs, transcription activator-like effecter nucleases (TALENs and clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR-associated (Cas system Cas9 can induce efficient DNA double strand breaks (DSBs at the target genomic sequence and result in indel mutations by the error-prone non-homologous end joining (NHEJ DNA repair system. Several methods including sequence specific endonuclease assay, T7E1 assay and high resolution melting curve assay (HRM etc have been developed to detect the efficiency of the induced mutations. However, these assays have some limitations in that they either require specific sequences in the target sites or are unable to generate sequencing-ready mutant DNA fragments or unable to distinguish induced mutations from natural nucleotide polymorphism. Here, we developed a simple PCR-based protocol for detecting indel mutations induced by TALEN and Cas9 in zebrafish. We designed 2 pairs of primers for each target locus, with one putative amplicon extending beyond the putative indel site and the other overlapping it. With these primers, we performed a qPCR assay to efficiently detect the frequencies of newly induced mutations, which was accompanied with a T-vector-based colony analysis to generate single-copy mutant fragment clones for subsequent DNA sequencing. Thus, our work has provided a very simple, efficient and fast assay for detecting induced mutations, which we anticipate will be widely used in the area of genome editing.

  20. A high-resolution InDel (insertion-deletion markers-anchored consensus genetic map identifies major QTLs governing pod number and seed yield in chickpea

    Directory of Open Access Journals (Sweden)

    Rishi Srivastava

    2016-09-01

    Full Text Available Development and large-scale genotyping of user-friendly informative genome/gene-derived InDel markers in natural and mapping populations is vital for accelerating genomics-assisted breeding applications of chickpea with minimal resource expenses. The present investigation employed a high-throughput whole genome NGS (next-generation sequencing resequencing strategy in low and high pod number parental accessions and homozygous individuals constituting the bulks from each of two inter-specific mapping populations [(Pusa 1103 x ILWC 46 and (Pusa 256 x ILWC 46] to develop non-erroneous InDel markers at a genome-wide scale. Comparing these high-quality genomic sequences, 82360 InDel markers with reference to kabuli genome and 13891 InDel markers exhibiting differentiation between low and high pod number parental accessions and bulks of aforementioned mapping populations were developed. These informative markers were structurally and functionally annotated in diverse coding and non-coding sequence components of genome/genes of kabuli chickpea. The functional significance of regulatory and coding (frameshift and large-effect mutations InDel markers for establishing marker-trait linkages through association/genetic mapping was apparent. The markers detected a greater amplification (97% and intra-specific polymorphic potential (58-87% among a diverse panel of cultivated desi, kabuli and wild accessions even by using a simpler cost-efficient agarose gel-based assay implicating their utility in large-scale genetic analysis especially in domesticated chickpea with narrow genetic base. Two high-density inter-specific genetic linkage maps generated using aforesaid mapping populations were integrated to construct a consensus 1479 InDel markers-anchored high-resolution (inter-marker distance: 0.66 cM genetic map for efficient molecular mapping of major QTLs governing pod number and seed yield per plant in chickpea. Utilizing these high-density genetic maps as

  1. Templated sequence insertion polymorphisms in the human genome

    Science.gov (United States)

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  2. Genome-wide patterns of nucleotide polymorphism in domesticated rice

    DEFF Research Database (Denmark)

    Caicedo, Ana L; Williamson, Scott H; Hernandez, Ryan D

    2007-01-01

    Domesticated Asian rice (Oryza sativa) is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments......, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs) in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models...... explanations for patterns of variation in domesticated rice varieties. If selective sweeps are indeed the explanation for the observed nucleotide data of domesticated rice, it suggests that strong selection can leave its imprint on genome-wide polymorphism patterns, contrary to expectations that selection...

  3. Genome-Wide Association Study of Polymorphisms Predisposing to Bronchiolitis

    Science.gov (United States)

    Pasanen, Anu; Karjalainen, Minna K.; Bont, Louis; Piippo-Savolainen, Eija; Ruotsalainen, Marja; Goksör, Emma; Kumawat, Kuldeep; Hodemaekers, Hennie; Nuolivirta, Kirsi; Jartti, Tuomas; Wennergren, Göran; Hallman, Mikko; Rämet, Mika; Korppi, Matti

    2017-01-01

    Bronchiolitis is a major cause of hospitalization among infants. Severe bronchiolitis is associated with later asthma, suggesting a common genetic predisposition. Genetic background of bronchiolitis is not well characterized. To identify polymorphisms associated with bronchiolitis, we conducted a genome-wide association study (GWAS) in which 5,300,000 single nucleotide polymorphisms (SNPs) were tested for association in a Finnish–Swedish population of 217 children hospitalized for bronchiolitis and 778 controls. The most promising SNPs (n = 77) were genotyped in a Dutch replication population of 416 cases and 432 controls. Finally, we used a set of 202 Finnish bronchiolitis cases to further investigate candidate SNPs. We did not detect genome-wide significant associations, but several suggestive association signals (p bronchiolitis. These preliminary findings require further validation in a larger sample size. PMID:28139761

  4. Genome-wide patterns of nucleotide polymorphism in domesticated rice.

    Directory of Open Access Journals (Sweden)

    Ana L Caicedo

    2007-09-01

    Full Text Available Domesticated Asian rice (Oryza sativa is one of the oldest domesticated crop species in the world, having fed more people than any other plant in human history. We report the patterns of DNA sequence variation in rice and its wild ancestor, O. rufipogon, across 111 randomly chosen gene fragments, and use these to infer the evolutionary dynamics that led to the origins of rice. There is a genome-wide excess of high-frequency derived single nucleotide polymorphisms (SNPs in O. sativa varieties, a pattern that has not been reported for other crop species. We developed several alternative models to explain contemporary patterns of polymorphisms in rice, including a (i selectively neutral population bottleneck model, (ii bottleneck plus migration model, (iii multiple selective sweeps model, and (iv bottleneck plus selective sweeps model. We find that a simple bottleneck model, which has been the dominant demographic model for domesticated species, cannot explain the derived nucleotide polymorphism site frequency spectrum in rice. Instead, a bottleneck model that incorporates selective sweeps, or a more complex demographic model that includes subdivision and gene flow, are more plausible explanations for patterns of variation in domesticated rice varieties. If selective sweeps are indeed the explanation for the observed nucleotide data of domesticated rice, it suggests that strong selection can leave its imprint on genome-wide polymorphism patterns, contrary to expectations that selection results only in a local signature of variation.

  5. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  6. Identification and annotation of genetic variants (SNP/Indel) in Danish Jutland cattle

    DEFF Research Database (Denmark)

    Das, Ashutosh; Panitz, Frank; Holm, Lars-Erik

    We sequenced the whole-genome of a Danish Jutland bull to identify genetic variants (SNP/indel). Using UnifiedGenotyper from the Genome Analysis Toolkit (GATK), we identified 6,812,198 SNPs and 804,453 indels. There were 2,598,000 (38.1%) novel SNPs and 607,923(75.6%) novel indels while the remai......,122 indels in coding sequences, 832 predicted to cause frame shift, 89 predicted to be inframe insertion and 115 to be inframe deletion. We detected a higher level of genetic variation in the Jutland bull compared to similar data from Holstein cattle......We sequenced the whole-genome of a Danish Jutland bull to identify genetic variants (SNP/indel). Using UnifiedGenotyper from the Genome Analysis Toolkit (GATK), we identified 6,812,198 SNPs and 804,453 indels. There were 2,598,000 (38.1%) novel SNPs and 607,923(75.6%) novel indels while...... the remaining was annotated in dbSNP build 133. In-depth annotation of the variants revealed that 45,776 SNPs affected the coding sequences of 11,538 genes, 221 SNPs predicted to cause a premature stop codon, 17 to cause a gain in coding sequence and 20,828 predicted to be non-synonymous. We identified 1...

  7. SARS-CoV Genome Polymorphism: A Bioinformatics Study

    Institute of Scientific and Technical Information of China (English)

    Gordana M. Pavlovi(c)-Lazeti(c); Nenad S. Miti(c); Andrija M. Tomovi(c); Mirjana D. Pavlovi(c); Milo(s) V.Beljanski

    2005-01-01

    A dataset of 103 SARS-CoV isolates (101 human patients and 2 palm civets) was investigated on different aspects of genome polymorphism and isolate classification.The number and the distribution of single nucleotide variations (SNVs) and insertions and deletions, with respect to a "profile", were determined and discussed ("profile" being a sequence containing the most represented letter per position).Distribution of substitution categories per codon positions, as well as synonymous and non-synonymous substitutions in coding regions of annotated isolates, was determined, along with amino acid (a.a.) property changes. Similar analysis was performed for the spike (S) protein in all the isolates (55 of them being predicted for the first time). The ratio Ka/Ks confirmed that the S gene was subjected to the Darwinian selection during virus transmission from animals to humans. Isolates from the dataset were classified according to genome polymorphism and genotypes. Genome polymorphism yields to two groups, one with a small number of SNVs and another with a large number of SNVs, with up to four subgroups with respect to insertions and deletions. We identified three basic nine-locus genotypes:TTTT/TTCGG, CGCC/TTCAT, and TGCC/TTCGT, with four subgenotypes.Both classifications proposed are in accordance with the new insights into possible epidemiological spread, both in space and time.

  8. Defining "mutation" and "polymorphism" in the era of personal genomics.

    Science.gov (United States)

    Karki, Roshan; Pandya, Deep; Elston, Robert C; Ferlini, Cristiano

    2015-07-15

    The growing advances in DNA sequencing tools have made analyzing the human genome cheaper and faster. While such analyses are intended to identify complex variants, related to disease susceptibility and efficacy of drug responses, they have blurred the definitions of mutation and polymorphism. In the era of personal genomics, it is critical to establish clear guidelines regarding the use of a reference genome. Nowadays DNA variants are called as differences in comparison to a reference. In a sequencing project Single Nucleotide Polymorphisms (SNPs) and DNA mutations are defined as DNA variants detectable in >1 % or genomic sequence. We propose to solve this nomenclature dilemma by defining mutations as DNA variants obtained in a paired sequencing project including the germline DNA of the same individual as a reference. Moreover, the term mutation should be accompanied by a qualifying prefix indicating whether the mutation occurs only in somatic cells (somatic mutation) or also in the germline (germline mutation). We believe this distinction in definition will help avoid confusion among researchers and support the practice of sequencing the germline and somatic tissues in parallel to classify the DNA variants thus defined as mutations.

  9. Genome size, karyotype polymorphism and chromosomal evolution in Trypanosoma cruzi.

    Directory of Open Access Journals (Sweden)

    Renata T Souza

    Full Text Available BACKGROUND: The Trypanosoma cruzi genome was sequenced from a hybrid strain (CL Brener. However, high allelic variation and the repetitive nature of the genome have prevented the complete linear sequence of chromosomes being determined. Determining the full complement of chromosomes and establishing syntenic groups will be important in defining the structure of T. cruzi chromosomes. A large amount of information is now available for T. cruzi and Trypanosoma brucei, providing the opportunity to compare and describe the overall patterns of chromosomal evolution in these parasites. METHODOLOGY/PRINCIPAL FINDINGS: The genome sizes, repetitive DNA contents, and the numbers and sizes of chromosomes of nine strains of T. cruzi from four lineages (TcI, TcII, TcV and TcVI were determined. The genome of the TcI group was statistically smaller than other lineages, with the exception of the TcI isolate Tc1161 (José-IMT. Satellite DNA content was correlated with genome size for all isolates, but this was not accompanied by simultaneous amplification of retrotransposons. Regardless of chromosomal polymorphism, large syntenic groups are conserved among T. cruzi lineages. Duplicated chromosome-sized regions were identified and could be retained as paralogous loci, increasing the dosage of several genes. By comparing T. cruzi and T. brucei chromosomes, homologous chromosomal regions in T. brucei were identified. Chromosomes Tb9 and Tb11 of T. brucei share regions of syntenic homology with three and six T. cruzi chromosomal bands, respectively. CONCLUSIONS: Despite genome size variation and karyotype polymorphism, T. cruzi lineages exhibit conservation of chromosome structure. Several syntenic groups are conserved among all isolates analyzed in this study. The syntenic regions are larger than expected if rearrangements occur randomly, suggesting that they are conserved owing to positive selection. Mapping of the syntenic regions on T. cruzi chromosomal bands

  10. The association of insertions/deletions (INDELs) and variable number tandem repeats (VNTRs) with obesity and its related traits and complications.

    Science.gov (United States)

    Say, Yee-How

    2017-06-14

    Despite the fact that insertions/deletions (INDELs) are the second most common type of genetic variations and variable number tandem repeats (VNTRs) represent a large portion of the human genome, they have received far less attention than single nucleotide polymorphisms (SNPs) and larger forms of structural variation like copy number variations (CNVs), especially in genome-wide association studies (GWAS) of complex diseases like polygenic obesity. This is exemplified by the vast amount of review papers on the role of SNPs and CNVs in obesity, its related traits (like anthropometric measurements, biochemical variables, and eating behavior), and its related complications (like hypertension, hypertriglyceridemia, hypercholesterolemia, and insulin resistance-collectively known as metabolic syndrome). Hence, this paper reviews the types of INDELs and VNTRs that have been studied for association with obesity and its related traits and complications. These INDELs and VNTRs could be found in the obesity loci or genes from the earliest GWAS and candidate gene association studies, like FTO, genes in the leptin-proopiomelanocortin pathway, and UCP2/3. Given the important role of the brain serotonergic and dopaminergic reward system in obesity susceptibility, the association of INDELs and VNTRs in these neurotransmitters' metabolism and transport genes with obesity is also reviewed. Next, the role of INS VNTR in obesity and its related traits is questionable, since recent large-scale studies failed to replicate the earlier positive associations. As obesity results in chronic low-grade inflammation of the adipose tissue, the proinflammatory cytokine gene IL1RA and anti-inflammatory cytokine gene IL4 have VNTRs that are implicated in obesity. A systemic proinflammatory state in combination with activation of the renin-angiotensin system and decreased nitric oxide bioavailability as found in obesity leads to endothelial dysfunction. This explains why VNTR and INDEL in eNOS and

  11. Whole-Genome Characteristics and Polymorphic Analysis of Vietnamese Rice Landraces as a Comprehensive Information Resource for Marker-Assisted Selection

    Science.gov (United States)

    Trinh, Hien; Nguyen, Khoa Truong; Nguyen, Lam Van; Pham, Huy Quang; Huong, Can Thu; Xuan, Tran Dang; Anh, La Hoang; Caccamo, Mario; Ayling, Sarah; Diep, Nguyen Thuy; Trung, Khuat Huu

    2017-01-01

    Next generation sequencing technologies have provided numerous opportunities for application in the study of whole plant genomes. In this study, we present the sequencing and bioinformatic analyses of five typical rice landraces including three indica and two japonica with potential blast resistance. A total of 688.4 million 100 bp paired-end reads have yielded approximately 30-fold coverage to compare with the Nipponbare reference genome. Among them, a small number of reads were mapped to both chromosomes and organellar genomes. Over two million and eight hundred thousand single nucleotide polymorphisms (SNPs) and insertions and deletions (InDels) in indica and japonica lines have been determined, which potentially have significant impacts on multiple transcripts of genes. SNP deserts, contiguous SNP-low regions, were found on chromosomes 1, 4, and 5 of all genomes of rice examined. Based on the distribution of SNPs per 100 kilobase pairs, the phylogenetic relationships among the landraces have been constructed. This is the first step towards revealing several salient features of rice genomes in Vietnam and providing significant information resources to further marker-assisted selection (MAS) in rice breeding programs. PMID:28265566

  12. Improved set of short-tandem-repeat polymorphisms for screening the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Yuan, Bo; Vaske, D.; Weber, J.L. [Marshfield Medical Research Foundation, WI (United States)] [and others

    1997-02-01

    Short-tandem-repeat (microsatellite) DNA polymorphisms are widely used for screening the human and other genomes in initial linkage mapping. Since the average spacing between polymorphisms in genome screens is usually {ge}10 cM and since many thousands of human short-tandem-repeat polymorphisms (STRPs) are now available, optimal subsets of STRPs must be selected for screening. Two screening sets of STRPs for humans have been described in the literature, both of which are based primarily on dinucleotide-repeat polymorphisms. Here we describe our eighth and most recent human screening set, which is based almost entirely on trinucleotide-and tetranucleotide-repeat polymorphisms. 7 refs., 1 tab.

  13. A High-Resolution InDel (Insertion–Deletion) Markers-Anchored Consensus Genetic Map Identifies Major QTLs Governing Pod Number and Seed Yield in Chickpea

    Science.gov (United States)

    Srivastava, Rishi; Singh, Mohar; Bajaj, Deepak; Parida, Swarup K.

    2016-01-01

    Development and large-scale genotyping of user-friendly informative genome/gene-derived InDel markers in natural and mapping populations is vital for accelerating genomics-assisted breeding applications of chickpea with minimal resource expenses. The present investigation employed a high-throughput whole genome next-generation resequencing strategy in low and high pod number parental accessions and homozygous individuals constituting the bulks from each of two inter-specific mapping populations [(Pusa 1103 × ILWC 46) and (Pusa 256 × ILWC 46)] to develop non-erroneous InDel markers at a genome-wide scale. Comparing these high-quality genomic sequences, 82,360 InDel markers with reference to kabuli genome and 13,891 InDel markers exhibiting differentiation between low and high pod number parental accessions and bulks of aforementioned mapping populations were developed. These informative markers were structurally and functionally annotated in diverse coding and non-coding sequence components of genome/genes of kabuli chickpea. The functional significance of regulatory and coding (frameshift and large-effect mutations) InDel markers for establishing marker-trait linkages through association/genetic mapping was apparent. The markers detected a greater amplification (97%) and intra-specific polymorphic potential (58–87%) among a diverse panel of cultivated desi, kabuli, and wild accessions even by using a simpler cost-efficient agarose gel-based assay implicating their utility in large-scale genetic analysis especially in domesticated chickpea with narrow genetic base. Two high-density inter-specific genetic linkage maps generated using aforesaid mapping populations were integrated to construct a consensus 1479 InDel markers-anchored high-resolution (inter-marker distance: 0.66 cM) genetic map for efficient molecular mapping of major QTLs governing pod number and seed yield per plant in chickpea. Utilizing these high-density genetic maps as anchors, three major

  14. Development and Utilization of InDel Markers to Identify Peanut (Arachis hypogaea Disease Resistance

    Directory of Open Access Journals (Sweden)

    Lifeng eLiu

    2015-11-01

    Full Text Available Peanut diseases, such as leaf spot and spotted wilt caused by Tomato Spotted Wilt Virus, can significantly reduce yield and quality. Application of marker assisted plant breeding requires the development and validation of different types of DNA molecular markers. Nearly 10,000 SSR-based molecular markers have been identified by various research groups around the world, but less than 14.5% showed polymorphism in peanut and only 6.4% have been mapped. Low levels of polymorphism limit the application of marker assisted selection (MAS in peanut breeding programs. Insertion/deletion (InDel markers have been reported to be more polymorphic than SSRs in some crops. The goals of this study were to identify novel InDel markers and to evaluate the potential use in peanut breeding. Forty-eight InDel markers were developed from conserved sequences of functional genes and tested in a diverse panel of 118 accessions covering six botanical types of cultivated peanut, of which 104 were from the U.S. mini-core. Results showed that 16 InDel markers were polymorphic with polymorphic information content (PIC among InDels ranged from 0.017 to 0.660. With respect to botanical types, PICs varied from 0.176 for fastigiata var., 0.181 for hypogaea var., 0.306 for vulgaris var., 0.534 for aequatoriana var., 0.556 for peruviana var., to 0.660 for hirsuta var., implying that aequatoriana var., peruviana var., and hirsuta var. have higher genetic diversity than the other types and provide a basis for gene functional studies. Single marker analysis was conducted to associate specific marker to disease resistant traits. Five InDels from functional genes were identified to be significantly correlated to tomato spotted wilt virus (TSWV infection and leaf spot, and these novel markers will be utilized to identify disease resistant genotype in breeding populations.

  15. Genomic and single nucleotide polymorphism analysis of infectious bronchitis coronavirus.

    Science.gov (United States)

    Abolnik, Celia

    2015-06-01

    Infectious bronchitis virus (IBV) is a Gammacoronavirus that causes a highly contagious respiratory disease in chickens. A QX-like strain was analysed by high-throughput Illumina sequencing and genetic variation across the entire viral genome was explored at the sub-consensus level by single nucleotide polymorphism (SNP) analysis. Thirteen open reading frames (ORFs) in the order 5'-UTR-1a-1ab-S-3a-3b-E-M-4b-4c-5a-5b-N-6b-3'UTR were predicted. The relative frequencies of missense: silent SNPs were calculated to obtain a comparative measure of variability in specific genes. The most variable ORFs in descending order were E, 3b, 5'UTR, N, 1a, S, 1ab, M, 4c, 5a, 6b. The E and 3b protein products play key roles in coronavirus virulence, and RNA folding demonstrated that the mutations in the 5'UTR did not alter the predicted secondary structure. The frequency of SNPs in the Spike (S) protein ORF of 0.67% was below the genomic average of 0.76%. Only three SNPS were identified in the S1 subunit, none of which were located in hypervariable region (HVR) 1 or HVR2. The S2 subunit was considerably more variable containing 87% of the polymorphisms detected across the entire S protein. The S2 subunit also contained a previously unreported multi-A insertion site and a stretch of four consecutive mutated amino acids, which mapped to the stalk region of the spike protein. Template-based protein structure modelling produced the first theoretical model of the IBV spike monomer. Given the lack of diversity observed at the sub-consensus level, the tenet that the HVRs in the S1 subunit are very tolerant of amino acid changes produced by genetic drift is questioned. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Complete Genome Sequence of Geobacillus thermoglucosidasius NCIMB 11955, the Progenitor of a Bioethanol Production Strain

    Science.gov (United States)

    Sheng, Lili; Zhang, Ying

    2016-01-01

    The industrially important thermophile Geobacillus thermoglucosidasius has the potential to produce chemicals and fuels from biomass-derived sugar feedstocks. Here, we present the genome sequence of strain NCIMB 11955, the progenitor of an ethanologenic industrial strain, revealing 11 single-nucleotide polymorphisms and 2 indels compared to strain DSM 2542 and two novel plasmids. PMID:27688322

  17. Genome-wide divergence and linkage disequilibrium analyses for Capsicum baccatum revealed by genome-anchored single nucleotide polymorphisms

    Science.gov (United States)

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...

  18. Analysis of the indel at the ARMS2 3′UTR in age-related macular degeneration

    Science.gov (United States)

    Wang, Gaofeng; Spencer, Kylee L.; Scott, William K.; Whitehead, Patrice; Court, Brenda L.; Ayala-Haedo, Juan; Mayo, Ping; Schwartz, Stephen G.; Kovach, Jaclyn L.; Gallins, Paul; Polk, Monica; Agarwal, Anita; Postel, Eric A.; Haines, Jonathan L.; Pericak-Vance, Margaret A.

    2010-01-01

    Controversy remains as to which gene at the chromosome 10q26 locus confers risk for age-related macular degeneration (AMD) and statistical genetic analysis is confounded by the strong linkage disequilibrium (LD) across the region. Functional analysis of related genetic variations could solve this puzzle. Recently Fritsche et al. reported that AMD is associated with unstable ARMS2 transcripts possibly caused by a complex insertion/deletion (indel; consisting of a 443 bp deletion and an adjacent 54 bp insertion) in its 3′UTR (untranslated region). To validate this indel, we sequenced our samples. We found that this indel is even more complex and is composed of two side-by-side indels separated by 17 bp: (1) 9 bp deletion with 10bp insertion; (2) 417 bp deletion with 27 bp insertion. The indel is significantly associated with the risk of AMD, but is also in strong LD with the non-synonymous single nucleotide polymorphism (SNP) rs10490924 (A69S). We also found that ARMS2 is expressed not only in placenta and retina but also in multiple human tissues. Using quantitative PCR, we found no correlation between the indel and ARMS2 mRNA level in human retina and blood samples. The lack of functional effects of the 3′UTR indel, the amino acid substitution of rs10490924 (A69S) and strong LD between them suggest that A69S, not the indel is the variant that confers risk of AMD. To our knowledge, it is the first time it's been shown that ARMS2 is widely expressed in human tissues. Conclusively, the indel at 3′UTR of ARMS2 actually contains two side-by-side indels. The indels are associated with risk of AMD, but not correlated with ARMS2 mRNA level. PMID:20182747

  19. Indel variant analysis of short-read sequencing data with Scalpel.

    Science.gov (United States)

    Fang, Han; Bergmann, Ewa A; Arora, Kanika; Vacic, Vladimir; Zody, Michael C; Iossifov, Ivan; O'Rawe, Jason A; Wu, Yiyang; Jimenez Barron, Laura T; Rosenbaum, Julie; Ronemus, Michael; Lee, Yoon-Ha; Wang, Zihua; Dikoglu, Esra; Jobanputra, Vaidehi; Lyon, Gholson J; Wigler, Michael; Schatz, Michael C; Narzisi, Giuseppe

    2016-12-01

    As the second most common type of variation in the human genome, insertions and deletions (indels) have been linked to many diseases, but the discovery of indels of more than a few bases in size from short-read sequencing data remains challenging. Scalpel (http://scalpel.sourceforge.net) is an open-source software for reliable indel detection based on the microassembly technique. It has been successfully used to discover mutations in novel candidate genes for autism, and it is extensively used in other large-scale studies of human diseases. This protocol gives an overview of the algorithm and describes how to use Scalpel to perform highly accurate indel calling from whole-genome and whole-exome sequencing data. We provide detailed instructions for an exemplary family-based de novo study, but we also characterize the other two supported modes of operation: single-sample and somatic analysis. Indel normalization, visualization and annotation of the mutations are also illustrated. Using a standard server, indel discovery and characterization in the exonic regions of the example sequencing data can be completed in ∼5 h after read mapping.

  20. Genomic polymorphism in symbiotic populations of Photobacterium leiognathi.

    Science.gov (United States)

    Dunlap, Paul V; Jiemjit, Anchalee; Ast, Jennifer C; Pearce, Meghan M; Marques, Ryan R; Lavilla-Pitogo, Celia R

    2004-02-01

    Photobacterium leiognathi forms a bioluminescent symbiosis with leiognathid fishes, colonizing the internal light organ of the fish and providing its host with light used in bioluminescence displays. Strains symbiotic with different species of the fish exhibit substantial phenotypic differences in symbiosis and in culture, including differences in 2-D PAGE protein patterns and profiles of indigenous plasmids. To determine if such differences might reflect a genetically based symbiont-strain/host-species specificity, we profiled the genomes of P. leiognathi strains from leiognathid fishes using PFGE. Individual strains from 10 species of leiognathid fishes exhibited substantial genomic polymorphism, with no obvious similarity among strains; these strains were nonetheless identified as P. leiognathi by 16S rDNA sequence analysis. Profiling of multiple strains from individual host specimens revealed an oligoclonal structure to the symbiont populations; typically one or two genomotypes dominated each population. However, analysis of multiple strains from multiple specimens of the same host species, to determine if the same strain types consistently colonize a host species, demonstrated substantial heterogeneity, with the same genomotype only rarely observed among the symbiont populations of different specimens of the same host species. Colonization of the leiognathid light organ to initiate the symbiosis therefore is likely to be oliogoclonal, and specificity of the P. leiognathi/leiognathid fish symbiosis apparently is maintained at the bacterial species level rather than at the level of individual, genomotypically defined strain types.

  1. Development of cleaved amplified polymorphic sequence markers and a CAPS-based genetic linkage map in watermelon (Citrullus lanatus [Thunb.] Matsum. and Nakai) constructed using whole-genome re-sequencing data.

    Science.gov (United States)

    Liu, Shi; Gao, Peng; Zhu, Qianglong; Luan, Feishi; Davis, Angela R; Wang, Xiaolu

    2016-03-01

    Cleaved amplified polymorphic sequence (CAPS) markers are useful tools for detecting single nucleotide polymorphisms (SNPs). This study detected and converted SNP sites into CAPS markers based on high-throughput re-sequencing data in watermelon, for linkage map construction and quantitative trait locus (QTL) analysis. Two inbred lines, Cream of Saskatchewan (COS) and LSW-177 had been re-sequenced and analyzed by Perl self-compiled script for CAPS marker development. 88.7% and 78.5% of the assembled sequences of the two parental materials could map to the reference watermelon genome, respectively. Comparative assembled genome data analysis provided 225,693 and 19,268 SNPs and indels between the two materials. 532 pairs of CAPS markers were designed with 16 restriction enzymes, among which 271 pairs of primers gave distinct bands of the expected length and polymorphic bands, via PCR and enzyme digestion, with a polymorphic rate of 50.94%. Using the new CAPS markers, an initial CAPS-based genetic linkage map was constructed with the F2 population, spanning 1836.51 cM with 11 linkage groups and 301 markers. 12 QTLs were detected related to fruit flesh color, length, width, shape index, and brix content. These newly CAPS markers will be a valuable resource for breeding programs and genetic studies of watermelon.

  2. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  3. Increasing the number of single nucleotide polymorphisms used in genomic evaluation of dairy cattle

    Science.gov (United States)

    GeneSeek designed a new version of the GeneSeek Genomic Profiler HD BeadChip for Dairy Cattle, which had >77,000 single nucleotide polymorphisms (SNPs). A set of >140,000 SNPs was selected that included all SNPs on the existing GeneSeek chip, all SNPs used in U.S. national genomic evaluations, SNPs ...

  4. NIG_MoG: a mouse genome navigator for exploring intersubspecific genetic polymorphisms.

    Science.gov (United States)

    Takada, Toyoyuki; Yoshiki, Atsushi; Obata, Yuichi; Yamazaki, Yukiko; Shiroishi, Toshihiko

    2015-08-01

    The National Institute of Genetics Mouse Genome database (NIG_MoG; http://molossinus.lab.nig.ac.jp/msmdb/) primarily comprises the whole-genome sequence data of two inbred mouse strains, MSM/Ms and JF1/Ms. These strains were established at NIG and originated from the Japanese subspecies Mus musculus molossinus. NIG_MoG provides visualized genome polymorphism information, browsing single-nucleotide polymorphisms and short insertions and deletions in the genomes of MSM/Ms and JF1/Ms with respect to C57BL/6J (whose genome is predominantly derived from the West European subspecies M. m. domesticus). This allows users, especially wet-lab biologists, to intuitively recognize intersubspecific genome divergence in these mouse strains using visual data. The database also supports the in silico screening of bacterial artificial chromosome (BAC) clones that contain genomic DNA from MSM/Ms and the standard classical laboratory strain C57BL/6N. NIG_MoG is thus a valuable navigator for exploring mouse genome polymorphisms and BAC clones that are useful for studies of gene function and regulation based on intersubspecific genome divergence.

  5. Hopeful (protein InDel) monsters?

    Science.gov (United States)

    Tóth-Petróczy, Agnes; Tawfik, Dan S

    2014-06-10

    In this issue of Structure, Arpino and colleagues describe in atomic detail how a protein stomachs a deletion within a helix, an event that rarely occurs in nature or in the lab. Can insertions and deletions (InDels) trigger dramatic structural transitions?

  6. Prediction of protein-destabilizing polymorphisms by manual curation with protein structure.

    Directory of Open Access Journals (Sweden)

    Craig Alan Gough

    Full Text Available The relationship between sequence polymorphisms and human disease has been studied mostly in terms of effects of single nucleotide polymorphisms (SNPs leading to single amino acid substitutions that change protein structure and function. However, less attention has been paid to more drastic sequence polymorphisms which cause premature termination of a protein's sequence or large changes, insertions, or deletions in the sequence. We have analyzed a large set (n = 512 of insertions and deletions (indels and single nucleotide polymorphisms causing premature termination of translation in disease-related genes. Prediction of protein-destabilization effects was performed by graphical presentation of the locations of polymorphisms in the protein structure, using the Genomes TO Protein (GTOP database, and manual annotation with a set of specific criteria. Protein-destabilization was predicted for 44.4% of the nonsense SNPs, 32.4% of the frameshifting indels, and 9.1% of the non-frameshifting indels. A prediction of nonsense-mediated decay allowed to infer which truncated proteins would actually be translated as defective proteins. These cases included the proteins linked to diseases inherited dominantly, suggesting a relation between these diseases and toxic aggregation. Our approach would be useful in identifying potentially aggregation-inducing polymorphisms that may have pathological effects.

  7. Fast and sensitive detection of indels induced by precise gene targeting

    DEFF Research Database (Denmark)

    Yang, Zhang; Steentoft, Catharina; Hauge, Camilla

    2015-01-01

    The nuclease-based gene editing tools are rapidly transforming capabilities for altering the genome of cells and organisms with great precision and in high throughput studies. A major limitation in application of precise gene editing lies in lack of sensitive and fast methods to detect...... and characterize the induced DNA changes. Precise gene editing induces double-stranded DNA breaks that are repaired by error-prone non-homologous end joining leading to introduction of insertions and deletions (indels) at the target site. These indels are often small and difficult and laborious to detect...

  8. Detection of genome-wide polymorphisms in the AT-rich Plasmodium falciparum genome using a high-density microarray

    Directory of Open Access Journals (Sweden)

    Huyen Yentram

    2008-08-01

    Full Text Available Abstract Background Genetic mapping is a powerful method to identify mutations that cause drug resistance and other phenotypic changes in the human malaria parasite Plasmodium falciparum. For efficient mapping of a target gene, it is often necessary to genotype a large number of polymorphic markers. Currently, a community effort is underway to collect single nucleotide polymorphisms (SNP from the parasite genome. Here we evaluate polymorphism detection accuracy of a high-density 'tiling' microarray with 2.56 million probes by comparing single feature polymorphisms (SFP calls from the microarray with known SNP among parasite isolates. Results We found that probe GC content, SNP position in a probe, probe coverage, and signal ratio cutoff values were important factors for accurate detection of SFP in the parasite genome. We established a set of SFP calling parameters that could predict mSFP (SFP called by multiple overlapping probes with high accuracy (≥ 94% and identified 121,087 mSFP genome-wide from five parasite isolates including 40,354 unique mSFP (excluding those from multi-gene families and ~18,000 new mSFP, producing a genetic map with an average of one unique mSFP per 570 bp. Genomic copy number variation (CNV among the parasites was also cataloged and compared. Conclusion A large number of mSFP were discovered from the P. falciparum genome using a high-density microarray, most of which were in clusters of highly polymorphic genes at chromosome ends. Our method for accurate mSFP detection and the mSFP identified will greatly facilitate large-scale studies of genome variation in the P. falciparum parasite and provide useful resources for mapping important parasite traits.

  9. Two sequence alterations, a 136 bp InDel and an A/C polymorphic site, in the S5 locus are associated with spikelet fertility of indica-japonica hybrid in rice.

    Science.gov (United States)

    Ji, Qing; Lu, Jufei; Chao, Qing; Zhang, Yan; Zhang, Meijing; Gu, Minghong; Xu, Mingliang

    2010-01-01

    The rice indica/japonica hybrid shows strong heterosis. However, such inter-subspecific hybrid can't be directly used in rice production due to its low spikelet fertility. The S5 locus was proved to be associated with fertility of indica/japonica hybrid and its S5n allele from wide-compatibility variety (WCV) is capable to overcome fertility barrier. In the present study, we reported the causal sites in the S5 locus responsible for compatibility of indica/japonica hybrid. Fine-mapping of the S5 locus using the 11 test-cross families pinpoints a candidate S5 locus encoding aspartic protease (Asp). Intragenic recombination within the Asp gene happened in a number of recombinants, resulting in chimeric S5j-S5n alleles. Just like S5n, the chimeric S5j-S5n allele displayed higher spikelet fertility when combined with the S5i allele. In the complementary test, however, the S5n allele from WCVs failed to enhance fertilities of the indica/japonica hybrids. Compared to both indica and japonica varieties, all nine WCVs from different resources are characterized with a 136 bp deletion in the Asp N-terminus, which probably renders the S5n allele non-functional. Furthermore, an A/C polymorphic site is detected 1,233 bp downstream of the Asp start codon. The heterozygous A/C site of the Asp gene in indica/japonica hybrid is believed to be the casual factor to cause partial sterility. The functional makers based on the two polymorphic sites will be broadly used in developing wide-compatibility rice varieties.

  10. Diversity Suppression-Subtractive Hybridization Array for Profiling Genomic DNA Polymorphisms

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Genomic DNA polymorphisms are very useful for tracing genetic traits and studying biological diversity among species. Here, we present a method we call the "diversity suppression-subtractive hybridization array" for effectively profiling genomic DNA polymorphisms. The method first obtains the subtracted gDNA fragments between any two species by suppression subtraction hybridization (SSH) to establish a subtracted gDNA library,from which diversity SSH arrays are created with the selected subtracted clones. The diversity SSH array hybridizes with the DIG-labeled genomic DNA of the organism to be assayed. Six closely related Dendrobium species were studied as model samples. Four Dendrobium species as testers were used to perform SSH. A total of 617 subtracted positive clones were obtained from four Dendrobium species, and the average ratio of positive clones was 80.3%. We demonstrated that the average percentage of polymorphic fragments of pairwise comparisons of four Dendrobium species was up to 42.4%. A dendrogram of the relatedness of six Dendrobium species was produced according to their polymorphic profiles. The results revealed that the diversity SSH array is a highly effective platform for profiling genomic DNA polymorphisms and dendrograms.

  11. Utilising polymorphisms to achieve allele-specific genome editing in zebrafish

    Directory of Open Access Journals (Sweden)

    Samuel J. Capon

    2017-01-01

    Full Text Available The advent of genome editing has significantly altered genetic research, including research using the zebrafish model. To better understand the selectivity of the commonly used CRISPR/Cas9 system, we investigated single base pair mismatches in target sites and examined how they affect genome editing in the zebrafish model. Using two different zebrafish strains that have been deep sequenced, CRISPR/Cas9 target sites containing polymorphisms between the two strains were identified. These strains were crossed (creating heterozygotes at polymorphic sites and CRISPR/Cas9 complexes that perfectly complement one strain injected. Sequencing of targeted sites showed biased, allele-specific editing for the perfectly complementary sequence in the majority of cases (14/19. To test utility, we examined whether phenotypes generated by F0 injection could be internally controlled with such polymorphisms. Targeting of genes bmp7a and chordin showed reduction in the frequency of phenotypes in injected ‘heterozygotes’ compared with injecting the strain with perfect complementarity. Next, injecting CRISPR/Cas9 complexes targeting two separate sites created deletions, but deletions were biased to selected chromosomes when one CRISPR/Cas9 target contained a polymorphism. Finally, integration of loxP sequences occurred preferentially in alleles with perfect complementarity. These experiments demonstrate that single nucleotide polymorphisms (SNPs present throughout the genome can be utilised to increase the efficiency of in cis genome editing using CRISPR/Cas9 in the zebrafish model.

  12. Association between the polymorphisms of angiotensin converting enzyme (Peptidyl-Dipeptidase A INDEL mutation (I/D and Angiotensin II type I receptor (A1166C and breast cancer among post menopausal Egyptian females

    Directory of Open Access Journals (Sweden)

    Rania Mohamed El Sharkawy

    2014-09-01

    Results: A statistically significant difference in AT1R A1166C SNP genotype frequencies was found among the studied groups. The patients group showed higher frequency of “CC” (2.9% vs 0% and “AC” (44.3% vs 24% and lower frequency of “AA” genotype (52.9% vs 76% than controls. The patients also showed significant higher frequency of allele “C” (25% vs 12% which was associated with increased breast cancer risk with an Odds ratio of 2.4444 (95% CI: 1.1967–4.9931. Testing the dominant model of inheritance revealed a statistically higher frequency of exposed genotypes “AC and CC” among the patients group (47.1% vs 24%, respectively; p = 0.013 with substantial increase in breast cancer risk among the exposed genotypes with an Odds ratio of 2.8243 (95% CI: 1.2679–6.2913. The present study demonstrated that (AC and CC genotypes of AT1R A1166C SNP and increased BMI can be considered as predictors for breast cancer risk among post menopausal Egyptian females. Results also revealed that A1166C SNP of AT1R gene and ACE/ID polymorphism could not be considered as predictors for breast cancer prognosis.

  13. Diversification and genetic differentiation of cultivated melon inferred from sequence polymorphism in the chloroplast genome

    OpenAIRE

    Tanaka, Katsunori; Akashi, Yukari; FUKUNAGA, Kenji; Yamamoto, Tatsuya; Aierken, Yasheng; Nishida, Hidetaka; Long, Chun Lin; Yoshino, Hiromichi; Sato, Yo-Ichiro; KATO, Kenji

    2013-01-01

    Molecular analysis encouraged discovery of genetic diversity and relationships of cultivated melon (Cucumis melo L.). We sequenced nine inter- and intra-genic regions of the chloroplast genome, about 5500 bp, using 60 melon accessions and six reference accessions of wild species of Cucumis to show intra-specific variation of the chloroplast genome. Sequence polymorphisms were detected among melon accessions and other Cucumis species, indicating intra-specific diversification of the chloroplas...

  14. Mining for Single Nucleotide Polymorphisms in Pig genome sequence data

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Kollers, S.; Kommandath, A.; Rosario, del M.; Dibbits, B.W.; Kinders, S.M.; Crooijmans, R.P.M.A.; Groenen, M.A.M.

    2009-01-01

    Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole g

  15. Microsatellite interruptions stabilize primate genomes and exist as population-specific single nucleotide polymorphisms within individual human genomes.

    Science.gov (United States)

    Ananda, Guruprasad; Hile, Suzanne E; Breski, Amanda; Wang, Yanli; Kelkar, Yogeshwar; Makova, Kateryna D; Eckert, Kristin A

    2014-07-01

    Interruptions of microsatellite sequences impact genome evolution and can alter disease manifestation. However, human polymorphism levels at interrupted microsatellites (iMSs) are not known at a genome-wide scale, and the pathways for gaining interruptions are poorly understood. Using the 1000 Genomes Phase-1 variant call set, we interrogated mono-, di-, tri-, and tetranucleotide repeats up to 10 units in length. We detected ∼26,000-40,000 iMSs within each of four human population groups (African, European, East Asian, and American). We identified population-specific iMSs within exonic regions, and discovered that known disease-associated iMSs contain alleles present at differing frequencies among the populations. By analyzing longer microsatellites in primate genomes, we demonstrate that single interruptions result in a genome-wide average two- to six-fold reduction in microsatellite mutability, as compared with perfect microsatellites. Centrally located interruptions lowered mutability dramatically, by two to three orders of magnitude. Using a biochemical approach, we tested directly whether the mutability of a specific iMS is lower because of decreased DNA polymerase strand slippage errors. Modeling the adenomatous polyposis coli tumor suppressor gene sequence, we observed that a single base substitution interruption reduced strand slippage error rates five- to 50-fold, relative to a perfect repeat, during synthesis by DNA polymerases α, β, or η. Computationally, we demonstrate that iMSs arise primarily by base substitution mutations within individual human genomes. Our biochemical survey of human DNA polymerase α, β, δ, κ, and η error rates within certain microsatellites suggests that interruptions are created most frequently by low fidelity polymerases. Our combined computational and biochemical results demonstrate that iMSs are abundant in human genomes and are sources of population-specific genetic variation that may affect genome stability. The

  16. Intron Derived Size Polymorphism in the Mitochondrial Genomes of Closely Related Chrysoporthe Species.

    Science.gov (United States)

    Kanzi, Aquillah Mumo; Wingfield, Brenda Diana; Steenkamp, Emma Theodora; Naidoo, Sanushka; van der Merwe, Nicolaas Albertus

    2016-01-01

    In this study, the complete mitochondrial (mt) genomes of Chrysoporthe austroafricana (190,834 bp), C. cubensis (89,084 bp) and C. deuterocubensis (124,412 bp) were determined. Additionally, the mitochondrial genome of another member of the Cryphonectriaceae, namely Cryphonectria parasitica (158,902 bp), was retrieved and annotated for comparative purposes. These genomes showed high levels of synteny, especially in regions including genes involved in oxidative phosphorylation and electron transfer, unique open reading frames (uORFs), ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs), as well as intron positions. Comparative analyses revealed signatures of duplication events, intron number and length variation, and varying intronic ORFs which highlighted the genetic diversity of mt genomes among the Cryphonectriaceae. These mt genomes showed remarkable size polymorphism. The size polymorphism in the mt genomes of these closely related Chrysoporthe species was attributed to the varying number and length of introns, coding sequences and to a lesser extent, intergenic sequences. Compared to publicly available fungal mt genomes, the C. austroafricana mt genome is the second largest in the Ascomycetes thus far.

  17. Human Xq28 Inversion Polymorphism: From Sex Linkage to Genomics--A Genetic Mother Lode

    Science.gov (United States)

    Kirby, Cait S.; Kolber, Natalie; Salih Almohaidi, Asmaa M.; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to…

  18. "Islands of Divergence" in the Atlantic Cod Genome Represent Polymorphic Chromosomal Rearrangements.

    Science.gov (United States)

    Sodeland, Marte; Jorde, Per Erik; Lien, Sigbjørn; Jentoft, Sissel; Berg, Paul R; Grove, Harald; Kent, Matthew P; Arnyasi, Mariann; Olsen, Esben Moland; Knutsen, Halvor

    2016-04-11

    In several species genetic differentiation across environmental gradients or between geographically separate populations has been reported to center at "genomic islands of divergence," resulting in heterogeneous differentiation patterns across genomes. Here, genomic regions of elevated divergence were observed on three chromosomes of the highly mobile fish Atlantic cod (Gadus morhua) within geographically fine-scaled coastal areas. The "genomic islands" extended at least 5, 9.5, and 13 megabases on linkage groups 2, 7, and 12, respectively, and coincided with large blocks of linkage disequilibrium. For each of these three chromosomes, pairs of segregating, highly divergent alleles were identified, with little or no gene exchange between them. These patterns of recombination and divergence mirror genomic signatures previously described for large polymorphic inversions, which have been shown to repress recombination across extensive chromosomal segments. The lack of genetic exchange permits divergence between noninverted and inverted chromosomes in spite of gene flow. For the rearrangements on linkage groups 2 and 12, allelic frequency shifts between coastal and oceanic environments suggest a role in ecological adaptation, in agreement with recently reported associations between molecular variation within these genomic regions and temperature, oxygen, and salinity levels. Elevated genetic differentiation in these genomic regions has previously been described on both sides of the Atlantic Ocean, and we therefore suggest that these polymorphisms are involved in adaptive divergence across the species distributional range. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    Directory of Open Access Journals (Sweden)

    Gopala Krishnan S

    Full Text Available BACKGROUND: Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. RESULTS: We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts. Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. CONCLUSIONS: Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  20. Comparative genome-wide polymorphic microsatellite markers in Antarctic penguins through next generation sequencing.

    Science.gov (United States)

    Vianna, Juliana A; Noll, Daly; Mura-Jornet, Isidora; Valenzuela-Guerra, Paulina; González-Acuña, Daniel; Navarro, Cristell; Loyola, David E; Dantas, Gisele P M

    Microsatellites are valuable molecular markers for evolutionary and ecological studies. Next generation sequencing is responsible for the increasing number of microsatellites for non-model species. Penguins of the Pygoscelis genus are comprised of three species: Adélie (P. adeliae), Chinstrap (P. antarcticus) and Gentoo penguin (P. papua), all distributed around Antarctica and the sub-Antarctic. The species have been affected differently by climate change, and the use of microsatellite markers will be crucial to monitor population dynamics. We characterized a large set of genome-wide microsatellites and evaluated polymorphisms in all three species. SOLiD reads were generated from the libraries of each species, identifying a large amount of microsatellite loci: 33,677, 35,265 and 42,057 for P. adeliae, P. antarcticus and P. papua, respectively. A large number of dinucleotide (66,139), trinucleotide (29,490) and tetranucleotide (11,849) microsatellites are described. Microsatellite abundance, diversity and orthology were characterized in penguin genomes. We evaluated polymorphisms in 170 tetranucleotide loci, obtaining 34 polymorphic loci in at least one species and 15 polymorphic loci in all three species, which allow to perform comparative studies. Polymorphic markers presented here enable a number of ecological, population, individual identification, parentage and evolutionary studies of Pygoscelis, with potential use in other penguin species.

  1. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    Science.gov (United States)

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  2. Polymorphism within the nuclear and 2 micron genomes of Saccharomyces cerevisiae.

    Science.gov (United States)

    Rank, G H; Casey, G P; Xiao, W; Pringle, A T

    1991-08-01

    Seven strains of bakers' yeast were obtained as a representative sample of the Spanish baking industry. The nuclear genome was monitored for polymorphism by transverse alternating field electrophoresis (TAFE) and restriction maps of 2 micron DNA were produced. All seven strains were uniquely different when evaluated by their total chromosomal lengths whereas only two 2 micron variants were defined. There was no apparent correlation between chromosomal and plasmid polymorphism. The extensive chromosomal polymorphism within one 2 micron DNA type indicates the rapid and relatively recent evolution of the nuclear genome. The hybrid origin (S. cerevisiae-S. monacensis) of lager yeast was critically evaluated by TAFE analysis of S. cerevisiae and S. carlsbergensis chromosomes. The absence of corresponding S. cerevisiae chromosomes III and XIII in S. carlsbergensis argued against the hybrid origin of lager strains. We discuss limitations of the hybrid origin hypothesis of industrial yeasts and propose that the molecular coevolution observed in 2 micron DNA serves as a useful additional mechanism for rationalization of some of the structural polymorphism of the nuclear genome.

  3. DNA indels in coding regions reveal selective constraints on protein evolution in the human lineage

    Directory of Open Access Journals (Sweden)

    Messer Philipp W

    2007-10-01

    Full Text Available Abstract Background Insertions and deletions of DNA segments (indels are together with substitutions the major mutational processes that generate genetic variation. Here we focus on recent DNA insertions and deletions in protein coding regions of the human genome to investigate selective constraints on indels in protein evolution. Results Frequencies of inserted and deleted amino acids differ from background amino acid frequencies in the human proteome. Small amino acids are overrepresented, while hydrophobic, aliphatic and aromatic amino acids are strongly suppressed. Indels are found to be preferentially located in protein regions that do not form important structural domains. Amino acid insertion and deletion rates in genes associated with elementary biochemical reactions (e. g. catalytic activity, ligase activity, electron transport, or catabolic process are lower compared to those in other genes and are therefore subject to stronger purifying selection. Conclusion Our analysis indicates that indels in human protein coding regions are subject to distinct levels of selective pressure with regard to their structural impact on the amino acid sequence, as well as to general properties of the genes they are located in. These findings confirm that many commonly accepted characteristics of selective constraints for substitutions are also valid for amino acid insertions and deletions.

  4. PrimeIndel: four-prime-number genetic code for indel decryption and sequence read alignment.

    Science.gov (United States)

    Lam, Ching-Wan

    2014-09-25

    To decrypt a doubly heterozygous sequence (DHS) in order to define the indel mutation for mutation reporting, an algorithm recursively searching the overlapped nucleotide using an offset of nucleotide positions can decrypt the indel without using a reference sequence. However, as genetic code is letter-based, special computer programs are required to run the decryption algorithm. The previous text-based algorithm was converted to a number-based algorithm by expressing DNA sequence from a 4-letter genetic code to a 4-prime-number genetic code, i.e., converting A, C, G, T to 2, 3, 5, and 7. This algorithm based on prime-number genetic code is called PrimeIndel and is executable by spreadsheet. Using prime number coded DNA sequence, the overlapped nucleotide between any 2 positions of the DHS is represented by the greatest common divisor (GCD) of the multiplication product of 2 prime numbers. This algorithm can also be used for aligning multiple overlapping sequence reads by in-silico DHS formation. The indel size of the in-silico formed DHS indicates the positions in the paired sequences for correct alignment. DHSs were successfully decrypted by the prime number-based algorithm and sequence reads were aligned correctly. DNA sequence expressed in prime numbers can be used for the decryption of DHS and the alignment of sequence reads using a well-known mathematical function GCD of a spreadsheet program. PrimeIndel is a useful tool for mutation reporting in clinical laboratories. The software is downloadable from http://www.patho.hku.hk/staff/list/cwlam.htm. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  6. CAG-encoded polyglutamine length polymorphism in the human genome

    Directory of Open Access Journals (Sweden)

    Hayden Michael R

    2007-05-01

    Full Text Available Abstract Background Expansion of polyglutamine-encoding CAG trinucleotide repeats has been identified as the pathogenic mutation in nine different genes associated with neurodegenerative disorders. The majority of individuals clinically diagnosed with spinocerebellar ataxia do not have mutations within known disease genes, and it is likely that additional ataxias or Huntington disease-like disorders will be found to be caused by this common mutational mechanism. We set out to determine the length distributions of CAG-polyglutamine tracts for the entire human genome in a set of healthy individuals in order to characterize the nature of polyglutamine repeat length variation across the human genome, to establish the background against which pathogenic repeat expansions can be detected, and to prioritize candidate genes for repeat expansion disorders. Results We found that repeats, including those in known disease genes, have unique distributions of glutamine tract lengths, as measured by fragment analysis of PCR-amplified repeat regions. This emphasizes the need to characterize each distribution and avoid making generalizations between loci. The best predictors of known disease genes were occurrence of a long CAG-tract uninterrupted by CAA codons in their reference genome sequence, and high glutamine tract length variance in the normal population. We used these parameters to identify eight priority candidate genes for polyglutamine expansion disorders. Twelve CAG-polyglutamine repeats were invariant and these can likely be excluded as candidates. We outline some confusion in the literature about this type of data, difficulties in comparing such data between publications, and its application to studies of disease prevalence in different populations. Analysis of Gene Ontology-based functions of CAG-polyglutamine-containing genes provided a visual framework for interpretation of these genes' functions. All nine known disease genes were involved in DNA

  7. DivStat: a user-friendly tool for single nucleotide polymorphism analysis of genomic diversity.

    Directory of Open Access Journals (Sweden)

    Inês Soares

    Full Text Available Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs. Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.

  8. Intra-genomic ribosomal RNA polymorphism and morphological variation in Elphidium macellum suggests inter-specific hybridization in foraminifera.

    Directory of Open Access Journals (Sweden)

    Loïc Pillet

    Full Text Available Elphidium macellum is a benthic foraminifer commonly found in the Patagonian fjords. To test whether its highly variable morphotypes are ecophenotypes or different genotypes, we analysed 70 sequences of the SSU rRNA gene from 25 specimens. Unexpectedly, we identified 11 distinct ribotypes, with up to 5 ribotypes co-occurring within the same specimen. The ribotypes differ by varying blocks of sequence located at the end of stem-loop motifs in the three expansion segments specific to foraminifera. These changes, distinct from typical SNPs and indels, directly affect the structure of the expansion segments. Their mosaic distribution suggests that ribotypes originated by recombination of two or more clusters of ribosomal genes. We propose that this expansion segment polymorphism (ESP could originate from hybridization of morphologically different populations of Patagonian Elphidium. We speculate that the complex geological history of Patagonia enhanced divergence of coastal foraminiferal species and contributed to increasing genetic and morphological variation.

  9. High-resolution genomic fingerprinting of Campylobacter jejuni and Campylobacter coli by analysis of amplified fragment length polymorphisms

    DEFF Research Database (Denmark)

    Kokotovic, Branko; On, Stephen L.W.

    1999-01-01

    A method for high-resolution genomic fingerprinting of the enteric pathogens Campylobacter jejuni and Campylobacter coli, based on the determination of amplified fragment length polymorphism, is described. The potential of this method for molecular epidemiological studies of these species...

  10. Human tyrosine hydroxylase (TH) genomic fragment (pHGTH4) identifies a PstI polymorphism

    Energy Technology Data Exchange (ETDEWEB)

    Kelsoe, J.R.; Stubblefield, B.K.; Ginns, E.I. (National Institute of Mental Health, Bethesda, MD (USA))

    1988-08-11

    pHGTH4 is a 2.3 kb Bam HI genomic fragment of tyrosine hydroxylase isolated from a lambda EMBL3 Sau3A partial digest human genomic library prepared from lymphoblasts. The fragment was subcloned into the Bam HI site of pBLUESCRIPT. The absence of a polymorphic Pst I site results in the 1.3 kb fragment, A1; whereas its presence results in a 0.7 kb and a 0.6 kb fragment, A2, as shown in the figure. It was located at 11p15 by in situ hybridization. Mendelian inheritance was demonstrated in a 23 member family with 12 children. Pst I was not polymorphic (all homoallelic for A2) in a panel of 6 subjects from Amish pedigree 110 (IMR 884), in which two other chromosome 11 RFLP's have been reported to be linked to manic-depressive illness.

  11. Genomic relations among 31 species of Mammillaria haworth (Cactaceae) using random amplified polymorphic DNA.

    Science.gov (United States)

    Mattagajasingh, Ilwola; Mukherjee, Arup Kumar; Das, Premananda

    2006-01-01

    Thirty-one species of Mammillaria were selected to study the molecular phylogeny using random amplified polymorphic DNA (RAPD) markers. High amount of mucilage (gelling polysaccharides) present in Mammillaria was a major obstacle in isolating good quality genomic DNA. The CTAB (cetyl trimethyl ammonium bromide) method was modified to obtain good quality genomic DNA. Twenty-two random decamer primers resulted in 621 bands, all of which were polymorphic. The similarity matrix value varied from 0.109 to 0.622 indicating wide variability among the studied species. The dendrogram obtained from the unweighted pair group method using arithmetic averages (UPGMA) analysis revealed that some of the species did not follow the conventional classification. The present work shows the usefulness of RAPD markers for genetic characterization to establish phylogenetic relations among Mammillaria species.

  12. Heteropolymeric triplex-based genomic assay to detect pathogens or single-nucleotide polymorphisms in human genomic samples.

    Directory of Open Access Journals (Sweden)

    Jasmine I Daksis

    Full Text Available Human genomic samples are complex and are considered difficult to assay directly without denaturation or PCR amplification. We report the use of a base-specific heteropolymeric triplex, formed by native duplex genomic target and an oligonucleotide third strand probe, to assay for low copy pathogen genomes present in a sample also containing human genomic duplex DNA, or to assay human genomic duplex DNA for Single Nucleotide Polymorphisms (SNP, without PCR amplification. Wild-type and mutant probes are used to identify triplexes containing FVL G1691A, MTHFR C677T and CFTR mutations. The specific triplex structure forms rapidly at room temperature in solution and may be detected without a separation step. YOYO-1, a fluorescent bis-intercalator, promotes and signals the formation of the specific triplex. Genomic duplexes may be assayed homogeneously with single base pair resolution. The specific triple-stranded structures of the assay may approximate homologous recombination intermediates, which various models suggest may form in either the major or minor groove of the duplex. The bases of the stable duplex target are rendered specifically reactive to the bases of the probe because of the activity of intercalated YOYO-1, which is known to decondense duplex locally 1.3 fold. This may approximate the local decondensation effected by recombination proteins such as RecA in vivo. Our assay, while involving triplex formation, is sui generis, as it is not homopurine sequence-dependent, as are "canonical triplexes". Rather, the base pair-specific heteropolymeric triplex of the assay is conformation-dependent. The highly sensitive diagnostic assay we present allows for the direct detection of base sequence in genomic duplex samples, including those containing human genomic duplex DNA, thereby bypassing the inherent problems and cost associated with conventional PCR based diagnostic assays.

  13. Exploration of presence/absence variation and corresponding polymorphic markers in soybean genome

    Institute of Scientific and Technical Information of China (English)

    Yufeng Wang; Tuanjie Zhao; Junyi Gai; Jiangjie Lu; Shouyi Chen; Liping Shu; Reid GPalmer; Guangnan Xing; Yan Li; Shouping Yang; Deyue Yu

    2014-01-01

    This study was designed to reveal the genome-wide distribution of presence/absence variation (PAV) and to establish a database of polymorphic PAV markers in soybean. The 33 soybean whole-genome sequences were compared to each other with that of Wil iams 82 as a reference genome. A total of 33,127 PAVs were detected and 28,912 PAV markers with their primer sequences were designed as the database NJAUSoyPAV_1.0. The PAVs scattered on whole genome while only 518 (1.8%) over-lapped with simple sequence repeats (SSRs) in BARCSOYSSR_1.0 database. In a random sample of 800 PAVs, 713 (89.13%) showed polymorphism among the 12 differential genotypes. Using 126 PAVs and 108 SSRs to test a Chinese soybean germplasm col ection composed of 828 Glycine soja Sieb. et Zucc. and Glycine max (L.) Merr. accessions, the per locus al ele number and its variation appeared less in PAVs than in SSRs. The distinctness among al eles/bands of PCR (polymerase chain reaction) products showed better in PAVs than in SSRs, potential in accurate marker-assisted al ele selection. The association mapping results showed SSR þ PAV was more powerful than any single marker systems. The NJAUSoyPAV_1.0 database has enriched the source of PCR markers, and may fit the materials with a range of per locus al ele numbers, if jointly used with SSR markers.

  14. Genome-wide DNA polymorphisms in Kavuni, a traditional rice cultivar with nutritional and therapeutic properties.

    Science.gov (United States)

    Rathinasabapathi, Pasupathi; Purushothaman, Natarajan; Parani, Madasamy

    2016-05-01

    Although rice genome was sequenced in the year 2002, efforts in resequencing the large number of available accessions, landraces, traditional cultivars, and improved varieties of this important food crop are limited. We have initiated resequencing of the traditional cultivars from India. Kavuni is an important traditional rice cultivar from South India that attracts premium price for its nutritional and therapeutic properties. Whole-genome sequencing of Kavuni using Illumina platform and SNPs analysis using Nipponbare reference genome identified 1 150 711 SNPs of which 377 381 SNPs were located in the genic regions. Non-synonymous SNPs (62 708) were distributed in 19 251 genes, and their number varied between 1 and 115 per gene. Large-effect DNA polymorphisms (7769) were present in 3475 genes. Pathway mapping of these polymorphisms revealed the involvement of genes related to carbohydrate metabolism, translation, protein-folding, and cell death. Analysis of the starch biosynthesis related genes revealed that the granule-bound starch synthase I gene had T/G SNPs at the first intron/exon junction and a two-nucleotide combination, which were reported to favour high amylose content and low glycemic index. The present study provided a valuable genomics resource to study the rice varieties with nutritional and medicinal properties.

  15. Effects of As2O3 on DNA methylation, genomic instability, and LTR retrotransposon polymorphism in Zea mays.

    Science.gov (United States)

    Erturk, Filiz Aygun; Aydin, Murat; Sigmaz, Burcu; Taspinar, M Sinan; Arslan, Esra; Agar, Guleray; Yagci, Semra

    2015-12-01

    Arsenic is a well-known toxic substance on the living organisms. However, limited efforts have been made to study its DNA methylation, genomic instability, and long terminal repeat (LTR) retrotransposon polymorphism causing properties in different crops. In the present study, effects of As2O3 (arsenic trioxide) on LTR retrotransposon polymorphism and DNA methylation as well as DNA damage in Zea mays seedlings were investigated. The results showed that all of arsenic doses caused a decreasing genomic template stability (GTS) and an increasing Random Amplified Polymorphic DNAs (RAPDs) profile changes (DNA damage). In addition, increasing DNA methylation and LTR retrotransposon polymorphism characterized a model to explain the epigenetically changes in the gene expression were also found. The results of this experiment have clearly shown that arsenic has epigenetic effect as well as its genotoxic effect. Especially, the increasing of polymorphism of some LTR retrotransposon under arsenic stress may be a part of the defense system against the stress.

  16. Substitutions of short heterologous DNA segments of intragenomic or extragenomic origins produce clustered genomic polymorphisms

    DEFF Research Database (Denmark)

    Harms, Klaus; Lunnan, Asbjørn; Hülter, Nils;

    2016-01-01

    In a screen for unexplained mutation events we identified a previously unrecognized mechanism generating clustered DNA polymorphisms such as microindels and cumulative SNPs. The mechanism, short-patch double illegitimate recombination (SPDIR), facilitates short single-stranded DNA molecules...... to invade and replace genomic DNA through two joint illegitimate recombination events. SPDIR is controlled by key components of the cellular genome maintenance machinery in the gram-negative bacterium Acinetobacter baylyi. The source DNA is primarily intragenomic but can also be acquired through horizontal...... gene transfer. The DNA replacements are nonreciprocal and locus independent. Bioinformatic approaches reveal occurrence of SPDIR events in the gram-positive human pathogen Streptococcus pneumoniae and in the human genome....

  17. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  18. Using Allele-Specific PCR with Molecular Beams as a Means for Genotyping the Diallelic Indels

    Energy Technology Data Exchange (ETDEWEB)

    Doktycz, M.J.; Weber, J.L. (Marshfield Medical Research Foundation)

    2000-06-01

    The first Specific Aim for this grant was to identify and characterize an average of 500 human insertion/deletion polymorphisms per grant year (1500 total). This task was carried out entirely at MMRF. They substantially exceeded this goal by confirming about 2,300 diallelic indels. Complete characterization information for these polymorphisms is available from the Marshfield web site. A manuscript describing results for the first 2,000 diallelic indels was published earlier this year in the American Journal of Human Genetics. The Second Specific Aim of the grant was to investigate and develop improved methods for analysis of diallelic polymorphisms using miniaturized DNA arrays. The initial genotyping technology efforts focused on various hybridization and extension protocols with oligo arrays on flow-through channel glass. Channel glass is a porous material that permits reagents to be passed through the arrays. They devoted roughly 19 months at the beginning of the grant in pursuit of this methodology, but for various technological reasons, progress was limited.

  19. Draft genome sequence of Coxiella burnetii Dog Utad, a strain isolated from a dog-related outbreak of Q fever

    Directory of Open Access Journals (Sweden)

    F. D’amato

    2014-07-01

    Full Text Available Coxiella burnetii Dog Utad, with a 2 008 938 bp genome is a strain isolated from a parturient dog responsible for a human familial outbreak of acute Q fever in Nova Scotia, Canada. Its genotype, determined by multispacer typing, is 21; the only one found in Canada that includes Q212, which causes endocarditis. Only 107 single nucleotide polymorphisms and 16 INDELs differed from Q212, suggesting a recent clonal radiation.

  20. Genomic lineages of Rhizobium etli revealed by the extent of nucleotide polymorphisms and low recombination

    Directory of Open Access Journals (Sweden)

    González Víctor

    2011-10-01

    Full Text Available Abstract Background Most of the DNA variations found in bacterial species are in the form of single nucleotide polymorphisms (SNPs, but there is some debate regarding how much of this variation comes from mutation versus recombination. The nitrogen-fixing symbiotic bacteria Rhizobium etli is highly variable in both genomic structure and gene content. However, no previous report has provided a detailed genomic analysis of this variation at nucleotide level or the role of recombination in generating diversity in this bacterium. Here, we compared draft genomic sequences versus complete genomic sequences to obtain reliable measures of genetic diversity and then estimated the role of recombination in the generation of genomic diversity among Rhizobium etli. Results We identified high levels of DNA polymorphism in R. etli, and found that there was an average divergence of 4% to 6% among the tested strain pairs. DNA recombination events were estimated to affect 3% to 10% of the genomic sample analyzed. In most instances, the nucleotide diversity (π was greater in DNA segments with recombinant events than in non-recombinant segments. However, this degree of recombination was not sufficiently large to disrupt the congruence of the phylogenetic trees, and further evaluation of recombination in strains quartets indicated that the recombination levels in this species are proportionally low. Conclusion Our data suggest that R. etli is a species composed of separated lineages with low homologous recombination among the strains. Horizontal gene transfer, particularly via the symbiotic plasmid characteristic of this species, seems to play an important role in diversity but the lineages maintain their evolutionary cohesiveness.

  1. Whole-genome single-nucleotide-polymorphism analysis for discrimination of Clostridium botulinum group I strains.

    Science.gov (United States)

    Gonzalez-Escalona, Narjol; Timme, Ruth; Raphael, Brian H; Zink, Donald; Sharma, Shashi K

    2014-04-01

    Clostridium botulinum is a genetically diverse Gram-positive bacterium producing extremely potent neurotoxins (botulinum neurotoxins A through G [BoNT/A-G]). The complete genome sequences of three strains harboring only the BoNT/A1 nucleotide sequence are publicly available. Although these strains contain a toxin cluster (HA(+) OrfX(-)) associated with hemagglutinin genes, little is known about the genomes of subtype A1 strains (termed HA(-) OrfX(+)) that lack hemagglutinin genes in the toxin gene cluster. We sequenced the genomes of three BoNT/A1-producing C. botulinum strains: two strains with the HA(+) OrfX(-) cluster (69A and 32A) and one strain with the HA(-) OrfX(+) cluster (CDC297). Whole-genome phylogenic single-nucleotide-polymorphism (SNP) analysis of these strains along with other publicly available C. botulinum group I strains revealed five distinct lineages. Strains 69A and 32A clustered with the C. botulinum type A1 Hall group, and strain CDC297 clustered with the C. botulinum type Ba4 strain 657. This study reports the use of whole-genome SNP sequence analysis for discrimination of C. botulinum group I strains and demonstrates the utility of this analysis in quickly differentiating C. botulinum strains harboring identical toxin gene subtypes. This analysis further supports previous work showing that strains CDC297 and 657 likely evolved from a common ancestor and independently acquired separate BoNT/A1 toxin gene clusters at distinct genomic locations.

  2. Genomic variation in rice: genesis of highly polymorphic linkage blocks during domestication.

    Directory of Open Access Journals (Sweden)

    Tian Tang

    2006-11-01

    Full Text Available Genomic regions that are unusually divergent between closely related species or racial groups can be particularly informative about the process of speciation or the operation of natural selection. The two sequenced genomes of cultivated Asian rice, Oryza sativa, reveal that at least 6% of the genomes are unusually divergent. Sequencing of ten unlinked loci from the highly divergent regions consistently identified two highly divergent haplotypes with each locus in nearly complete linkage disequilibrium among 25 O. sativa cultivars and 35 lines from six wild species. The existence of two highly divergent haplotypes in high divergence regions in species from all geographical areas (Africa, Asia, and Oceania was in contrast to the low polymorphism and low linkage disequilibrium that were observed in other parts of the genome, represented by ten reference loci. While several natural processes are likely to contribute to this pattern of genomic variation, domestication may have greatly exaggerated the trend. In this hypothesis, divergent haplotypes that were adapted to different geographical and ecological environments migrated along with humans during the development of domesticated varieties. If true, these high divergence regions of the genome would be enriched for loci that contribute to the enormous range of phenotypic variation observed among domesticated breeds.

  3. Extensive sequence-influenced DNA methylation polymorphism in the human genome

    Directory of Open Access Journals (Sweden)

    Hellman Asaf

    2010-05-01

    Full Text Available Abstract Background Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters, but there are also interesting patterns of CpG methylation found outside of CpG islands. Results We compared DNA methylation patterns on both alleles between many pairs (and larger groups of related and unrelated individuals. Direct observation and simulation experiments revealed that around 10% of common single nucleotide polymorphisms (SNPs reside in regions with differences in the propensity for local DNA methylation between the two alleles. We further showed that for the most common form of SNP, a polymorphism at a CpG dinucleotide, the presence of the CpG at the SNP positively affected local DNA methylation in cis. Conclusions Taken together with the known effect of DNA methylation on mutation rate, our results suggest an interesting interdependence between genetics and epigenetics underlying diversity in the human genome.

  4. Efficient human paternity testing with a panel of 40 short insertion-deletion polymorphisms.

    Science.gov (United States)

    Pimenta, J R; Pena, S D J

    2010-03-30

    We developed a panel of 40 multiplexed short insertion-deletion (indel) polymorphic loci with widespread chromosomal locations and allele frequencies close to 0.50 in the European population. We genotyped these markers in 360 unrelated self-classified White Brazilians and 50 mother-child-probable father trios with proven paternity. The average heterozygosity (gene diversity) per locus was 0.48, and the combined probability of identity (matching probability) for the 40-locus set was 3.48 x 10(-17). The combined power of exclusion of the indel panel was 0.9997. The efficiency of the 40 indel set in the exclusion of falsely accused individuals in paternity casework was equivalent to the CODIS set of 13 microsatellites. The geometric mean of the paternity indices of the 50 mother-child-probable father trios was 17,607. This panel of 40 short indels was found to have excellent performance. Thus, especially because of its simplicity and low cost, and the fact that it is composed of genomic markers that have very low mutation rates, it represents a useful new tool for human paternity testing.

  5. Phenotypic Plasticity Promotes Balanced Polymorphism in Periodic Environments by a Genomic Storage Effect.

    Science.gov (United States)

    Gulisija, Davorka; Kim, Yuseob; Plotkin, Joshua B

    2016-04-01

    Phenotypic plasticity is known to evolve in perturbed habitats, where it alleviates the deleterious effects of selection. But the effects of plasticity on levels of genetic polymorphism, an important precursor to adaptation in temporally varying environments, are unclear. Here we develop a haploid, two-locus population-genetic model to describe the interplay between a plasticity modifier locus and a target locus subject to periodically varying selection. We find that the interplay between these two loci can produce a "genomic storage effect" that promotes balanced polymorphism over a large range of parameters, in the absence of all other conditions known to maintain genetic variation. The genomic storage effect arises as recombination allows alleles at the two loci to escape more harmful genetic backgrounds and associate in haplotypes that persist until environmental conditions change. Using both Monte Carlo simulations and analytical approximations we quantify the strength of the genomic storage effect across a range of selection pressures, recombination rates, plasticity modifier effect sizes, and environmental periods.

  6. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point...... mapping technology as a comprehensive and cost-effective method for detecting structural variation and studying complex regions in the human genome, as well as deciphering viral integration into the host genome....

  7. Prediction of maize phenotype based on whole-genome single nucleotide polymorphisms using deep belief networks

    Science.gov (United States)

    Rachmatia, H.; Kusuma, W. A.; Hasibuan, L. S.

    2017-05-01

    Selection in plant breeding could be more effective and more efficient if it is based on genomic data. Genomic selection (GS) is a new approach for plant-breeding selection that exploits genomic data through a mechanism called genomic prediction (GP). Most of GP models used linear methods that ignore effects of interaction among genes and effects of higher order nonlinearities. Deep belief network (DBN), one of the architectural in deep learning methods, is able to model data in high level of abstraction that involves nonlinearities effects of the data. This study implemented DBN for developing a GP model utilizing whole-genome Single Nucleotide Polymorphisms (SNPs) as data for training and testing. The case study was a set of traits in maize. The maize dataset was acquisitioned from CIMMYT’s (International Maize and Wheat Improvement Center) Global Maize program. Based on Pearson correlation, DBN is outperformed than other methods, kernel Hilbert space (RKHS) regression, Bayesian LASSO (BL), best linear unbiased predictor (BLUP), in case allegedly non-additive traits. DBN achieves correlation of 0.579 within -1 to 1 range.

  8. Sequence length variation, indel costs, and congruence in sensitivity analysis

    DEFF Research Database (Denmark)

    Aagesen, Lone; Petersen, Gitte; Seberg, Ole

    2005-01-01

    the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously...... preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation...... in the data set. Dominance was easily detected, as the character-based congruence measures approached their optimal value when indel costs were incremented. Dominance of a fragment or data partition was overwhelmed when new sequence length-variable fragments or data partitions were added....

  9. Developing market class specific InDel markers from next generation sequence data in Phaseolus vulgaris L.

    Science.gov (United States)

    Moghaddam, Samira Mafi; Song, Qijian; Mamidi, Sujan; Schmutz, Jeremy; Lee, Rian; Cregan, Perry; Osorno, Juan M; McClean, Phillip E

    2014-01-01

    Next generation sequence data provides valuable information and tools for genetic and genomic research and offers new insights useful for marker development. This data is useful for the design of accurate and user-friendly molecular tools. Common bean (Phaseolus vulgaris L.) is a diverse crop in which separate domestication events happened in each gene pool followed by race and market class diversification that has resulted in different morphological characteristics in each commercial market class. This has led to essentially independent breeding programs within each market class which in turn has resulted in limited within market class sequence variation. Sequence data from selected genotypes of five bean market classes (pinto, black, navy, and light and dark red kidney) were used to develop InDel-based markers specific to each market class. Design of the InDel markers was conducted through a combination of assembly, alignment and primer design software using 1.6× to 5.1× coverage of Illumina GAII sequence data for each of the selected genotypes. The procedure we developed for primer design is fast, accurate, less error prone, and higher throughput than when they are designed manually. All InDel markers are easy to run and score with no need for PCR optimization. A total of 2687 InDel markers distributed across the genome were developed. To highlight their usefulness, they were employed to construct a phylogenetic tree and a genetic map, showing that InDel markers are reliable, simple, and accurate.

  10. Draft genome of the sea cucumber Apostichopus japonicus and genetic polymorphism among color variants.

    Science.gov (United States)

    Jo, Jihoon; Oh, Jooseong; Lee, Hyun-Gwan; Hong, Hyun-Hee; Lee, Sung-Gwon; Cheon, Seongmin; Kern, Elizabeth M A; Jin, Soyeong; Cho, Sung-Jin; Park, Joong-Ki; Park, Chungoo

    2017-01-01

    The Japanese sea cucumber (Apostichopus japonicus Selenka 1867) is an economically important species as a source of seafood and ingredient in traditional medicine. It is mainly found off the coasts of northeast Asia. Recently, substantial exploitation and widespread biotic diseases in A. japonicus have generated increasing conservation concern. However, the genomic knowledge base and resources available for researchers to use in managing this natural resource and to establish genetically based breeding systems for sea cucumber aquaculture are still in a nascent stage. A total of 312 Gb of raw sequences were generated using the Illumina HiSeq 2000 platform and assembled to a final size of 0.66 Gb, which is about 80.5% of the estimated genome size (0.82 Gb). We observed nucleotide-level heterozygosity within the assembled genome to be 0.986%. The resulting draft genome assembly comprising 132 607 scaffolds with an N50 value of 10.5 kb contains a total of 21 771 predicted protein-coding genes. We identified 6.6-14.5 million heterozygous single nucleotide polymorphisms in the assembled genome of the three natural color variants (green, red, and black), resulting in an estimated nucleotide diversity of 0.00146. We report the first draft genome of A. japonicus and provide a general overview of the genetic variation in the three major color variants of A. japonicus. These data will help provide a comprehensive view of the genetic, physiological, and evolutionary relationships among color variants in A. japonicus, and will be invaluable resources for sea cucumber genomic research. © The Author 2017. Published by Oxford University Press.

  11. Typing of 30 insertion/deletions in Danes using the first commercial indel kit-Mentype(®) DIPplex

    DEFF Research Database (Denmark)

    Friis, Susanne Lunøe; Børsting, Claus; Rockenbauer, Eszter

    2012-01-01

    probability was 3.3×10(-13), the mean paternity exclusion probability was 99.7% and the typical paternity indices for trios and duos were 2350 and 165, respectively. Furthermore, we typed five highly degraded DNA samples with the DIPplex kit, the AmpFlSTR(®) SGM Plus kit and the AmpFlSTR(®) SEfiler Plus kit......In this study, we tested the first commercial kit with insertion/deletion (indel) polymorphisms, the Mentype(®) DIPplex PCR Amplification Kit (DIPplex kit). A total of 30 biallelic autosomal indels and Amelogenin were amplified with the DIPplex kit. All loci were amplified in one PCR multiplex...... and all amplicon lengths were shorter than 160bp. Full indel profiles were generated from as little as 100pg of DNA. A total of 117 individuals from Danish paternity cases were successfully typed. No deviation from Hardy-Weinberg equilibrium was observed for any of the indels. The combined mean match...

  12. Whole-genome sequencing of six Mauritian Cynomolgus macaques (Macaca fascicularis) reveals a genome-wide pattern of polymorphisms under extreme population bottleneck.

    Science.gov (United States)

    Osada, Naoki; Hettiarachchi, Nilmini; Adeyemi Babarinde, Isaac; Saitou, Naruya; Blancher, Antoine

    2015-03-23

    Cynomolgus macaques (Macaca fascicularis) were introduced to the island of Mauritius by humans around the 16th century. The unique demographic history of the Mauritian cynomolgus macaques provides the opportunity to not only examine the genetic background of well-established nonhuman primates for biomedical research but also understand the effect of an extreme population bottleneck on the pattern of polymorphisms in genomes. We sequenced the whole genomes of six Mauritian cynomolgus macaques and obtained an average of 20-fold coverage of the genome sequences for each individual. The overall level of nucleotide diversity was 23% smaller than that of the Malaysian cynomolgus macaques, and a reduction of low-frequency polymorphisms was observed. In addition, we also confirmed that the Mauritian cynomolgus macaques were genetically closer to a representative of the Malaysian population than to a representative of the Indochinese population. Excess of nonsynonymous polymorphisms in low frequency, which has been observed in many other species, was not very strong in the Mauritian samples, and the proportion of heterozygous nonsynonymous polymorphisms relative to synonymous polymorphisms is higher within individuals in Mauritian than Malaysian cynomolgus macaques. Those patterns indicate that the extreme population bottleneck made purifying selection overwhelmed by the power of genetic drift in the population. Finally, we estimated the number of founding individuals by using the genome-wide site frequency spectrum of the six samples. Assuming a simple demographic scenario with a single bottleneck followed by exponential growth, the estimated number of founders (∼20 individuals) is largely consistent with previous estimates.

  13. Incorporating indel information into phylogeny estimation for rapidly emerging pathogens

    Directory of Open Access Journals (Sweden)

    Suchard Marc A

    2007-03-01

    Full Text Available Abstract Background Phylogenies of rapidly evolving pathogens can be difficult to resolve because of the small number of substitutions that accumulate in the short times since divergence. To improve resolution of such phylogenies we propose using insertion and deletion (indel information in addition to substitution information. We accomplish this through joint estimation of alignment and phylogeny in a Bayesian framework, drawing inference using Markov chain Monte Carlo. Joint estimation of alignment and phylogeny sidesteps biases that stem from conditioning on a single alignment by taking into account the ensemble of near-optimal alignments. Results We introduce a novel Markov chain transition kernel that improves computational efficiency by proposing non-local topology rearrangements and by block sampling alignment and topology parameters. In addition, we extend our previous indel model to increase biological realism by placing indels preferentially on longer branches. We demonstrate the ability of indel information to increase phylogenetic resolution in examples drawn from within-host viral sequence samples. We also demonstrate the importance of taking alignment uncertainty into account when using such information. Finally, we show that codon-based substitution models can significantly affect alignment quality and phylogenetic inference by unrealistically forcing indels to begin and end between codons. Conclusion These results indicate that indel information can improve phylogenetic resolution of recently diverged pathogens and that alignment uncertainty should be considered in such analyses.

  14. Analysis of Nucleosome Positioning in The Vicinity of Sites of Nucleotide Polymorphism in Human Genome%人类基因组核苷酸多态性位点核小体定位分析

    Institute of Scientific and Technical Information of China (English)

    刘宏德; 孙啸

    2011-01-01

    of polymorphism sites has an intimate relationship with nucleosome positioning. Further studies suggest that most of single nucleotide polymorphism sites are at two ends of core DNA, while sites of insertion, deletion, and insertion and deletion (in-del) tend to be in nucleosome-depleted region. The equally-spaced configuration of nucleosomes downstream of TSS causes the periodic distribution of polymorphism sites. The studies suggest genome variations occur in different regions relatively to nucleosomes, and nucleosome positioning has a role in forming nucleotide polymorphism.

  15. The complete chloroplast genome sequences for four Amaranthus species (Amaranthaceae)1

    Science.gov (United States)

    Chaney, Lindsay; Mangelson, Ryan; Ramaraj, Thiruvarangan; Jellen, Eric N.; Maughan, Peter J.

    2016-01-01

    Premise of the study: The amaranth genus contains many important grain and weedy species. We further our understanding of the genus through the development of a complete reference chloroplast genome. Methods and Results: A high-quality Amaranthus hypochondriacus (Amaranthaceae) chloroplast genome assembly was developed using long-read technology. This reference genome was used to reconstruct the chloroplast genomes for two closely related grain species (A. cruentus and A. caudatus) and their putative progenitor (A. hybridus). The reference genome was 150,518 bp and possesses a circular structure of two inverted repeats (24,352 bp) separated by small (17,941 bp) and large (83,873 bp) single-copy regions; it encodes 111 genes, 72 for proteins. Relative to the reference chloroplast genome, an average of 210 single-nucleotide polymorphisms (SNPs) and 122 insertion/deletion polymorphisms (indels) were identified across the analyzed genomes. Conclusions: This reference chloroplast genome, along with the reported simple sequence repeats, SNPs, and indels, is an invaluable genetic resource for studying the phylogeny and genetic diversity within the amaranth genus. PMID:27672525

  16. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  17. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  18. Chloroplast genome sequence of the moss Tortula ruralis: gene content, polymorphism, and structural arrangement relative to other green plant chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Wolf Paul G

    2010-02-01

    Full Text Available Abstract Background Tortula ruralis, a widely distributed species in the moss family Pottiaceae, is increasingly used as a model organism for the study of desiccation tolerance and mechanisms of cellular repair. In this paper, we present the chloroplast genome sequence of T. ruralis, only the second published chloroplast genome for a moss, and the first for a vegetatively desiccation-tolerant plant. Results The Tortula chloroplast genome is ~123,500 bp, and differs in a number of ways from that of Physcomitrella patens, the first published moss chloroplast genome. For example, Tortula lacks the ~71 kb inversion found in the large single copy region of the Physcomitrella genome and other members of the Funariales. Also, the Tortula chloroplast genome lacks petN, a gene found in all known land plant plastid genomes. In addition, an unusual case of nucleotide polymorphism was discovered. Conclusions Although the chloroplast genome of Tortula ruralis differs from that of the only other sequenced moss, Physcomitrella patens, we have yet to determine the biological significance of the differences. The polymorphisms we have uncovered in the sequencing of the genome offer a rare possibility (for mosses of the generation of DNA markers for fine-level phylogenetic studies, or to investigate individual variation within populations.

  19. A method for the analysis of 32 X chromosome insertion deletion polymorphisms in a single PCR

    DEFF Research Database (Denmark)

    Pereira, Rui; Pereira, Vania; Gomes, Iva

    2012-01-01

    Studies of human genetic variation predominantly use short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) but Insertion deletion polymorphisms (Indels) are being increasingly explored. They combine desirable characteristics of other genetic markers, especially the possibility of...

  20. Population genomics of divergence among extreme and intermediate color forms in a polymorphic insect.

    Science.gov (United States)

    Lozier, Jeffrey D; Jackson, Jason M; Dillon, Michael E; Strange, James P

    2016-02-01

    Geographic variation in insect coloration is among the most intriguing examples of rapid phenotypic evolution and provides opportunities to study mechanisms of phenotypic change and diversification in closely related lineages. The bumble bee Bombus bifarius comprises two geographically disparate color groups characterized by red-banded and black-banded abdominal pigmentation, but with a range of spatially and phenotypically intermediate populations across western North America. Microsatellite analyses have revealed that B. bifarius in the USA are structured into two major groups concordant with geography and color pattern, but also suggest ongoing gene flow among regional populations. In this study, we better resolve the relationships among major color groups to better understand evolutionary mechanisms promoting and maintaining such polymorphism. We analyze >90,000 and >25,000 single-nucleotide polymorphisms derived from transcriptome (RNAseq) and double digest restriction site associated DNA sequencing (ddRAD), respectively, in representative samples from spatial and color pattern extremes in B. bifarius as well as phenotypic and geographic intermediates. Both ddRAD and RNAseq data illustrate substantial genome-wide differentiation of the red-banded (eastern) color form from both black-banded (western) and intermediate (central) phenotypes and negligible differentiation among the latter populations, with no obvious admixture among bees from the two major lineages. Results thus indicate much stronger background differentiation among B. bifarius lineages than expected, highlighting potential challenges for revealing loci underlying color polymorphism from population genetic data alone. These findings will have significance for resolving taxonomic confusion in this species and in future efforts to investigate color-pattern evolution in B. bifarius and other polymorphic bumble bee species.

  1. Amplified fragment length polymorphism: an adept technique for genome mapping, genetic differentiation, and intraspecific variation in protozoan parasites.

    Science.gov (United States)

    Kumar, Awanish; Misra, Pragya; Dube, Anuradha

    2013-02-01

    With the advent of polymerase chain reaction (PCR), genetic markers are now accessible for all organisms, including parasites. Amplified fragment length polymorphism (AFLP) is a PCR-based marker for the rapid screening of genetic diversity and intraspecific variation. It is a potent fingerprinting technique for genomic DNAs of any origin or complexity and rapidly generates a number of highly replicable markers that allow high-resolution genotyping. AFLPs are convenient and reliable in comparison to other markers like random amplified polymorphic DNA, restriction fragment length polymorphism, and simple sequence repeat in terms of time and cost efficiency, reproducibility, and resolution as it does not require template DNA sequencing. In addition, AFLP essentially probes the entire genome at random, without prior sequence knowledge. So, AFLP markers have emerged as an advance type of genetic marker with broad application in genomic mapping, population genetics, and DNA fingerprinting and are ideally suited as screening tool for molecular markers linked with biological and clinical traits. This review describes the AFLP procedure and its applications and overview in the fingerprinting of a genome, which has been currently used in parasite genome research. We outline the AFLP procedure adapted for Leishmania genome study and discuss the benefits of AFLPs for assessing genetic variation and genome mapping over other existing molecular techniques. We highlight the possible use of AFLPs as genetic markers with its broad application in parasitological research because it allows random screening of the entire genome for linkage with genetic and clinical properties of the parasite. In this review, we have taken a pragmatic approach on the study of AFLP for genome mapping and polymorphism in protozoan parasites and conclude that AFLP is a very useful tool.

  2. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    Science.gov (United States)

    Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

    2012-02-17

    The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  3. Whole-Genome sequencing and genetic variant analysis of a quarter Horse mare

    Directory of Open Access Journals (Sweden)

    Doan Ryan

    2012-02-01

    Full Text Available Abstract Background The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs, insertion/deletion polymorphisms (INDELs, and copy number variants (CNVs in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Results Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. Conclusions This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  4. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  5. Genomic comparison of invasive and rare non-invasive strains reveals Porphyromonas gingivalis genetic polymorphisms

    Directory of Open Access Journals (Sweden)

    Svetlana Dolgilevich

    2011-03-01

    Full Text Available Porphyromonas gingivalis strains are shown to invade human cells in vitro with different invasion efficiencies, varying by up to three orders of magnitude.We tested the hypothesis that invasion-associated interstrain genomic polymorphisms are present in P. gingivalis and that putative invasion-associated genes can contribute to P. gingivalis invasion.Using an invasive (W83 and the only available non-invasive P. gingivalis strain (AJW4 and whole genome microarrays followed by two separate software tools, we carried out comparative genomic hybridization (CGH analysis.We identified 68 annotated and 51 hypothetical open reading frames (ORFs that are polymorphic between these strains. Among these are surface proteins, lipoproteins, capsular polysaccharide biosynthesis enzymes, regulatory and immunoreactive proteins, integrases, and transposases often with abnormal GC content and clustered on the chromosome. Amplification of selected ORFs was used to validate the approach and the selection. Eleven clinical strains were investigated for the presence of selected ORFs. The putative invasion-associated ORFs were present in 10 of the isolates. The invasion ability of three isogenic mutants, carrying deletions in PG0185, PG0186, and PG0982 was tested. The PG0185 (ragA and PG0186 (ragB mutants had 5.1×103-fold and 3.6×103-fold decreased in vitro invasion ability, respectively.The annotation of divergent ORFs suggests deficiency in multiple genes as a basis for P. gingivalis non-invasive phenotype. Access the supplementary material to this article: Supplement, table (see Supplementary files under Reading Tools online.

  6. Population Differentiations and Phylogenetic Analysis of Tibet and Qinghai Tibetan Groups Based on 30 InDel Loci.

    Science.gov (United States)

    Guo, Yuxin; Shen, Chunmei; Meng, Haotian; Dong, Qian; Kong, Tingting; Yang, Chunhua; Wang, Hongdan; Jin, Rui; Zhu, Bofeng

    2016-12-01

    In recent years, Insertion/Deletion (InDel) polymorphisms have become a hot area of forensic research. In this study, 30 InDel loci were selected to investigate the genetic polymorphisms of Tibetan groups, which are from Tibet Autonomous Region and Qinghai province of China, and explore the genetic relationships between Tibetan groups and other groups. Allele frequencies of the 30 InDel loci ranged from 0.1219 (HLD111) to 0.5609 (HLD57) in the Tibet Tibetan group and 0.1639 (HLD118) to 0.5655 (HLD124) in the Qinghai Tibetan group. The combined power of discrimination, matching probability, and power of exclusion were 0.999999999986, 0.999999988, and 0.9913 in the Tibet Tibetan group, respectively, and 0.99999999999204, 0.9999999796, and 0.9862 in the Qinghai Tibetan group. The results of principal component analysis, phylogenetic tree, and population structure demonstrated that the four Tibetan groups (Tibetan1, Tibetan2, Tibet, and Qinghai Tibetan groups) clustered together and had relatively close genetic relationships with nine Asian groups and then European and Amerindian groups.

  7. Genomic diversity amongst Vibrio isolates from different sources determined by fluorescent amplified fragment length polymorphism.

    Science.gov (United States)

    Thompson, F L; Hoste, B; Vandemeulebroecke, K; Swings, J

    2001-12-01

    The genomic diversity among 506 strains of the family Vibrionaceae was analysed using Fluorescent Amplified Fragments Length Polymorphisms (FAFLP). Isolates were from different sources (e.g. fish, mollusc, shrimp, rotifers, artemia, and their culture water) in different countries, mainly from the aquacultural environment. Clustering of the FAFLP band patterns resulted in 69 clusters. A majority of the actually known species of the family Vibrionaceae formed separate clusters. Certain species e.g. V. alginolyticus, V. cholerae, V. cincinnatiensis, V. diabolicus, V. diazotrophicus, V. harveyi, V. logei, V. natriegens, V. nereis, V. splendidus and V. tubiashii were found to be ubiquitous, whereas V. halioticoli, V. ichthyoenteri, V. pectenicida and V. wodanis appear to be exclusively associated with a particular host or geographical region. Three main categories of isolates could be distinguished: (1) isolates with genomes related (i.e. with > or =45% FAFLP pattern similarity) to one of the known type strains; (2) isolates clustering (> or =45% pattern similarity) with more than one type strain; (3) isolates with genomes unrelated (<45% pattern similarity) to any of the type strains. The latter group consisted of 236 isolates distributed in 31 clusters indicating that many culturable taxa of the Vibrionaceae remain as yet to be described.

  8. Self-similar characteristics of single nucleotide polymorphisms in the rice genome

    Science.gov (United States)

    Lee, Chang-Yong

    2016-11-01

    With single nucleotide polymorphism (SNP) data from the 3,000 rice genome project, we investigate the mutational characteristics of the rice genome from the perspective of statistical physics. From the frequency distributions of the space between adjacent SNPs, we present evidence that SNPs are not spaced randomly, but clustered across the genome. The clustering property is related to a long-range correlation in SNP locations, suggesting that a mutation occurring in a locus may affect other mutations far away along the sequence in a chromosome. In addition, the reliability of the existence of the long-range correlation is supported by the agreement between the results of two independent analysis methods. The highly-skewed and long-tailed distribution of SNP spaces is further characterized by a multi-fractal, showing that SNP spaces possess a rich structure of a statistical self-similarity. These results can be used for an optimal design of a microarray assay and a primer, as well as for genotyping quality control.

  9. Genome-based polymorphic microsatellite development and validation in the mosquito Aedes aegypti and application to population genetics in Haiti

    Directory of Open Access Journals (Sweden)

    Streit Thomas G

    2009-12-01

    Full Text Available Abstract Background Microsatellite markers have proven useful in genetic studies in many organisms, yet microsatellite-based studies of the dengue and yellow fever vector mosquito Aedes aegypti have been limited by the number of assayable and polymorphic loci available, despite multiple independent efforts to identify them. Here we present strategies for efficient identification and development of useful microsatellites with broad coverage across the Aedes aegypti genome, development of multiplex-ready PCR groups of microsatellite loci, and validation of their utility for population analysis with field collections from Haiti. Results From 79 putative microsatellite loci representing 31 motifs identified in 42 whole genome sequence supercontig assemblies in the Aedes aegypti genome, 33 microsatellites providing genome-wide coverage amplified as single copy sequences in four lab strains, with a range of 2-6 alleles per locus. The tri-nucleotide motifs represented the majority (51% of the polymorphic single copy loci, and none of these was located within a putative open reading frame. Seven groups of 4-5 microsatellite loci each were developed for multiplex-ready PCR. Four multiplex-ready groups were used to investigate population genetics of Aedes aegypti populations sampled in Haiti. Of the 23 loci represented in these groups, 20 were polymorphic with a range of 3-24 alleles per locus (mean = 8.75. Allelic polymorphic information content varied from 0.171 to 0.867 (mean = 0.545. Most loci met Hardy-Weinberg expectations across populations and pairwise FST comparisons identified significant genetic differentiation between some populations. No evidence for genetic isolation by distance was observed. Conclusion Despite limited success in previous reports, we demonstrate that the Aedes aegypti genome is well-populated with single copy, polymorphic microsatellite loci that can be uncovered using the strategy developed here for rapid and efficient

  10. Single strand conformation polymorphism of genomic and EST-SSRs marker and its utility in genetic evaluation of sugarcane.

    Science.gov (United States)

    Kalwade, Sachin B; Devarumath, Rachayya M

    2014-07-01

    Sugarcane is an important crop producing around 75 % of sugar in world and used as first generation biofuel. In present study, the genomic and gene based microsatellite markers were analyzed by low cost Single Strand Confirmation Polymorphism technique for genetic evaluation of 22 selected sugarcane genotypes. Total 16 genomic and 12 Expression Sequence Tag derived markers were able to amplify the selected sugarcane genotypes. Total 138 alleles were amplified of which 99 alleles (72 %) found polymorphic with an average of 4.9 alleles per locus. Microsatellite marker, VCSSR7 and VCSSR 12 showed monomorphic alleles with frequency 7.1 % over the average of 3.5 obtained for polymorphic locus. The level of Polymorphic Information Content (PIC) varied from 0.09 in VCSSR 6 to 0.88 in VCSSR 11 marker respectively with a mean of 0.49. Genomic SSRs showed more polymorphism than EST-SSRs markers on selected sugarcane genotypes whereas, the genetic similarity indices calculated by Jaccard's similarity coefficient varied from 0.55 to 0.81 indicate a high level of genetic similarity among the genotypes that was mainly attributed to intra specific diversity. Hence, the SSR-SSCP technique helped to identify the genetically diverse clones which could be used in crossing program for introgression of sugar and stress related traits in hybrid sugarcane.

  11. Capillary electrophoresis of 38 noncoding biallelic mini-Indels for degraded samples and as complementary tool in paternity testing.

    Science.gov (United States)

    Pereira, Rui; Gusmão, Leonor

    2012-01-01

    This work describes the main advantages and the steps involved in the optimization of a multiplex system able to characterize 38 noncoding biallelic Insertion Deletion Polymorphisms(Indels). With this methodology, all markers are amplified in a single PCR, using short amplicons (up to 160 bp) in order to improve its performance in degraded samples. Alleles are easily detected using capillary electrophoresis.The Indel multiplex typing strategy here described has the same desirable characteristics of forensic SNP assays, including genetic markers (a) with low mutation rates, increasing their usefulness in some kinship cases where few or single incompatibilities can be explained by mutation, and (b) that can be typed using a short amplicon strategy, increasing their usefulness in cases where degraded samples are available. Moreover, this approach uses simple and well-established methodologies already applied in forensic STR assays.

  12. Completion of a worldwide reference panel of samples for an ancestry informative Indel assay.

    Science.gov (United States)

    Santos, Carla; Phillips, Christopher; Oldoni, Fabio; Amigo, Jorge; Fondevila, Manuel; Pereira, Rui; Carracedo, Ángel; Lareu, Maria Victoria

    2015-07-01

    The use of ancestry informative markers (AIMs) in forensic analysis is of considerable utility since ancestry inference can progress an investigation when no identification has been made of DNA from the crime-scene. Short-amplicon markers, including insertion deletion polymorphisms, are particularly useful in forensic analysis due to their mutational stability, capacity to amplify degraded samples and straightforward amplification technique. In this study we report the completion of H952 HGDP-CEPH panel genotyping with a set of 46 AIM-Indels. The study adds Central South Asian and Middle Eastern population data, allowing a comparison of patterns of variation in Eurasia for these markers, in order to enhance their use in forensic analyses, particularly when combined with sets of ancestry informative SNPs. Ancestry analysis using principal component analysis and Bayesian methods indicates that a proportion of classification error occurs with European-Middle East population comparisons, but the 46 AIM-Indels have the capability to differentiate six major population groups when European-Central South Asian comparisons are made. These findings have relevance for forensic ancestry analyses in countries where South Asians form much of the demographic profile, including the UK, USA and South Africa. A novel third allele detected in MID-548 was characterized - despite a low frequency in the HGDP-CEPH panel samples, it appears confined to Central South Asian populations, increasing the ability to differentiate this population group. The H952 data set was implemented in a new open access SPSmart frequency browser - forInDel: Forensic Indel browser.

  13. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Crooijmans, R.P.M.A.; Veenendaal, A.; Dibbits, B.W.; Chin-A-Woeng, T.F.C.; Dunnen, den J.T.; Groenen, M.A.M.

    2009-01-01

    Background - The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a

  14. A new single-nucleotide polymorphisms database for rainbow trout generated through whole genome resequencing of selected samples

    Science.gov (United States)

    Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...

  15. Selection of Unique Escherichia coli Clones by Random Amplified Polymorphic DNA (RAPD): Evaluation by Whole Genome Sequencing

    Science.gov (United States)

    Nielsen, Karen L.; Godfrey, Paul A.; Stegger, Marc; Andersen, Paal S.; Feldgarden, Michael; Frimodt-Møller, Niels

    2014-01-01

    Identifying and characterizing clonal diversity is important when analysing fecal flora. We evaluated random amplified polymorphic DNA (RAPD) PCR, applied for selection of Escherichia coli isolates, by whole genome sequencing. RAPD was fast, and reproducible as screening method for selection of distinct E. coli clones in fecal swabs. PMID:24912108

  16. Application of real-time PCR of sex-independent insertion-deletion polymorphisms to determine fetal sex using cell-free fetal DNA from maternal plasma.

    Science.gov (United States)

    Ho, Sherry Sze Yee; Barrett, Angela; Thadani, Henna; Asibal, Cecille Laureano; Koay, Evelyn Siew-Chuan; Choolani, Mahesh

    2015-07-01

    Prenatal diagnosis of sex-linked disorders requires invasive procedures, carrying a risk of miscarriage of up to 1%. Cell-free fetal DNA (cffDNA) present in cell-free DNA (cfDNA) from maternal plasma offers a non-invasive source of fetal genetic material for analysis. Detection of Y-chromosome sequences in cfDNA indicates presence of a male fetus; in the absence of a Y-chromosome signal a female fetus is inferred. We aimed to validate the clinical utility of insertion-deletion polymorphisms (INDELs) to confirm presence of a female fetus using cffDNA. Quantitative real-time PCR (qPCR) for the Y-chromosome-specific sequence, SRY, was performed on cfDNA from 82 samples at 6-39 gestational weeks. In samples without detectable SRY, qPCRs for eight INDELs were performed on maternal genomic DNA and cfDNA. Detection of paternally inherited fetal alleles in cfDNA negative for SRY confirmed a female fetus. Fetal sex was correctly determined in 77/82 (93.9%) cfDNA samples. SRY was detected in all 39 samples from male-bearing pregnancies, and none of the 43 female-bearing pregnancies (sensitivity and specificity of SRY qPCR is therefore 100%; 95% CI 91%-100%). Paternally inherited fetal alleles were detected in 38/43 samples with no SRY signal, confirming the presence of a female fetus (INDEL assay sensitivity is therefore 88.4%; 95% CI 74.1%-95.6%). Since paternally inherited fetal INDELs were not used in women bearing male fetuses, the specificity of INDELs cannot be calculated. Five cfDNA samples were negative for both SRY and INDELS. We have validated a non-invasive prenatal test to confirm fetal sex as early as 6 gestational weeks using cffDNA from maternal plasma.

  17. Detection of a new 20-bp insertion/deletion (indel) within sheep PRND gene using mathematical expectation (ME) method.

    Science.gov (United States)

    Li, Jie; Zhu, Xichun; Ma, Lin; Xu, Hongwei; Cao, Xin; Luo, Renyun; Chen, Hong; Sun, Xiuzhu; Cai, Yong; Lan, Xianyong

    2017-03-31

    Prion-related protein doppel gene (PRND), as an essential member of the mammalian prion gene family, is associated with the scrapie susceptibility as well as phenotype traits, so the mutation polymorphism of the PRND has been highly concerned recently, including the single nucleiotide polymorphism and insertion/deletion (indel). Therefore, the objective of present study was to examine this novel indel variants by mathematical expectation (ME) detection method as well as explore its associations with phenotype traits. A novel 20-bp indel was verified in 623 tested individuals representing four diversity sheep breeds. The results showed that three genotypes were detected and the minor allelic frequency was 0.008 (Lanzhou Fat-Tail sheep), 0.084 (Small Tail Han sheep), 0.021(Tong sheep) and 0.083 (Hu sheep), respectively. Comparing with the traditional method of detecting samples one by one, the reaction times with ME method was decreased by 36.22% (STHS), 37.00% (HS), 68.67% (TS) and 83.33% (LFTS), respectively. Besides, this locus were significantly associated to cannon circumference index (P = 0.012) and trunk index (P = 0.037) in the Hu sheep breed. Notably, it was not concordance with the result of DNA sequencing (GCTGTCCCTGCAGGGCTTCT) and dbSNPase of NCBI (NC_443194: g.46184887- 46184906delCTGCTGTCCCTGCAGGGCTT). Consequently, it was the first time to detect the 20-bp indel of sheep PRND gene by ME strategy, which may provide a valuable theoretical basis for marker-assisted selection in sheep genetics and breeding.

  18. Evolution of paralogous genes: Reconstruction of genome rearrangements through comparison of multiple genomes within Staphylococcus aureus.

    Science.gov (United States)

    Tsuru, Takeshi; Kawai, Mikihiko; Mizutani-Ui, Yoko; Uchiyama, Ikuo; Kobayashi, Ichizo

    2006-06-01

    Analysis of evolution of paralogous genes in a genome is central to our understanding of genome evolution. Comparison of closely related bacterial genomes, which has provided clues as to how genome sequences evolve under natural conditions, would help in such an analysis. With species Staphylococcus aureus, whole-genome sequences have been decoded for seven strains. We compared their DNA sequences to detect large genome polymorphisms and to deduce mechanisms of genome rearrangements that have formed each of them. We first compared strains N315 and Mu50, which make one of the most closely related strain pairs, at the single-nucleotide resolution to catalogue all the middle-sized (more than 10 bp) to large genome polymorphisms such as indels and substitutions. These polymorphisms include two paralogous gene sets, one in a tandem paralogue gene cluster for toxins in a genomic island and the other in a ribosomal RNA operon. We also focused on two other tandem paralogue gene clusters and type I restriction-modification (RM) genes on the genomic islands. Then we reconstructed rearrangement events responsible for these polymorphisms, in the paralogous genes and the others, with reference to the other five genomes. For the tandem paralogue gene clusters, we were able to infer sequences for homologous recombination generating the change in the repeat number. These sequences were conserved among the repeated paralogous units likely because of their functional importance. The sequence specificity (S) subunit of type I RM systems showed recombination, likely at the homology of a conserved region, between the two variable regions for sequence specificity. We also noticed novel alleles in the ribosomal RNA operons and suggested a role for illegitimate recombination in their formation. These results revealed importance of recombination involving long conserved sequence in the evolution of paralogous genes in the genome.

  19. Whole-genome linkage analysis in mapping alcoholism genes using single-nucleotide polymorphisms and microsatellites.

    Science.gov (United States)

    Wang, Shuang; Huang, Song; Liu, Nianjun; Chen, Liang; Oh, Cheongeun; Zhao, Hongyu

    2005-12-30

    There is currently a great interest in using single-nucleotide polymorphisms (SNPs) in genetic linkage and association studies because of the abundance of SNPs as well as the availability of high-throughput genotyping technologies. In this study, we compared the performance of whole-genome scans using SNPs with microsatellites on 143 pedigrees from the Collaborative Studies on Genetics of Alcoholism provided by Genetic Analysis Workshop 14. A total of 315 microsatellites and 10,081 SNPs from Affymetrix on 22 autosomal chromosomes were used in our analyses. We found that the results from the two scans had good overall concordance. One region on chromosome 2 and two regions on chromosome 7 showed significant linkage signals (i.e., NPL >or= 2) for alcoholism from both the SNP and microsatellite scans. The different results observed between the two scans may be explained by the difference observed in information content between the SNPs and the microsatellites.

  20. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains

    KAUST Repository

    Preston, Mark D.

    2014-06-13

    Malaria is a major public health problem that is actively being addressed in a global eradication campaign. Increased population mobility through international air travel has elevated the risk of re-introducing parasites to elimination areas and dispersing drug-resistant parasites to new regions. A simple genetic marker that quickly and accurately identifies the geographic origin of infections would be a valuable public health tool for locating the source of imported outbreaks. Here we analyse the mitochondrion and apicoplast genomes of 711 Plasmodium falciparum isolates from 14 countries, and find evidence that they are non-recombining and co-inherited. The high degree of linkage produces a panel of relatively few single-nucleotide polymorphisms (SNPs) that is geographically informative. We design a 23-SNP barcode that is highly predictive (?92%) and easily adapted to aid case management in the field and survey parasite migration worldwide. 2014 Macmillan Publishers Limited. All rights reserved.

  1. Application of genome-wide single nucleotide polymorphism typing: simple association and beyond.

    Directory of Open Access Journals (Sweden)

    J Raphael Gibbs

    2006-10-01

    Full Text Available The International HapMap Project and the arrival of technologies that type more than 100,000 SNPs in a single experiment have made genome-wide single nucleotide polymorphism (GW-SNP assay a realistic endeavor. This has sparked considerable debate regarding the promise of GW-SNP typing to identify genetic association in disease. As has already been shown, this approach has the potential to localize common genetic variation underlying disease risk. The data provided from this technology also lends itself to several other lines of investigation; autozygosity mapping in consanguineous families and outbred populations, direct detection of structural variation, admixture analysis, and other population genetic approaches. In this review we will discuss the potential uses and practical application of GW-SNP typing including those above and beyond simple association testing.

  2. Genome-wide identification of novel genetic markers from RNA sequencing assembly of diverse Aegilops tauschii accessions.

    Science.gov (United States)

    Nishijima, Ryo; Yoshida, Kentaro; Motoi, Yuka; Sato, Kazuhiro; Takumi, Shigeo

    2016-08-01

    The wild species in the Triticeae tribe are tremendous resources for crop breeding due to their abundant natural variation. However, their huge and highly repetitive genomes have hindered the establishment of physical maps and the completeness of their genome sequences. To develop molecular markers for the efficient utilization of their valuable traits while avoiding their genome complexity, we assembled RNA sequences of ten representative accessions of Aegilops tauschii, the progenitor of the wheat D genome, and estimated single nucleotide polymorphisms (SNPs) and insertions/deletions (indels). The deduced unigenes were anchored to the chromosomes of Ae. tauschii and barley. The SNPs and indels in the anchored unigenes, covering entire chromosomes, were sufficient for linkage map construction, even in combinations between the genetically closest accessions. Interestingly, the resolution of SNP and indel distribution on barley chromosomes was slightly higher than on Ae. tauschii chromosomes. Since barley chromosomes are regarded as virtual chromosomes of Triticeae species, our strategy allows capture of genetic markers arranged on the chromosomes in order based on the conserved synteny. The resolution of these genetic markers will be comparable to that of the Ae. tauschii whose draft genome sequence is available. Our procedure should be applicable to marker development for Triticeae species, which have no draft sequences available.

  3. A comparison in association and linkage genome-wide scans for alcoholism susceptibility genes using single-nucleotide polymorphisms.

    Science.gov (United States)

    Chiu, Yen-Feng; Liu, Su-Yun; Tsai, Ya-Yu

    2005-12-30

    We conducted genome-wide linkage scans using both microsatellite and single-nucleotide polymorphism (SNP) markers. Regions showing the strongest evidence of linkage to alcoholism susceptibility genes were identified. Haplotype analyses using a sliding-window approach for SNPs in these regions were performed. In addition, we performed a genome-wide association scan using SNP data. SNPs in these regions with evidence of association (P alcoholism (the most significant SNP had a p-value of 0.030) as those identified from association genomic screening (the most significant SNP had a p-value of 2.0 x 10(-8)).

  4. A Probabilistic Model for Sequence Alignment with Context-Sensitive Indels

    Science.gov (United States)

    Hickey, Glenn; Blanchette, Mathieu

    Probabilistic approaches for sequence alignment are usually based on pair Hidden Markov Models (HMMs) or Stochastic Context Free Grammars (SCFGs). Recent studies have shown a significant correlation between the content of short indels and their flanking regions, which by definition cannot be modelled by the above two approaches. In this work, we present a context-sensitive indel model based on a pair Tree-Adjoining Grammar (TAG), along with accompanying algorithms for efficient alignment and parameter estimation. The increased precision and statistical power of this model is shown on simulated and real genomic data. As the cost of sequencing plummets, the usefulness of comparative analysis is becoming limited by alignment accuracy rather than data availability. Our results will therefore have an impact on any type of downstream comparative genomics analyses that rely on alignments. Fine-grained studies of small functional regions or disease markers, for example, could be significantly improved by our method. The implementation is available at http://www.mcb.mcgill.ca/~blanchem/software.html

  5. Polymorphic microsatellite loci from two enriched genomic libraries for the genetic analysis of the miiuy croaker, Miichthys miiuy (Sciaenidae).

    Science.gov (United States)

    Wang, R X; Xu, T J; Sun, Y N; He, G Y

    2010-05-18

    Twelve polymorphic microsatellites from the (AG)(13) and (CA)(13) enriched genomic libraries of Miichthys miiuy were isolated and characterized in a test population; the number of alleles ranged from two to nine. The observed and expected heterozygosities ranged from 0.1923 to 1.0000 and from 0.2633 to 0.8337, respectively. Three loci deviated from Hardy-Weinberg equilibrium, and linkage disequilibrium between five pairs of loci was significant. These polymorphic microsatellite loci can be used for genetic diversity analysis and molecular-assisted breeding of M. miiuy.

  6. [Effects of Cu2+ stress on DNA polymorphism of genome in foxtail millet of different genotypes].

    Science.gov (United States)

    Zhang, Yi-Xian; Fu, Ya-Ping; Xiao, Zhi-Hua; Zhang, Xi-Wen; Li, Ping

    2013-10-01

    Cu2+ is an essential element for plant growth, and is one of the major elements in the environment. In order to investigate the physiological characteristics and geno-toxicity effects of foxtail millet (Setaria italica (L) Beauv) under different Cu2+ stress, four genotypes of foxtail millet (Zhaogu, Huangmi, An06, D2-8) from Shanxi, China were cultivated for 30 days in a pot filled with soil of with different mass concentrations of Cu2+ (0, 50, 100, 200, 400 mg.kg-l). Effects of Cu2+ stress on DNA damage of genome in foxtail millet were studied using random amplified polymorphic DNA (RAPD) , and the contents of soluble sugar, proline and MDA were tested. The result showed that the content of soluble sugar had a trend of initial increased followed by decline in all four foxtail millet seedlings in response to the rising Cu2+ concentration, and the maximum value was 50 mg.kg-1. At Cu2 concentrations of 200 mg. kg-1 or more, the soluble sugar content in the four kinds of millet showed an average reduction of 32.44% to 56.5% compared to that of the control group. The result showed that proline synthesis was enhanced at low concentrations (less than 50 mg.kg-1) , but inhibited at high concentrations (more than 100 mg.kg-1), and the contents of MDA in the four genotypes of foxtail millet were significantly increased compared with the control group (P different genotypes of millet showed different response in the physiological and genetic damage under Cu2+ stress. The change of DNA polymorphism using RAPD technique could be used as the biomarkers to find genotoxic effects of Cu2+.

  7. Genomic polymorphism and protein changes of soybean mutant induced by space environment

    Science.gov (United States)

    He, J.; Gao, Y.; Sun, Y.

    Soybean 194 4126 of excellent agricultural qualities such as high yield and rounder and wider leaf was selected in six generation after abroad recoverable satellite 15 days in 1996 from Soybean 72163 featured with long-leaf white-blossom grey-hair and infinitude-poding To explore the mechanisms of plant mutation induced by space environment we have experimented at genome and proteome level on Soybean 194 4126 and its control Soybean 72163 Amplified Fragment Length Polymorphism AFLP was used to identify mutated sits and the result shows that 36 polymorphic bands varying between 100 and 900 bp in 2022 DNA bands varying between 100 and 1500 bp have been amplified out of 64 pairs of primer combinations between mutant Soybean 194 4126 and the control plant So the mutation degree of DNA is 3 56 The protein two-dimensional electrophoresis 2-DE and peptide mass fingerprint PMF assays were used to investigate the difference of proteins in fruits and leaves between Soybean 194 4126 and its control Results indicate that 62 protein dots specially appear in Soybean 72163 and 39 dots specially in the mutant Soybean 194 4126 by image analysis software PDQuest in the 2-DE maps of soybean seeds Using PMF assay and protein data-base searching to investigate two distinct protein dots we found that the protein specially expressed in the seed of mutant Soybean 194 4126 may be Dehydrin and the other protein specially expressed in the seed of the control Soybean 72163 may be maturation-associated protein MAT1 Because Dehydrin and MAT1 are

  8. Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome

    DEFF Research Database (Denmark)

    Lewis, Nathan E; Liu, Xin; Li, Yuxiang;

    2013-01-01

    Chinese hamster ovary (CHO) cells, first isolated in 1957, are the preferred production host for many therapeutic proteins. Although genetic heterogeneity among CHO cell lines has been well documented, a systematic, nucleotide-resolution characterization of their genotypic differences has been...... stymied by the lack of a unifying genomic resource for CHO cells. Here we report a 2.4-Gb draft genome sequence of a female Chinese hamster, Cricetulus griseus, harboring 24,044 genes. We also resequenced and analyzed the genomes of six CHO cell lines from the CHO-K1, DG44 and CHO-S lineages....... This analysis identified hamster genes missing in different CHO cell lines, and detected >3.7 million single-nucleotide polymorphisms (SNPs), 551,240 indels and 7,063 copy number variations. Many mutations are located in genes with functions relevant to bioprocessing, such as apoptosis. The details...

  9. Genotyping of FCN and MBL2 polymorphisms using pyrosequencing

    DEFF Research Database (Denmark)

    Munthe-Fog, Lea; Madsen, Hans O.; Garred, Peter

    2014-01-01

    Pyrosequencing represents one of the most thorough methods used to analyze polymorphisms. One advantage of using pyrosequencing for genotyping is the ability to identify not only single-nucleotide polymorphisms (SNPs) but also tri-allelic variations, insertions and deletions (InDels). In contrast...... to most other genotyping assays the sequence surrounding the polymorphism provides an internal control making this method highly reliable....

  10. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  11. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  12. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles

    2004-01-01

    Full Text Available Abstract Background Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison. Results In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors. Conclusions We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial

  13. MTHFR Functional Polymorphism C677T and Genomic Instability in the Etiology of Idiopathic Autism in Simplex Families

    Science.gov (United States)

    2014-12-01

    AWARD NUMBER: W81XWH-12-1-0298 TITLE: MTHFR Functional Polymorphism C677T and Genomic Instability in the Etiology of Idiopathic Autism in... Autism in Simplex Families 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-12-1-0298 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Xudong Liu, PhD 5d...DISTRIBUTION / AVAILABILITY STATEMENT Approved for Public Release; Distribution Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT Autism Spectrum Disorder (ASD

  14. Genetic predisposition to neuroblastoma mediated by a LMO1 super-enhancer polymorphism | Office of Cancer Genomics

    Science.gov (United States)

    Neuroblastoma is a paediatric malignancy that typically arises in early childhood, and is derived from the developing sympathetic nervous system. Clinical phenotypes range from localized tumours with excellent outcomes to widely metastatic disease in which long-term survival is approximately 40% despite intensive therapy. A previous genome-wide association study identified common polymorphisms at the LMO1 gene locus that are highly associated with neuroblastoma susceptibility and oncogenic addiction to LMO1 in the tumour cells.

  15. Whole genome sequence analysis of the TALLYHO/Jng mouse.

    Science.gov (United States)

    Denvir, James; Boskovic, Goran; Fan, Jun; Primerano, Donald A; Parkman, Jacaline K; Kim, Jung Han

    2016-11-11

    The TALLYHO/Jng (TH) mouse is a polygenic model for obesity and type 2 diabetes first described in the literature in 2001. The origin of the TH strain is an outbred colony of the Theiler Original strain and mice derived from this source were selectively bred for male hyperglycemia establishing an inbred strain at The Jackson Laboratory. TH mice manifest many of the disease phenotypes observed in human obesity and type 2 diabetes. We sequenced the whole genome of TH mice maintained at Marshall University to a depth of approximately 64.8X coverage using data from three next generation sequencing runs. Genome-wide, we found approximately 4.31 million homozygous single nucleotide polymorphisms (SNPs) and 1.10 million homozygous small insertions and deletions (indels) of which 98,899 SNPs and 163,720 indels were unique to the TH strain compared to 28 previously sequenced inbred mouse strains. In order to identify potentially clinically-relevant genes, we intersected our list of SNP and indel variants with human orthologous genes in which variants were associated in GWAS studies with obesity, diabetes, and metabolic syndrome, and with genes previously shown to confer a monogenic obesity phenotype in humans, and found several candidate variants that could be functionally tested using TH mice. Further, we filtered our list of variants to those occurring in an obesity quantitative trait locus, tabw2, identified in TH mice and found a missense polymorphism in the Cidec gene and characterized this variant's effect on protein function. We generated a complete catalog of variants in TH mice using the data from whole genome sequencing. Our findings will facilitate the identification of causal variants that underlie metabolic diseases in TH mice and will enable identification of candidate susceptibility genes for complex human obesity and type 2 diabetes.

  16. A germline polymorphism of DNA polymerase beta induces genomic instability and cellular transformation.

    Directory of Open Access Journals (Sweden)

    Jennifer Yamtich

    Full Text Available Several germline single nucleotide polymorphisms (SNPs have been identified in the POLB gene, but little is known about their cellular and biochemical impact. DNA Polymerase β (Pol β, encoded by the POLB gene, is the main gap-filling polymerase involved in base excision repair (BER, a pathway that protects the genome from the consequences of oxidative DNA damage. In this study we tested the hypothesis that expression of the POLB germline coding SNP (rs3136797 in mammalian cells could induce a cancerous phenotype. Expression of this SNP in both human and mouse cells induced double-strand breaks, chromosomal aberrations, and cellular transformation. Following treatment with an alkylating agent, cells expressing this coding SNP accumulated BER intermediate substrates, including single-strand and double-strand breaks. The rs3136797 SNP encodes the P242R variant Pol β protein and biochemical analysis showed that P242R protein had a slower catalytic rate than WT, although P242R binds DNA similarly to WT. Our results suggest that people who carry the rs3136797 germline SNP may be at an increased risk for cancer susceptibility.

  17. Structural variations in pig genomes

    NARCIS (Netherlands)

    Paudel, Y.

    2015-01-01

    Abstract Paudel, Y. (2015). Structural variations in pig genomes. PhD thesis, Wageningen University, the Netherlands Structural variations are chromosomal rearrangements such as insertions-deletions (INDELs), duplications, inversions, translocations, and copy number variations (CNVs

  18. Whole-genome analyses of Korean native and Holstein cattle breeds by massively parallel sequencing.

    Directory of Open Access Journals (Sweden)

    Jung-Woo Choi

    Full Text Available A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea--Hanwoo, Jeju Heugu, and Korean Holstein--using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs, of which 54.12% were found to be novel. We also detected 1,063,267 insertions-deletions (InDels across the genomes (78.92% novel. Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding.

  19. Polymorphism of the prion protein gene (PRNP) in Polish cattle affected by classical bovine spongiform encephalopathy.

    Science.gov (United States)

    Gurgul, Artur; Czarnik, Urszula; Urszula, Czarnik; Larska, Magdalena; Polak, Mirosław P; Strychalski, Janusz; Słota, Ewa

    2012-05-01

    Recent attempts to discover genetic factors affecting cattle resistance/susceptibility to bovine spongiform encephalopathy (BSE) have led to the identification of two insertion/deletion (indel) polymorphisms, located within the promoter and intron 1 of the prion protein gene PRNP, showing a significant association with the occurrence of classical form of the disease. Because the effect of the polymorphisms was studied only in few populations, in this study we investigated whether previously described association of PRNP indel polymorphisms with BSE susceptibility in cattle is also present in Polish cattle population. We found a significant relation between the investigated PRNP indel polymorphisms (23 and 12 bp indels), and susceptibility of Polish Holstein-Friesian cattle to classical BSE (P < 0.05). The deletion variants of both polymorphisms were related to increased susceptibility, whereas insertion variants were protective against BSE.

  20. Large-scale parsimony analysis of metazoan indels in protein-coding genes.

    Science.gov (United States)

    Belinky, Frida; Cohen, Ofir; Huchon, Dorothée

    2010-02-01

    Insertions and deletions (indels) are considered to be rare evolutionary events, the analysis of which may resolve controversial phylogenetic relationships. Indeed, indel characters are often assumed to be less homoplastic than amino acid and nucleotide substitutions and, consequently, more reliable markers for phylogenetic reconstruction. In this study, we analyzed indels from over 1,000 metazoan orthologous genes. We studied the impact of different species sampling, ortholog data sets, lengths of included indels, and indel-coding methods on the resulting metazoan tree. Our results show that, similar to sequence substitutions, indels are homoplastic characters, and their analysis is sensitive to the long-branch attraction artifact. Furthermore, improving the taxon sampling and choosing a closely related outgroup greatly impact the phylogenetic inference. Our indel-based inferences support the Ecdysozoa hypothesis over the Coelomata hypothesis and suggest that sponges are a sister clade to other animals.

  1. Rapid Genome-wide Single Nucleotide Polymorphism Discovery in Soybean and Rice via Deep Resequencing of Reduced Representation Libraries with the Illumina Genome Analyzer

    Directory of Open Access Journals (Sweden)

    Stéphane Deschamps

    2010-07-01

    Full Text Available Massively parallel sequencing platforms have allowed for the rapid discovery of single nucleotide polymorphisms (SNPs among related genotypes within a species. We describe the creation of reduced representation libraries (RRLs using an initial digestion of nuclear genomic DNA with a methylation-sensitive restriction endonuclease followed by a secondary digestion with the 4bp-restriction endonuclease This strategy allows for the enrichment of hypomethylated genomic DNA, which has been shown to be rich in genic sequences, and the digestion with serves to increase the number of common loci resequenced between individuals. Deep resequencing of these RRLs performed with the Illumina Genome Analyzer led to the identification of 2618 SNPs in rice and 1682 SNPs in soybean for two representative genotypes in each of the species. A subset of these SNPs was validated via Sanger sequencing, exhibiting validation rates of 96.4 and 97.0%, in rice ( and soybean (, respectively. Comparative analysis of the read distribution relative to annotated genes in the reference genome assemblies indicated that the RRL strategy was primarily sampling within genic regions for both species. The massively parallel sequencing of methylation-sensitive RRLs for genome-wide SNP discovery can be applied across a wide range of plant species having sufficient reference genomic sequence.

  2. E Unibus Plurum: genomic analysis of an experimentally evolved polymorphism in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Margie A Kinnersley

    2009-11-01

    Full Text Available Microbial populations founded by a single clone and propagated under resource limitation can become polymorphic. We sought to elucidate genetic mechanisms whereby a polymorphism evolved in Escherichia coli under glucose limitation and persisted because of cross-feeding among multiple adaptive clones. Apart from a 29 kb deletion in the dominant clone, no large-scale genomic changes distinguished evolved clones from their common ancestor. Using transcriptional profiling on co-evolved clones cultured separately under glucose-limitation we identified 180 genes significantly altered in expression relative to the common ancestor grown under similar conditions. Ninety of these were similarly expressed in all clones, and many of the genes affected (e.g., mglBAC, mglD, and lamB are in operons coordinately regulated by CRP and/or rpoS. While the remaining significant expression differences were clone-specific, 93% were exhibited by the majority clone, many of which are controlled by global regulators, CRP and CpxR. When transcriptional profiling was performed on adaptive clones cultured together, many expression differences that distinguished the majority clone cultured in isolation were absent, suggesting that CpxR may be activated by overflow metabolites removed by cross-feeding strains in co-culture. Relative to their common ancestor, shared expression differences among adaptive clones were partly attributable to early-arising shared mutations in the trans-acting global regulator, rpoS, and the cis-acting regulator, mglO. Gene expression differences that distinguished clones may in part be explained by mutations in trans-acting regulators malT and glpK, and in cis-acting sequences of acs. In the founder, a cis-regulatory mutation in acs (acetyl CoA synthetase and a structural mutation in glpR (glycerol-3-phosphate repressor likely favored evolution of specialists that thrive on overflow metabolites. Later-arising mutations that led to specialization

  3. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot

    Science.gov (United States)

    Robledo, Diego; Fernández, Carlos; Hermida, Miguel; Sciara, Andrés; Álvarez-Dios, José Antonio; Cabaleiro, Santiago; Caamaño, Rubén; Martínez, Paulino; Bouza, Carmen

    2016-01-01

    Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL) mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs). Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species. PMID:26901189

  4. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot

    Directory of Open Access Journals (Sweden)

    Diego Robledo

    2016-02-01

    Full Text Available Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs. Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species.

  5. Development of an ultra-dense genetic map of the sunflower genome based on single-feature polymorphisms.

    Directory of Open Access Journals (Sweden)

    John E Bowers

    Full Text Available The development of ultra-dense genetic maps has the potential to facilitate detailed comparative genomic analyses and whole genome sequence assemblies. Here we describe the use of a custom Affymetrix GeneChip containing nearly 2.4 million features (25 bp sequences targeting 86,023 unigenes from sunflower (Helianthus annuus L. and related species to test for single-feature polymorphisms (SFPs in a recombinant inbred line (RIL mapping population derived from a cross between confectionery and oilseed sunflower lines (RHA280×RHA801. We then employed an existing genetic map derived from this same population to rigorously filter out low quality data and place 67,486 features corresponding to 22,481 unigenes on the sunflower genetic map. The resulting map contains a substantial fraction of all sunflower genes and will thus facilitate a number of downstream applications, including genome assembly and the identification of candidate genes underlying QTL or traits of interest.

  6. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes

    OpenAIRE

    Kumar, Pankaj; Chaitanya, Pasumarthy S.; Nagarajaram, Hampapathalu A

    2010-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1–6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in s...

  7. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    Directory of Open Access Journals (Sweden)

    den Dunnen Johan T

    2009-10-01

    Full Text Available Abstract Background The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey individuals. Results A total of 100 million 36 bp reads were generated, representing approximately 5-6% (~62 Mbp of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC and observed minor allele frequency (MAF for the validated SNPs was 0.69. Conclusion We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even

  8. Rice SNP-seek database update: new SNPs, indels, and queries

    Science.gov (United States)

    Mansueto, Locedie; Fuentes, Roven Rommel; Borja, Frances Nikki; Detras, Jeffery; Abriol-Santos, Juan Miguel; Chebotarov, Dmytro; Sanciangco, Millicent; Palis, Kevin; Copetti, Dario; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Wing, Rod A.; Hamilton, Ruaraidh Sackville; Mauleon, Ramil; McNally, Kenneth L.; Alexandrov, Nickolai

    2017-01-01

    We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org. PMID:27899667

  9. Development and Validation of 697 Novel Polymorphic Genomic and EST-SSR Markers in the American Cranberry (Vaccinium macrocarpon Ait.

    Directory of Open Access Journals (Sweden)

    Brandon Schlautman

    2015-01-01

    Full Text Available The American cranberry, Vaccinium macrocarpon Ait., is an economically important North American fruit crop that is consumed because of its unique flavor and potential health benefits. However, a lack of abundant, genome-wide molecular markers has limited the adoption of modern molecular assisted selection approaches in cranberry breeding programs. To increase the number of available markers in the species, this study identified, tested, and validated microsatellite markers from existing nuclear and transcriptome sequencing data. In total, new primers were designed, synthesized, and tested for 979 SSR loci; 697 of the markers amplified allele patterns consistent with single locus segregation in a diploid organism and were considered polymorphic. Of the 697 polymorphic loci, 507 were selected for additional genetic diversity and segregation analyses in 29 cranberry genotypes. More than 95% of the 507 loci did not display segregation distortion at the p < 0.05 level, and contained moderate to high levels of polymorphism with a polymorphic information content >0.25. This comprehensive collection of developed and validated microsatellite loci represents a substantial addition to the molecular tools available for geneticists, genomicists, and breeders in cranberry and Vaccinium.

  10. A resource of genome-wide single-nucleotide polymorphisms generated by RAD tag sequencing in the critically endangered European eel

    DEFF Research Database (Denmark)

    Pujolar, J.M.; Jacobsen, M.W.; Frydenberg, J.

    2013-01-01

    Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers for the Eu......Reduced representation genome sequencing such as restriction-site-associated DNA (RAD) sequencing is finding increased use to identify and genotype large numbers of single-nucleotide polymorphisms (SNPs) in model and nonmodel species. We generated a unique resource of novel SNP markers...... for the European eel using the RAD sequencing approach that was simultaneously identified and scored in a genome-wide scan of 30 individuals. Whereas genomic resources are increasingly becoming available for this species, including the recent release of a draft genome, no genome-wide set of SNP markers...

  11. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  12. Genome-wide macrosynteny among Fusarium species in the Gibberella fujikuroi complex revealed by amplified fragment length polymorphisms.

    Science.gov (United States)

    De Vos, Lieschen; Steenkamp, Emma T; Martin, Simon H; Santana, Quentin C; Fourie, Gerda; van der Merwe, Nicolaas A; Wingfield, Michael J; Wingfield, Brenda D

    2014-01-01

    The Gibberella fujikuroi complex includes many Fusarium species that cause significant losses in yield and quality of agricultural and forestry crops. Due to their economic importance, whole-genome sequence information has rapidly become available for species including Fusarium circinatum, Fusarium fujikuroi and Fusarium verticillioides, each of which represent one of the three main clades known in this complex. However, no previous studies have explored the genomic commonalities and differences among these fungi. In this study, a previously completed genetic linkage map for an interspecific cross between Fusarium temperatum and F. circinatum, together with genomic sequence data, was utilized to consider the level of synteny between the three Fusarium genomes. Regions that are homologous amongst the Fusarium genomes examined were identified using in silico and pyrosequenced amplified fragment length polymorphism (AFLP) fragment analyses. Homology was determined using BLAST analysis of the sequences, with 777 homologous regions aligned to F. fujikuroi and F. verticillioides. This also made it possible to assign the linkage groups from the interspecific cross to their corresponding chromosomes in F. verticillioides and F. fujikuroi, as well as to assign two previously unmapped supercontigs of F. verticillioides to probable chromosomal locations. We further found evidence of a reciprocal translocation between the distal ends of chromosome 8 and 11, which apparently originated before the divergence of F. circinatum and F. temperatum. Overall, a remarkable level of macrosynteny was observed among the three Fusarium genomes, when comparing AFLP fragments. This study not only demonstrates how in silico AFLPs can aid in the integration of a genetic linkage map to the physical genome, but it also highlights the benefits of using this tool to study genomic synteny and architecture.

  13. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  14. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  15. Whole genome sequencing and comparative genomics of closely related Fusarium Head Blight fungi: Fusarium graminearum, F. meridionale and F. asiaticum.

    Science.gov (United States)

    Walkowiak, Sean; Rowland, Owen; Rodrigue, Nicolas; Subramaniam, Rajagopal

    2016-12-09

    The Fusarium graminearum species complex is composed of many distinct fungal species that cause several diseases in economically important crops, including Fusarium Head Blight of wheat. Despite being closely related, these species and individuals within species have distinct phenotypic differences in toxin production and pathogenicity, with some isolates reported as non-pathogenic on certain hosts. In this report, we compare genomes and gene content of six new isolates from the species complex, including the first available genomes of F. asiaticum and F. meridionale, with four other genomes reported in previous studies. A comparison of genome structure and gene content revealed a 93-99% overlap across all ten genomes. We identified more than 700 k base pairs (kb) of single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) within common regions of the genome, which validated the species and genetic populations reported within species. We constructed a non-redundant pan gene list containing 15,297 genes from the ten genomes and among them 1827 genes or 12% were absent in at least one genome. These genes were co-localized in telomeric regions and select regions within chromosomes with a corresponding increase in SNPs and indels. Many are also predicted to encode for proteins involved in secondary metabolism and other functions associated with disease. Genes that were common between isolates contained high levels of nucleotide variation and may be pseudogenes, allelic, or under diversifying selection. The genomic resources we have contributed will be useful for the identification of genes that contribute to the phenotypic variation and niche specialization that have been reported among members of the F. graminearum species complex.

  16. Recombination drives vertebrate genome contraction.

    Directory of Open Access Journals (Sweden)

    Kiwoong Nam

    Full Text Available Selective and/or neutral processes may govern variation in DNA content and, ultimately, genome size. The observation in several organisms of a negative correlation between recombination rate and intron size could be compatible with a neutral model in which recombination is mutagenic for length changes. We used whole-genome data on small insertions and deletions within transposable elements from chicken and zebra finch to demonstrate clear links between recombination rate and a number of attributes of reduced DNA content. Recombination rate was negatively correlated with the length of introns, transposable elements, and intergenic spacer and with the rate of short insertions. Importantly, it was positively correlated with gene density, the rate of short deletions, the deletion bias, and the net change in sequence length. All these observations point at a pattern of more condensed genome structure in regions of high recombination. Based on the observed rates of small insertions and deletions and assuming that these rates are representative for the whole genome, we estimate that the genome of the most recent common ancestor of birds and lizards has lost nearly 20% of its DNA content up until the present. Expansion of transposable elements can counteract the effect of deletions in an equilibrium mutation model; however, since the activity of transposable elements has been low in the avian lineage, the deletion bias is likely to have had a significant effect on genome size evolution in dinosaurs and birds, contributing to the maintenance of a small genome. We also demonstrate that most of the observed correlations between recombination rate and genome contraction parameters are seen in the human genome, including for segregating indel polymorphisms. Our data are compatible with a neutral model in which recombination drives vertebrate genome size evolution and gives no direct support for a role of natural selection in this process.

  17. Genomic DNA enrichment using sequence capture microarrays: a novel approach to discover sequence nucleotide polymorphisms (SNP in Brassica napus L.

    Directory of Open Access Journals (Sweden)

    Wayne E Clarke

    Full Text Available Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38. The main goal of this project was to combine sequence capture with next generation sequencing (NGS to discover single nucleotide polymorphisms (SNPs in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively. Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.

  18. Extensive sequence-influenced DNA methylation polymorphism in the human genome

    OpenAIRE

    Hellman Asaf; Chess Andrew

    2010-01-01

    Abstract Background Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters), but there are also int...

  19. A survey of genomic properties for the detection of regulatory polymorphisms.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2007-06-01

    Full Text Available Advances in the computational identification of functional noncoding polymorphisms will aid in cataloging novel determinants of health and identifying genetic variants that explain human evolution. To date, however, the development and evaluation of such techniques has been limited by the availability of known regulatory polymorphisms. We have attempted to address this by assembling, from the literature, a computationally tractable set of regulatory polymorphisms within the ORegAnno database (http://www.oreganno.org. We have further used 104 regulatory single-nucleotide polymorphisms from this set and 951 polymorphisms of unknown function, from 2-kb and 152-bp noncoding upstream regions of genes, to investigate the discriminatory potential of 23 properties related to gene regulation and population genetics. Among the most important properties detected in this region are distance to transcription start site, local repetitive content, sequence conservation, minor and derived allele frequencies, and presence of a CpG island. We further used the entire set of properties to evaluate their collective performance in detecting regulatory polymorphisms. Using a 10-fold cross-validation approach, we were able to achieve a sensitivity and specificity of 0.82 and 0.71, respectively, and we show that this performance is strongly influenced by the distance to the transcription start site.

  20. Genotyping of FCN and MBL2 Polymorphisms Using Pyrosequencing

    DEFF Research Database (Denmark)

    Munthe-Fog, Lea; Madsen, Hans Ole; Garred, Peter

    2014-01-01

    Pyrosequencing represents one of the most thorough methods used to analyze polymorphisms. One advantage of using pyrosequencing for genotyping is the ability to identify not only single-nucleotide polymorphisms (SNPs) but also tri-allelic variations, insertions and deletions (InDels). In contrast...

  1. Blast-Resistance Inheritance of Space-Induced Rice Lines and Their Genomic Polymorphism by Microsatellite Markers

    Institute of Scientific and Technical Information of China (English)

    XIAO Wu-ming; YANG Qi-yun; CHEN Zhi-qiang; WANG Hui; GUO Tao; LIU Yong-zhu; ZHU Xiao-yuan

    2009-01-01

    To understand the resistance inheritance basis of space-induced rice lines to blast,and to probe mutants'genomic DNA polymorphism compared with ground control by microsatellite markers,three space-induced lines were crossed with a highly susceptible variety LTH,and their F1 and F2 populations were inoculated by two representative blast isolates with broad pathogenicity to analyze their resistance inheritance basis.Meanwhile three mutant lines and the ground control were analyzed by 225 rice SSR(simple sequence repeat)primer pairs selected throughout the 12 chromosomes of whole rice genome,to scan the mutagenesis in genome of the mutant lines.The results indicated the blast-resistant genes harbored in these mutant lines were dominant.It was demonstrated that the resistance of mutant H1 to isolate GD0193 and GD3286 was controlled by a single gene,respectively;while mutants H2 and H3 were controlled by two pairs of major genes against isolate GD3286 and H2 showed complicated genetic mechanism to isolate GD0193.H3's resistance to isolate GD0193 was verified to be controlled by a single gene.According to the results of SSR analysis,three mutant lines showed different mutant rates as compared with the ground control,and the mutant rates also varied.Resistance genes can be induced from rice by space mutation,and different genomic variations were detected in blast-resistant lines.

  2. Systematic analysis of short internal indels and their impact on protein folding

    Directory of Open Access Journals (Sweden)

    Guo Jun-tao

    2010-08-01

    Full Text Available Abstract Background Protein sequence insertions/deletions (indels can be introduced during evolution or through alternative splicing (AS. Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledge of the structural changes due to indels is critical to our understanding of the evolution of protein structure and function. In addition, it can help us probe the evolution of alternative splicing and the diversity of functional isoforms. However, little is known about the effects of indels, in particular the ones involving core secondary structures, on the folding of protein structures. The long term goal of our study is to accurately predict the protein AS isoform structures. As a first step towards this goal, we performed a systematic analysis on the structural changes caused by short internal indels through mining highly homologous proteins in Protein Data Bank (PDB. Results We compiled a non-redundant dataset of short internal indels (2-40 amino acids from highly homologous protein pairs and analyzed the sequence and structural features of the indels. We found that about one third of indel residues are in disordered state and majority of the residues are exposed to solvent, suggesting that these indels are generally located on the surface of proteins. Though naturally occurring indels are fewer than engineered ones in the dataset, there are no statistically significant differences in terms of amino acid frequencies and secondary structure types between the "Natural" indels and "All" indels in the dataset. Structural comparisons show that all the protein pairs with short internal indels in the dataset preserve the structural folds and about 85% of protein pairs have global RMSDs (root mean square deviations of 2Å or less, suggesting that protein structures tend to be conserved and can tolerate short insertions and deletions. A few pairs

  3. Relative effects of mutability and selection on single nucleotide polymorphisms in transcribed regions of the human genome

    Directory of Open Access Journals (Sweden)

    Amos Christopher I

    2008-06-01

    Full Text Available Abstract Motivation Single nucleotide polymorphisms (SNPs are the most common type of genetic variation in humans. However, the factors that affect SNP density are poorly understood. The goal of this study was to estimate the relative effects of mutability and selection on SNP density in transcribed regions of human genes. It is important for prediction of the regions that harbor functional polymorphisms. Results We used frequency-validated SNPs resulting from single-nucleotide substitutions. SNPs were subdivided into five functional categories: (i 5' untranslated region (UTR SNPs, (ii 3' UTR SNPs, (iii synonymous SNPs, (iv SNPs producing conservative missense mutations, and (v SNPs producing radical missense mutations. Each of these categories was further subdivided into nine mutational categories on the basis of the single-nucleotide substitution type. Thus, 45 functional/mutational categories were analyzed. The relative mutation rate in each mutational category was estimated on the basis of published data. The proportion of segregating sites (PSSs for each functional/mutational category was estimated by dividing the observed number of SNPs by the number of potential sites in the genome for a given functional/mutational category. By analyzing each functional group separately, we found significant positive correlations between PSSs and relative mutation rates (Spearman's correlation coefficient, at least r = 0.96, df = 9, P P = 0.001, suggesting that selection affects SNP density in transcribed regions of the genome. We used analyses of variance and covariance to estimate the relative effects of selection (functional category and mutability (relative mutation rate on the PSSs and found that approximately 87% of variation in PSS was due to variation in the mutation rate and approximately 13% was due to selection, suggesting that the probability that a site located in a transcribed region of a gene is polymorphic mostly depends on the mutability

  4. Whole-Genome Resequencing of a Cucumber Chromosome Segment Substitution Line and Its Recurrent Parent to Identify Candidate Genes Governing Powdery Mildew Resistance

    Science.gov (United States)

    Yu, Ting; Xu, Xuewen; Yan, Yali; Qi, Xiaohua; Chen, Xuehao

    2016-01-01

    Cucumber is an economically important vegetable crop worldwide. Powdery mildew (PM) is one of the most severe diseases that can affect cucumber crops. There have been several research efforts to isolate PM resistance genes for breeding PM-resistant cucumber. In the present study, we used a chromosome segment substitution line, SSL508-28, which carried PM resistance genes from the donor parent, JIN5-508, through twelve generations of backcrossing with a PM-susceptible inbred line, D8. We performed whole-genome resequencing of SSL508-28 and D8 to identify single nucleotide polymorphisms (SNPs), and insertions and deletions (indels). When compared against the reference genome of the inbred cucumber line 9930, a total of 468,616 SNPs and 67,259 indels were identified in SSL508-28, and 537,352 SNPs and 91,698 indels were identified in D8. Of these, 3,014 non-synonymous SNPs and 226 frameshift indels in SSL508-28, and 3,104 non-synonymous SNPs and 251 frameshift indels in D8, were identified. Bioinformatics analysis of these variations revealed a total of 15,682 SNPs and 6,262 indels between SSL508-28 and D8, among which 120 non-synonymous SNPs and 30 frameshift indels in 94 genes were detected between SSL508-28 and D8. Finally, out of these 94 genes, five resistance genes with nucleotide-binding sites and leucine-rich repeat domains were selected for qRT-PCR analysis. This revealed an upregulation of two transcripts, Csa2M435460.1 and Csa5M579560.1, in SSL508-28. Furthermore, the results of qRT-PCR analysis of these two genes in ten PM resistant and ten PM susceptible cucumber lines showed that when exposed to PM, Csa2M435460.1 and Csa5M579560.1 exhibited a higher expression level of resistant lines than susceptible lines. This indicates that Csa2M435460.1 and Csa5M579560.1 are candidate genes for PM resistance in cucumber. In addition, the non-synonymous SNPs in Csa2M435460.1 and Csa5M579560.1, identified in SSL508-28 and D8, might be the key to high PM-resistance in

  5. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome().

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  6. Genetic Diversity and Population Structure in Native Chicken Populations from Myanmar, Thailand and Laos by Using 102 Indels Markers

    Directory of Open Access Journals (Sweden)

    A. A. Maw

    2015-01-01

    Full Text Available The genetic diversity of native chicken populations from Myanmar, Thailand, and Laos was examined by using 102 insertion and/or deletion (indels markers. Most of the indels loci were polymorphic (71% to 96%, and the genetic variability was similar in all populations. The average observed heterozygosities (HO and expected heterozygosities (HE ranged from 0.205 to 0.263 and 0.239 to 0.381, respectively. The coefficients of genetic differentiation (Gst for all cumulated populations was 0.125, and the Thai native chickens showed higher Gst (0.088 than Myanmar (0.041 and Laotian (0.024 populations. The pairwise Fst distances ranged from 0.144 to 0.308 among populations. A neighbor-joining (NJ tree, using Nei’s genetic distance, revealed that Thai and Laotian native chicken populations were genetically close, while Myanmar native chickens were distant from the others. The native chickens from these three countries were thought to be descended from three different origins (K = 3 from STRUCTURE analysis. Genetic admixture was observed in Thai and Laotian native chickens, while admixture was absent in Myanmar native chickens.

  7. Phylocomposer and phylodirector: analysis and visualization of transducer indel models.

    Science.gov (United States)

    Holmes, Ian

    2007-12-01

    Finite-state string transducers are probabilistic tools similar to Hidden Markov Models that can be systematically extended to large number of sequences related by indel and substitution processes on phylogenetic trees. The number of states in such models grows exponentially with the number of nodes in the tree, with the consequence that even quite small trees can be difficult to analyze or visualize. Here, we present two tools, phylocomposer and phylodirector, for working with string transducers. The former tool implements previously described composition algorithms for extending transducers to arbitrary tree topologies, while the latter generates short animations for arbitrary input alignments and phylogenetic trees, illustrating the state path through the composed transducer. Phylocomposer and phylodirector are freely available at http://biowiki.org/PhyloComposer and http://biowiki.org/PhyloDirector

  8. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    Vattipally B Sreenu; Pankaj Kumar; Javaregowda Nagaraju; Hampapathalu A Nagarajaram

    2007-01-01

    Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes.

  9. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle.

    Science.gov (United States)

    Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S

    2014-11-01

    Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  10. Genome-wide identification of R genes and exploitation of candidate RGA markers in rice

    Institute of Scientific and Technical Information of China (English)

    WANG Xusheng; WU Weiren; JIN Gulei; ZHU Jun

    2005-01-01

    By scanning the whole genomic sequence of japonica rice using 45 known plant disease resistance (R) genes, we identified 2119 resistance gene homologs or analogs (RGAs) and verified that RGAs are not randomly distributed but tend to cluster in the rice genome. The RGAs were classified into 21 families according to their functional domain based on Hidden Markov model (HMM). By comparing the RGAs of japonica rice with the whole genomic sequence of indica rice, we found 702 RGAs allelic between the two subspecies and revealed that 671 (95.6%) of them have length difference (InDels) in their genomic sequences (including coding and non-coding regions) between the two subspecies, suggesting that RGAs are highly polymorphic between the two subspecies in rice. We also exploited 402 PCR-based and co-dominant candidate RGA markers by designing primer pairs on the regions flanking the InDels and validating them via e-PCR. The length differences of the candidate RGA markers between the two subspecies are from 1 to 742 bp, with an average of 10.26 bp. All related information of the RGAs is available from our web site (http://ibi.zju.edu.cn/RGAs/index.html).

  11. Drosophila genomes and the development of affordable molecular markers for species genotyping.

    Science.gov (United States)

    Minuk, Leigh; Civetta, Alberto

    2011-04-01

    The recent completion of genome sequencing of 12 species of Drosophila has provided a powerful resource for hypothesis testing, as well as the development of technical tools. Here we take advantage of genome sequence data from two closely related species of Drosophila, Drosophila simulans and Drosophila sechellia, to quickly identify candidate molecular markers for genotyping based on expected insertion or deletion (indel) differences between species. Out of 64 candidate molecular markers selected along the second and third chromosome of Drosophila, 51 molecular markers were validated using PCR and gel electrophoresis. We found that the 20% error rate was due to sequencing errors in the genome data, although we cannot rule out possible indel polymorphisms. The approach has the advantage of being affordable and quick, as it only requires the use of bioinformatics tools for predictions and a PCR and agarose gel based assay for validation. Moreover, the approach could be easily extended to a wide variety of taxa with the only limitation being the availability of complete or partial genome sequence data.

  12. DNA Slippage Occurs at Microsatellite Loci without Minimal Threshold Length in Humans: A Comparative Genomic Approach

    Science.gov (United States)

    Leclercq, Sébastien; Rivals, Eric; Jarne, Philippe

    2010-01-01

    The dynamics of microsatellite, or short tandem repeats (STRs), is well documented for long, polymorphic loci, but much less is known for shorter ones. For example, the issue of a minimum threshold length for DNA slippage remains contentious. Model-fitting methods have generally concluded that slippage only occurs over a threshold length of about eight nucleotides, in contradiction with some direct observations of tandem duplications at shorter repeated sites. Using a comparative analysis of the human and chimpanzee genomes, we examined the mutation patterns at microsatellite loci with lengths as short as one period plus one nucleotide. We found that the rates of tandem insertions and deletions at microsatellite loci strongly deviated from background rates in other parts of the human genome and followed an exponential increase with STR size. More importantly, we detected no lower threshold length for slippage. The rate of tandem duplications at unrepeated sites was higher than expected from random insertions, providing evidence for genome-wide action of indel slippage (an alternative mechanism generating tandem repeats). The rate of point mutations adjacent to STRs did not differ from that estimated elsewhere in the genome, except around dinucleotide loci. Our results suggest that the emergence of STR depends on DNA slippage, indel slippage, and point mutations. We also found that the dynamics of tandem insertions and deletions differed in both rates and size at which these mutations take place. We discuss these results in both evolutionary and mechanistic terms. PMID:20624737

  13. Analysis of Complete Nucleotide Sequences of 12 Gossypium Chloroplast Genomes: Origin and Evolution of Allotetraploids

    Science.gov (United States)

    Xu, Qin; Xiong, Guanjun; Li, Pengbo; He, Fei; Huang, Yi; Wang, Kunbo; Li, Zhaohu; Hua, Jinping

    2012-01-01

    Background Cotton (Gossypium spp.) is a model system for the analysis of polyploidization. Although ascertaining the donor species of allotetraploid cotton has been intensively studied, sequence comparison of Gossypium chloroplast genomes is still of interest to understand the mechanisms underlining the evolution of Gossypium allotetraploids, while it is generally accepted that the parents were A- and D-genome containing species. Here we performed a comparative analysis of 13 Gossypium chloroplast genomes, twelve of which are presented here for the first time. Methodology/Principal Findings The size of 12 chloroplast genomes under study varied from 159,959 bp to 160,433 bp. The chromosomes were highly similar having >98% sequence identity. They encoded the same set of 112 unique genes which occurred in a uniform order with only slightly different boundary junctions. Divergence due to indels as well as substitutions was examined separately for genome, coding and noncoding sequences. The genome divergence was estimated as 0.374% to 0.583% between allotetraploid species and A-genome, and 0.159% to 0.454% within allotetraploids. Forty protein-coding genes were completely identical at the protein level, and 20 intergenic sequences were completely conserved. The 9 allotetraploids shared 5 insertions and 9 deletions in whole genome, and 7-bp substitutions in protein-coding genes. The phylogenetic tree confirmed a close relationship between allotetraploids and the ancestor of A-genome, and the allotetraploids were divided into four separate groups. Progenitor allotetraploid cotton originated 0.43–0.68 million years ago (MYA). Conclusion Despite high degree of conservation between the Gossypium chloroplast genomes, sequence variations among species could still be detected. Gossypium chloroplast genomes preferred for 5-bp indels and 1–3-bp indels are mainly attributed to the SSR polymorphisms. This study supports that the common ancestor of diploid A-genome species in

  14. Identification of a novel FGFRL1 MicroRNA target site polymorphism for bone mineral density in meta-analyses of genome-wide association studies

    NARCIS (Netherlands)

    T. Niu (Tianhua); N. Liu (Ning); M. Zhao (Ming); G. Xie (Guie); L. Zhang (Lei); J. Li (Jian); Y.-F. Pei (Yu-Fang); H. Shen (Hui); X. Fu (Xiaoying); H. He (Hao); S. Lu (Shan); X. Chen (Xiangding); L. Tan (Lijun); T.-L. Yang (Tie-Lin); Y. Guo (Yan); P.J. Leo (Paul); E.L. Duncan (Emma); J. Shen (Jie); Y.-F. Guo (Yan-fang); G.C. Nicholson (Geoffrey); R.L. Prince (Richard L.); J.A. Eisman (John); G. Jones (Graeme); P.N. Sambrook (Philip); X. Hu (Xiang); P.M. Das (Partha M.); Q. Tian (Qing); X.-Z. Zhu (Xue-Zhen); C.J. Papasian (Christopher J.); M.A. Brown (Matthew); A.G. Uitterlinden (André G.); Y.-P. Wang (Yu-Ping); S. Xiang (Shuanglin); H.-W. Deng

    2015-01-01

    textabstractMicroRNAs (miRNAs) are critical post-transcriptional regulators. Based on a previous genome-wide association (GWA) scan, we conducted a polymorphism in microRNAs' Target Sites (poly-miRTS)-centric multistage meta-analysis for lumbar spine (LS)-, total hip (HIP)-, and femoral neck (FN)-bo

  15. Small indels induced by CRISPR/Cas9 in the 5' region of microRNA lead to its depletion and Drosha processing retardance.

    Science.gov (United States)

    Jiang, Qian; Meng, Xing; Meng, Lingwei; Chang, Nannan; Xiong, Jingwei; Cao, Huiqing; Liang, Zicai

    2014-01-01

    MicroRNA knockout by genome editing technologies is promising. In order to extend the application of the technology and to investigate the function of a specific miRNA, we used CRISPR/Cas9 to deplete human miR-93 from a cluster by targeting its 5' region in HeLa cells. Various small indels were induced in the targeted region containing the Drosha processing site and seed sequences. Interestingly, we found that even a single nucleotide deletion led to complete knockout of the target miRNA with high specificity. Functional knockout was confirmed by phenotype analysis. Furthermore, de novo microRNAs were not found by RNA-seq. Nevertheless, expression of the pri-microRNAs was increased. When combined with structural analysis, the data indicated that biogenesis was impaired. Altogether, we showed that small indels in the 5' region of a microRNA result in sequence depletion as well as Drosha processing retard.

  16. Complete mitochondrial genome of the versicoloured emerald hummingbird Amazilia versicolor, a polymorphic species.

    Science.gov (United States)

    Prosdocimi, Francisco; Souto, Helena Magarinos; Ruschi, Piero Angeli; Furtado, Carolina; Jennings, W Bryan

    2016-09-01

    The genome of the versicoloured emerald hummingbird (Amazilia versicolor) was partially sequenced in one-sixth of an Illumina HiSeq lane. The mitochondrial genome was assembled using MIRA and MITObim software, yielding a circular molecule of 16,861 bp in length and deposited in GenBank under the accession number KF624601. The mitogenome contained 13 protein-coding genes, 22 transfer tRNAs, 2 ribosomal RNAs and 1 non-coding control region. The molecule was assembled using 21,927 sequencing reads of 100 bp each, resulting in ∼130 × coverage of uniformly distributed reads along the genome. This is the forth mitochondrial genome described for this highly diverse family of birds and may benefit further phylogenetic, phylogeographic, population genetic and species delimitation studies of hummingbirds.

  17. Genomic variations of Mycoplasma capricolum subsp capripneumoniae detected by amplified fragment length polymorphism (AFLP) analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Bolske, G.; Ahrens, Peter;

    2000-01-01

    The genetic diversity of Mycoplasma capricolum subsp. capripneumoniae strains based on determination of amplified fragment length polymorphisms (AFLP) is described. AFLP fingerprints of 38 strains derived from different countries in Africa and the Middle East consisted of over 100 bands in the size...... found by 16S rDNA analysis. The present data support previous observations regarding genetic homogeneity of M. capricolum subsp. capripneumoniae, and confirm the two evolutionary lines of descent found by analysis of 16S rRNA genes....

  18. Genomic structure and sequence polymorphism of E,E-alphafarnesene synthase gene in apples (Malus domestica Borkh.)

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Primer pairs were designed to amplify the genomic DNA sequence of the alpha-farnesene synthase (AFS) gene by PCR.The PCR products were sequenced,spliced and compared to Cdna sequences in the GenBank (accession No.AY182241).The genomic sequence and intron-exon organization of the AFS gene were thus obtained.The AFS genomic sequence has been registered in the GenBank (accession No.DQ901739).It has 6 introns and 7 exons,encoding a protein of 576 amino acids.The sizes of the 6 introns were 108 bp,113 bp,>1000 bp,125 bp,220 bp and 88 bp,and their phases were 0,1,2,2,0,0,respectively.The sizes of the deduced amino acids of the 7 exons were 57,89,127,73,48,83 and 99,respectively.The AFS protein contained three motifs:the RR(X8)W motif encoded by a sequence in exon 1,and the RxR motif and DDxxD motif encoded by two sequences in exon 4.After comparing the AFS genomic sequence (accession No.DQ901739) to the Cdna sequence (accession No.AY523409) in the GenBank,it was found that there were 6 single-nucleotide polymorphisms between the two sequences,four of which caused mutations at the amino acid level.Interestingly,one amino acid mutation (291R→G) was found in the RxR motif,and further investigation is needed to determine whether the alpha-farnesene synthesis ability and superficial scald susceptibility of apples are influenced by this amino acid mutation and other mutations.

  19. Phylogeography and adaptation genetics of stickleback from the Haida Gwaii archipelago revealed using genome-wide single nucleotide polymorphism genotyping.

    Science.gov (United States)

    Deagle, Bruce E; Jones, Felicity C; Absher, Devin M; Kingsley, David M; Reimchen, Thomas E

    2013-04-01

    Threespine stickleback populations are model systems for studying adaptive evolution and the underlying genetics. In lakes on the Haida Gwaii archipelago (off western Canada), stickleback have undergone a remarkable local radiation and show phenotypic diversity matching that seen throughout the species distribution. To provide a historical context for this radiation, we surveyed genetic variation at >1000 single nucleotide polymorphism (SNP) loci in stickleback from over 100 populations. SNPs included markers evenly distributed throughout genome and candidate SNPs tagging adaptive genomic regions. Based on evenly distributed SNPs, the phylogeographic pattern differs substantially from the disjunct pattern previously observed between two highly divergent mtDNA lineages. The SNP tree instead shows extensive within watershed population clustering and different watersheds separated by short branches deep in the tree. These data are consistent with separate colonizations of most watersheds, despite underlying genetic connections between some independent drainages. This supports previous suppositions that morphological diversity observed between watersheds has been shaped independently, with populations exhibiting complete loss of lateral plates and giant size each occurring in several distinct clades. Throughout the archipelago, we see repeated selection of SNPs tagging candidate freshwater adaptive variants at several genomic regions differentiated between marine-freshwater populations on a global scale (e.g. EDA, Na/K ATPase). In estuarine sites, both marine and freshwater allelic variants were commonly detected. We also found typically marine alleles present in a few freshwater lakes, especially those with completely plated morphology. These results provide a general model for postglacial colonization of freshwater habitat by sticklebacks and illustrate the tremendous potential of genome-wide SNP data sets hold for resolving patterns and processes underlying recent

  20. Using the dog genome to find single nucleotide polymorphisms in red foxes and other distantly related members of the Canidae.

    Science.gov (United States)

    Sacks, Benjamin N; Louie, Susan

    2008-01-01

    Single nucleotide polymorphisms (SNP) are the ideal marker for characterizing genomic variation but can be difficult to find in nonmodel species. We explored the usefulness of the dog genome for finding SNPs in distantly related nonmodel canids and evaluated so-ascertained SNPs. Using 40 primer pairs designed from randomly selected bacterial artificial chromosome clones from the dog genome, we successfully sequenced 80-88% of loci in a coyote (Canis latrans), grey fox (Urocyon cinereoargenteus), and red fox (Vulpes vulpes), which compared favourably to a 60% success rate for each species using 10 primer pairs conserved across mammals. Loci were minimally heterogeneous with respect to SNP density, which was similar, overall, in a discovery panel of nine red foxes to that previously reported for a panel of eight wolves (Canis lupus). Additionally, individual heterozygosity was similar across the three canids in this study. However, the proportion of SNP sites shared with the dog decreased with phylogenetic divergence, with no SNPs shared between red foxes and dogs. Density of interspecific SNPs increased approximately linearly with divergence time between species. Using red foxes from three populations, we estimated F(ST) based on each of 42 SNPs and 14 microsatellites and simulated null distributions conditioned on each marker type. Relative to SNPs, microsatellites systematically underestimated F(ST) and produced biased null distributions, indicating that SNPs are superior markers for these functions. By reconstituting the frequency spectrum of SNPs discovered in nine red foxes, we discovered an estimated 77-89% of all SNPs (within the region screened) present in North American red foxes. In sum, these findings indicate that information from the dog genome enables easy ascertainment of random and gene-linked SNPs throughout the Canidae and illustrate the value of SNPs in ecological and evolutionary genetics.

  1. Host genome polymorphisms and tuberculosis infection: What we have to say?

    Science.gov (United States)

    Khalilullah, Said Alfin; Harapan, Harapan; Hasan, Nabeeh A.; Winardi, Wira; Ichsan, Ichsan; Mulyadi, Mulyadi

    2015-01-01

    Several epidemiology studies suggest that host genetic factors play important roles in susceptibility, protection and progression of tuberculosis infection. Here we have reviewed the implications of some genetic polymorphisms in pathways related to tuberculosis susceptibility, severity and development. Large case-control studies examining single-nucleotide polymorphisms (SNPs) in genes have been performed in tuberculosis patients in some countries. Polymorphisms in natural resistance-associated macrophage protein 1 (NRAMP1), toll-like receptor 2 (TLR2), interleukin-6 (IL-6), tumor necrosis factor alpha (TNF-α), interleukin-1 receptor antagonist (IL-1RA), IL-10, vitamin D receptor (VDR), dendritic cell-specific ICAM-3-grabbing non-integrin (DC-SIGN), monocyte chemoattractant protein-1 (MCP-1), nucleotide oligomerization binding domain 2 (NOD2), interferon-gamma (IFN-γ), inducible nitric oxide synthase (iNOS), mannose-binding lectin (MBL) and surfactant proteins A (SP-A) have been reviewed. These genes have been variably associated with tuberculosis infection and there is strong evidence indicating that host genetic factors play critical roles in tuberculosis susceptibility, severity and development. PMID:26966339

  2. Whole genome analysis of a Vietnamese trio

    Indian Academy of Sciences (India)

    Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh

    2015-03-01

    We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

  3. Meta-analysis of genome-wide association studies identifies common susceptibility polymorphisms for colorectal and endometrial cancer near SH2B3 and TSHZ1.

    Science.gov (United States)

    Cheng, Timothy H T; Thompson, Deborah; Painter, Jodie; O'Mara, Tracy; Gorman, Maggie; Martin, Lynn; Palles, Claire; Jones, Angela; Buchanan, Daniel D; Ko Win, Aung; Hopper, John; Jenkins, Mark; Lindor, Noralane M; Newcomb, Polly A; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Giles, Graham G; Pharoah, Paul; Peto, Julian; Cox, Angela; Swerdlow, Anthony; Couch, Fergus; Cunningham, Julie M; Goode, Ellen L; Winham, Stacey J; Lambrechts, Diether; Fasching, Peter; Burwinkel, Barbara; Brenner, Hermann; Brauch, Hiltrud; Chang-Claude, Jenny; Salvesen, Helga B; Kristensen, Vessela; Darabi, Hatef; Li, Jingmei; Liu, Tao; Lindblom, Annika; Hall, Per; de Polanco, Magdalena Echeverry; Sans, Monica; Carracedo, Angel; Castellvi-Bel, Sergi; Rojas-Martinez, Augusto; Aguiar Jnr, Samuel; Teixeira, Manuel R; Dunning, Alison M; Dennis, Joe; Otton, Geoffrey; Proietto, Tony; Holliday, Elizabeth; Attia, John; Ashton, Katie; Scott, Rodney J; McEvoy, Mark; Dowdy, Sean C; Fridley, Brooke L; Werner, Henrica M J; Trovik, Jone; Njolstad, Tormund S; Tham, Emma; Mints, Miriam; Runnebaum, Ingo; Hillemanns, Peter; Dörk, Thilo; Amant, Frederic; Schrauwen, Stefanie; Hein, Alexander; Beckmann, Matthias W; Ekici, Arif; Czene, Kamila; Meindl, Alfons; Bolla, Manjeet K; Michailidou, Kyriaki; Tyrer, Jonathan P; Wang, Qin; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Annibali, Daniela; Depreeuw, Jeroen; Al-Tassan, Nada A; Harris, Rebecca; Meyer, Brian F; Whiffin, Nicola; Hosking, Fay J; Kinnersley, Ben; Farrington, Susan M; Timofeeva, Maria; Tenesa, Albert; Campbell, Harry; Haile, Robert W; Hodgson, Shirley; Carvajal-Carmona, Luis; Cheadle, Jeremy P; Easton, Douglas; Dunlop, Malcolm; Houlston, Richard; Spurdle, Amanda; Tomlinson, Ian

    2015-12-01

    High-risk mutations in several genes predispose to both colorectal cancer (CRC) and endometrial cancer (EC). We therefore hypothesised that some lower-risk genetic variants might also predispose to both CRC and EC. Using CRC and EC genome-wide association series, totalling 13,265 cancer cases and 40,245 controls, we found that the protective allele [G] at one previously-identified CRC polymorphism, rs2736100 near TERT, was associated with EC risk (odds ratio (OR) = 1.08, P = 0.000167); this polymorphism influences the risk of several other cancers. A further CRC polymorphism near TERC also showed evidence of association with EC (OR = 0.92; P = 0.03). Overall, however, there was no good evidence that the set of CRC polymorphisms was associated with EC risk, and neither of two previously-reported EC polymorphisms was associated with CRC risk. A combined analysis revealed one genome-wide significant polymorphism, rs3184504, on chromosome 12q24 (OR = 1.10, P = 7.23 × 10(-9)) with shared effects on CRC and EC risk. This polymorphism, a missense variant in the gene SH2B3, is also associated with haematological and autoimmune disorders, suggesting that it influences cancer risk through the immune response. Another polymorphism, rs12970291 near gene TSHZ1, was associated with both CRC and EC (OR = 1.26, P = 4.82 × 10(-8)), with the alleles showing opposite effects on the risks of the two cancers.

  4. Meta-analysis of genome-wide association studies identifies common susceptibility polymorphisms for colorectal and endometrial cancer near SH2B3 and TSHZ1

    Science.gov (United States)

    Cheng, Timothy HT; Thompson, Deborah; Painter, Jodie; O’Mara, Tracy; Gorman, Maggie; Martin, Lynn; Palles, Claire; Jones, Angela; Buchanan, Daniel D.; Ko Win, Aung; Hopper, John; Jenkins, Mark; Lindor, Noralane M.; Newcomb, Polly A.; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Giles, Graham G; Pharoah, Paul; Peto, Julian; Cox, Angela; Swerdlow, Anthony; Couch, Fergus; Cunningham, Julie M; Goode, Ellen L; Winham, Stacey J; Lambrechts, Diether; Fasching, Peter; Burwinkel, Barbara; Brenner, Hermann; Brauch, Hiltrud; Chang-Claude, Jenny; Salvesen, Helga B.; Kristensen, Vessela; Darabi, Hatef; Li, Jingmei; Liu, Tao; Lindblom, Annika; Hall, Per; de Polanco, Magdalena Echeverry; Sans, Monica; Carracedo, Angel; Castellvi-Bel, Sergi; Rojas-Martinez, Augusto; Aguiar Jnr, Samuel; Teixeira, Manuel R.; Dunning, Alison M; Dennis, Joe; Otton, Geoffrey; Proietto, Tony; Holliday, Elizabeth; Attia, John; Ashton, Katie; Scott, Rodney J; McEvoy, Mark; Dowdy, Sean C; Fridley, Brooke L; Werner, Henrica MJ; Trovik, Jone; Njolstad, Tormund S; Tham, Emma; Mints, Miriam; Runnebaum, Ingo; Hillemanns, Peter; Dörk, Thilo; Amant, Frederic; Schrauwen, Stefanie; Hein, Alexander; Beckmann, Matthias W; Ekici, Arif; Czene, Kamila; Meindl, Alfons; Bolla, Manjeet K; Michailidou, Kyriaki; Tyrer, Jonathan P; Wang, Qin; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Annibali, Daniela; Depreeuw, Jeroen; Al-Tassan, Nada A.; Harris, Rebecca; Meyer, Brian F.; Whiffin, Nicola; Hosking, Fay J; Kinnersley, Ben; Farrington, Susan M.; Timofeeva, Maria; Tenesa, Albert; Campbell, Harry; Haile, Robert W.; Hodgson, Shirley; Carvajal-Carmona, Luis; Cheadle, Jeremy P.; Easton, Douglas; Dunlop, Malcolm; Houlston, Richard; Spurdle, Amanda; Tomlinson, Ian

    2015-01-01

    High-risk mutations in several genes predispose to both colorectal cancer (CRC) and endometrial cancer (EC). We therefore hypothesised that some lower-risk genetic variants might also predispose to both CRC and EC. Using CRC and EC genome-wide association series, totalling 13,265 cancer cases and 40,245 controls, we found that the protective allele [G] at one previously-identified CRC polymorphism, rs2736100 near TERT, was associated with EC risk (odds ratio (OR) = 1.08, P = 0.000167); this polymorphism influences the risk of several other cancers. A further CRC polymorphism near TERC also showed evidence of association with EC (OR = 0.92; P = 0.03). Overall, however, there was no good evidence that the set of CRC polymorphisms was associated with EC risk, and neither of two previously-reported EC polymorphisms was associated with CRC risk. A combined analysis revealed one genome-wide significant polymorphism, rs3184504, on chromosome 12q24 (OR = 1.10, P = 7.23 × 10−9) with shared effects on CRC and EC risk. This polymorphism, a missense variant in the gene SH2B3, is also associated with haematological and autoimmune disorders, suggesting that it influences cancer risk through the immune response. Another polymorphism, rs12970291 near gene TSHZ1, was associated with both CRC and EC (OR = 1.26, P = 4.82 × 10−8), with the alleles showing opposite effects on the risks of the two cancers. PMID:26621817

  5. Developing market class specific InDel markers from next generation sequence data in Phaseolus vulgaris L.

    Directory of Open Access Journals (Sweden)

    Samira eMafi Moghaddam

    2014-05-01

    Full Text Available Next generation sequence data provides valuable information and tools for genetic and genomic research and offers new insights useful for marker development. This data is useful for the design of accurate and user-friendly molecular tools. Common bean (Phaseolus vulgaris L. is a diverse crop in which separate domestication events happened in each gene pool followed by race and market class diversification that has resulted in different morphological characteristics in each commercial market class. This has led to essentially independent breeding programs within each market class which in turn has resulted in limited within market class sequence variation. Sequence data from selected genotypes of five bean market classes (pinto, black, navy, and light and dark red kidney were used to develop InDel-based markers specific to each market class. Design of the InDel markers was conducted through a combination of assembly, alignment and primer design software using 1.6x to 5.1x coverage of Illumina GAII sequence data for each of the selected genotypes. The procedure we developed for primer design is fast, accurate, less error prone, and higher throughput than when they are designed manually. All InDel markers are easy to run and score with no need for PCR optimization. A total of 2,687 InDel markers distributed across the genome were developed. To highlight their usefulness, they were employed to construct a phylogenetic tree and a genetic map, showing that InDel markers are reliable, simple, and accurate.

  6. Comparative Genomics of Rhodococcus equi Virulence Plasmids Indicates Host-Driven Evolution of the vap Pathogenicity Island.

    Science.gov (United States)

    MacArthur, Iain; Anastasi, Elisa; Alvarez, Sonsiray; Scortti, Mariela; Vázquez-Boland, José A

    2017-05-01

    The conjugative virulence plasmid is a key component of the Rhodococcus equi accessory genome essential for pathogenesis. Three host-associated virulence plasmid types have been identified the equine pVAPA and porcine pVAPB circular variants, and the linear pVAPN found in bovine (ruminant) isolates. We recently characterized the R. equi pangenome (Anastasi E, et al. 2016. Pangenome and phylogenomic analysis of the pathogenic actinobacterium Rhodococcus equi. Genome Biol Evol. 8:3140-3148.) and we report here the comparative analysis of the virulence plasmid genomes. Plasmids within each host-associated type were highly similar despite their diverse origins. Variation was accounted for by scattered single nucleotide polymorphisms and short nucleotide indels, while larger indels-mostly in the plasticity region near the vap pathogencity island (PAI)-defined plasmid genomic subtypes. Only one of the plasmids analyzed, of pVAPN type, was exceptionally divergent due to accumulation of indels in the housekeeping backbone. Each host-associated plasmid type carried a unique PAI differing in vap gene complement, suggesting animal host-specific evolution of the vap multigene family. Complete conservation of the vap PAI was observed within each host-associated plasmid type. Both diversity of host-associated plasmid types and clonality of specific chromosomal-plasmid genomic type combinations were observed within the same R. equi phylogenomic subclade. Our data indicate that the overall strong conservation of the R. equi host-associated virulence plasmids is the combined result of host-driven selection, lateral transfer between strains, and geographical spread due to international livestock exchanges. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. An efficient way of finding good indel seeds for local homology search

    Institute of Scientific and Technical Information of China (English)

    CHEN Ke; ZHU QingXin; YANG Fan; TANG DongMing

    2009-01-01

    Designing good or optimal seeds is a key factor for local homology search in bioinformatics. Con-tinuous seeds have existed for nearly 20 years used by BLAST series programs. Recently, spaced seeds, which were introduced by PattenHunter program, were shown to be more sensitive and faster than continuous seeds under the same similarity level. However, there are 2 main disadvantages for space seeds: (i) It assumes that only matches and mismatches occur within seed alignments, but not insertions and deletions (indels); (ii) calculating optimal spaced seeds is an NP-hard problem. Intro-duction for indel seeds solved the first problem, but the second is getting much harder because of its higher exponential level. In this paper, we introduce an efficient way of designing good (even optimal) indel seeds under "indel overlap complexity" model, and it can be calculated in polynomial time. We calculate indel seeds from weight of 11 to 15. The result shows that indel seeds have higher sensitivi-ties than spaced ones and our algorithm finds good indel seeds very quickly.

  8. Single nucleotide polymorphisms and linkage disequilibrium in sunflower.

    Science.gov (United States)

    Kolkman, Judith M; Berry, Simon T; Leon, Alberto J; Slabaugh, Mary B; Tang, Shunxue; Gao, Wenxiang; Shintani, David K; Burke, John M; Knapp, Steven J

    2007-09-01

    Genetic diversity in modern sunflower (Helianthus annuus L.) cultivars (elite oilseed inbred lines) has been shaped by domestication and breeding bottlenecks and wild and exotic allele introgression(-)the former narrowing and the latter broadening genetic diversity. To assess single nucleotide polymorphism (SNP) frequencies, nucleotide diversity, and linkage disequilibrium (LD) in modern cultivars, alleles were resequenced from 81 genic loci distributed throughout the sunflower genome. DNA polymorphisms were abundant; 1078 SNPs (1/45.7 bp) and 178 insertions-deletions (INDELs) (1/277.0 bp) were identified in 49.4 kbp of DNA/genotype. SNPs were twofold more frequent in noncoding (1/32.1 bp) than coding (1/62.8 bp) sequences. Nucleotide diversity was only slightly lower in inbred lines ( = 0.0094) than wild populations ( = 0.0128). Mean haplotype diversity was 0.74. When extraploted across the genome ( approximately 3500 Mbp), sunflower was predicted to harbor at least 76.4 million common SNPs among modern cultivar alleles. LD decayed more slowly in inbred lines than wild populations (mean LD declined to 0.32 by 5.5 kbp in the former, the maximum physical distance surveyed), a difference attributed to domestication and breeding bottlenecks. SNP frequencies and LD decay are sufficient in modern sunflower cultivars for very high-density genetic mapping and high-resolution association mapping.

  9. Single Nucleotide Polymorphisms and Linkage Disequilibrium in Sunflower

    Science.gov (United States)

    Kolkman, Judith M.; Berry, Simon T.; Leon, Alberto J.; Slabaugh, Mary B.; Tang, Shunxue; Gao, Wenxiang; Shintani, David K.; Burke, John M.; Knapp, Steven J.

    2007-01-01

    Genetic diversity in modern sunflower (Helianthus annuus L.) cultivars (elite oilseed inbred lines) has been shaped by domestication and breeding bottlenecks and wild and exotic allele introgression−the former narrowing and the latter broadening genetic diversity. To assess single nucleotide polymorphism (SNP) frequencies, nucleotide diversity, and linkage disequilibrium (LD) in modern cultivars, alleles were resequenced from 81 genic loci distributed throughout the sunflower genome. DNA polymorphisms were abundant; 1078 SNPs (1/45.7 bp) and 178 insertions-deletions (INDELs) (1/277.0 bp) were identified in 49.4 kbp of DNA/genotype. SNPs were twofold more frequent in noncoding (1/32.1 bp) than coding (1/62.8 bp) sequences. Nucleotide diversity was only slightly lower in inbred lines (θ = 0.0094) than wild populations (θ = 0.0128). Mean haplotype diversity was 0.74. When extraploted across the genome (∼3500 Mbp), sunflower was predicted to harbor at least 76.4 million common SNPs among modern cultivar alleles. LD decayed more slowly in inbred lines than wild populations (mean LD declined to 0.32 by 5.5 kbp in the former, the maximum physical distance surveyed), a difference attributed to domestication and breeding bottlenecks. SNP frequencies and LD decay are sufficient in modern sunflower cultivars for very high-density genetic mapping and high-resolution association mapping. PMID:17660563

  10. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics

    Directory of Open Access Journals (Sweden)

    Zhang Dapeng

    2012-06-01

    Full Text Available Abstract Background Proteinaceous toxins are observed across all levels of inter-organismal and intra-genomic conflicts. These include recently discovered prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. They are characterized by a remarkable diversity of C-terminal toxin domains generated by recombination with standalone toxin-coding cassettes. Prior analysis revealed a striking diversity of nuclease and deaminase domains among the toxin modules. We systematically investigated polymorphic toxin systems using comparative genomics, sequence and structure analysis. Results Polymorphic toxin systems are distributed across all major bacterial lineages and are delivered by at least eight distinct secretory systems. In addition to type-II, these include type-V, VI, VII (ESX, and the poorly characterized “Photorhabdus virulence cassettes (PVC”, PrsW-dependent and MuF phage-capsid-like systems. We present evidence that trafficking of these toxins is often accompanied by autoproteolytic processing catalyzed by HINT, ZU5, PrsW, caspase-like, papain-like, and a novel metallopeptidase associated with the PVC system. We identified over 150 distinct toxin domains in these systems. These span an extraordinary catalytic spectrum to include 23 distinct clades of peptidases, numerous previously unrecognized versions of nucleases and deaminases, ADP-ribosyltransferases, ADP ribosyl cyclases, RelA/SpoT-like nucleotidyltransferases, glycosyltranferases and other enzymes predicted to modify lipids and carbohydrates, and a pore-forming toxin domain. Several of these toxin domains are shared with host-directed effectors of pathogenic bacteria. Over 90 families of immunity proteins might neutralize anywhere between a single to at least 27 distinct types of toxin domains. In some organisms multiple tandem immunity genes or immunity protein domains are organized into polyimmunity loci or polyimmunity proteins. Gene-neighborhood-analysis of

  11. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics

    Science.gov (United States)

    2012-01-01

    Background Proteinaceous toxins are observed across all levels of inter-organismal and intra-genomic conflicts. These include recently discovered prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. They are characterized by a remarkable diversity of C-terminal toxin domains generated by recombination with standalone toxin-coding cassettes. Prior analysis revealed a striking diversity of nuclease and deaminase domains among the toxin modules. We systematically investigated polymorphic toxin systems using comparative genomics, sequence and structure analysis. Results Polymorphic toxin systems are distributed across all major bacterial lineages and are delivered by at least eight distinct secretory systems. In addition to type-II, these include type-V, VI, VII (ESX), and the poorly characterized “Photorhabdus virulence cassettes (PVC)”, PrsW-dependent and MuF phage-capsid-like systems. We present evidence that trafficking of these toxins is often accompanied by autoproteolytic processing catalyzed by HINT, ZU5, PrsW, caspase-like, papain-like, and a novel metallopeptidase associated with the PVC system. We identified over 150 distinct toxin domains in these systems. These span an extraordinary catalytic spectrum to include 23 distinct clades of peptidases, numerous previously unrecognized versions of nucleases and deaminases, ADP-ribosyltransferases, ADP ribosyl cyclases, RelA/SpoT-like nucleotidyltransferases, glycosyltranferases and other enzymes predicted to modify lipids and carbohydrates, and a pore-forming toxin domain. Several of these toxin domains are shared with host-directed effectors of pathogenic bacteria. Over 90 families of immunity proteins might neutralize anywhere between a single to at least 27 distinct types of toxin domains. In some organisms multiple tandem immunity genes or immunity protein domains are organized into polyimmunity loci or polyimmunity proteins. Gene-neighborhood-analysis of polymorphic toxin systems

  12. A genome-wide survey reveals a deletion polymorphism associated with resistance to gastrointestinal nematodes in Angus cattle.

    Science.gov (United States)

    Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Song, Jiuzhou; Van Tassell, Curtis P; Sonstegard, Tad S; Liu, George E

    2014-06-01

    Gastrointestinal (GI) nematode infections are a worldwide threat to human health and animal production. In this study, we performed a genome-wide association study between copy number variations (CNVs) and resistance to GI nematodes in an Angus cattle population. Using a linear regression analysis, we identified one deletion CNV which reaches genome-wide significance after Bonferroni correction. With multiple mapped human olfactory receptor genes but no annotated bovine genes in the region, this significantly associated CNV displays high population frequencies (58.26 %) with a length of 104.8 kb on chr7. We further investigated the linkage disequilibrium (LD) relationships between this CNV and its nearby single nucleotide polymorphisms (SNPs) and genes. The underlining haplotype blocks contain immune-related genes such as ZNF496 and NLRP3. As this CNV co-segregates with linked SNPs and associated genes, we suspect that it could contribute to the detected variations in gene expression and thus differences in host parasite resistance.

  13. Genome-wide association mapping for wood characteristics in Populus identifies an array of candidate single nucleotide polymorphisms.

    Science.gov (United States)

    Porth, Ilga; Klapšte, Jaroslav; Skyba, Oleksandr; Hannemann, Jan; McKown, Athena D; Guy, Robert D; DiFazio, Stephen P; Muchero, Wellington; Ranjan, Priya; Tuskan, Gerald A; Friedmann, Michael C; Ehlting, Juergen; Cronk, Quentin C B; El-Kassaby, Yousry A; Douglas, Carl J; Mansfield, Shawn D

    2013-11-01

    Establishing links between phenotypes and molecular variants is of central importance to accelerate genetic improvement of economically important plant species. Our work represents the first genome-wide association study to the inherently complex and currently poorly understood genetic architecture of industrially relevant wood traits. Here, we employed an Illumina Infinium 34K single nucleotide polymorphism (SNP) genotyping array that generated 29,233 high-quality SNPs in c. 3500 broad-based candidate genes within a population of 334 unrelated Populus trichocarpa individuals to establish genome-wide associations. The analysis revealed 141 significant SNPs (α ≤ 0.05) associated with 16 wood chemistry/ultrastructure traits, individually explaining 3-7% of the phenotypic variance. A large set of associations (41% of all hits) occurred in candidate genes preselected for their suggested a priori involvement with secondary growth. For example, an allelic variant in the FRA8 ortholog explained 21% of the total genetic variance in fiber length, when the trait's heritability estimate was considered. The remaining associations identified SNPs in genes not previously implicated in wood or secondary wall formation. Our findings provide unique insights into wood trait architecture and support efforts for population improvement based on desirable allelic variants.

  14. Receptor Polymorphism and Genomic Structure Interact to Shape Bitter Taste Perception.

    Science.gov (United States)

    Roudnitzky, Natacha; Behrens, Maik; Engel, Anika; Kohl, Susann; Thalmann, Sophie; Hübner, Sandra; Lossow, Kristina; Wooding, Stephen P; Meyerhof, Wolfgang

    2015-01-01

    The ability to taste bitterness evolved to safeguard most animals, including humans, against potentially toxic substances, thereby leading to food rejection. Nonetheless, bitter perception is subject to individual variations due to the presence of genetic functional polymorphisms in bitter taste receptor (TAS2R) genes, such as the long-known association between genetic polymorphisms in TAS2R38 and bitter taste perception of phenylthiocarbamide. Yet, due to overlaps in specificities across receptors, such associations with a single TAS2R locus are uncommon. Therefore, to investigate more complex associations, we examined taste responses to six structurally diverse compounds (absinthin, amarogentin, cascarillin, grosheimin, quassin, and quinine) in a sample of the Caucasian population. By sequencing all bitter receptor loci, inferring long-range haplotypes, mapping their effects on phenotype variation, and characterizing functionally causal allelic variants, we deciphered at the molecular level how a subjects' genotype for the whole-family of TAS2R genes shapes variation in bitter taste perception. Within each haplotype block implicated in phenotypic variation, we provided evidence for at least one locus harboring functional polymorphic alleles, e.g. one locus for sensitivity to amarogentin, one of the most bitter natural compounds known, and two loci for sensitivity to grosheimin, one of the bitter compounds of artichoke. Our analyses revealed also, besides simple associations, complex associations of bitterness sensitivity across TAS2R loci. Indeed, even if several putative loci harbored both high- and low-sensitivity alleles, phenotypic variation depended on linkage between these alleles. When sensitive alleles for bitter compounds were maintained in the same linkage phase, genetically driven perceptual differences were obvious, e.g. for grosheimin. On the contrary, when sensitive alleles were in opposite phase, only weak genotype-phenotype associations were seen

  15. Bayesian phylogeny analysis of vertebrate serpins illustrates evolutionary conservation of the intron and indels based six groups classification system from lampreys for ∼500 MY

    Directory of Open Access Journals (Sweden)

    Abhishek Kumar

    2015-06-01

    Full Text Available The serpin superfamily is characterized by proteins that fold into a conserved tertiary structure and exploits a sophisticated and irreversible suicide-mechanism of inhibition. Vertebrate serpins are classified into six groups (V1–V6, based on three independent biological features—genomic organization, diagnostic amino acid sites and rare indels. However, this classification system was based on the limited number of mammalian genomes available. In this study, several non-mammalian genomes are used to validate this classification system using the powerful Bayesian phylogenetic method. This method supports the intron and indel based vertebrate classification and proves that serpins have been maintained from lampreys to humans for about 500 MY. Lampreys have fewer than 10 serpins, which expand into 36 serpins in humans. The two expanding groups V1 and V2 have SERPINB1/SERPINB6 and SERPINA8/SERPIND1 as the ancestral serpins, respectively. Large clusters of serpins are formed by local duplications of these serpins in tetrapod genomes. Interestingly, the ancestral HCII/SERPIND1 locus (nested within PIK4CA possesses group V4 serpin (A2APL1, homolog of α2-AP/SERPINF2 of lampreys; hence, pointing to the fact that group V4 might have originated from group V2. Additionally in this study, details of the phylogenetic history and genomic characteristics of vertebrate serpins are revisited.

  16. Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers

    DEFF Research Database (Denmark)

    Su, Guosheng; Christensen, Ole Fredslund; Ostersen, Tage;

    2012-01-01

    Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total...... genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP) markers. In addition...... of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear models were used: 1) a simple additive genetic model (MA), 2) a model including both additive and additive by additive epistatic genetic effects (MAE), 3) a model including both additive and dominance genetic effects...

  17. Genomic impact of CRISPR immunization against bacteriophages.

    Science.gov (United States)

    Barrangou, Rodolphe; Coûté-Monvoisin, Anne-Claire; Stahl, Buffy; Chavichvily, Isabelle; Damange, Florian; Romero, Dennis A; Boyaval, Patrick; Fremaux, Christophe; Horvath, Philippe

    2013-12-01

    CRISPR (clustered regularly interspaced short palindromic repeats) together with CAS (RISPR-associated) genes form the CRISPR-Cas immune system, which provides sequence-specific adaptive immunity against foreign genetic elements in bacteria and archaea. Immunity is acquired by the integration of short stretches of invasive DNA as novel 'spacers' into CRISPR loci. Subsequently, these immune markers are transcribed and generate small non-coding interfering RNAs that specifically guide nucleases for sequence-specific cleavage of complementary sequences. Among the four CRISPR-Cas systems present in Streptococcus thermophilus, CRISPR1 and CRISPR3 have the ability to readily acquire new spacers following bacteriophage or plasmid exposure. In order to investigate the impact of building CRISPR-encoded immunity on the host chromosome, we determined the genome sequence of a BIM (bacteriophage-insensitive mutant) derived from the DGCC7710 model organism, after four consecutive rounds of bacteriophage challenge. As expected, active CRISPR loci evolved via polarized addition of several novel spacers following exposure to bacteriophages. Although analysis of the draft genome sequence revealed a variety of SNPs (single nucleotide polymorphisms) and INDELs (insertions/deletions), most of the in silico differences were not validated by Sanger re-sequencing. In addition, two SNPs and two small INDELs were identified and tracked in the intermediate variants. Overall, building CRISPR-encoded immunity does not significantly affect the genome, which allows the maintenance of important functional properties in isogenic CRISPR mutants. This is critical for the development and formulation of sustainable and robust next-generation starter cultures with increased industrial lifespans.

  18. Genomic diversity among Danish field strains of Mycoplasma hyosynoviae assessed by amplified fragment length polymorphism analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, Niels F.; Nielsen, Elisabeth O.;

    2002-01-01

    ) were concurrently examined for variance in BglII-MfeI and EcoRI-Csp6I-A AFLP markers. A total of 56 different genomic fingerprints having an overall similarity between 77 and 96% were detected. No correlation between AFLP variability and period of isolation or anatomical site of isolation could...

  19. Genome relationships among Lotus species based on random amplified polymorphic DNA (RAPD).

    Science.gov (United States)

    Campos, L P; Raelson, J V; Grant, W F

    1994-06-01

    The ability of random amplified polymorphic DNA (RAPD) to distinguish among different taxa of Lotus was evaluated for several geographically dispersed accessions of four diploid Lotus species, L. tennis Waldst. et Kit, L. alpinus Schleich., L. japonicus (Regel) Larsen, and L. uliginosus Schkuhr and for the tetraploid L. corniculatus L., in order to ascertain whether RAPD data could offer additional evidence concerning the origin of the tetraploid L. corniculatus. Clear bands and several polymorphisms were obtained for 20 primers used for each species/accession. The evolutionary pathways among the species/accessions presented in a cladogram were expressed in terms of treelengths giving the most parsimonious reconstructions. Accessions within the same species grouped closely together. It is considered that L. uliginosus which is most distantly related to L. corniculatus, may be excluded as a direct progenitor of L. corniculatus, confirming previous results from isoenzyme studies. Lotus alpinus is grouped with accessions of L. corniculatus, which differs from previous studies. With this exception, these findings are in agreement with previous experimental studies in the L. corniculatus group. The value of the RAPD data to theories on the origin of L. corniculatus is discussed.

  20. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Science.gov (United States)

    Khajuria, Yash Paul; Saxena, Maneesha S; Gaur, Rashmi; Chattopadhyay, Debasis; Jain, Mukesh; Parida, Swarup K; Bhatia, Sabhyata

    2015-01-01

    The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777) of an inter-specific reference mapping population. High amplification efficiency (87%), experimental validation success rate (81%) and polymorphic potential (55%) of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48%) detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%). An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777) having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs) of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM) longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped highest

  1. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea.

    Directory of Open Access Journals (Sweden)

    Yash Paul Khajuria

    Full Text Available The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777 of an inter-specific reference mapping population. High amplification efficiency (87%, experimental validation success rate (81% and polymorphic potential (55% of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48% detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%. An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777 having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7-23 cM longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped

  2. Development and Integration of Genome-Wide Polymorphic Microsatellite Markers onto a Reference Linkage Map for Constructing a High-Density Genetic Map of Chickpea

    Science.gov (United States)

    Gaur, Rashmi; Chattopadhyay, Debasis; Jain, Mukesh; Parida, Swarup K.; Bhatia, Sabhyata

    2015-01-01

    The identification of informative in silico polymorphic genomic and genic microsatellite markers by comparing the genome and transcriptome sequences of crop genotypes is a rapid, cost-effective and non-laborious approach for large-scale marker validation and genotyping applications, including construction of high-density genetic maps. We designed 1494 markers, including 1016 genomic and 478 transcript-derived microsatellite markers showing in-silico fragment length polymorphism between two parental genotypes (Cicer arietinum ICC4958 and C. reticulatum PI489777) of an inter-specific reference mapping population. High amplification efficiency (87%), experimental validation success rate (81%) and polymorphic potential (55%) of these microsatellite markers suggest their effective use in various applications of chickpea genetics and breeding. Intra-specific polymorphic potential (48%) detected by microsatellite markers in 22 desi and kabuli chickpea genotypes was lower than inter-specific polymorphic potential (59%). An advanced, high-density, integrated and inter-specific chickpea genetic map (ICC4958 x PI489777) having 1697 map positions spanning 1061.16 cM with an average inter-marker distance of 0.625 cM was constructed by assigning 634 novel informative transcript-derived and genomic microsatellite markers on eight linkage groups (LGs) of our prior documented, 1063 marker-based genetic map. The constructed genome map identified 88, including four major (7–23 cM) longest high-resolution genomic regions on LGs 3, 5 and 8, where the maximum number of novel genomic and genic microsatellite markers were specifically clustered within 1 cM genetic distance. It was for the first time in chickpea that in silico FLP analysis at genome-wide level was carried out and such a large number of microsatellite markers were identified, experimentally validated and further used in genetic mapping. To best of our knowledge, in the presently constructed genetic map, we mapped highest

  3. MAF45, a highly polymorphic marker for the pseudoautosomal region of the sheep genome, is not linked to the FecXI (Inverdale) gene.

    Science.gov (United States)

    Swarbrick, P A; Schmack, A E; Crawford, A M

    1992-07-01

    A highly polymorphic dinucleotide repeat, or microsatellite, that shows partial sex-linked inheritance in sheep has been isolated from the sheep genome. Our data indicate that the locus is in the pseudoautosomal region approximately 13 cm from the boundary with the sex-linked regions. The locus, designated MAF45, has 12 alleles with a PIC of 0.84. The same primers amplify a single polymorphic locus in cattle and goats. This locus was not linked to the Inverdale gene, an X-linked gene that increases the ovulation rate in sheep.

  4. Receptor Polymorphism and Genomic Structure Interact to Shape Bitter Taste Perception.

    Directory of Open Access Journals (Sweden)

    Natacha Roudnitzky

    Full Text Available The ability to taste bitterness evolved to safeguard most animals, including humans, against potentially toxic substances, thereby leading to food rejection. Nonetheless, bitter perception is subject to individual variations due to the presence of genetic functional polymorphisms in bitter taste receptor (TAS2R genes, such as the long-known association between genetic polymorphisms in TAS2R38 and bitter taste perception of phenylthiocarbamide. Yet, due to overlaps in specificities across receptors, such associations with a single TAS2R locus are uncommon. Therefore, to investigate more complex associations, we examined taste responses to six structurally diverse compounds (absinthin, amarogentin, cascarillin, grosheimin, quassin, and quinine in a sample of the Caucasian population. By sequencing all bitter receptor loci, inferring long-range haplotypes, mapping their effects on phenotype variation, and characterizing functionally causal allelic variants, we deciphered at the molecular level how a subjects' genotype for the whole-family of TAS2R genes shapes variation in bitter taste perception. Within each haplotype block implicated in phenotypic variation, we provided evidence for at least one locus harboring functional polymorphic alleles, e.g. one locus for sensitivity to amarogentin, one of the most bitter natural compounds known, and two loci for sensitivity to grosheimin, one of the bitter compounds of artichoke. Our analyses revealed also, besides simple associations, complex associations of bitterness sensitivity across TAS2R loci. Indeed, even if several putative loci harbored both high- and low-sensitivity alleles, phenotypic variation depended on linkage between these alleles. When sensitive alleles for bitter compounds were maintained in the same linkage phase, genetically driven perceptual differences were obvious, e.g. for grosheimin. On the contrary, when sensitive alleles were in opposite phase, only weak genotype

  5. Comparative mapping of Brassica juncea and Arabidopsis thaliana using Intron Polymorphism (IP markers: homoeologous relationships, diversification and evolution of the A, B and C Brassica genomes

    Directory of Open Access Journals (Sweden)

    Gupta Vibha

    2008-03-01

    Full Text Available Abstract Background Extensive mapping efforts are currently underway for the establishment of comparative genomics between the model plant, Arabidopsis thaliana and various Brassica species. Most of these studies have deployed RFLP markers, the use of which is a laborious and time-consuming process. We therefore tested the efficacy of PCR-based Intron Polymorphism (IP markers to analyze genome-wide synteny between the oilseed crop, Brassica juncea (AABB genome and A. thaliana and analyzed the arrangement of 24 (previously described genomic block segments in the A, B and C Brassica genomes to study the evolutionary events contributing to karyotype variations in the three diploid Brassica genomes. Results IP markers were highly efficient and generated easily discernable polymorphisms on agarose gels. Comparative analysis of the segmental organization of the A and B genomes of B. juncea (present study with the A and B genomes of B. napus and B. nigra respectively (described earlier, revealed a high degree of colinearity suggesting minimal macro-level changes after polyploidization. The ancestral block arrangements that remained unaltered during evolution and the karyotype rearrangements that originated in the Oleracea lineage after its divergence from Rapa lineage were identified. Genomic rearrangements leading to the gain or loss of one chromosome each between the A-B and A-C lineages were deciphered. Complete homoeology in terms of block organization was found between three linkage groups (LG each for the A-B and A-C genomes. Based on the homoeology shared between the A, B and C genomes, a new nomenclature for the B genome LGs was assigned to establish uniformity in the international Brassica LG nomenclature code. Conclusion IP markers were highly effective in generating comparative relationships between Arabidopsis and various Brassica species. Comparative genomics between the three Brassica lineages established the major rearrangements

  6. Genome relationship among nine species of Millettieae (Leguminosae: Papilionoideae) based on random amplified polymorphic DNA (RAPD).

    Science.gov (United States)

    Acharya, Laxmikanta; Mukherjee, Arup Kumar; Panda, Pratap Chandra

    2004-01-01

    Random amplified polymorphic DNA (RAPD) marker was used to establish intergeneric classification and phylogeny of the tribe Millettieae sensu Geesink (1984) (Leguminosae: Papilionoideae) and to assess genetic relationship between 9 constituent species belonging to 5 traditionally recognized genera under the tribe. DNA from pooled leaf samples was isolated and RAPD analysis performed using 25 decamer primers. The genetic similarities were derived from the dendrogram constructed by the pooled RAPD data using a similarity index, which supported clear grouping of species under their respective genera, inter- and intra-generic classification and phylogeny and also merger of Pongamia with Millettia. Elevation of Tephrosia purpurea var. pumila to the rank of a species (T. pumila) based on morphological characteristics is also supported through this study of molecular markers.

  7. Protein Subcellular Localization Prediction and Genomic Polymorphism Analysis of the SARS Coronavirus

    Institute of Scientific and Technical Information of China (English)

    季星来; 柳树群; 李岭; 孙之荣

    2004-01-01

    The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus (CoV).Several sequences of the complete genome of SARS-CoV have been determined.The subcellular localization (SubLocation) of annotated open-reading frames of the SARS-CoV genome was predicted using a support vector machine.Several gene products were predicted to locate in the Golgi body and cell nucleus.The SubLocation information was combined with predicted transmembrane information to develop a model of the viral life cycle.The results show that this information can be used to predict the functions of genes and even the virus pathogenesis.In addition,the entire SARS viral genome sequences currently available in GenBank were compared to identify the sequence variations among different isolates.Some variations in the Hong Kong strains may be related to the special clinical manifestations and provide clues for understanding the relationship between gene functions and evolution.These variations reflect the evolution of the SARS virus in human populations and may help development of a vaccine.

  8. A 2-stage genome-wide association study to identify single nucleotide polymorphisms associated with development of urinary symptoms after radiotherapy for prostate cancer.

    Science.gov (United States)

    Kerns, Sarah L; Stone, Nelson N; Stock, Richard G; Rath, Lynda; Ostrer, Harry; Rosenstein, Barry S

    2013-07-01

    We identified single nucleotide polymorphisms associated with change in the AUA Symptom Score after radiotherapy for prostate cancer. A total of 723 patients treated with brachytherapy with or without external beam radiation therapy were assessed at baseline and annually after radiotherapy using the AUA Symptom Score. A 2-stage genome-wide association study was performed with the primary end point of change in AUA Symptom Score from baseline at each of 4 followup periods. Single nucleotide polymorphism associations were assessed using multivariable linear regression adjusting for pre-radiotherapy AUA Symptom Score severity category and clinical variables. Fisher's trend method was used to calculate combined p values from the discovery and replication cohorts. A region on chromosome 9p21.2 containing 8 single nucleotide polymorphisms showed the strongest association with change in AUA Symptom Score (combined p values 8.8×10(-6) to 6.5×10(-7) at 2 to 3 years after radiotherapy). These single nucleotide polymorphisms form a haplotype block that encompasses the inflammation signaling gene IFNK. These single nucleotide polymorphisms were independently associated with change in AUA Symptom Score after adjusting for clinical predictors including smoking history, hypertension, α-blocker use and pre-radiotherapy AUA Symptom Score. An additional 24 single nucleotide polymorphisms showed moderate significance for association with change in AUA Symptom Score. Several of these single nucleotide polymorphisms were more strongly associated with change in specific AUA Symptom Score items, including rs13035033 in the MYO3B gene, which was associated with straining (beta coefficient 0.9, 95% CI 0.6-1.2, p = 5.0×10(-9)). If validated, these single nucleotide polymorphisms could provide insight into the biology underlying urinary symptoms following radiotherapy and could lead to development of an assay to identify patients at risk for experiencing these effects. Copyright © 2013

  9. Natural variation in SAR11 marine bacterioplankton genomes inferred from metagenomic data

    Directory of Open Access Journals (Sweden)

    Wilhelm Larry J

    2007-11-01

    Full Text Available Abstract Background One objective of metagenomics is to reconstruct information about specific uncultured organisms from fragmentary environmental DNA sequences. We used the genome of an isolate of the marine alphaproteobacterium SAR11 ('Candidatus Pelagibacter ubique'; strain HTCC1062, obtained from the cold, productive Oregon coast, as a query sequence to study variation in SAR11 metagenome sequence data from the Sargasso Sea, a warm, oligotrophic ocean gyre. Results The average amino acid identity of SAR11 genes encoded by the metagenomic data to the query genome was only 71%, indicating significant evolutionary divergence between the coastal isolates and Sargasso Sea populations. However, an analysis of gene neighbors indicated that SAR11 genes in the Sargasso Sea metagenomic data match the gene order of the HTCC1062 genome in 96% of cases (> 85,000 observations, and that rearrangements are most frequent at predicted operon boundaries. There were no conserved examples of genes with known functions being found in the coastal isolates, but not the Sargasso Sea metagenomic data, or vice versa, suggesting that core regions of these diverse SAR11 genomes are relatively conserved in gene content. However, four hypervariable regions were observed, which may encode properties associated with variation in SAR11 ecotypes. The largest of these, HVR2, is a 48 kb region flanked by the sole 5S and 23S genes in the HTCC1062 genome, and mainly encodes genes that determine cell surface properties. A comparison of two closely related 'Candidatus Pelagibacter' genomes (HTCC1062 and HTCC1002 revealed a number of "gene indels" in core regions. Most of these were found to be polymorphic in the metagenomic data and showed evidence of purifying selection, suggesting that the same "polymorphic gene indels" are maintained in physically isolated SAR11 populations. Conclusion These findings suggest that natural selection has conserved many core features of SAR11

  10. Genomic Fingerprinting of the Vaccine Strain of Clostridium Tetani by Restriction Fragment Length Polymorphism Technique

    Directory of Open Access Journals (Sweden)

    Naser Harzandi

    2013-05-01

    Full Text Available Background: Clostridium tetani or Nicolaier’s bacillus is an obligatory anaerobic, Gram-positive, movable with terminal or sub terminal spore. The chromosome of C. tetani contains 2,799,250 bp with a G+C content of 28.6%. The aim of this study was identification and genomic fingerprinting of the vaccine strain of C. tetani.Materials and Methods: The vaccine strain of C. tetani was provided by Razi Vaccine and Serum Research Institute. The seeds were inoculated into Columbia blood agar and grown for 72 h and transferred to the thioglycolate broth medium for further 36 h culturing. The cultures were incubated at 35ºC in anaerobic conditions. DNA extraction with phenol/ chloroform method was performed. After extraction, the consistency of DNA was assayed. Next, the vaccine strain was digested using pvuII enzyme and incubated at 37ºC for overnight. The digested DNA was gel-electrophoresed by 1% agarose for a short time. Then, the gel was studied with Gel Doc system and transferred to Hybond N+membrane using standard DNA blotting techniques.Results: The vaccine strain of C. tetani genome was fingerprinted by RFLP technique. Our preliminary results showed no divergence exists in the vaccine strain used for the production tetanus toxoid during the periods of 1990-2011.Conclusion: Observation suggests that there is lack of significant changes in RFLP genomic fingerprinting profile of the vaccine strain. Therefore, this strain did not lose its efficiency in tetanus vaccine production. RFLP analysis is worthwhile in investigating the nature of the vaccine strain C. tetani.

  11. Genome-Wide Association Study between Single Nucleotide Polymorphisms and Flight Speed in Nellore Cattle

    Science.gov (United States)

    Valente, Tiago Silva; Baldi, Fernando; Sant’Anna, Aline Cristina; Albuquerque, Lucia Galvão; Paranhos da Costa, Mateus José Rodrigues

    2016-01-01

    Introduction Cattle temperament is an important factor that affects the profitability of beef cattle enterprises, due to its relationship with productivity traits, animal welfare and labor safety. Temperament is a complex phenotype often assessed by measuring a series of behavioral traits, which result from the effects of multiple environmental and genetic factors, and their interactions. The aims of this study were to perform a genome-wide association study and detect genomic regions, potential candidate genes and their biological mechanisms underlying temperament, measured by flight speed (FS) test in Nellore cattle. Materials and Methods The genome-wide association study (GWAS) was performed using a single-step procedure (ssGBLUP) which combined simultaneously all 16,600 phenotypes from genotyped and non-genotyped animals, full pedigree information of 162,645 animals and 1,384 genotyped animals in one step. The animals were genotyped with High Density Bovine SNP BeadChip which contains 777,962 SNP markers. After quality control (QC) a total of 455,374 SNPs remained. Results Heritability estimated for FS was 0.21 ± 0.02. Consecutive SNPs explaining 1% or more of the total additive genetic variance were considered as windows associated with FS. Nine candidate regions located on eight different Bos taurus chromosomes (BTA) (1 at 73 Mb, 2 at 65 Mb, 5 at 22 Mb and 119 Mb, 9 at 98 Mb, 11 at 67 Mb, 15 at 16 Mb, 17 at 63 Kb, and 26 at 47 Mb) were identified. The candidate genes identified in these regions were NCKAP5 (BTA2), PARK2 (BTA9), ANTXR1 (BTA11), GUCY1A2 (BTA15), CPE (BTA17) and DOCK1 (BTA26). Among these genes PARK2, GUCY1A2, CPE and DOCK1 are related to dopaminergic system, memory formation, biosynthesis of peptide hormone and neurotransmitter and brain development, respectively. Conclusions Our findings allowed us to identify nine genomic regions (SNP windows) associated with beef cattle temperament, measured by FS test. Within these windows, six promising

  12. Using 50K single nucleotide polymorphisms to elucidate genomic architecture of Line 1 Hereford cattle

    Directory of Open Access Journals (Sweden)

    Yijian eHuang

    2012-12-01

    Full Text Available Hereford is a major beef breed in the USA, and a sub-population, known as Line 1 (L1, was established in 1934 using two paternal half-sib bulls and 50 unrelated females. L1 has since been maintained as a closed population and selected for growth to one year of age. Objectives were to characterize the molecular genetic architecture of L1 (n = 240 by comparing a cross-section of L1 with the general U.S. Hereford population (AHA, n = 311, estimating effects of imposed selection within L1 based on allele frequencies at 50K SNP loci, and examining loci-specific effects of heterozygosity on the selection criterion. Animals were genotyped using the Illumina BovineSNP50 Beadchip, and SNP were mapped to UMD3.0 assembly of the bovine genome sequence. Average LD, measured by square of Pearson correlation, of adjacent SNP was 0.36 and 0.16 in L1 and AHA, respectively. Difference in LD between L1 and AHA decreased as SNP spacing increased. Persistence of phase between L1 and AHA decreased from 0.45 to 0.14 as SNP spacing increased from 50 kb to 5,000 kb. Extended haplotype homozygosity was greater in L1 than in AHA for 95.6% of the SNP. Knowledge of selection applied to L1 facilitated a novel approach to QTL discovery. Minor allele frequency was (FDR < 0.01 affected by cumulative selection differential at 191 out of 25,901 SNP. With the FDR relaxed to 0.05, 13 regions on BTA2, 5, 6, 9, 11, 14, 15, 18, 23 and 26 are co-located with previously identified QTL for growth. After adjustment of postweaning gain phenotypes for fixed effects and direct additive genetic effects, regression of residuals on genome-wide heterozygosity was -235.3 ± 91.6 kg. However, no SNP-specific loci where heterozygotes were significantly superior to the average of homozygotes were revealed (FDR ≥ 0.17. In conclusion, genome-wide SNP genotypes clarified effects of selection and inbreeding within L1 and differences in genomic architecture between the population segment L1 and the AHA

  13. Genome-Wide Association Study between Single Nucleotide Polymorphisms and Flight Speed in Nellore Cattle.

    Directory of Open Access Journals (Sweden)

    Tiago Silva Valente

    Full Text Available Cattle temperament is an important factor that affects the profitability of beef cattle enterprises, due to its relationship with productivity traits, animal welfare and labor safety. Temperament is a complex phenotype often assessed by measuring a series of behavioral traits, which result from the effects of multiple environmental and genetic factors, and their interactions. The aims of this study were to perform a genome-wide association study and detect genomic regions, potential candidate genes and their biological mechanisms underlying temperament, measured by flight speed (FS test in Nellore cattle.The genome-wide association study (GWAS was performed using a single-step procedure (ssGBLUP which combined simultaneously all 16,600 phenotypes from genotyped and non-genotyped animals, full pedigree information of 162,645 animals and 1,384 genotyped animals in one step. The animals were genotyped with High Density Bovine SNP BeadChip which contains 777,962 SNP markers. After quality control (QC a total of 455,374 SNPs remained.Heritability estimated for FS was 0.21 ± 0.02. Consecutive SNPs explaining 1% or more of the total additive genetic variance were considered as windows associated with FS. Nine candidate regions located on eight different Bos taurus chromosomes (BTA (1 at 73 Mb, 2 at 65 Mb, 5 at 22 Mb and 119 Mb, 9 at 98 Mb, 11 at 67 Mb, 15 at 16 Mb, 17 at 63 Kb, and 26 at 47 Mb were identified. The candidate genes identified in these regions were NCKAP5 (BTA2, PARK2 (BTA9, ANTXR1 (BTA11, GUCY1A2 (BTA15, CPE (BTA17 and DOCK1 (BTA26. Among these genes PARK2, GUCY1A2, CPE and DOCK1 are related to dopaminergic system, memory formation, biosynthesis of peptide hormone and neurotransmitter and brain development, respectively.Our findings allowed us to identify nine genomic regions (SNP windows associated with beef cattle temperament, measured by FS test. Within these windows, six promising candidate genes and their biological functions were

  14. Polymorphisms in the Prion Protein Gene of cattle breeds from Brazil

    Directory of Open Access Journals (Sweden)

    Cristiane C. Sanches

    Full Text Available ABSTRACT: One of the alterations that occur in the PRNP gene in bovines is the insertion/deletion (indel of base sequences in specific regions, such as indels of 12-base pairs (bp in intron 1 and of 23- bp in the promoter region. The deletion allele of 23 bp is associated with susceptibility to bovine spongiform encephalopathy (BSE as well as the presence of the deletion allele of 12 bp. In the present study, the variability of nucleotides in the promoter region and intron 1 of the PRNP gene was genotyped for the Angus, Canchim, Nellore and Simmental bovine breeds to identify the genotype profiles of resistance and/or susceptibility to BSE in each animal. Genomic DNA was extracted for amplification of the target regions of the PRNP gene using polymerase chain reaction (PCR and specific primers. The PCR products were submitted to electrophoresis in agarose gel 3% and sequencing for genotyping. With the exception of the Angus breed, most breeds exhibited a higher frequency of deletion alleles for 12 bp and 23 bp in comparison to their respective insertion alleles for both regions. These results represent an important contribution to understanding the formation process of the Brazilian herd in relation to bovine PRNP gene polymorphisms.

  15. New genetic variants in the CCR5 gene and the distribution of known polymorphisms in Omani population.

    Science.gov (United States)

    Al-Mahruqi, S H; Zadjali, F; Koh, C Y; Balkhair, A; Said, E A; Al-Balushi, M S; Hasson, S S; Al-Jabri, A A

    2014-02-01

    C-C motif chemokine receptor-5 (CCR5) is a pro-inflammatory receptor that binds to chemokines and facilitates the entry of the R5 strain of HIV-1. A number of polymorphisms were identified within the promoter and coding regions of the CCR5 gene, some of which have been found to affect the protein expression and thus receptor function. Although several CCR5 polymorphisms were shown to vary widely in their distribution among different ethnic populations, there has been no study addressing the potential variants of the CCR5 gene in the Omani population. The aim of this study was to identify the polymorphic sites that exist within the CCR5 gene in Omanis. Blood samples were collected from 89 Omani adult individuals, and genomic DNA was amplified by polymerase chain reaction and sequenced to identify the polymorphic sites. The distribution of the detected variants was examined and compared with the previously published data. Four new indels were detected of 32 variable positions, -2973A/-, -2894A/-, -2827TA/- and -2769T/-, and all were located in the 5'UTR. Furthermore, two new mutations, -2248G/A and +658A/G, were observed for the first time; the -2248G/A was detected in the intron 1 region in one subject and +658A/G in the coding region of the CCR5 in another subject. In silico analysis showed that the novel variations in the 5'UTR may have effects on the transcription factor binding sites. Therefore, this study demonstrates the presence of two new SNPs and four novel indels in the CCR5 gene in the Omani population. Our findings support the wide spectrum of genetic diversity reported within the CCR5 gene region among different ethnic groups.

  16. Characterization of porcine ENO3: genomic and cDNA structure, polymorphism and expression

    Directory of Open Access Journals (Sweden)

    Xiong Yuanzhu

    2008-09-01

    Full Text Available Abstract In this study, a full-length cDNA of the porcine ENO3 gene encoding a 434 amino acid protein was isolated. It contains 12 exons over approximately 5.4 kb. Differential splicing in the 5'-untranslated sequence generates two forms of mRNA that differ from each other in the presence or absence of a 142-nucleotide fragment. Expression analysis showed that transcript 1 of ENO3 is highly expressed in liver and lung, while transcript 2 is highly expressed in skeletal muscle and heart. We provide the first evidence that in skeletal muscle expression of ENO3 is different between Yorkshire and Meishan pig breeds. Furthermore, real-time polymerase chain reaction revealed that, in Yorkshire pigs, skeletal muscle expression of transcript 1 is identical at postnatal day-1 and at other stages while that of transcript 2 is higher. Moreover, expression of transcript 1 is lower in skeletal muscle and all other tissue samples than that of transcript 2, with the exception of liver and kidney. Statistical analysis showed the existence of a polymorphism in the ENO3 gene between Chinese indigenous and introduced commercial western pig breeds and that it is associated with fat percentage, average backfat thickness, meat marbling and intramuscular fat in two different populations.

  17. Genome-wide analysis of neuroblastomas using high-density single nucleotide polymorphism arrays.

    Directory of Open Access Journals (Sweden)

    Rani E George

    Full Text Available BACKGROUND: Neuroblastomas are characterized by chromosomal alterations with biological and clinical significance. We analyzed paired blood and primary tumor samples from 22 children with high-risk neuroblastoma for loss of heterozygosity (LOH and DNA copy number change using the Affymetrix 10K single nucleotide polymorphism (SNP array. FINDINGS: Multiple areas of LOH and copy number gain were seen. The most commonly observed area of LOH was on chromosome arm 11q (15/22 samples; 68%. Chromosome 11q LOH was highly associated with occurrence of chromosome 3p LOH: 9 of the 15 samples with 11q LOH had concomitant 3p LOH (P = 0.016. Chromosome 1p LOH was seen in one-third of cases. LOH events on chromosomes 11q and 1p were generally accompanied by copy number loss, indicating hemizygous deletion within these regions. The one exception was on chromosome 11p, where LOH in all four cases was accompanied by normal copy number or diploidy, implying uniparental disomy. Gain of copy number was most frequently observed on chromosome arm 17q (21/22 samples; 95% and was associated with allelic imbalance in six samples. Amplification of MYCN was also noted, and also amplification of a second gene, ALK, in a single case. CONCLUSIONS: This analysis demonstrates the power of SNP arrays for high-resolution determination of LOH and DNA copy number change in neuroblastoma, a tumor in which specific allelic changes drive clinical outcome and selection of therapy.

  18. A genome-wide association study for milk production traits in Danish Jersey cattle using a 50K single nucleotide polymorphism chip.

    Science.gov (United States)

    Mai, M D; Sahana, G; Christiansen, F B; Guldbrandtsen, B

    2010-11-01

    Quantitative trait loci for milk production traits in Danish Jersey cattle were mapped by a genome-wide association analysis using a mixed model. The analysis incorporated 1,039 bulls and 33,090 SNP and resulted in 98 detected combinations of QTL and traits on 27 BTA. These QTL comprised 30 for milk index, 50 for fat index, and 18 for protein index. The evidence presents 33 genome-wide QTL on 14 BTA. Of these, 7 had effects on milk index, 21 on fat index, and 5 on protein index. Among the genome-wide QTL, 26 have been previously reported, 2 on BTA4 and BTA5 were new for milk index, and 5 on BTA4, BTA5, BTA13, BTA20, and BTA29 were new QTL for fat index. We found 7 pleiotropic or very closely linked QTL. Most of the QTL were associated with polymorphisms within narrow regions and several may represent the effects of polymorphisms of genes: DGAT1, casein, ARFGAP3, CYP11B1, and CDC-like kinase 4. By a chromosome-wide threshold, 65 additional QTL were detected. Many of them are likely to represent QTL. The results are interesting from a breeding perspective and contribute to the search for the genes causing the polymorphisms important for milk production traits.

  19. Single nucleotide polymorphisms and indel markers from the transcriptome of garlic

    Science.gov (United States)

    Garlic (Allium sativum L.) is cultivated world-wide and widely appreciated for its culinary uses. In spite of primarily being asexually propagated, garlic shows great diversity for adaptation to diverse production environments and bulb phenotypes. Anonymous molecular markers have been used to assess...

  20. Genome-wide survey of artificial mutations induced by ethyl methanesulfonate and gamma rays in tomato.

    Science.gov (United States)

    Shirasawa, Kenta; Hirakawa, Hideki; Nunome, Tsukasa; Tabata, Satoshi; Isobe, Sachiko

    2016-01-01

    Genome-wide mutations induced by ethyl methanesulfonate (EMS) and gamma irradiation in the tomato Micro-Tom genome were identified by a whole-genome shotgun sequencing analysis to estimate the spectrum and distribution of whole-genome DNA mutations and the frequency of deleterious mutations. A total of ~370 Gb of paired-end reads for four EMS-induced mutants and three gamma-ray-irradiated lines as well as a wild-type line were obtained by next-generation sequencing technology. Using bioinformatics analyses, we identified 5920 induced single nucleotide variations and insertion/deletion (indel) mutations. The predominant mutations in the EMS mutants were C/G to T/A transitions, while in the gamma-ray mutants, C/G to T/A transitions, A/T to T/A transversions, A/T to G/C transitions and deletion mutations were equally common. Biases in the base composition flanking mutations differed between the mutagenesis types. Regarding the effects of the mutations on gene function, >90% of the mutations were located in intergenic regions, and only 0.2% were deleterious. In addition, we detected 1,140,687 spontaneous single nucleotide polymorphisms and indel polymorphisms in wild-type Micro-Tom lines. We also found copy number variation, deletions and insertions of chromosomal segments in both the mutant and wild-type lines. The results provide helpful information not only for mutation research, but also for mutant screening methodology with reverse-genetic approaches.

  1. Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome.

    Science.gov (United States)

    Lewis, Nathan E; Liu, Xin; Li, Yuxiang; Nagarajan, Harish; Yerganian, George; O'Brien, Edward; Bordbar, Aarash; Roth, Anne M; Rosenbloom, Jeffrey; Bian, Chao; Xie, Min; Chen, Wenbin; Li, Ning; Baycin-Hizal, Deniz; Latif, Haythem; Forster, Jochen; Betenbaugh, Michael J; Famili, Iman; Xu, Xun; Wang, Jun; Palsson, Bernhard O

    2013-08-01

    Chinese hamster ovary (CHO) cells, first isolated in 1957, are the preferred production host for many therapeutic proteins. Although genetic heterogeneity among CHO cell lines has been well documented, a systematic, nucleotide-resolution characterization of their genotypic differences has been stymied by the lack of a unifying genomic resource for CHO cells. Here we report a 2.4-Gb draft genome sequence of a female Chinese hamster, Cricetulus griseus, harboring 24,044 genes. We also resequenced and analyzed the genomes of six CHO cell lines from the CHO-K1, DG44 and CHO-S lineages. This analysis identified hamster genes missing in different CHO cell lines, and detected >3.7 million single-nucleotide polymorphisms (SNPs), 551,240 indels and 7,063 copy number variations. Many mutations are located in genes with functions relevant to bioprocessing, such as apoptosis. The details of this genetic diversity highlight the value of the hamster genome as the reference upon which CHO cells can be studied and engineered for protein production.

  2. Porcine SOX9 Gene Expression Is Influenced by an 18 bp Indel in the 5'-Untranslated Region.

    Directory of Open Access Journals (Sweden)

    Bertram Brenig

    Full Text Available Sex determining region Y-box 9 (SOX9 is an important regulator of sex and skeletal development and is expressed in a variety of embryonal and adult tissues. Loss or gain of function resulting from mutations within the coding region or chromosomal aberrations of the SOX9 locus lead to a plethora of detrimental phenotypes in humans and animals. One of these phenotypes is the so-called male-to-female or female-to-male sex-reversal which has been observed in several mammals including pig, dog, cat, goat, horse, and deer. In 38,XX sex-reversal French Large White pigs, a genome-wide association study suggested SOX9 as the causal gene, although no functional mutations were identified in affected animals. However, besides others an 18 bp indel had been detected in the 5'-untranslated region of the SOX9 gene by comparing affected animals and controls. We have identified the same indel (Δ18 between position +247 bp and +266 bp downstream the transcription start site of the porcine SOX9 gene in four other pig breeds; i.e., German Large White, Laiwu Black, Bamei, and Erhualian. These animals have been genotyped in an attempt to identify candidate genes for porcine inguinal and/or scrotal hernia. Because the 18 bp segment in the wild type 5'-UTR harbours a highly conserved cAMP-response element (CRE half-site, we analysed its role in SOX9 expression in vitro. Competition and immunodepletion electromobility shift assays demonstrate that the CRE half-site is specifically recognized by CREB. Both binding of CREB to the wild type as well as the absence of the CRE half-site in Δ18 reduced expression efficiency in HEK293T, PK-15, and ATDC5 cells significantly. Transfection experiments of wild type and Δ18 SOX9 promoter luciferase constructs show a significant reduction of RNA and protein levels depending on the presence or absence of the 18 bp segment. Hence, the data presented here demonstrate that the 18 bp indel in the porcine SOX9 5'-UTR is of functional

  3. Porcine SOX9 Gene Expression Is Influenced by an 18 bp Indel in the 5'-Untranslated Region.

    Science.gov (United States)

    Brenig, Bertram; Duan, Yanyu; Xing, Yuyun; Ding, Nengshui; Huang, Lusheng; Schütz, Ekkehard

    2015-01-01

    Sex determining region Y-box 9 (SOX9) is an important regulator of sex and skeletal development and is expressed in a variety of embryonal and adult tissues. Loss or gain of function resulting from mutations within the coding region or chromosomal aberrations of the SOX9 locus lead to a plethora of detrimental phenotypes in humans and animals. One of these phenotypes is the so-called male-to-female or female-to-male sex-reversal which has been observed in several mammals including pig, dog, cat, goat, horse, and deer. In 38,XX sex-reversal French Large White pigs, a genome-wide association study suggested SOX9 as the causal gene, although no functional mutations were identified in affected animals. However, besides others an 18 bp indel had been detected in the 5'-untranslated region of the SOX9 gene by comparing affected animals and controls. We have identified the same indel (Δ18) between position +247 bp and +266 bp downstream the transcription start site of the porcine SOX9 gene in four other pig breeds; i.e., German Large White, Laiwu Black, Bamei, and Erhualian. These animals have been genotyped in an attempt to identify candidate genes for porcine inguinal and/or scrotal hernia. Because the 18 bp segment in the wild type 5'-UTR harbours a highly conserved cAMP-response element (CRE) half-site, we analysed its role in SOX9 expression in vitro. Competition and immunodepletion electromobility shift assays demonstrate that the CRE half-site is specifically recognized by CREB. Both binding of CREB to the wild type as well as the absence of the CRE half-site in Δ18 reduced expression efficiency in HEK293T, PK-15, and ATDC5 cells significantly. Transfection experiments of wild type and Δ18 SOX9 promoter luciferase constructs show a significant reduction of RNA and protein levels depending on the presence or absence of the 18 bp segment. Hence, the data presented here demonstrate that the 18 bp indel in the porcine SOX9 5'-UTR is of functional importance and may

  4. Genome-wide analysis of single nucleotide polymorphisms uncovers population structure in Northern Europe.

    Directory of Open Access Journals (Sweden)

    Elina Salmela

    Full Text Available BACKGROUND: Genome-wide data provide a powerful tool for inferring patterns of genetic variation and structure of human populations. PRINCIPAL FINDINGS: In this study, we analysed almost 250,000 SNPs from a total of 945 samples from Eastern and Western Finland, Sweden, Northern Germany and Great Britain complemented with HapMap data. Small but statistically significant differences were observed between the European populations (F(ST = 0.0040, p<10(-4, also between Eastern and Western Finland (F(ST = 0.0032, p<10(-3. The latter indicated the existence of a relatively strong autosomal substructure within the country, similar to that observed earlier with smaller numbers of markers. The Germans and British were less differentiated than the Swedes, Western Finns and especially the Eastern Finns who also showed other signs of genetic drift. This is likely caused by the later founding of the northern populations, together with subsequent founder and bottleneck effects, and a smaller population size. Furthermore, our data suggest a small eastern contribution among the Finns, consistent with the historical and linguistic background of the population. SIGNIFICANCE: Our results warn against a priori assumptions of homogeneity among Finns and other seemingly isolated populations. Thus, in association studies in such populations, additional caution for population structure may be necessary. Our results illustrate that population history is often important for patterns of genetic variation, and that the analysis of hundreds of thousands of SNPs provides high resolution also for population genetics.

  5. High-efficiency non-mosaic CRISPR-mediated knock-in and indel mutation in F0 Xenopus.

    Science.gov (United States)

    Aslan, Yetki; Tadjuidje, Emmanuel; Zorn, Aaron M; Cha, Sang-Wook

    2017-08-01

    The revolution in CRISPR-mediated genome editing has enabled the mutation and insertion of virtually any DNA sequence, particularly in cell culture where selection can be used to recover relatively rare homologous recombination events. The efficient use of this technology in animal models still presents a number of challenges, including the time to establish mutant lines, mosaic gene editing in founder animals, and low homologous recombination rates. Here we report a method for CRISPR-mediated genome editing in Xenopus oocytes with homology-directed repair (HDR) that provides efficient non-mosaic targeted insertion of small DNA fragments (40-50 nucleotides) in 4.4-25.7% of F0 tadpoles, with germline transmission. For both CRISPR/Cas9-mediated HDR gene editing and indel mutation, the gene-edited F0 embryos are uniformly heterozygous, consistent with a mutation in only the maternal genome. In addition to efficient tagging of proteins in vivo, this HDR methodology will allow researchers to create patient-specific mutations for human disease modeling in Xenopus. © 2017. Published by The Company of Biologists Ltd.

  6. Is gastric lymphoepithelioma-like carcinoma a special subtype of EBV-associated gastric carcinoma? New insight based on clinicopathological features and EBV genome polymorphisms.

    Science.gov (United States)

    Cheng, Na; Hui, Da-yang; Liu, Yong; Zhang, Na-na; Jiang, Ye; Han, Jing; Li, Hai-Gang; Ding, Yun-Gang; Du, Hong; Chen, Jian-Ning; Shao, Chun-Kui

    2015-04-01

    Gastric lymphoepithelioma-like carcinoma (LELC) is a rare entity that is closely associated with Epstein-Barr virus (EBV). However, the EBV latency pattern and genome polymorphisms in gastric LELC have not been systematically explored. The clinicopathological features, EBV latency pattern and genome polymorphisms of EBV-positive gastric LELC in Guangzhou, southern China were investigated and compared with those of ordinary EBV-associated gastric carcinoma (EBVaGC) in the same area. Ten (1.42%) of 702 gastric carcinoma cases were identified as gastric LELC, in which eight (80%) cases were EBV-positive. The clinicopathological characteristics and EBV latency pattern of EBV-positive gastric LELC were similar to those of ordinary EBVaGC. In EBV genotype analysis, type A strain, type F, I, mut-W1/I, XhoI- and del-LMP1 variants were predominant among EBV-positive gastric LELCs, accounting for eight (100%), six (75%), eight (100%), seven (87.5%), five (62.5%) and six (75%) cases, respectively, which are similar to those in ordinary EBVaGC. For EBNA1 polymorphisms, the V-leu and P-ala subtypes were predominant in EBV-positive gastric LELC, which is different from the predominant V-val subtype in ordinary EBVaGC. EBV-positive gastric LELC has a favorable prognosis when compared to ordinary EBVaGC (median survival time 43.0 vs. 18.0 months). Gastric LELC is strongly associated with EBV and EBV-positive gastric LELC should be regarded as a special subtype of EBVaGC. This, to our best knowledge, is the first time in the world that the EBV latency pattern and genome polymorphisms of EBV-positive gastric LELC are systematically revealed.

  7. PRNP promoter polymorphisms are associated with BSE susceptibility in Swiss and German cattle

    Directory of Open Access Journals (Sweden)

    Ziegler Ute

    2007-04-01

    Full Text Available Abstract Background Non-synonymous polymorphisms within the prion protein gene (PRNP influence the susceptibility and incubation time for transmissible spongiform encephalopathies (TSE in some species such as sheep and humans. In cattle, none of the known polymorphisms within the PRNP coding region has a major influence on susceptibility to bovine spongiform encephalopathy (BSE. Recently, however, we demonstrated an association between susceptibility to BSE and a 23 bp insertion/deletion (indel polymorphism and a 12 bp indel polymorphism within the putative PRNP promoter region using 43 German BSE cases and 48 German control cattle. The objective of this study was to extend this work by including a larger number of BSE cases and control cattle of German and Swiss origin. Results Allele, genotype and haplotype frequencies of the two indel polymorphisms were determined in 449 BSE cattle and 431 unaffected cattle from Switzerland and Germany including all 43 German BSE and 16 German control animals from the original study. When breeds with similar allele and genotype distributions were compared, the 23 bp indel polymorphism again showed a significant association with susceptibility to BSE. However, some additional breed-specific allele and genotype distributions were identified, mainly related to the Brown breeds. Conclusion Our study corroborated earlier findings that polymorphisms in the PRNP promoter region have an influence on susceptibility to BSE. However, breed-specific differences exist that need to be accounted for when analyzing such data.

  8. Genome-wide analysis of single nucleotide polymorphisms in patients with atrophic age-related macular degeneration in oldest old Han Chinese.

    Science.gov (United States)

    Zhou, T Q; Guan, H J; Hu, J Y

    2015-12-21

    The aim of this study was to identify disease-associated loci in oldest old Han Chinese with atrophic age-related macular degeneration (AMD). This genome-wide association study (GWAS) only included oldest old (≥95 years old) subjects in Rugao County, China. Thirty atrophic AMD patients and 47 age-matched non-AMD controls were enrolled. The study subjects underwent a complete ophthalmic examination. Genomic DNA was extracted from peripheral blood samples. Single nucleotide polymorphisms (SNPs) were scanned by Genome-Wide Human Mapping SNP 6.0 Arrays and GeneChip Scanner 3000 7G. The results were read and analyzed by the Affymetrix Genotyping Console software. We filtered out the SNPs with a no-call rate ≥10%, MAF P old Han Chinese population. This finding may lead to new strategies for screening of atrophic AMD for Han Chinese.

  9. Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

    Science.gov (United States)

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

    2015-01-01

    We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692

  10. Pan-genome multilocus sequence typing and outbreak-specific reference-based single nucleotide polymorphism analysis to resolve two concurrent Staphylococcus aureus outbreaks in neonatal services.

    Science.gov (United States)

    Roisin, S; Gaudin, C; De Mendonça, R; Bellon, J; Van Vaerenbergh, K; De Bruyne, K; Byl, B; Pouseele, H; Denis, O; Supply, P

    2016-06-01

    We used a two-step whole genome sequencing analysis for resolving two concurrent outbreaks in two neonatal services in Belgium, caused by exfoliative toxin A-encoding-gene-positive (eta+) methicillin-susceptible Staphylococcus aureus with an otherwise sporadic spa-type t209 (ST-109). Outbreak A involved 19 neonates and one healthcare worker in a Brussels hospital from May 2011 to October 2013. After a first episode interrupted by decolonization procedures applied over 7 months, the outbreak resumed concomitantly with the onset of outbreak B in a hospital in Asse, comprising 11 neonates and one healthcare worker from mid-2012 to January 2013. Pan-genome multilocus sequence typing, defined on the basis of 42 core and accessory reference genomes, and single-nucleotide polymorphisms mapped on an outbreak-specific de novo assembly were used to compare 28 available outbreak isolates and 19 eta+/spa-type t209 isolates identified by routine or nationwide surveillance. Pan-genome multilocus sequence typing showed that the outbreaks were caused by independent clones not closely related to any of the surveillance isolates. Isolates from only ten cases with overlapping stays in outbreak A, including four pairs of twins, showed no or only a single nucleotide polymorphism variation, indicating limited sequential transmission. Detection of larger genomic variation, even from the start of the outbreak, pointed to sporadic seeding from a pre-existing exogenous source, which persisted throughout the whole course of outbreak A. Whole genome sequencing analysis can provide unique fine-tuned insights into transmission pathways of complex outbreaks even at their inception, which, with timely use, could valuably guide efforts for early source identification.

  11. Bovine spongiform encephalopathy associated insertion/deletion polymorphisms of the prion protein gene in the four beef cattle breeds from North China.

    Science.gov (United States)

    Zhu, Xiang-Yuan; Feng, Fu-Ying; Xue, Su-Yuan; Hou, Ting; Liu, Hui-Rong

    2011-10-01

    Two insertion/deletion (indel) polymorphisms of the prion protein gene (PRNP), a 23-bp indel in the putative promoter region and a 12-bp indel within intron I, are associated with the susceptibility to bovine spongiform encephalopathy (BSE) in cattle. In the present study, the polymorphism frequencies of the two indels in four main beef cattle breeds (Hereford, Simmental, Black Angus, and Mongolian) from North China were studied. The results showed that the frequencies of deletion genotypes and alleles of 23- and 12-bp indels were lower, whereas the frequencies of insertion genotypes and alleles of the two indels were higher in Mongolian cattle than in the other three cattle breeds. In Mongolian cattle, the 23-bp insertion / 12-bp insertion was the major haplotype, whereas in Hereford, Simmental, and Black Angus cattle, the 23-bp deletion / 12-bp deletion was the major haplotype. These results demonstrated that Mongolian cattle could be more resistant to BSE, compared with the other three cattle breeds, because of its relatively low frequencies of deletion genotypes and alleles of 23- and 12-bp indel polymorphisms. Thus, this race could be important for selective breeding to improve resistance against BSE in this area.

  12. 2matrix: A Utility for Indel Coding and Phylogenetic Matrix Concatenation

    Directory of Open Access Journals (Sweden)

    Nelson R. Salinas

    2014-01-01

    Full Text Available Premise of the study: Phylogenetic analysis of DNA and amino acid sequences requires the creation of files formatted specifically for each analysis package. Programs currently available cannot simultaneously code inferred insertion/deletion (indel events in sequence alignments and concatenate data sets. Methods and Results: A novel Perl script, 2matrix, was created to concatenate matrices of non-molecular characters and/or aligned sequences and to code indels. 2matrix outputs a variety of formats compatible with popular phylogenetic programs. Conclusions: 2matrix efficiently codes indels and concatenates matrices of sequences and non-molecular data. It is available for free download under a GPL (General Public License open source license (https://github.com/nrsalinas/2matrix/archive/master.zip.

  13. Genome-wide patterns of recombination, linkage disequilibrium and nucleotide diversity from pooled resequencing and single nucleotide polymorphism genotyping unlock the evolutionary history of Eucalyptus grandis.

    Science.gov (United States)

    Silva-Junior, Orzenil B; Grattapaglia, Dario

    2015-11-01

    We used high-density single nucleotide polymorphism (SNP) data and whole-genome pooled resequencing to examine the landscape of population recombination (ρ) and nucleotide diversity (ϴw ), assess the extent of linkage disequilibrium (r(2) ) and build the highest density linkage maps for Eucalyptus. At the genome-wide level, linkage disequilibrium (LD) decayed within c. 4-6 kb, slower than previously reported from candidate gene studies, but showing considerable variation from absence to complete LD up to 50 kb. A sharp decrease in the estimate of ρ was seen when going from short to genome-wide inter-SNP distances, highlighting the dependence of this parameter on the scale of observation adopted. Recombination was correlated with nucleotide diversity, gene density and distance from the centromere, with hotspots of recombination enriched for genes involved in chemical reactions and pathways of the normal metabolic processes. The high nucleotide diversity (ϴw = 0.022) of E. grandis revealed that mutation is more important than recombination in shaping its genomic diversity (ρ/ϴw = 0.645). Chromosome-wide ancestral recombination graphs allowed us to date the split of E. grandis (1.7-4.8 million yr ago) and identify a scenario for the recent demographic history of the species. Our results have considerable practical importance to Genome Wide Association Studies (GWAS), while indicating bright prospects for genomic prediction of complex phenotypes in eucalypt breeding.

  14. Identification of a glutamic acid repeat polymorphism of ALMS1 as a novel genetic risk marker for early-onset myocardial infarction by genome-wide linkage analysis.

    Science.gov (United States)

    Ichihara, Sahoko; Yamamoto, Ken; Asano, Hiroyuki; Nakatochi, Masahiro; Sukegawa, Mayo; Ichihara, Gaku; Izawa, Hideo; Hirashiki, Akihiro; Takatsu, Fumimaro; Umeda, Hisashi; Iwase, Mitsunori; Inagaki, Haruo; Hirayama, Haruo; Sone, Takahito; Nishigaki, Kazuhiko; Minatoguchi, Shinya; Cho, Myeong-Chan; Jang, Yangsoo; Kim, Hyo-Soo; Park, Jeong E; Tada-Oikawa, Saeko; Kitajima, Hidetoshi; Matsubara, Tatsuaki; Sunagawa, Kenji; Shimokawa, Hiroaki; Kimura, Akinori; Lee, Jong-Young; Murohara, Toyoaki; Inoue, Ituro; Yokota, Mitsuhiro

    2013-12-01

    Myocardial infarction (MI) is a leading cause of death worldwide. Given that a family history is an independent risk factor for coronary artery disease, genetic variants are thought to contribute directly to the development of this condition. The identification of susceptibility genes for coronary artery disease or MI may thus help to identify high-risk individuals and offer the opportunity for disease prevention. We designed a 5-step protocol, consisting of a genome-wide linkage study followed by association analysis, to identify novel genetic variants that confer susceptibility to coronary artery disease or MI. A genome-wide affected sib-pair linkage study with 221 Japanese families with coronary artery disease yielded a statistically significant logarithm of the odds score of 3.44 for chromosome 2p13 and MI. Further association analysis implicated Alström syndrome 1 gene (ALMS1) as a candidate gene within the linkage region. Validation association analysis revealed that representative single-nucleotide polymorphisms of the ALMS1 promoter region were significantly associated with early-onset MI in both Japanese and Korean populations. Moreover, direct sequencing of the ALMS1 coding region identified a glutamic acid repeat polymorphism in exon 1, which was subsequently found to be associated with early-onset MI. The glutamic acid repeat polymorphism of ALMS1 identified in the present study may provide insight into the pathogenesis of early-onset MI.

  15. A genome-wide association study for milk production traits in Danish Jersey cattle using a 50K single nucleotide polymorphism chip

    DEFF Research Database (Denmark)

    Mai, Duy Minh; Sahana, Goutam; Christiansen, Freddy;

    2010-01-01

    for milk index, 50 for fat index, and 18 for protein index. The evidence presents 33 genome-wide QTL on 14 BTA. Of these, 7 had effects on milk index, 21 on fat index, and 5 on protein index. Among the genome-wide QTL, 26 have been previously reported, 2 on BTA4 and BTA5 were new for milk index, and 5......Quantitative trait loci for milk production traits in Danish Jersey cattle were mapped by a genome-wide association analysis using a mixed model. The analysis incorporated 1,039 bulls and 33,090 SNP and resulted in 98 detected combinations of QTL and traits on 27 BTA. These QTL comprised 30...... on BTA4, BTA5, BTA13, BTA20, and BTA29 were new QTL for fat index. We found 7 pleiotropic or very closely linked QTL. Most of the QTL were associated with polymorphisms within narrow regions and several may represent the effects of polymorphisms of genes: DGAT1, casein, ARFGAP3, CYP11B1, and CDC...

  16. Inference of the Genetic Polymorphisms of CYP2D6 in Six Subtribes of the Malaysian Orang Asli from Whole-Genome Sequencing Data.

    Science.gov (United States)

    Yu, Choo Yee; Ang, Geik Yong; Subramaniam, Vinothini; Johari James, Richard; Ahmad, Aminuddin; Abdul Rahman, Thuhairah; Mohd Nor, Fadzilah; Shaari, Syahrul Azlin; Teh, Lay Kek; Salleh, Mohd Zaki

    2017-07-01

    CYP2D6 is one of the major enzymes in the cytochrome P450 monooxygenase system. It metabolizes ∼25% of prescribed drugs and hence, the genetic diversity of a CYP2D6 gene has continued to be of great interest to the medical and pharmaceutical industries. This study was designed to perform a systematic analysis of the CYP2D6 gene in six subtribes of the Malaysian Orang Asli. Genomic DNAs were extracted from the blood samples followed by whole-genome sequencing. The reads were aligned to the reference human genome hg19 and variants in the CYP2D6 gene were analyzed. CYP2D6*5 and duplication of CYP2D6 were analyzed using previously established methods. A total of 72 single nucleotide polymorphisms were identified. CYP2D6*1, *2, *4, *5, *10,*41, and duplication of the gene were found in the Orang Asli, whereby CYP2D6*2 and *41 alleles are reported for the first time in the Malaysian population. The findings in this study provide insights into the genetic polymorphisms of CYP2D6 in the Orang Asli of Peninsular Malaysia.

  17. Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Shibaya Taeko

    2010-04-01

    Full Text Available Abstract Background To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. Results The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7× the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67 051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Conclusions Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several

  18. Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789.

    Science.gov (United States)

    Wei, Wu; McCusker, John H; Hyman, Richard W; Jones, Ted; Ning, Ye; Cao, Zhiwei; Gu, Zhenglong; Bruno, Dan; Miranda, Molly; Nguyen, Michelle; Wilhelmy, Julie; Komp, Caridad; Tamse, Raquel; Wang, Xiaojing; Jia, Peilin; Luedi, Philippe; Oefner, Peter J; David, Lior; Dietrich, Fred S; Li, Yixue; Davis, Ronald W; Steinmetz, Lars M

    2007-07-31

    We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.

  19. A MITE-based genotyping method to reveal hundreds of DNA polymorphisms in an animal genome after a few generations of artificial selection

    Directory of Open Access Journals (Sweden)

    Tetreau Guillaume

    2008-10-01

    Full Text Available Abstract Background For most organisms, developing hundreds of genetic markers spanning the whole genome still requires excessive if not unrealistic efforts. In this context, there is an obvious need for methodologies allowing the low-cost, fast and high-throughput genotyping of virtually any species, such as the Diversity Arrays Technology (DArT. One of the crucial steps of the DArT technique is the genome complexity reduction, which allows obtaining a genomic representation characteristic of the studied DNA sample and necessary for subsequent genotyping. In this article, using the mosquito Aedes aegypti as a study model, we describe a new genome complexity reduction method taking advantage of the abundance of miniature inverted repeat transposable elements (MITEs in the genome of this species. Results Ae. aegypti genomic representations were produced following a two-step procedure: (1 restriction digestion of the genomic DNA and simultaneous ligation of a specific adaptor to compatible ends, and (2 amplification of restriction fragments containing a particular MITE element called Pony using two primers, one annealing to the adaptor sequence and one annealing to a conserved sequence motif of the Pony element. Using this protocol, we constructed a library comprising more than 6,000 DArT clones, of which at least 5.70% were highly reliable polymorphic markers for two closely related mosquito strains separated by only a few generations of artificial selection. Within this dataset, linkage disequilibrium was low, and marker redundancy was evaluated at 2.86% only. Most of the detected genetic variability was observed between the two studied mosquito strains, but individuals of the same strain could still be clearly distinguished. Conclusion The new complexity reduction method was particularly efficient to reveal genetic polymorphisms in Ae. egypti. Overall, our results testify of the flexibility of the DArT genotyping technique and open new

  20. Deep sequencing revealed genome-wide single-nucleotide polymorphism and plasmid content of Erwinia amylovora strains isolated in Middle Atlas, Morocco.

    Science.gov (United States)

    Hannou, Najat; Mondy, Samuel; Planamente, Sara; Moumni, Mohieddine; Llop, Pablo; López, María; Manceau, Charles; Barny, Marie-Anne; Faure, Denis

    2013-10-01

    Erwinia amylovora causes economic losses that affect pear and apple production in Morocco. Here, we report comparative genomics of four Moroccan E. amylovora strains with the European strain CFBP1430 and North-American strain ATCC49946. Analysis of single nucleotide polymorphisms (SNPs) revealed genetic homogeneity of Moroccan's strains and their proximity to the European strain CFBP1430. Moreover, the collected sequences allowed the assembly of a 65 kpb plasmid, which is highly similar to the plasmid pEI70 harbored by several European E. amylovora isolates. This plasmid was found in 33% of the 40 E. amylovora strains collected from several host plants in 2009 and 2010 in Morocco.

  1. Association Between the 313 bp Indel in Porcine POU1F1 Gene and Reproduction Traits

    Institute of Scientific and Technical Information of China (English)

    WU Han; SONG Chengyi; GAO Bo; TENG Shanghui; WANG Xiaoyang; LIU Ruoyu; CAI Huifen

    2009-01-01

    The study aims to analyze the distribution of the 313 bp indel (insertion/deletion termed as indel) in first intron of POUIF1 and it's association with reproduction traits in Sutai pigs by using the PCR-DSCP technique. The results showed that in this commercial pig population, the frequency of allele A was 0.6371, B was 0.3629; the genotype frequency of AA was 0.4516, AB was 0.3710, BB was 0.1774, and the X2 test showed that the allele frequencies were in Hardy-Weinberg equilibrium. The SPSS GLM procedure was used to identify the association of the 313 bp indel with reproductive traits. In Sutai pigs, the pigs with AA genotype represented higher value in all reproduction traits, except for higher survival rate of piglets at weaning. Higher weaning weight was significantly associated with AA genotype pigs and higher survival rate of piglets at weaning was significantly associated with BB genotype (P0.05); the P value of different traits affected by fixed factors were not significant as well (P>0.05). The result indicated that although this 313 bp indel was significantly associated with the weaning weight and survival rate at weaning, no any association with major reproduction traits was observed in Sutai pigs.

  2. Whole-genome single-nucleotide polymorphism (SNP marker discovery and association analysis with the eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content in Larimichthys crocea

    Directory of Open Access Journals (Sweden)

    Shijun Xiao

    2016-12-01

    Full Text Available Whole-genome single-nucleotide polymorphism (SNP markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA and docosahexaenoic acid (DHA content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms.

  3. The Genome of Anopheles darlingi, the main neotropical malaria vector

    Science.gov (United States)

    Marinotti, Osvaldo; Cerqueira, Gustavo C.; de Almeida, Luiz Gonzaga Paula; Ferro, Maria Inês Tiraboschi; Loreto, Elgion Lucio da Silva; Zaha, Arnaldo; Teixeira, Santuza M. R.; Wespiser, Adam R.; Almeida e Silva, Alexandre; Schlindwein, Aline Daiane; Pacheco, Ana Carolina Landim; da Silva, Artur Luiz da Costa; Graveley, Brenton R.; Walenz, Brian P.; Lima, Bruna de Araujo; Ribeiro, Carlos Alexandre Gomes; Nunes-Silva, Carlos Gustavo; de Carvalho, Carlos Roberto; Soares, Célia Maria de Almeida; de Menezes, Claudia Beatriz Afonso; Matiolli, Cleverson; Caffrey, Daniel; Araújo, Demetrius Antonio M.; de Oliveira, Diana Magalhães; Golenbock, Douglas; Grisard, Edmundo Carlos; Fantinatti-Garboggini, Fabiana; de Carvalho, Fabíola Marques; Barcellos, Fernando Gomes; Prosdocimi, Francisco; May, Gemma; de Azevedo Junior, Gilson Martins; Guimarães, Giselle Moura; Goldman, Gustavo Henrique; Padilha, Itácio Q. M.; Batista, Jacqueline da Silva; Ferro, Jesus Aparecido; Ribeiro, José M. C.; Fietto, Juliana Lopes Rangel; Dabbas, Karina Maia; Cerdeira, Louise; Agnez-Lima, Lucymara Fassarella; Brocchi, Marcelo; de Carvalho, Marcos Oliveira; Teixeira, Marcus de Melo; Diniz Maia, Maria de Mascena; Goldman, Maria Helena S.; Cruz Schneider, Maria Paula; Felipe, Maria Sueli Soares; Hungria, Mariangela; Nicolás, Marisa Fabiana; Pereira, Maristela; Montes, Martín Alejandro; Cantão, Maurício E.; Vincentz, Michel; Rafael, Miriam Silva; Silverman, Neal; Stoco, Patrícia Hermes; Souza, Rangel Celso; Vicentini, Renato; Gazzinelli, Ricardo Tostes; Neves, Rogério de Oliveira; Silva, Rosane; Astolfi-Filho, Spartaco; Maciel, Talles Eduardo Ferreira; Ürményi, Turán P.; Tadei, Wanderli Pedro; Camargo, Erney Plessmann; de Vasconcelos, Ana Tereza Ribeiro

    2013-01-01

    Anopheles darlingi is the principal neotropical malaria vector, responsible for more than a million cases of malaria per year on the American continent. Anopheles darlingi diverged from the African and Asian malaria vectors ∼100 million years ago (mya) and successfully adapted to the New World environment. Here we present an annotated reference A. darlingi genome, sequenced from a wild population of males and females collected in the Brazilian Amazon. A total of 10 481 predicted protein-coding genes were annotated, 72% of which have their closest counterpart in Anopheles gambiae and 21% have highest similarity with other mosquito species. In spite of a long period of divergent evolution, conserved gene synteny was observed between A. darlingi and A. gambiae. More than 10 million single nucleotide polymorphisms and short indels with potential use as genetic markers were identified. Transposable elements correspond to 2.3% of the A. darlingi genome. Genes associated with hematophagy, immunity and insecticide resistance, directly involved in vector–human and vector–parasite interactions, were identified and discussed. This study represents the first effort to sequence the genome of a neotropical malaria vector, and opens a new window through which we can contemplate the evolutionary history of anopheline mosquitoes. It also provides valuable information that may lead to novel strategies to reduce malaria transmission on the South American continent. The A. darlingi genome is accessible at www.labinfo.lncc.br/index.php/anopheles-darlingi. PMID:23761445

  4. Chloroplast genome sequence confirms distinctness of Australian and Asian wild rice.

    Science.gov (United States)

    Waters, Daniel L E; Nock, Catherine J; Ishikawa, Ryuji; Rice, Nicole; Henry, Robert J

    2012-01-01

    Cultivated rice (Oryza sativa) is an AA genome Oryza species that was most likely domesticated from wild populations of O. rufipogon in Asia. O. rufipogon and O. meridionalis are the only AA genome species found within Australia and occur as widespread populations across northern Australia. The chloroplast genome sequence of O. rufipogon from Asia and Australia and O. meridionalis and O. australiensis (an Australian member of the genus very distant from O. sativa) was obtained by massively parallel sequencing and compared with the chloroplast genome sequence of domesticated O. sativa. Oryza australiensis differed in more than 850 sites single nucleotide polymorphism or indel from each of the other samples. The other wild rice species had only around 100 differences relative to cultivated rice. The chloroplast genomes of Australian O. rufipogon and O. meridionalis were closely related with only 32 differences. The Asian O. rufipogon chloroplast genome (with only 68 differences) was closer to O. sativa than the Australian taxa (both with more than 100 differences). The chloroplast sequences emphasize the genetic distinctness of the Australian populations and their potential as a source of novel rice germplasm. The Australian O. rufipogon may be a perennial form of O. meridionalis.

  5. Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx)

    DEFF Research Database (Denmark)

    Xia, Qingyou; Guo, Yiran; Zhang, Ze;

    2009-01-01

    A single-base pair resolution silkworm genetic variation map was constructed from 40 domesticated and wild silkworms, each sequenced to approximately threefold coverage, representing 99.88% of the genome. We identified ~16 million single-nucleotide polymorphisms, many indels, and structural...... variations. We find that the domesticated silkworms are clearly genetically differentiated from the wild ones, but they have maintained large levels of genetic variability, suggesting a short domestication event involving a large number of individuals. We also identified signals of selection at 354 candidate...... genes that may have been important during domestication, some of which have enriched expression in the silk gland, midgut, and testis. These data add to our understanding of the domestication processes and may have applications in devising pest control strategies and advancing the use of silkworms...

  6. Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome

    DEFF Research Database (Denmark)

    Lewis, Nathan E; Liu, Xin; Li, Yuxiang

    2013-01-01

    . This analysis identified hamster genes missing in different CHO cell lines, and detected >3.7 million single-nucleotide polymorphisms (SNPs), 551,240 indels and 7,063 copy number variations. Many mutations are located in genes with functions relevant to bioprocessing, such as apoptosis. The details...

  7. Genome-wide association study reveals a polymorphism in the podocyte receptor RANK for the decline of renal function in coronary patients.

    Directory of Open Access Journals (Sweden)

    Andreas Leiherer

    Full Text Available Impaired kidney function is a significant health problem and a major concern in clinical routine and is routinely determined by decreased glomerular filtration rate (GFR. In contrast to single assessment of a patients' kidney function providing only limited information on patients' health, serial measurements of GFR clearly improves the validity of diagnosis. The decline of kidney function has recently been reported to be predictive for mortality and vascular events in coronary patients. However, it has not been investigated for genetic association in GWA studies. This study investigates for the first time the association of cardiometabolic polymorphisms with the decline of estimated GFR during a 4 year follow up in 583 coronary patients, using the Cardio-Metabo Chip. We revealed a suggestive association with 3 polymorphisms, surpassing genome-wide significance (p = 4.0 e-7. The top hit rs17069906 (p = 5.6 e-10 is located within the genomic region of RANK, recently demonstrated to be an important player in the adaptive recovery response in podocytes and suggested as a promising therapeutic target in glomerular diseases.

  8. 鼠疫菌全基因组单核苷酸多态性研究进展%Genome-wide single nucleotide polymorphism of Yersinia pestis

    Institute of Scientific and Technical Information of China (English)

    王娜

    2011-01-01

    Single nucleotide polymorphisms (SNPs) mainly refer to the polymorphism of DNA sequence caused by a single nucleotide mutation, including the synonymous SNPs and non- synonymous SNPs. With the rapid development of sequencing technology, a large number of bacterial genome sequences are available. So, it's possible to identify potential SNPs sites by sequencing technology and bioinformatics methods. Also, SNPs, because of their own characteristics, have been widely used as a new molecular marker in bacterial genotyping, evolution and epidemiology research. In this paper, advances in the research on the genome-wide search of SNPs sites and analysis of the Yersinia pestis microevolution based on SNPs data are reviewed.%单核苷酸多态性(single nucleotide polymorphisms,SNPs)主要是指在基因组水平上由单个核苷酸的变异所引起的DNA序列多态性,包括同义SNPs(synonymous SNPs,sSNPs)和非同义SNPs(non-synonymous SNPs,nSNPs).随着测序技术的迅速发展,获得了大量细菌全基因组序列,使得通过测序技术及生物信息学方法寻找潜在的SNPs位点成为可能.并且,由于SNPs本身的特性,使其作为一种新的分子标记,在细菌分型与进化、流行病学调查研究中得到广泛应用.该文主要阐述基于全基因组寻找SNPs位点,并建立以SNPs数据为基础的鼠疫菌微进化研究分析的研究进展状况.

  9. Chloroplast genome sequence of the moss Tortula ruralis: gene content, polymorphism, and structural arrangement relative to other green plant chloroplast genomes

    OpenAIRE

    Wolf Paul G; Everett Karin DE; Mandoli Dina F; Boore Jeffrey L; Kuehl Jennifer V; Mishler Brent D; Murdock Andrew G; Oliver Melvin J; Duffy Aaron M; Karol Kenneth G

    2010-01-01

    Abstract Background Tortula ruralis, a widely distributed species in the moss family Pottiaceae, is increasingly used as a model organism for the study of desiccation tolerance and mechanisms of cellular repair. In this paper, we present the chloroplast genome sequence of T. ruralis, only the second published chloroplast genome for a moss, and the first for a vegetatively desiccation-t...

  10. Genomic profiling of plastid DNA variation in the Mediterranean olive tree

    Directory of Open Access Journals (Sweden)

    Dorado Gabriel

    2011-05-01

    Full Text Available Abstract Background Characterisation of plastid genome (or cpDNA polymorphisms is commonly used for phylogeographic, population genetic and forensic analyses in plants, but detecting cpDNA variation is sometimes challenging, limiting the applications of such an approach. In the present study, we screened cpDNA polymorphism in the olive tree (Olea europaea L. by sequencing the complete plastid genome of trees with a distinct cpDNA lineage. Our objective was to develop new markers for a rapid genomic profiling (by Multiplex PCRs of cpDNA haplotypes in the Mediterranean olive tree. Results Eight complete cpDNA genomes of Olea were sequenced de novo. The nucleotide divergence between olive cpDNA lineages was low and not exceeding 0.07%. Based on these sequences, markers were developed for studying two single nucleotide substitutions and length polymorphism of 62 regions (with variable microsatellite motifs or other indels. They were then used to genotype the cpDNA variation in cultivated and wild Mediterranean olive trees (315 individuals. Forty polymorphic loci were detected on this sample, allowing the distinction of 22 haplotypes belonging to the three Mediterranean cpDNA lineages known as E1, E2 and E3. The discriminating power of cpDNA variation was particularly low for the cultivated olive tree with one predominating haplotype, but more diversity was detected in wild populations. Conclusions We propose a method for a rapid characterisation of the Mediterranean olive germplasm. The low variation in the cultivated olive tree indicated that the utility of cpDNA variation for forensic analyses is limited to rare haplotypes. In contrast, the high cpDNA variation in wild populations demonstrated that our markers may be useful for phylogeographic and populations genetic studies in O. europaea.

  11. Construction of an interspecific genetic map based on InDel and SSR for mapping the QTLs affecting the initiation of flower primordia in pepper (Capsicum spp..

    Directory of Open Access Journals (Sweden)

    Shu Tan

    Full Text Available Re-sequencing permits the mining of genome-wide variations on a large scale and provides excellent resources for the research community. To accelerate the development and application of molecular markers and identify the QTLs affecting the flowering time-related trait in pepper, a total of 1,038 pairs of InDel and 674 SSR primers from different sources were used for genetic mapping using the F2 population (n = 154 derived from a cross between BA3 (C. annuum and YNXML (C. frutescens. Of these, a total of 224 simple PCR-based markers, including 129 InDels and 95 SSRs, were validated and integrated into a map, which was designated as the BY map. The BY map consisted of 13 linkage groups (LGs and spanned a total genetic distance of 1,249.77 cM with an average marker distance of 5.60 cM. Comparative analysis of the genetic and physical map based on the anchored markers showed that the BY map covered nearly the whole pepper genome. Based on the BY map, one major and five minor QTLs affecting the number of leaves on the primary axis (Nle were detected on chromosomes P2, P7, P10 and P11 in 2012. The major QTL on P2 was confirmed based on another subset of the same F2 population (n = 147 in 2014 with selective genotyping of markers from the BY map. With the accomplishment of pepper whole genome sequencing and annotations (release 2.0, 153 candidate genes were predicted to embed in the Nle2.2 region, of which 12 important flowering related genes were obtained. The InDel/SSR-based interspecific genetic map, QTLs and candidate genes obtained by the present study will be useful for the downstream isolation of flowering time-related gene and other genetic applications for pepper.

  12. Construction of an interspecific genetic map based on InDel and SSR for mapping the QTLs affecting the initiation of flower primordia in pepper (Capsicum spp.).

    Science.gov (United States)

    Tan, Shu; Cheng, Jiao-Wen; Zhang, Li; Qin, Cheng; Nong, Ding-Guo; Li, Wei-Peng; Tang, Xin; Wu, Zhi-Ming; Hu, Kai-Lin

    2015-01-01

    Re-sequencing permits the mining of genome-wide variations on a large scale and provides excellent resources for the research community. To accelerate the development and application of molecular markers and identify the QTLs affecting the flowering time-related trait in pepper, a total of 1,038 pairs of InDel and 674 SSR primers from different sources were used for genetic mapping using the F2 population (n = 154) derived from a cross between BA3 (C. annuum) and YNXML (C. frutescens). Of these, a total of 224 simple PCR-based markers, including 129 InDels and 95 SSRs, were validated and integrated into a map, which was designated as the BY map. The BY map consisted of 13 linkage groups (LGs) and spanned a total genetic distance of 1,249.77 cM with an average marker distance of 5.60 cM. Comparative analysis of the genetic and physical map based on the anchored markers showed that the BY map covered nearly the whole pepper genome. Based on the BY map, one major and five minor QTLs affecting the number of leaves on the primary axis (Nle) were detected on chromosomes P2, P7, P10 and P11 in 2012. The major QTL on P2 was confirmed based on another subset of the same F2 population (n = 147) in 2014 with selective genotyping of markers from the BY map. With the accomplishment of pepper whole genome sequencing and annotations (release 2.0), 153 candidate genes were predicted to embed in the Nle2.2 region, of which 12 important flowering related genes were obtained. The InDel/SSR-based interspecific genetic map, QTLs and candidate genes obtained by the present study will be useful for the downstream isolation of flowering time-related gene and other genetic applications for pepper.

  13. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing

    Directory of Open Access Journals (Sweden)

    Chen Zuozhou

    2010-11-01

    Full Text Available Abstract Background Massively parallel sequencing readouts of epigenomic assays are enabling integrative genome-wide analyses of genomic and epigenomic variation. Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing. Results Pash 3.0 generally matches the accuracy and speed of niche programs for fast mapping of short reads, and exceeds their performance on longer reads generated by a new generation of massively parallel sequencing technologies. By exploiting longer read lengths, Pash 3.0 maps reads onto the large fraction of genomic DNA that contains repetitive elements and polymorphic sites, including indel polymorphisms. Conclusions We demonstrate the versatility of Pash 3.0 by analyzing the interaction between CpG methylation, CpG SNPs, and imprinting based on publicly available whole-genome shotgun bisulfite sequencing data. Pash 3.0 makes use of gapped k-mer alignment, a non-seed based comparison method, which is implemented using multi-positional hash tables. This allows Pash 3.0 to run on diverse hardware platforms, including individual computers with standard RAM capacity, multi-core hardware architectures and large clusters.

  14. The Mitochondrial Genome of Raphanus sativus and Gene Evolution of Cruciferous Mitochondrial Types

    Institute of Scientific and Technical Information of China (English)

    Shengxin Chang; Jianmei Chen; Yankun Wang; Bingchao Gu; Jianbo He; Pu Chu; Rongzhan Guan

    2013-01-01

    To explore the mitochondrial genes of the Cruciferae family,the mitochondrial genome of Raphanus sativus (sat) was sequenced and annotated.The circular mitochondrial genome of sat is 239,723 bp and includes 33 protein-coding genes,three rRNA genes and 17 tRNA genes.The mitochondrial genome also contains a pair of large repeat sequences 5.9 kb in length,which may mediate genome reorganization into two sub-genomic circles,with predicted sizes of 124.8 kb and 115.0 kb,respectively.Furthermore,gene evolution of mitochondrial genomes within the Cruciferae family was analyzed using sat mitochondrial type (mitotype),together with six other reported mitotypes.The cruciferous mitochondrial genomes have maintained almost the same set of functional genes.Compared with Cycas taitungensis (a representative gymnosperm),the mitochondrial genomes of the Cruciferae have lost nine protein-coding genes and seven mitochondrial-like tRNA genes,but acquired six chloroplast-like tRNAs.Among the Cruciferae,to maintain the same set of genes that are necessary for mitochondrial function,the exons of the genes have changed at the lowest rates,as indicated by the numbers of single nucleotide polymorphisms.The open reading frames (ORFs) of unknown function in the cruciferous genomes are not conserved.Evolutionary events,such as mutations,genome reorganizations and sequence insertions or deletions (indels),have resulted in the nonconserved ORFs in the cruciferous mitochondrial genomes,which is becoming significantly different among mitotypes.This work represents the first phylogenic explanation of the evolution of genes of known function in the Cruciferae family.It revealed significant variation in ORFs and the causes of such variation.

  15. The mitochondrial genome of Raphanus sativus and gene evolution of cruciferous mitochondrial types.

    Science.gov (United States)

    Chang, Shengxin; Chen, Jianmei; Wang, Yankun; Gu, Bingchao; He, Jianbo; Chu, Pu; Guan, Rongzhan

    2013-03-20

    To explore the mitochondrial genes of the Cruciferae family, the mitochondrial genome of Raphanus sativus (sat) was sequenced and annotated. The circular mitochondrial genome of sat is 239,723 bp and includes 33 protein-coding genes, three rRNA genes and 17 tRNA genes. The mitochondrial genome also contains a pair of large repeat sequences 5.9 kb in length, which may mediate genome reorganization into two sub-genomic circles, with predicted sizes of 124.8 kb and 115.0 kb, respectively. Furthermore, gene evolution of mitochondrial genomes within the Cruciferae family was analyzed using sat mitochondrial type (mitotype), together with six other reported mitotypes. The cruciferous mitochondrial genomes have maintained almost the same set of functional genes. Compared with Cycas taitungensis (a representative gymnosperm), the mitochondrial genomes of the Cruciferae have lost nine protein-coding genes and seven mitochondrial-like tRNA genes, but acquired six chloroplast-like tRNAs. Among the Cruciferae, to maintain the same set of genes that are necessary for mitochondrial function, the exons of the genes have changed at the lowest rates, as indicated by the numbers of single nucleotide polymorphisms. The open reading frames (ORFs) of unknown function in the cruciferous genomes are not conserved. Evolutionary events, such as mutations, genome reorganizations and sequence insertions or deletions (indels), have resulted in the non-conserved ORFs in the cruciferous mitochondrial genomes, which is becoming significantly different among mitotypes. This work represents the first phylogenic explanation of the evolution of genes of known function in the Cruciferae family. It revealed significant variation in ORFs and the causes of such variation.

  16. Genome-wide discovery of single nucleotide polymorphisms (SNPs) and single nucleotide variants (SNVs) in deep-sea mussels: Potential use in population genomics and cross-species application

    Science.gov (United States)

    Xu, Ting; Sun, Jin; Lv, Jia; Kayama Watanabe, Hiromi; Li, Tianqi; Zou, Weiwen; Rouse, Greg W.; Wang, Shi; Qian, Pei-Yuan; Bao, Zhenmin; Qiu, Jian-Wen

    2017-03-01

    The present study aimed to generate genome-wide single nucleotide polymorphisms (SNPs) for the deep-sea mussel Bathymodiolus platifrons via a combination of genome survey sequencing and the type IIB endonuclease restriction-site associated DNA (2b-RAD) sequencing, assess the potential use of SNPs in detecting fine-sale population genetic structure and signatures of divergent selection, as well as their cross-species application in other bathymodioline mussels. Genome survey sequencing was conducted for one individual of B. platifrons. De novo assembly resulted in 781,720 sequences with a scaffold N50 of 2.9 kb. Using these sequences as a reference, 9307 genome-wide SNPs were identified by 2b-RAD for 28 B. platifrons individuals collected from a seep and a vent population. Among these SNPs, nine outliers showed significant evidence for divergent selection, and their positions in the genes or scaffolds were identified. The FST estimated based on the putative neutral SNPs was low (0.0126) indicating the two B. platifrons populations having a high genetic connectivity. However, the permutation test detected significant differences (Pgenetic differentiation. The Bayesian clustering analyses and principle component analyses (PCA) performed based on either the putative neutral or outlier SNPs also showed that these two populations were genetically differentiated. In addition, 2b-RAD was also conducted to detect 10,199, 6429, and 3811 single nucleotide variants (SNVs) respectively in the bathymodioline mussels Bathymodiolus japonicus, Bathymodiolus aduloides and Idas sp. with different phylogenetic distances from B. platifrons. Overall, our study has demonstrated the feasibility and effectiveness of combining genome survey sequencing and 2b-RAD to rapidly generate genomic resources for use in fine-scale population genetic studies, and various cross-species applications.

  17. Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex genetic alterations in cervical cancer

    Directory of Open Access Journals (Sweden)

    Kenter Gemma G

    2007-02-01

    Full Text Available Abstract Background Cervical carcinoma develops as a result of multiple genetic alterations. Different studies investigated genomic alterations in cervical cancer mainly by means of metaphase comparative genomic hybridization (mCGH and microsatellite marker analysis for the detection of loss of heterozygosity (LOH. Currently, high throughput methods such as array comparative genomic hybridization (array CGH, single nucleotide polymorphism array (SNP array and gene expression arrays are available to study genome-wide alterations. Integration of these 3 platforms allows detection of genomic alterations at high resolution and investigation of an association between copy number changes and expression. Results Genome-wide copy number and genotype analysis of 10 cervical cancer cell lines by array CGH and SNP array showed highly complex large-scale alterations. A comparison between array CGH and SNP array revealed that the overall concordance in detection of the same areas with copy number alterations (CNA was above 90%. The use of SNP arrays demonstrated that about 75% of LOH events would not have been found by methods which screen for copy number changes, such as array CGH, since these were LOH events without CNA. Regions frequently targeted by CNA, as determined by array CGH, such as amplification of 5p and 20q, and loss of 8p were confirmed by fluorescent in situ hybridization (FISH. Genome-wide, we did not find a correlation between copy-number and gene expression. At chromosome arm 5p however, 22% of the genes were significantly upregulated in cell lines with amplifications as compared to cell lines without amplifications, as measured by gene expression arrays. For 3 genes, SKP2, ANKH and TRIO, expression differences were confirmed by quantitative real-time PCR (qRT-PCR. Conclusion This study showed that copy number data retrieved from either array CGH or SNP array are comparable and that the integration of genome-wide LOH, copy number and gene

  18. Detection of Ribosomal DNA Sequence Polymorphisms in the Protist Plasmodiophora brassicae for the Identification of Geographical Isolates

    Directory of Open Access Journals (Sweden)

    Rawnak Laila

    2017-01-01

    Full Text Available Clubroot is a soil-borne disease caused by the protist Plasmodiophora brassicae (P. brassicae. It is one of the most economically important diseases of Brassica rapa and other cruciferous crops as it can cause remarkable yield reductions. Understanding P. brassicae genetics, and developing efficient molecular markers, is essential for effective detection of harmful races of this pathogen. Samples from 11 Korean field populations of P. brassicae (geographic isolates, collected from nine different locations in South Korea, were used in this study. Genomic DNA was extracted from the clubroot-infected samples to sequence the ribosomal DNA. Primers and probes for P. brassicae were designed using a ribosomal DNA gene sequence from a Japanese strain available in GenBank (accession number AB526843; isolate NGY. The nuclear ribosomal DNA (rDNA sequence of P. brassicae, comprising 6932 base pairs (bp, was cloned and sequenced and found to include the small subunits (SSUs and a large subunit (LSU, internal transcribed spacers (ITS1 and ITS2, and a 5.8s. Sequence variation was observed in both the SSU and LSU. Four markers showed useful differences in high-resolution melting analysis to identify nucleotide polymorphisms including single- nucleotide polymorphisms (SNPs, oligonucleotide polymorphisms, and insertions/deletions (InDels. A combination of three markers was able to distinguish the geographical isolates into two groups.

  19. Identification of Genome-Wide Variants and Discovery of Variants Associated with Brassica rapa Clubroot Resistance Gene Rcr1 through Bulked Segregant RNA Sequencing

    Science.gov (United States)

    Yu, Fengqun; Zhang, Xingguo; Huang, Zhen; Chu, Mingguang; Song, Tao; Falk, Kevin C.; Deora, Abhinandan; Chen, Qilin; Zhang, Yan; McGregor, Linda; Gossen, Bruce D.; McDonald, Mary Ruth; Peng, Gary

    2016-01-01

    Clubroot, caused by Plasmodiophora brassicae, is an important disease on Brassica species worldwide. A clubroot resistance gene, Rcr1, with efficacy against pathotype 3 of P. brassicae, was previously mapped to chromosome A03 of B. rapa in pak choy cultivar “Flower Nabana”. In the current study, resistance to pathotypes 2, 5 and 6 was shown to be associated with Rcr1 region on chromosome A03. Bulked segregant RNA sequencing was performed and short read sequences were assembled into 10 chromosomes of the B. rapa reference genome v1.5. For the resistant (R) bulks, a total of 351.8 million (M) sequences, 30,836.5 million bases (Mb) in length, produced 120-fold coverage of the reference genome. For the susceptible (S) bulks, 322.9 M sequences, 28,216.6 Mb in length, produced 109-fold coverage. In total, 776.2 K single nucleotide polymorphisms (SNPs) and 122.2 K insertion / deletion (InDels) in R bulks and 762.8 K SNPs and 118.7 K InDels in S bulks were identified; each chromosome had about 87% SNPs and 13% InDels, with 78% monomorphic and 22% polymorphic variants between the R and S bulks. Polymorphic variants on each chromosome were usually below 23%, but made up 34% of the variants on chromosome A03. There were 35 genes annotated in the Rcr1 target region and variants were identified in 21 genes. The numbers of poly variants differed significantly among the genes. Four out of them encode Toll-Interleukin-1 receptor / nucleotide-binding site / leucine-rich-repeat proteins; Bra019409 and Bra019410 harbored the higher numbers of polymorphic variants, which indicates that they are more likely candidates of Rcr1. Fourteen SNP markers in the target region were genotyped using the Kompetitive Allele Specific PCR method and were confirmed to associate with Rcr1. Selected SNP markers were analyzed with 26 recombinants obtained from a segregating population consisting of 1587 plants, indicating that they were completely linked to Rcr1. Nine SNP markers were used for marker

  20. Indelible Rules of Josephson Coupling Energy and Zero-Point Energy in High-Tc Cuprates

    Institute of Scientific and Technical Information of China (English)

    LIU Fu-Sui; CHEN Wan-Fang

    2004-01-01

    This paper shows that the Josephson coupling energy and the zero-point energy have indelible rules on the superfluid density and the superconductivity in the high-Tc cuprates.This paper also shows that the values of Tc at underdoped and overdoped regions are determined by the damage conditions of the phase coherence in the classical and the quantum XY-models,respectively.

  1. MySSP: non-stationary evolutionary sequence simulation, including indels.

    Science.gov (United States)

    Rosenberg, Michael S

    2007-02-26

    MySSP is a new program for the simulation of DNA sequence evolution across a phylogenetic tree. Although many programs are available for sequence simulation, MySSP is unique in its inclusion of indels, flexibility in allowing for non-stationary patterns, and output of ancestral sequences. Some of these features can individually be found in existing programs, but have not all have been previously available in a single package.

  2. Genomic variation and population structure detected by single nucleotide polymorphism arrays in Corriedale, Merino and Creole sheep

    Directory of Open Access Journals (Sweden)

    Andrés N Grasso

    2014-06-01

    Full Text Available The aim of this study was to investigate the genetic diversity within and among three breeds of sheep: Corriedale, Merino and Creole. Sheep from the three breeds (Merino n = 110, Corriedale n = 108 and Creole n = 10 were genotyped using the Illumina Ovine SNP50 beadchip®. Genetic diversity was evaluated by comparing the minor allele frequency (MAF among breeds. Population structure and genetic differentiation were assessed using STRUCTURE software, principal component analysis (PCA and fixation index (F ST. Fixed markers (MAF = 0 that were different among breeds were identified as specific breed markers. Using a subset of 18,181 single nucleotide polymorphisms (SNPs, PCA and STUCTURE analysis were able to explain population stratification within breeds. Merino and Corriedale divergent lines showed high levels of polymorphism (89.4% and 86% of polymorphic SNPs, respectively and moderate genetic differentiation (F ST = 0.08 between them. In contrast, Creole had only 69% polymorphic SNPs and showed greater genetic differentiation from the other two breeds (F ST = 0.17 for both breeds. Hence, a subset of molecular markers present in the OvineSNP50 is informative enough for breed assignment and population structure analysis of commercial and Creole breeds.

  3. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Development of Advanced Classification Algorithm for Genome-Wide Single Nucleotide Polymorphism (SNP) Data Analysis

    Science.gov (United States)

    2011-04-01

    al. (2007) “Efficient mapping of mendelian traits in dogs through genome-wide association.” Nat Genet 39:1321-1328. 12 Distribution A...collected data to genetically map superior intelligence in the military working dog. A behavioral testing regimen was developed by canine cognitive expert Dr...TERMS Military working dog genome-wide association study genetic marker intelligence 16

  4. Slow DNA loss in the gigantic genomes of salamanders.

    Science.gov (United States)

    Sun, Cheng; López Arriaza, José R; Mueller, Rachel Lockridge

    2012-01-01

    Evolutionary changes in genome size result from the combined effects of mutation, natural selection, and genetic drift. Insertion and deletion mutations (indels) directly impact genome size by adding or removing sequences. Most species lose more DNA through small indels (i.e., ~1-30 bp) than they gain, which can result in genome reduction over time. Because this rate of DNA loss varies across species, small indel dynamics have been suggested to contribute to genome size evolution. Species with extremely large genomes provide interesting test cases for exploring the link between small indels and genome size; however, most large genomes remain relatively unexplored. Here, we examine rates of DNA loss in the tetrapods with the largest genomes-the salamanders. We used low-coverage genomic shotgun sequence data from four salamander species to examine patterns of insertion, deletion, and substitution in neutrally evolving non-long terminal repeat (LTR) retrotransposon sequences. For comparison, we estimated genome-wide DNA loss rates in non-LTR retrotransposon sequences from five other vertebrate genomes: Anolis carolinensis, Danio rerio, Gallus gallus, Homo sapiens, and Xenopus tropicalis. Our results show that salamanders have significantly lower rates of DNA loss than do other vertebrates. More specifically, salamanders experience lower numbers of deletions relative to insertions, and both deletions and insertions are skewed toward smaller sizes. On the basis of these patterns, we conclude that slow DNA loss contributes to genomic gigantism in salamanders. We also identify candidate molecular mechanisms underlying these differences and suggest that natural variation in indel dynamics provides a unique opportunity to study the basis of genome stability.

  5. Comparative analysis of single nucleotide polymorphisms in the nuclear, chloroplast, and mitochondrial genomes in identification of phylogenetic association among seven melon (Cucumis melo L.) cultivars.

    Science.gov (United States)

    Zhu, Qianglong; Gao, Peng; Liu, Shi; Amanullah, Sikandar; Luan, Feishi

    2016-12-01

    A variety of melons are cultivated worldwide, and their specific biological properties make them an attractive model for molecular studies. This study aimed to investigate the single nucleotide polymorphisms (SNPs) from the mitochondrial, chloroplast, and nuclear genomes of seven melon accessions (Cucumis melo L.) to identify the phylogenetic relationships among melon cultivars with the Illumina HiSeq 2000 platform and bioinformatical analyses. The data showed that there were a total of 658 mitochondrial SNPs (207-295 in each), while there were 0-60 chloroplast SNPs among these seven melon cultivars, compared to the reference genome. Bioinformatical analysis showed that the mitochondrial tree topology was unable to separate the melon features, whereas the maximum parsimony/neighbor joining (MP/NJ) tree of the chloroplast SNPs could define melon features such as seed length, width, thickness, 100-seed weight, and type. SNPs of the nuclear genome were better than the mitochondrial and chloroplast SNPs in the identification of melon features. The data demonstrated the usefulness of mitochondrial, chloroplast, and nuclear SNPs in identification of phylogenetic associations among these seven melon cultivars.

  6. PolyTB: A genomic variation map for Mycobacterium tuberculosis

    KAUST Repository

    Coll, Francesc

    2014-02-15

    Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) is the second major cause of death from an infectious disease worldwide. Recent advances in DNA sequencing are leading to the ability to generate whole genome information in clinical isolates of M. tuberculosis complex (MTBC). The identification of informative genetic variants such as phylogenetic markers and those associated with drug resistance or virulence will help barcode Mtb in the context of epidemiological, diagnostic and clinical studies. Mtb genomic datasets are increasingly available as raw sequences, which are potentially difficult and computer intensive to process, and compare across studies. Here we have processed the raw sequence data (>1500 isolates, eight studies) to compile a catalogue of SNPs (n = 74,039, 63% non-synonymous, 51.1% in more than one isolate, i.e. non-private), small indels (n = 4810) and larger structural variants (n = 800). We have developed the PolyTB web-based tool (http://pathogenseq.lshtm.ac.uk/polytb) to visualise the resulting variation and important meta-data (e.g. in silico inferred strain-types, location) within geographical map and phylogenetic views. This resource will allow researchers to identify polymorphisms within candidate genes of interest, as well as examine the genomic diversity and distribution of strains. PolyTB source code is freely available to researchers wishing to develop similar tools for their pathogen of interest. 2014 Elsevier Ltd. All rights reserved.

  7. A role for indels in the evolution of Cro protein folds.

    Science.gov (United States)

    Stewart, Katie L; Nelson, Michael R; Eaton, Karen V; Anderson, William J; Cordes, Matthew H J

    2013-11-01

    Insertions and deletions in protein sequences, or indels, can disrupt structure and may result in changes in protein folds during evolution or in association with alternative splicing. Pfl 6 and Xfaso 1 are two proteins in the Cro family that share a common ancestor but have different folds. Sequence alignments of the two proteins show two gaps, one at the N terminus, where the sequence of Xfaso 1 is two residues shorter, and one near the center of the sequence, where the sequence of Pfl 6 is five residues shorter. To test the potential importance of indels in Cro protein evolution, we generated hybrid variants of Pfl 6 and Xfaso 1 with indels in one or both regions, chosen according to several plausible sequence alignments. All but one deletion variant completely unfolded both proteins, showing that a longer N-terminal sequence was critical for Pfl 6 folding and a longer central region sequence was critical for Xfaso 1 folding. By contrast, Xfaso 1 tolerated a longer N-terminal sequence with little destabilization, and Pfl 6 tolerated central region insertions, albeit with substantial effects on thermal stability and some perturbation of the surrounding structure. None of the mutations appeared to convert one stable fold into the other. On the basis of this two-protein comparison, short insertion and deletion mutations probably played a role in evolutionary fold change in the Cro family, but were also not the only factors. Copyright © 2013 Wiley Periodicals, Inc.

  8. The humankind genome: from genetic diversity to the origin of human diseases.

    Science.gov (United States)

    Belizário, Jose E

    2013-12-01

    Genome-wide association studies have failed to establish common variant risk for the majority of common human diseases. The underlying reasons for this failure are explained by recent studies of resequencing and comparison of over 1200 human genomes and 10 000 exomes, together with the delineation of DNA methylation patterns (epigenome) and full characterization of coding and noncoding RNAs (transcriptome) being transcribed. These studies have provided the most comprehensive catalogues of functional elements and genetic variants that are now available for global integrative analysis and experimental validation in prospective cohort studies. With these datasets, researchers will have unparalleled opportunities for the alignment, mining, and testing of hypotheses for the roles of specific genetic variants, including copy number variations, single nucleotide polymorphisms, and indels as the cause of specific phenotypes and diseases. Through the use of next-generation sequencing technologies for genotyping and standardized ontological annotation to systematically analyze the effects of genomic variation on humans and model organism phenotypes, we will be able to find candidate genes and new clues for disease's etiology and treatment. This article describes essential concepts in genetics and genomic technologies as well as the emerging computational framework to comprehensively search websites and platforms available for the analysis and interpretation of genomic data.

  9. A high-definition view of functional genetic variation from natural yeast genomes.

    Science.gov (United States)

    Bergström, Anders; Simpson, Jared T; Salinas, Francisco; Barré, Benjamin; Parts, Leopold; Zia, Amin; Nguyen Ba, Alex N; Moses, Alan M; Louis, Edward J; Mustonen, Ville; Warringer, Jonas; Durbin, Richard; Liti, Gianni

    2014-04-01

    The question of how genetic variation in a population influences phenotypic variation and evolution is of major importance in modern biology. Yet much is still unknown about the relative functional importance of different forms of genome variation and how they are shaped by evolutionary processes. Here we address these questions by population level sequencing of 42 strains from the budding yeast Saccharomyces cerevisiae and its closest relative S. paradoxus. We find that genome content variation, in the form of presence or absence as well as copy number of genetic material, is higher within S. cerevisiae than within S. paradoxus, despite genetic distances as measured in single-nucleotide polymorphisms being vastly smaller within the former species. This genome content variation, as well as loss-of-function variation in the form of premature stop codons and frameshifting indels, is heavily enriched in the subtelomeres, strongly reinforcing the relevance of these regions to functional evolution. Genes affected by these likely functional forms of variation are enriched for functions mediating interaction with the external environment (sugar transport and metabolism, flocculation, metal transport, and metabolism). Our results and analyses provide a comprehensive view of genomic diversity in budding yeast and expose surprising and pronounced differences between the variation within S. cerevisiae and that within S. paradoxus. We also believe that the sequence data and de novo assemblies will constitute a useful resource for further evolutionary and population genomics studies.

  10. [Sequence polymorphism and mapping of wheat Ca2+-binding protein TaCRT-A gene].

    Science.gov (United States)

    Wang, Ji-Ping; Mao, Xin-Guo; Li, Run-Zhi; Jing, Rui-Lian

    2012-09-01

    Taking thirty-seven hexaploid wheat (AABBDD) accessions with different drought resistance at seedling stage, three wheat species with A genome (AA), and three tetraploid wheat species (AABB) as test materials, and by direct sequencing the single nucleotide polymorphism (SNP) in TaCRT-A, this paper analyzed the relationships of the SNP with the drought resistance of wheat ( Triticum aestivum) at its seedling stage, and mapped the TaCRT-A on the chromosome of wheat. The full-length sequence of the TaCRT-A genomic DNA was 3887 bp. A total of 202 nucleotide variant loci were observed in the full length sequence of 167141 bp, among which, 165 SNP and 37 InDel with the frequencies of 1 SNP/1013 bp and 1 InDel/4517 bp were detected, respectively. The nucleotide diversity (pi) in coding region of TaCRT-A was lower than that in non-coding region, suggesting that the selection pressure in coding region was stronger than that in non-coding region. The 43 accessions could be classified as 14 haplotypes (H1-H14) by haploid analysis, among which, H1, H2, and H13 all contained one accession which was the donor species of A genome in common wheat, H16 and H7 had one high drought-resistant accession, H8 comprised tetraploid wheat, drought-resistant accessions, and drought-sensitive accessions, whereas H11 included the wheat accessions with drought-resistance and medium-drought resistance. Though the expression of TaCRT was induced by water stress, no significant relationship was identified between TaCRT-A polymorphism and drought resistance. Using a population of recombinant inbred lines derived from a cross of Opata 85 x W7984, the TaCRT-A was mapped between SSR markers Xmwg30 and Xmwg570 on chromosome 3A, and the genetic distances were 10.5 cM and 49.6 cM from the flanking markers, respectively.

  11. Differentiation of Plum pox virus isolates by single-strand conformation polymorphism and low-stringency single specific primer PCR analysis of HC-Pro genome region.

    Science.gov (United States)

    Gadiou, S; Safárová, D; Navrátil, M

    2009-01-01

    Single-strand conformation polymorphism (SSCP) and low-stringency single specific primer (LSSP)-PCR were assessed for suitability and reliability in genotyping of Plum pox virus (PPV) isolates. Examined PPV isolates included 16 PPV-D, 12 PPV-M, and 14 PPV-Rec isolates collected in Czech Republic. The analysis was performed on the helper component protease (HC-Pro) region of the PPV genome. SSCP and LSSP-PCR allowed the differentiation of PPV strain, but SSCP was not able to distinguish isolates within the same strain. The individual genotyping of each PPV isolate was obtained by LSSP-PCR. Nevertheless, both SSCP and LSSP-PCR techniques are suitable for preliminary screening of genetic variability of plant RNA viruses.

  12. Buccal cells DNA extraction to obtain high quality human genomic DNA suitable for polymorphism genotyping by PCR-RFLP and Real-Time PCR.

    Science.gov (United States)

    Küchler, Erika Calvano; Tannure, Patricia Nivoloni; Falagan-Lotsch, Priscila; Lopes, Taliria Silva; Granjeiro, Jose Mauro; Amorim, Lidia Maria Fonte

    2012-01-01

    The aim of this study was to evaluate, by PCR-RFLP and real-time PCR, the yield and quality of genomic DNA collected from buccal cells by mouthwash after different storage times at room temperature. A group of volunteers was recruited to collect buccal cells using a mouthwash solution. The collected solution was divided into 3 tubes, one tube were used for immediate extraction and the remaining received ethanol and were kept at room temperature for 4 and 8 days followed by dna extraction. The concentration, purity and integrity of the dna were determined using spectrophotometry and electrophoresis. DNA quality differences among the three incubation times were also evaluated for genotyping EGF +61 a/g (rs 4444903) polymorphism by PCR-RFLP and for IRF6 polymorphism (rs 17015215) using real-time PCR. There was no significant difference of dna yield (p=0.75) and purity (p=0.86) among the three different incubation times. DNA obtained from different incubation times presented high-molecular weight. The PCR-RFLP and real time pcr reactions were successfully performed for all DNA samples, even those extracted after 8 days of incubation. All samples genotyped by real-time pcr presented c allele for irf6 gene polymorphism (homozygous: cc; heterozygous: Ct) and the C allele was used as a reference for Ct values. The samples presented the same genotype for the different times in both techniques. We demonstrated that the method described herein is simple and low cost, and that DNA can be extracted and pcr amplified after storage in mouthwash solution at room temperature.

  13. Buccal cells DNA extraction to obtain high quality human genomic DNA suitable for polymorphism genotyping by PCR-RFLP and Real-Time PCR

    Directory of Open Access Journals (Sweden)

    Erika Calvano Küchler

    2012-08-01

    Full Text Available OBJECTIVE: The aim of this study was to evaluate, by PCR-RFLP and real-time PCR, the yield and quality of genomic DNA collected from buccal cells by mouthwash after different storage times at room temperature. MATERIAL AND METHODS: A group of volunteers was recruited to collect buccal cells using a mouthwash solution. The collected solution was divided into 3 tubes, one tube were used for immediate extraction and the remaining received ethanol and were kept at room temperature for 4 and 8 days followed by dna extraction. The concentration, purity and integrity of the dna were determined using spectrophotometry and electrophoresis. DNA quality differences among the three incubation times were also evaluated for genotyping EGF +61 a/g (rs 4444903 polymorphism by PCR-RFLP and for IRF6 polymorphism (rs 17015215 using real-time PCR. RESULTS: There was no significant difference of dna yield (p=0.75 and purity (p=0.86 among the three different incubation times. DNA obtained from different incubation times presented high-molecular weight. The PCR-RFLP and real time pcr reactions were successfully performed for all DNA samples, even those extracted after 8 days of incubation. All samples genotyped by real-time pcr presented c allele for irf6 gene polymorphism (homozygous: cc; heterozygous: Ct and the C allele was used as a reference for Ct values. The samples presented the same genotype for the different times in both techniques. CONCLUSION: We demonstrated that the method described herein is simple and low cost, and that DNA can be extracted and pcr amplified after storage in mouthwash solution at room temperature.

  14. Nucleotide polymorphisms and haplotype diversity of RTCS gene in China elite maize inbred lines.

    Directory of Open Access Journals (Sweden)

    Enying Zhang

    Full Text Available The maize RTCS gene, encoding a LOB domain transcription factor, plays important roles in the initiation of embryonic seminal and postembryonic shoot-borne root. In this study, the genomic sequences of this gene in 73 China elite inbred lines, including 63 lines from 5 temperate heteroric groups and 10 tropic germplasms, were obtained, and the nucleotide polymorphisms and haplotype diversity were detected. A total of 63 sequence variants, including 44 SNPs and 19 indels, were identified at this locus, and most of them were found to be located in the regions of UTR and intron. The coding region of this gene in all tested inbred lines carried 14 haplotypes, which encoding 7 deferring RTCS proteins. Analysis of the polymorphism sites revealed that at least 6 recombination events have occurred. Among all 6 groups tested, only the P heterotic group had a much lower nucleotide diversity than the whole set, and selection analysis also revealed that only this group was under strong negative selection. However, the set of Huangzaosi and its derived lines possessed a higher nucleotide diversity than the whole set, and no selection signal were identified.

  15. Automated discovery of single nucleotide polymorphism and simple sequence repeat molecular genetic markers.

    Science.gov (United States)

    Batley, Jacqueline; Jewell, Erica; Edwards, David

    2007-01-01

    Molecular genetic markers represent one of the most powerful tools for the analysis of genomes. Molecular marker technology has developed rapidly over the last decade, and two forms of sequence-based markers, simple sequence repeats (SSRs), also known as microsatellites, and single nucleotide polymorphisms (SNPs), now predominate applications in modern genetic analysis. The availability of large sequence data sets permits mining for SSRs and SNPs, which may then be applied to genetic trait mapping and marker-assisted selection. Here, we describe Web-based automated methods for the discovery of these SSRs and SNPs from sequence data. SSRPrimer enables the real-time discovery of SSRs within submitted DNA sequences, with the concomitant design of PCR primers for SSR amplification. Alternatively, users may browse the SSR Taxonomy Tree to identify predetermined SSR amplification primers for any species represented within the GenBank database. SNPServer uses a redundancy-based approach to identify SNPs within DNA sequence data. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences, and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms.

  16. Moving Away from the Reference Genome: Evaluating a Peptide Sequencing Tagging Approach for Single Amino Acid Polymorphism Identifications in the Genus Populus

    Energy Technology Data Exchange (ETDEWEB)

    Abraham, Paul E [ORNL; Adams, Rachel M [ORNL; Tuskan, Gerald A [ORNL; Hettich, Robert {Bob} L [ORNL

    2013-01-01

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6,653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (AlaSer) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  17. Detection of Hereditary 1,25-Hydroxyvitamin D-Resistant Rickets Caused by Uniparental Disomy of Chromosome 12 Using Genome-Wide Single Nucleotide Polymorphism Array

    Science.gov (United States)

    Tamura, Mayuko; Isojima, Tsuyoshi; Kawashima, Minae; Yoshida, Hideki; Yamamoto, Keiko; Kitaoka, Taichi; Namba, Noriyuki; Oka, Akira; Ozono, Keiichi; Tokunaga, Katsushi; Kitanaka, Sachiko

    2015-01-01

    Context Hereditary 1,25-dihydroxyvitamin D-resistant rickets (HVDRR) is an autosomal recessive disease caused by biallelic mutations in the vitamin D receptor (VDR) gene. No patients have been reported with uniparental disomy (UPD). Objective Using genome-wide single nucleotide polymorphism (SNP) array to confirm whether HVDRR was caused by UPD of chromosome 12. Materials and Methods A 2-year-old girl with alopecia and short stature and without any family history of consanguinity was diagnosed with HVDRR by typical laboratory data findings and clinical features of rickets. Sequence analysis of VDR was performed, and the origin of the homozygous mutation was investigated by target SNP sequencing, short tandem repeat analysis, and genome-wide SNP array. Results The patient had a homozygous p.Arg73Ter nonsense mutation. Her mother was heterozygous for the mutation, but her father was negative. We excluded gross deletion of the father’s allele or paternal discordance. Genome-wide SNP array of the family (the patient and her parents) showed complete maternal isodisomy of chromosome 12. She was successfully treated with high-dose oral calcium. Conclusions This is the first report of HVDRR caused by UPD, and the third case of complete UPD of chromosome 12, in the published literature. Genome-wide SNP array was useful for detecting isodisomy and the parental origin of the allele. Comprehensive examination of the homozygous state is essential for accurate genetic counseling of recurrence risk and appropriate monitoring for other chromosome 12 related disorders. Furthermore, oral calcium therapy was effective as an initial treatment for rickets in this instance. PMID:26153892

  18. Detection of Hereditary 1,25-Hydroxyvitamin D-Resistant Rickets Caused by Uniparental Disomy of Chromosome 12 Using Genome-Wide Single Nucleotide Polymorphism Array.

    Directory of Open Access Journals (Sweden)

    Mayuko Tamura

    Full Text Available Hereditary 1,25-dihydroxyvitamin D-resistant rickets (HVDRR is an autosomal recessive disease caused by biallelic mutations in the vitamin D receptor (VDR gene. No patients have been reported with uniparental disomy (UPD.Using genome-wide single nucleotide polymorphism (SNP array to confirm whether HVDRR was caused by UPD of chromosome 12.A 2-year-old girl with alopecia and short stature and without any family history of consanguinity was diagnosed with HVDRR by typical laboratory data findings and clinical features of rickets. Sequence analysis of VDR was performed, and the origin of the homozygous mutation was investigated by target SNP sequencing, short tandem repeat analysis, and genome-wide SNP array.The patient had a homozygous p.Arg73Ter nonsense mutation. Her mother was heterozygous for the mutation, but her father was negative. We excluded gross deletion of the father's allele or paternal discordance. Genome-wide SNP array of the family (the patient and her parents showed complete maternal isodisomy of chromosome 12. She was successfully treated with high-dose oral calcium.This is the first report of HVDRR caused by UPD, and the third case of complete UPD of chromosome 12, in the published literature. Genome-wide SNP array was useful for detecting isodisomy and the parental origin of the allele. Comprehensive examination of the homozygous state is essential for accurate genetic counseling of recurrence risk and appropriate monitoring for other chromosome 12 related disorders. Furthermore, oral calcium therapy was effective as an initial treatment for rickets in this instance.

  19. 6个虾种基因组DNA多态性分析%ANALYSIS OF GENOMIC DNA POLYMORPHISMS IN SHRIMP GROUP

    Institute of Scientific and Technical Information of China (English)

    许玉德; 孙晟

    2001-01-01

    Genomic DNA polymorphisms in six species shrimp (Macrobrachium rosenbergii, Aristeus virilis, Penaeus penicillatus, P. japonicus, P. monodom and Metapenaeus joyneri) were detected using randomly amplified polymorphic DAN(RAPD) method. Amplifications with 20 primers gave 492 reproducible fragments. Index of genetic similarity(F) was calculated. The value of (1- F) was used to evaluate genetic distances between species and to construct phylogenetic tree. These RAPD analysis is consistent with extant taxonomic system of shrimp group. Therefore, overall results revealed phylogenetic relationship of differential taxonomic class of shrimp group on genomic DNA.%采用RAPD方法检测了罗氏沼虾(Macrobrachium rosenbergii)、绿须虾(Aristeus virilis)、长毛对虾(Penaeus penicillatus)、日本对虾(P.japonicus)、斑节对虾 ( P.monodon)和周氏新对虾(Metapenaeus joyneri)等6个虾种的基因组DNA的多态性。用20个随机引物扩增得到492个DNA片段,根据这些片段的共享度计算出遗传距离并构建系统树。所得结果从DNA水平上反映出虾类在科属种不同分类阶元亲缘关系的远近,并为虾类现行的分类系统提供了分子生物学依据。

  20. Whole genome analysis of an MDR Beijing/W strain of Mycobacterium tuberculosis with large genomic deletions associated with resistance to isoniazid.

    Science.gov (United States)

    Zhang, Qiufen; Wan, Baoshan; Zhou, Aiping; Ni, Jinjing; Xu, Zhihong; Li, Shuxian; Tao, Jing; Yao, YuFeng

    2016-05-15

    Mycobacterium tuberculosis (M.tb) is one of the most prevalent bacterial pathogens in the world. With geographical wide spread and hypervirulence, Beijing/W family is the most successful M.tb lineage. China is a country of high tuberculosis (TB) and high multiple drug-resistant TB (MDR-TB) burden, and the Beijing/W family strains take the largest share of MDR strains. To study the genetic basis of Beijing/W family strains' virulence and drug resistance, we performed the whole genome sequencing of M.tb strain W146, a clinical Beijing/W genotype MDR isolated from Wuxi, Jiangsu province, China. Compared with genome sequence of M.tb strain H37Rv, we found that strain W146 lacks three large fragments and the missing of furA-katG operon confers isoniazid resistance. Besides the missing of furA-katG operon, strain W146 harbored almost all known drug resistance-associated mutations. Comparison analysis of single nucleotide polymorphisms (SNPs) and indels between strain W146 and Beijing/W genotype strains and non-Beijing/W genotype strains revealed that strain W146 possessed some unique mutations, which may be related to drug resistance, transmission and pathogenicity. These findings will help to understand the large sequence polymorphisms (LSPs) and the transmission and drug resistance related genetic characteristics of the Beijing/W genotype of M.tb.

  1. Identification and analysis of Single Nucleotide Polymorphisms (SNPs in the mosquito Anopheles funestus, malaria vector

    Directory of Open Access Journals (Sweden)

    Hemingway Janet

    2007-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most common source of genetic variation in eukaryotic species and have become an important marker for genetic studies. The mosquito Anopheles funestus is one of the major malaria vectors in Africa and yet, prior to this study, no SNPs have been described for this species. Here we report a genome-wide set of SNP markers for use in genetic studies on this important human disease vector. Results DNA fragments from 50 genes were amplified and sequenced from 21 specimens of An. funestus. A third of specimens were field collected in Malawi, a third from a colony of Mozambican origin and a third form a colony of Angolan origin. A total of 494 SNPs including 303 within the coding regions of genes and 5 indels were identified. The physical positions of these SNPs in the genome are known. There were on average 7 SNPs per kilobase similar to that observed in An. gambiae and Drosophila melanogaster. Transitions outnumbered transversions, at a ratio of 2:1. The increased frequency of transition substitutions in coding regions is likely due to the structure of the genetic code and selective constraints. Synonymous sites within coding regions showed a higher polymorphism rate than non-coding introns or 3' and 5'flanking DNA with most of the substitutions in coding regions being observed at the 3rd codon position. A positive correlation in the level of polymorphism was observed between coding and non-coding regions within a gene. By genotyping a subset of 30 SNPs, we confirmed the validity of the SNPs identified during this study. Conclusion This set of SNP markers represents a useful tool for genetic studies in An. funestus, and will be useful in identifying candidate genes that affect diverse ranges of phenotypes that impact on vector control, such as resistance insecticide, mosquito behavior and vector competence.

  2. Failure to lyse venous thrombi because of elevated plasminogen activator Inhibitor 1 (PAI-1) and 4G polymorphism of its promotor genome (The PAI-1/4G Syndrome).

    Science.gov (United States)

    Bern, Murray M; McCarthy, Nancy

    2010-10-01

    Plasminogen activator Inhibitor 1 (PAI-1) inhibits plasminogen activators leading to decreased fibrinolysis and increased risk of thromboembolic disease (TED). Shifts in PAI-1 promoter genome from normal 5G>5G to 4G>5G or 4G>4G alleles are associated with overexpression of PAI-1. In this study patients with residual venous thrombi were observed to have increased PAI-1 levels and more frequent shifts to 4G alleles. Of the 26, 20 (76.9%) patients with unresolved thrombus had elevated PAI-1 values. 4G genomic shifts were found in 92.9% patients studied. Normal PAI-1 levels were found in 5 patients with 4G polymorphisms. Thus, PAI-1 is often elevated among patients with residual thrombus, with an unexpectedly high prevalence of the 4G polymorphism of the promoter genome. Patients with persistent thrombus should be considered at risk of having constituently increased PAI-1 due to genomic changes in the PAI-1 promoter genome. Hypotheses are proposed to explain those with normal PAI-1, despite having 4G polymorphisms.

  3. The South Asian genome.

    Science.gov (United States)

    Chambers, John C; Abbott, James; Zhang, Weihua; Turro, Ernest; Scott, William R; Tan, Sian-Tsung; Afzal, Uzma; Afaq, Saima; Loh, Marie; Lehne, Benjamin; O'Reilly, Paul; Gaulton, Kyle J; Pearson, Richard D; Li, Xinzhong; Lavery, Anita; Vandrovcova, Jana; Wass, Mark N; Miller, Kathryn; Sehmi, Joban; Oozageer, Laticia; Kooner, Ishminder K; Al-Hussaini, Abtehale; Mills, Rebecca; Grewal, Jagvir; Panoulas, Vasileios; Lewin, Alexandra M; Northwood, Korrinne; Wander, Gurpreet S; Geoghegan, Frank; Li, Yingrui; Wang, Jun; Aitman, Timothy J; McCarthy, Mark I; Scott, James; Butcher, Sarah; Elliott, Paul; Kooner, Jaspal S

    2014-01-01

    The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

  4. The South Asian genome.

    Directory of Open Access Journals (Sweden)

    John C Chambers

    Full Text Available The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

  5. Development and validation of a 20K single nucleotide polymorphism (SNP whole genome genotyping array for apple (Malus × domestica Borkh.

    Directory of Open Access Journals (Sweden)

    Luca Bianco

    Full Text Available High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus. A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs. Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  6. Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

    Science.gov (United States)

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  7. Identification of chloroplast genome loci suitable for high-resolution phylogeographic studies of Colocasia esculenta (L.) Schott (Araceae) and closely related taxa.

    Science.gov (United States)

    Ahmed, Ibrar; Matthews, Peter J; Biggs, Patrick J; Naeem, Muhammad; McLenachan, Patricia A; Lockhart, Peter J

    2013-09-01

    Recently, we reported the chloroplast genome-wide association of oligonucleotide repeats, indels and nucleotide substitutions in aroid chloroplast genomes. We hypothesized that the distribution of oligonucleotide repeat sequences in a single representative genome can be used to identify mutational hotspots and loci suitable for population genetic, phylogenetic and phylogeographic studies. Using information on the location of oligonucleotide repeats in the chloroplast genome of taro (Colocasia esculenta), we designed 30 primer pairs to amplify and sequence polymorphic loci. The primers have been tested in a range of intra-specific to intergeneric comparisons, including ten taro samples (Colocasia esculenta) from diverse geographical locations, four other Colocasia species (C. affinis, C. fallax, C. formosana, C. gigantea) and three other aroid genera (represented by Remusatia vivipara, Alocasia brisbanensis and Amorphophallus konjac). Multiple sequence alignments for the intra-specific comparison revealed nucleotide substitutions (point mutations) at all 30 loci and microsatellite polymorphisms at 14 loci. The primer pairs reported here reveal levels of genetic variation suitable for high-resolution phylogeographic and evolutionary studies of taro and other closely related aroids. Our results confirm that information on repeat distribution can be used to identify loci suitable for such studies, and we expect that this approach can be used in other plant groups.

  8. A new approach to in silico SNP detection and some new SNPs in the Bacillus anthracis genome

    Directory of Open Access Journals (Sweden)

    Francoeur Joe

    2011-04-01

    Full Text Available Abstract Background Bacillus anthracis is one of the most monomorphic pathogens known. Identification of polymorphisms in its genome is essential for taxonomic classification, for determination of recent evolutionary changes, and for evaluation of pathogenic potency. Findings In this work three strains of the Bacillus anthracis genome are compared and previously unpublished single nucleotide polymorphisms (SNPs are revealed. Moreover, it is shown that, despite the highly monomorphic nature of Bacillus anthracis, the SNPs are (1 abundant in the genome and (2 distributed relatively uniformly across the sequence. Conclusions The findings support the proposition that SNPs, together with indels and variable number tandem repeats (VNTRs, can be used effectively not only for the differentiation of perfect strain data, but also for the comparison of moderately incomplete, noisy and, in some cases, unknown Bacillus anthracis strains. In the case when the data is of still lower quality, a new DNA sequence fingerprinting approach based on recently introduced markers, based on combinatorial-analytic concepts and called cyclic difference sets, can be used.

  9. Genome-wide association study identifies single-nucleotide polymorphism in KCNB1 associated with left ventricular mass in humans: The HyperGEN Study

    Directory of Open Access Journals (Sweden)

    Kraemer Rachel

    2009-05-01

    Full Text Available Abstract Background We conducted a genome-wide association study (GWAS and validation study for left ventricular (LV mass in the Family Blood Pressure Program – HyperGEN population. LV mass is a sensitive predictor of cardiovascular mortality and morbidity in all genders, races, and ages. Polymorphisms of candidate genes in diverse pathways have been associated with LV mass. However, subsequent studies have often failed to replicate these associations. Genome-wide association studies have unprecedented power to identify potential genes with modest effects on left LV mass. We describe here a GWAS for LV mass in Caucasians using the Affymetrix GeneChip Human Mapping 100 k Set. Cases (N = 101 and controls (N = 101 were selected from extreme tails of the LV mass index distribution from 906 individuals in the HyperGEN study. Eleven of 12 promising (Q Results Despite the relatively small sample, we identified 12 promising SNPs in the GWAS. Eleven SNPs were successfully genotyped in the validation study of 704 Caucasians and 1467 African Americans; 5 SNPs on chromosomes 5, 12, and 20 were significantly (P ≤ 0.05 associated with LV mass after correction for multiple testing. One SNP (rs756529 is intragenic within KCNB1, which is dephosphorylated by calcineurin, a previously reported candidate gene for LV hypertrophy within this population. Conclusion These findings suggest KCNB1 may be involved in the development of LV hypertrophy in humans.

  10. Polymorphisms in O-methyltransferase genes are associated with stover cell wall digestibility in European maize (Zea mays L.).

    Science.gov (United States)

    Brenner, Everton A; Zein, Imad; Chen, Yongsheng; Andersen, Jeppe R; Wenzel, Gerhard; Ouzunova, Milena; Eder, Joachim; Darnhofer, Birte; Frei, Uschi; Barrière, Yves; Lübberstedt, Thomas

    2010-02-12

    OMT (O-methyltransferase) genes are involved in lignin biosynthesis, which relates to stover cell wall digestibility. Reduced lignin content is an important determinant of both forage quality and ethanol conversion efficiency of maize stover. Variation in genomic sequences coding for COMT, CCoAOMT1, and CCoAOMT2 was analyzed in relation to stover cell wall digestibility for a panel of 40 European forage maize inbred lines, and re-analyzed for a panel of 34 lines from a published French study. Different methodologies for association analysis were performed and compared. Across association methodologies, a total number of 25, 12, 1, 6 COMT polymorphic sites were significantly associated with DNDF, OMD, NDF, and WSC, respectively. Association analysis for CCoAOMT1 and CCoAOMT2 identified substantially fewer polymorphic sites (3 and 2, respectively) associated with the investigated traits. Our re-analysis on the 34 lines from a published French dataset identified 14 polymorphic sites significantly associated with cell wall digestibility, two of them were consistent with our study. Promising polymorphisms putatively causally associated with variability of cell wall digestibility were inferred from the total number of significantly associated SNPs/Indels. Several polymorphic sites for three O-methyltransferase loci were associated with stover cell wall digestibility. All three tested genes seem to be involved in controlling DNDF, in particular COMT. Thus, considerable variation among Bm3 wildtype alleles can be exploited for improving cell-wall digestibility. Target sites for functional markers were identified enabling development of efficient marker-based selection strategies.

  11. Polymorphisms in O-methyltransferase genes are associated with stover cell wall digestibility in European maize (Zea mays L.

    Directory of Open Access Journals (Sweden)

    Darnhofer Birte

    2010-02-01

    Full Text Available Abstract Background OMT (O-methyltransferase genes are involved in lignin biosynthesis, which relates to stover cell wall digestibility. Reduced lignin content is an important determinant of both forage quality and ethanol conversion efficiency of maize stover. Results Variation in genomic sequences coding for COMT, CCoAOMT1, and CCoAOMT2 was analyzed in relation to stover cell wall digestibility for a panel of 40 European forage maize inbred lines, and re-analyzed for a panel of 34 lines from a published French study. Different methodologies for association analysis were performed and compared. Across association methodologies, a total number of 25, 12, 1, 6 COMT polymorphic sites were significantly associated with DNDF, OMD, NDF, and WSC, respectively. Association analysis for CCoAOMT1 and CCoAOMT2 identified substantially fewer polymorphic sites (3 and 2, respectively associated with the investigated traits. Our re-analysis on the 34 lines from a published French dataset identified 14 polymorphic sites significantly associated with cell wall digestibility, two of them were consistent with our study. Promising polymorphisms putatively causally associated with variability of cell wall digestibility were inferred from the total number of significantly associated SNPs/Indels. Conclusions Several polymorphic sites for three O-methyltransferase loci were associated with stover cell wall digestibility. All three tested genes seem to be involved in controlling DNDF, in particular COMT. Thus, considerable variation among Bm3 wildtype alleles can be exploited for improving cell-wall digestibility. Target sites for functional markers were identified enabling development of efficient marker-based selection strategies.

  12. Genome-wide association mapping in tomato (Solanum lycopersicum) is possible using genome admixture of Solanum lycopersicum var. cerasiforme.

    Science.gov (United States)

    Ranc, Nicolas; Muños, Stephane; Xu, Jiaxin; Le Paslier, Marie-Christine; Chauveau, Aurélie; Bounon, Rémi; Rolland, Sophie; Bouchet, Jean-Paul; Brunel, Dominique; Causse, Mathilde

    2012-08-01

    Genome-wide association mapping is an efficient way to identify quantitative trait loci controlling the variation of phenotypes, but the approach suffers severe limitations when one is studying inbred crops like cultivated tomato (Solanum lycopersicum). Such crops exhibit low rates of molecular polymorphism and high linkage disequilibrium, which reduces mapping resolution. The cherry type tomato (S. lycopersicum var. cerasiforme) genome has been described as an admixture between the cultivated tomato and its wild ancestor, S. pimpinellifolium. We have thus taken advantage of the properties of this admixture to improve the resolution of association mapping in tomato. As a proof of concept, we sequenced 81 DNA fragments distributed on chromosome 2 at different distances in a core collection of 90 tomato accessions, including mostly cherry type tomato accessions. The 81 Sequence Tag Sites revealed 352 SNPs and indels. Molecular diversity was greatest for S. pimpinellifolium accessions, intermediate for S. l. cerasiforme accessions, and lowest for the cultivated group. We assessed the structure of molecular polymorphism and the extent of linkage disequilibrium over genetic and physical distances. Linkage disequilibrium decreased under r(2) = 0.3 within 1 cM, and minimal estimated value (r(2) = 0.13) was reached within 20 kb over the physical regions studied. Associations between polymorphisms and fruit weight, locule number, and soluble solid content were detected. Several candidate genes and quantitative trait loci previously identified were validated and new associations detected. This study shows the advantages of using a collection of S. l. cerasiforme accessions to overcome the low resolution of association mapping in tomato.

  13. Comprehensive identification of single nucleotide polymorphisms associated with beta-lactam resistance within pneumococcal mosaic genes.

    Science.gov (United States)

    Chewapreecha, Claire; Marttinen, Pekka; Croucher, Nicholas J; Salter, Susannah J; Harris, Simon R; Mather, Alison E; Hanage, William P; Goldblatt, David; Nosten, Francois H; Turner, Claudia; Turner, Paul; Bentley, Stephen D; Parkhill, Julian

    2014-08-01

    Traditional genetic association studies are very difficult in bacteria, as the generally limited recombination leads to large linked haplotype blocks, confounding the identification of causative variants. Beta-lactam antibiotic resistance in Streptococcus pneumoniae arises readily as the bacteria can quickly incorporate DNA fragments encompassing variants that make the transformed strains resistant. However, the causative mutations themselves are embedded within larger recombined blocks, and previous studies have only analysed a limited number of isolates, leading to the description of "mosaic genes" as being responsible for resistance. By comparing a large number of genomes of beta-lactam susceptible and non-susceptible strains, the high frequency of recombination should break up these haplotype blocks and allow the use of genetic association approaches to identify individual causative variants. Here, we performed a genome-wide association study to identify single nucleotide polymorphisms (SNPs) and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery. The large sample sizes allowed us to narrow the source of beta-lactam non-susceptibility from long recombinant fragments down to much smaller loci comprised of discrete or linked SNPs. While some loci appear to be universal resistance determinants, contributing equally to non-susceptibility for at least two classes of beta-lactam antibiotics, some play a larger role in resistance to particular antibiotics. All of the identified loci have a highly non-uniform distribution in the populations. They are enriched not only in vaccine-targeted, but also non-vaccine-targeted lineages, which may raise clinical concerns. Identification of single nucleotide polymorphisms underlying resistance will be essential for future use of genome sequencing to predict antibiotic sensitivity in clinical microbiology.

  14. Comprehensive identification of single nucleotide polymorphisms associated with beta-lactam resistance within pneumococcal mosaic genes.

    Directory of Open Access Journals (Sweden)

    Claire Chewapreecha

    2014-08-01

    Full Text Available Traditional genetic association studies are very difficult in bacteria, as the generally limited recombination leads to large linked haplotype blocks, confounding the identification of causative variants. Beta-lactam antibiotic resistance in Streptococcus pneumoniae arises readily as the bacteria can quickly incorporate DNA fragments encompassing variants that make the transformed strains resistant. However, the causative mutations themselves are embedded within larger recombined blocks, and previous studies have only analysed a limited number of isolates, leading to the description of "mosaic genes" as being responsible for resistance. By comparing a large number of genomes of beta-lactam susceptible and non-susceptible strains, the high frequency of recombination should break up these haplotype blocks and allow the use of genetic association approaches to identify individual causative variants. Here, we performed a genome-wide association study to identify single nucleotide polymorphisms (SNPs and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery. The large sample sizes allowed us to narrow the source of beta-lactam non-susceptibility from long recombinant fragments down to much smaller loci comprised of discrete or linked SNPs. While some loci appear to be universal resistance determinants, contributing equally to non-susceptibility for at least two classes of beta-lactam antibiotics, some play a larger role in resistance to particular antibiotics. All of the identified loci have a highly non-uniform distribution in the populations. They are enriched not only in vaccine-targeted, but also non-vaccine-targeted lineages, which may raise clinical concerns. Identification of single nucleotide polymorphisms underlying resistance will be essential for future use of genome sequencing to predict antibiotic sensitivity in clinical microbiology.

  15. Word Reading Fluency: Role of Genome-Wide Single-Nucleotide Polymorphisms in Developmental Stability and Correlations with Print Exposure

    Science.gov (United States)

    Harlaar, Nicole; Trzaskowski, Maciej; Dale, Philip S.; Plomin, Robert

    2014-01-01

    The genetic effects on individual differences in reading development were examined using genome-wide complex trait analysis (GCTA) in a twin sample. In unrelated individuals (one twin per pair, n = 2,942), the GCTA-based heritability of reading fluency was ~20%-29% at ages 7 and 12. GCTA bivariate results showed that the phenotypic stability of…

  16. Isolation and characterization of 13 new polymorphic microsatellite markers in the Phaseolus vulgaris L. (Common Bean) genome.

    Science.gov (United States)

    Wang, Aihua; Ding, Yi; Hu, Zhenhua; Lin, Chufa; Wang, Shuzhen; Wang, Bingcai; Zhang, Hongyuan; Zhou, Guolin

    2012-01-01

    In this study, 13 polymorphic microsatellite markers were isolated from the Phaseolus vulgaris L. (common bean) by using the Fast Isolation by AFLP of Sequence COntaining Repeats (FIASCO) protocol. These markers revealed two to seven alleles, with an average of 3.64 alleles per locus. The polymorphic information content (PIC) values ranged from 0.055 to 0.721 over 13 loci, with a mean value of 0.492, and 7 loci having PIC greater than 0.5. The expected heterozygosity (H(E)) and observed heterozygosity (H(O)) levels ranged from 0.057 to 0.814 and from 0.026 to 0.531, respectively. Cross-species amplification of the 13 prime pairs was performed in its related specie of Vigna unguiculata L. Seven out of all these markers showed cross-species transferability. These markers will be useful for future genetic diversity and population genetics studies for this agricultural specie and its related species.

  17. Isolation and Characterization of 13 New Polymorphic Microsatellite Markers in the Phaseolus vulgaris L. (Common Bean Genome

    Directory of Open Access Journals (Sweden)

    Aihua Wang

    2012-09-01

    Full Text Available In this study, 13 polymorphic microsatellite markers were isolated from the Phaseolus vulgaris L. (common bean by using the Fast Isolation by AFLP of Sequence COntaining Repeats (FIASCO protocol. These markers revealed two to seven alleles, with an average of 3.64 alleles per locus. The polymorphic information content (PIC values ranged from 0.055 to 0.721 over 13 loci, with a mean value of 0.492, and 7 loci having PIC greater than 0.5. The expected heterozygosity (HE and observed heterozygosity (HO levels ranged from 0.057 to 0.814 and from 0.026 to 0.531, respectively. Cross-species amplification of the 13 prime pairs was performed in its related specie of Vigna unguiculata L. Seven out of all these markers showed cross-species transferability. These markers will be useful for future genetic diversity and population genetics studies for this agricultural specie and its related species.

  18. A 2-Stage Genome-Wide Association Study to Identify Single Nucleotide Polymorphisms Associated With Development of Erectile Dysfunction Following Radiation Therapy for Prostate Cancer

    Energy Technology Data Exchange (ETDEWEB)

    Kerns, Sarah L. [Department of Radiation Oncology, Mount Sinai School of Medicine, New York, New York (United States); Departments of Pathology and Genetics, Albert Einstein College of Medicine, Bronx, New York (United States); Stock, Richard [Department of Radiation Oncology, Mount Sinai School of Medicine, New York, New York (United States); Stone, Nelson [Department of Radiation Oncology, Mount Sinai School of Medicine, New York, New York (United States); Department of Urology, Mount Sinai School of Medicine, New York, New York (United States); Buckstein, Michael [Department of Radiation Oncology, Mount Sinai School of Medicine, New York, New York (United States); Shao, Yongzhao [Division of Biostatistics, New York University School of Medicine, New York, New York (United States); Campbell, Christopher [Departments of Pathology and Genetics, Albert Einstein College of Medicine, Bronx, New York (United States); Rath, Lynda [Department of Radiation Oncology, Mount Sinai School of Medicine, New York, New York (United States); De Ruysscher, Dirk; Lammering, Guido [Department of Radiation Oncology, Maastricht University Medical Center, Maastricht (Netherlands); Hixson, Rosetta; Cesaretti, Jamie; Terk, Mitchell [Florida Radiation Oncology Group, Jacksonville, Florida (United States); Ostrer, Harry [Departments of Pathology and Genetics, Albert Einstein College of Medicine, Bronx, New York (United States); Rosenstein, Barry S., E-mail: barry.rosenstein@mssm.edu [Department of Radiation Oncology, Mount Sinai School of Medicine, New York, New York (United States); Department of Radiation Oncology, New York University School of Medicine, New York, New York (United States); Departments of Dermatology and Preventive Medicine, Mount Sinai School of Medicine, New York, New York (United States)

    2013-01-01

    Purpose: To identify single nucleotide polymorphisms (SNPs) associated with development of erectile dysfunction (ED) among prostate cancer patients treated with radiation therapy. Methods and Materials: A 2-stage genome-wide association study was performed. Patients were split randomly into a stage I discovery cohort (132 cases, 103 controls) and a stage II replication cohort (128 cases, 102 controls). The discovery cohort was genotyped using Affymetrix 6.0 genome-wide arrays. The 940 top ranking SNPs selected from the discovery cohort were genotyped in the replication cohort using Illumina iSelect custom SNP arrays. Results: Twelve SNPs identified in the discovery cohort and validated in the replication cohort were associated with development of ED following radiation therapy (Fisher combined P values 2.1 Multiplication-Sign 10{sup -5} to 6.2 Multiplication-Sign 10{sup -4}). Notably, these 12 SNPs lie in or near genes involved in erectile function or other normal cellular functions (adhesion and signaling) rather than DNA damage repair. In a multivariable model including nongenetic risk factors, the odds ratios for these SNPs ranged from 1.6 to 5.6 in the pooled cohort. There was a striking relationship between the cumulative number of SNP risk alleles an individual possessed and ED status (Sommers' D P value = 1.7 Multiplication-Sign 10{sup -29}). A 1-allele increase in cumulative SNP score increased the odds for developing ED by a factor of 2.2 (P value = 2.1 Multiplication-Sign 10{sup -19}). The cumulative SNP score model had a sensitivity of 84% and specificity of 75% for prediction of developing ED at the radiation therapy planning stage. Conclusions: This genome-wide association study identified a set of SNPs that are associated with development of ED following radiation therapy. These candidate genetic predictors warrant more definitive validation in an independent cohort.

  19. A method for the analysis of 32 X chromosome insertion deletion polymorphisms in a single PCR

    DEFF Research Database (Denmark)

    Pereira, Rui; Pereira, Vania; Gomes, Iva;

    2012-01-01

    , given its special transmission pattern. The X chromosome markers brought new insights into the history of modern human populations and also proved useful in forensic kinship investigations, namely in deficient relationship cases and in cases where autosomes are uninformative. This work describes an X......-Indel multiplex system amplifying 32 biallelic markers in one single PCR. The multiplex includes X-Indels shown to be polymorphic in the major human population groups and follows a short amplicon strategy. The set was applied in the genetic characterization of sub-Saharan African, European and East Asian...

  20. The role of genome and gene regulatory network canalization in the evolution of multi-trait polymorphisms and sympatric speciation

    Directory of Open Access Journals (Sweden)

    Hogeweg Paulien

    2009-07-01

    Full Text Available Abstract Background Sexual reproduction has classically been considered as a barrier to the buildup of discrete phenotypic differentiation. This notion has been confirmed by models of sympatric speciation in which a fixed genetic architecture and a linear genotype phenotype mapping were assumed. In this paper we study the influence of a flexible genetic architecture and non-linear genotype phenotype map on differentiation under sexual reproduction. We use an individual based model in which organisms have a genome containing genes and transcription factor binding sites. Mutations involve single genes or binding sites or stretches of genome. The genome codes for a regulatory network that determines the gene expression pattern and hence the phenotype of the organism, resulting in a non-linear genotype phenotype map. The organisms compete in a multi-niche environment, imposing selection for phenotypic differentiation. Results We find as a generic outcome the evolution of discrete clusters of organisms adapted to different niches, despite random mating. Organisms from different clusters are distinct on the genotypic, the network and the phenotypic level. However, the genome and network differences are constrained to a subset of the genome locations, a process we call genotypic canalization. We demonstrate how this canalization leads to an increased robustness to recombination and increasing hybrid fitness. Finally, in case of assortative mating, we explain how this canalization increases the effectiveness of assortativeness. Conclusion We conclude that in case of a flexible genetic architecture and a non-linear genotype phenotype mapping, sexual reproduction does not constrain phenotypic differentiation, but instead constrains the genotypic differences underlying it. We hypothesize that, as genotypic canalization enables differentiation despite random mating and increases the effectiveness of assortative mating, sympatric speciation is more likely

  1. Phylogenetic inference under varying proportions of indel-induced alignment gaps

    Directory of Open Access Journals (Sweden)

    Gadagkar Sudhindra R

    2009-08-01

    Full Text Available Abstract Background The effect of alignment gaps on phylogenetic accuracy has been the subject of numerous studies. In this study, we investigated the relationship between the total number of gapped sites and phylogenetic accuracy, when the gaps were introduced (by means of computer simulation to reflect indel (insertion/deletion events during the evolution of DNA sequences. The resulting (true alignments were subjected to commonly used gap treatment and phylogenetic inference methods. Results (1 In general, there was a strong – almost deterministic – relationship between the amount of gap in the data and the level of phylogenetic accuracy when the alignments were very "gappy", (2 gaps resulting from deletions (as opposed to insertions contributed more to the inaccuracy of phylogenetic inference, (3 the probabilistic methods (Bayesian, PhyML & "MLε, " a method implemented in DNAML in PHYLIP performed better at most levels of gap percentage when compared to parsimony (MP and distance (NJ methods, with Bayesian analysis being clearly the best, (4 methods that treat gapped sites as missing data yielded less accurate trees when compared to those that attribute phylogenetic signal to the gapped sites (by coding them as binary character data – presence/absence, or as in the MLε method, and (5 in general, the accuracy of phylogenetic inference depended upon the amount of available data when the gaps resulted from mainly deletion events, and the amount of missing data when insertion events were equally likely to have caused the alignment gaps. Conclusion When gaps in an alignment are a consequence of indel events in the evolution of the sequences, the accuracy of phylogenetic analysis is likely to improve if: (1 alignment gaps are categorized as arising from insertion events or deletion events and then treated separately in the analysis, (2 the evolutionary signal provided by indels is harnessed in the phylogenetic analysis, and (3 methods that

  2. Polymorphic amplified typing sequences (PATS) and pulsed-field gel electrophoresis (PFGE) yield comparable results in the strain typing of a diverse set of bovine Escherichia coli O157 isolates

    Science.gov (United States)

    The PCR-based Escherichia coli O157 (O157) strain typing system, Polymorphic Amplified Typing Sequences (PATS), targets insertions-deletions (Indels) and single nucleotide polymorphisms (SNPs) at the XbaI and AvrII(BlnI) restriction enzyme sites, respectively, besides amplifying four known virulenc...

  3. Development and Characterization of New Single Nucleotide Polymorphism Markers from Expressed Sequence Tags in Common Carp (Cyprinus carpio

    Directory of Open Access Journals (Sweden)

    Xiaomu Yu

    2012-06-01

    Full Text Available The common carp (Cyprinus carpio is an important aquaculture fish worldwide but only limited single nucleotide polymorphism (SNP markers are characterized from expressed sequence tags (ESTs in this species. In this study, 1487 putative SNPs were bioinformatically mined from 14,066 online ESTs mainly from the European common carp, with the occurrence rate of about one SNP every 173 bp. One hundred and twenty-one of these SNPs were selected for validation using PCR fragment sequencing, and 48 out of 81 primers could amplify the expected fragments in the Chinese common carp genome. Only 26 (21.5% putative SNPs were validated, however, 508 new SNPs and 68 indels were identified. The ratios of transitions to transversions were 1.77 for exon SNPs and 1.05 for intron SNPs. All the 23 SNPs selected for population tests were polymorphic, with the observed heterozygosity (Ho ranging from 0.053 to 0.526 (mean 0.262, polymorphism information content (PIC from 0.095 to 0.357 (mean 0.246, and 21 SNPs were in Hardy–Weinberg equilibrium. These results suggest that different common carp populations with geographic isolation have significant genetic variation at the SNP level, and these new EST-SNP markers are readily available for genetics and breeding studies in common carp.

  4. Whole-Genome Sequencing for National Surveillance of Shigella flexneri

    Directory of Open Access Journals (Sweden)

    Marie A. Chattaway

    2017-09-01

    Full Text Available National surveillance of Shigella flexneri ensures the rapid detection of outbreaks to facilitate public health investigation and intervention strategies. In this study, we used whole-genome sequencing (WGS to type S. flexneri in order to detect linked cases and support epidemiological investigations. We prospectively analyzed 330 isolates of S. flexneri received at the Gastrointestinal Bacteria Reference Unit at Public Health England between August 2015 and January 2016. Traditional phenotypic and WGS sub-typing methods were compared. PCR was carried out on isolates exhibiting phenotypic/genotypic discrepancies with respect to serotype. Phylogenetic relationships between isolates were analyzed by WGS using single nucleotide polymorphism (SNP typing to facilitate cluster detection. For 306/330 (93% isolates there was concordance between serotype derived from the genome and phenotypic serology. Discrepant results between the phenotypic and genotypic tests were attributed to novel O-antigen synthesis/modification gene combinations or indels identified in O-antigen synthesis/modification genes rendering them dysfunctional. SNP typing identified 36 clusters of two isolates or more. WGS provided microbiological evidence of epidemiologically linked clusters and detected novel O-antigen synthesis/modification gene combinations associated with two outbreaks. WGS provided reliable and robust data for monitoring trends in the incidence of different serotypes over time. SNP typing can be used to facilitate outbreak investigations in real-time thereby informing surveillance strategies and providing the opportunities for implementing timely public health interventions.

  5. Genomic diversity of Mycobacterium tuberculosis Beijing strains isolated in Tuscany, Italy, based on large sequence deletions, SNPs in putative DNA repair genes and MIRU-VNTR polymorphisms.

    Science.gov (United States)

    Garzelli, Carlo; Lari, Nicoletta; Rindi, Laura

    2016-03-01

    The Beijing genotype of Mycobacterium tuberculosis is cause of global concern as it is rapidly spreading worldwide, is considered hypervirulent, and is most often associated to massive spread of MDR/XDR TB, although these epidemiological or pathological properties have not been confirmed for all strains and in all geographic settings. In this paper, to gain new insights into the biogeographical heterogeneity of the Beijing family, we investigated a global sample of Beijing strains (22% from Italian-born, 78% from foreign-born patients) by determining large sequence polymorphism of regions RD105, RD181, RD150 and RD142, single nucleotide polymorphism of putative DNA repair genes mutT4 and mutT2 and MIRU-VNTR profiles based on 11 discriminative loci. We found that, although our sample of Beijing strains showed a considerable genomic heterogeneity, yielding both ancient and recent phylogenetic strains, the prevalent successful Beijing subsets were characterized by deletions of RD105 and RD181 and by one nucleotide substitution in one or both mutT genes. MIRU-VNTR analysis revealed 47 unique patterns and 9 clusters including a total of 33 isolates (41% of total isolates); the relatively high proportion of Italian-born Beijing TB patients, often occurring in mixed clusters, supports the possibility of an ongoing cross-transmission of the Beijing genotype to autochthonous population. High rates of extra-pulmonary localization and drug-resistance, particularly MDR, frequently reported for Beijing strains in other settings, were not observed in our survey.

  6. Genetic analysis and molecular characterization of Chinese sesame (Sesamum indicum L.) cultivars using insertion-deletion (InDel) and simple sequence repeat (SSR) markers.

    Science.gov (United States)

    Wu, Kun; Yang, Minmin; Liu, Hongyan; Tao, Ye; Mei, Ju; Zhao, Yingzhong

    2014-03-19

    Sesame is an important and ancient oil crop in tropical and subtropical areas. China is one of the most important sesame producing countries with many germplasm accessions and excellent cultivars. Domestication and modern plant breeding have presumably narrowed the genetic basis of cultivated sesame. Several modern sesame cultivars were bred with a limited number of landrace cultivars in their pedigree. The genetic variation was subsequently reduced by genetic drift and selection. Characterization of genetic diversity of these cultivars by molecular markers is of great value to assist parental line selection and breeding strategy design. Three hundred and forty nine simple sequence repeat (SSR) and 79 insertion-deletion (InDel) markers were developed from cDNA library and reduced-representation sequencing of a sesame cultivar Zhongzhi 14, respectively. Combined with previously published SSR markers, 88 polymorphic markers were used to assess the genetic diversity, phylogenetic relationships, population structure, and allele distribution among 130 Chinese sesame accessions including 82 cultivars, 44 landraces and 4 wild germplasm accessions. A total of 325 alleles were detected, with the average gene diversity of 0.432. Model-based structure analysis revealed the presence of five subgroups belonging to two main groups, which were consistent with the results from principal coordinate analysis (PCA), phylogenetic clustering and analysis of molecular variance (AMOVA). Several missing or unique alleles were identified from particular types, subgroups or families, even though they share one or both parental/progenitor lines. This report presented a by far most comprehensive characterization of the molecular and genetic diversity of sesame cultivars in China. InDels are more polymorphic than SSRs, but their ability for deciphering genetic diversity compared to the later. Improved sesame cultivars have narrower genetic basis than landraces, reflecting the effect of genetic

  7. Insertion/deletion polymorphisms in the ΔNp63 promoter are a risk factor for bladder exstrophy epispadias complex.

    Directory of Open Access Journals (Sweden)

    Simon Wilkins

    Full Text Available Bladder exstrophy epispadias complex (BEEC is a severe congenital anomaly; however, the genetic and molecular mechanisms underlying the formation of BEEC remain unclear. TP63, a member of TP53 tumor suppressor gene family, is expressed in bladder urothelium and skin over the external genitalia during mammalian development. It plays a role in bladder development. We have previously shown that p63(-/- mouse embryos developed a bladder exstrophy phenotype identical to human BEEC. We hypothesised that TP63 is involved in human BEEC pathogenesis. RNA was extracted from BEEC foreskin specimens and, as in mice, ΔNp63 was the predominant p63 isoform. ΔNp63 expression in the foreskin and bladder epithelium of BEEC patients was reduced. DNA was sequenced from 163 BEEC patients and 285 ethnicity-matched controls. No exon mutations were detected. Sequencing of the ΔNp63 promoter showed 7 single nucleotide polymorphisms and 4 insertion/deletion (indel polymorphisms. Indel polymorphisms were associated with an increased risk of BEEC. Significantly the sites of indel polymorphisms differed between Caucasian and non-Caucasian populations. A 12-base-pair deletion was associated with an increased risk with only Caucasian patients (p = 0.0052 Odds Ratio (OR = 18.33, whereas a 4-base-pair insertion was only associated with non-Caucasian patients (p = 0.0259 OR = 4.583. We found a consistent and statistically significant reduction in transcriptional efficiencies of the promoter sequences containing indel polymorphisms in luciferase assays. These findings suggest that indel polymorphisms of the ΔNp63 promoter lead to a reduction in p63 expression, which could lead to BEEC.

  8. Role of ACE and AGT gene polymorphisms in genetic susceptibility to diabetes mellitus type 2 in a Brazilian sample.

    Science.gov (United States)

    Wollinger, L M; Dal Bosco, S M; Rempe, C; Almeida, S E M; Berlese, D B; Castoldi, R P; Arndt, M E; Contini, V; Genro, J P

    2015-12-29

    The aim of the current study was to investigate the association between the InDel polymorphism in the angiotensin I-converting enzyme gene (ACE) and the rs699 polymorphism in the angiotensinogen gene (AGT) and diabetes mellitus type 2 (DM2) in a sample population from Southern Brazil. A case-control study was conducted with 228 patients with DM2 and 183 controls without DM2. The ACE InDel polymorphism was genotyped by polymerase chain reaction (PCR) with specific primers, followed by electrophoresis on 1.5% agarose gel. The AGT rs699 polymorphism was genotyped using a real-time PCR assay. No significant association between the ACE InDel polymorphism and DM2 was detected (P = 0.97). However, regarding the AGT rs699 polymorphism, DM2 patients had a significantly higher frequency of the AG genotype and lower frequency of the GG genotype when compared to the controls (P = 0.03). Our results suggest that there is an association between the AGT rs699 polymorphism and DM2 in a Brazilian sample.

  9. Identification of Nucleotide-Level Changes Impacting Gene Content and Genome Evolution in Orthopoxviruses

    Science.gov (United States)

    Hatcher, Eneida L.; Hendrickson, Robert Curtis

    2014-01-01

    ABSTRACT Poxviruses are composed of large double-stranded DNA (dsDNA) genomes coding for several hundred genes whose variation has supported virus adaptation to a wide variety of hosts over their long evolutionary history. Comparative genomics has suggested that the Orthopoxvirus genus in particular has undergone reductive evolution, with the most recent common ancestor likely possessing a gene complement consisting of all genes present in any existing modern-day orthopoxvirus species, similar to the current Cowpox virus species. As orthopoxviruses adapt to new environments, the selection pressure on individual genes may be altered, driving sequence divergence and possible loss of function. This is evidenced by accumulation of mutations and loss of protein-coding open reading frames (ORFs) that progress from individual missense mutations to gene truncation through the introduction of early stop mutations (ESMs), gene fragmentation, and in some cases, a total loss of the ORF. In this study, we have constructed a whole-genome alignment for representative isolates from each Orthopoxvirus species and used it to identify the nucleotide-level changes that have led to gene content variation. By identifying the changes that have led to ESMs, we were able to determine that short indels were the major cause of gene truncations and that the genome length is inversely proportional to the number of ESMs present. We also identified the number and types of protein functional motifs still present in truncated genes to assess their functional significance. IMPORTANCE This work contributes to our understanding of reductive evolution in poxviruses by identifying genomic remnants such as single nucleotide polymorphisms (SNPs) and indels left behind by evolutionary processes. Our comprehensive analysis of the genomic changes leading to gene truncation and fragmentation was able to detect some of the remnants of these evolutionary processes still present in orthopoxvirus genomes and

  10. Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology

    Science.gov (United States)

    Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Pierzchała, Mariusz; Feng, Yaping; Kadarmideen, Haja N.; Kumar, Dibyendu

    2017-01-01

    Background RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF) and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits. Results The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel) positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs) with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM) SNP genotyping assay

  11. Sampling strategy and potential utility of indels for DNA barcoding of closely related plant species: a case study in taxus.

    Science.gov (United States)

    Liu, Jie; Provan, Jim; Gao, Lian-Ming; Li, De-Zhu

    2012-01-01

    Although DNA barcoding has become a useful tool for species identification and biodiversity surveys in plant sciences, there remains little consensus concerning appropriate sampling strategies and the treatment of indels. To address these two issues, we sampled 39 populations for nine Taxus species across their entire ranges, with two to three individuals per population randomly sampled. We sequenced one core DNA barcode (matK) and three supplementary regions (trnH-psbA, trnL-trnF and ITS) for all samples to test the effects of sampling design and the utility of indels. Our results suggested that increasing sampling within-population did not change the clustering of individuals, and that meant within-population P-distances were zero for most populations in all regions. Based on the markers tested here, comparison of methods either including or excluding indels indicated that discrimination and nodal support of monophyletic groups were significantly increased when indels were included. Thus we concluded that one individual per population was adequate to represent the within-population variation in these species for DNA barcoding, and that intra-specific sampling was best focused on representing the entire ranges of certain taxa. We also found that indels occurring in the chloroplast trnL-trnF and trnH-psbA regions were informative to differentiate among for closely related taxa barcoding, and we proposed that indel-coding methods should be considered for use in future for closed related plant species DNA barcoding projects on or below generic level.

  12. The Role of Genetic Polymorphisms as Related to One-Carbon Metabolism, Vitamin B6, and Gene–Nutrient Interactions in Maintaining Genomic Stability and Cell Viability in Chinese Breast Cancer Patients

    Directory of Open Access Journals (Sweden)

    Xiayu Wu

    2016-06-01

    Full Text Available Folate-mediated one-carbon metabolism (FMOCM is linked to DNA synthesis, methylation, and cell proliferation. Vitamin B6 (B6 is a cofactor, and genetic polymorphisms of related key enzymes, such as serine hydroxymethyltransferase (SHMT, methionine synthase reductase (MTRR, and methionine synthase (MS, in FMOCM may govern the bioavailability of metabolites and play important roles in the maintenance of genomic stability and cell viability (GSACV. To evaluate the influences of B6, genetic polymorphisms of these enzymes, and gene–nutrient interactions on GSACV, we utilized the cytokinesis-block micronucleus assay (CBMN and PCR-restriction fragment length polymorphism (PCR-RFLP techniques in the lymphocytes from female breast cancer cases and controls. GSACV showed a significantly positive correlation with B6 concentration, and 48 nmol/L of B6 was the most suitable concentration for maintaining GSACV in vitro. The GSACV indexes showed significantly different sensitivity to B6 deficiency between cases and controls; the B6 effect on the GSACV variance contribution of each index was significantly higher than that of genetic polymorphisms and the sample state (tumor state. SHMT C1420T mutations may reduce breast cancer susceptibility, whereas MTRR A66G and MS A2756G mutations may increase breast cancer susceptibility. The role of SHMT, MS, and MTRR genotype polymorphisms in GSACV is reduced compared with that of B6. The results appear to suggest that the long-term lack of B6 under these conditions may increase genetic damage and cell injury and that individuals with various genotypes have different sensitivities to B6 deficiency. FMOCM metabolic enzyme gene polymorphism may be related to breast cancer susceptibility to a certain extent due to the effect of other factors such as stress, hormones, cancer therapies, psychological conditions, and diet. Adequate B6 intake may be good for maintaining genome health and preventing breast cancer.

  13. Genome-Wide Single Nucleotide Polymorphism Discovery and the Construction of a High-Density Genetic Map for Melon (Cucumis melo L.) Using Genotyping-by-Sequencing.

    Science.gov (United States)

    Chang, Che-Wei; Wang, Yu-Hua; Tung, Chih-Wei

    2017-01-01

    Although genotyping-by-sequencing (GBS) enables the efficient and low-cost generation of large numbers of markers, the utility of resultant genotypes are limited, because they are enormously error-prone and contain high proportions of missing data. In this study, we generated single nucleotide polymorphism (SNP) markers for 109 recombinant inbred lines of melon (Cucumis melo L.) using the GBS approach and ordered them according to their physical position on the draft double haploid line DHL92 genome. Next, by investigating associations between these SNPs, we discovered that some segments on the physical map conflict with linkage relationships. Therefore, to filter out error-prone loci, 4,110 SNPs in which we have a high degree of confidence were selected as anchors to test independence with respect to unselected markers, and the resultant dataset was then analyzed using the Full-Sib Family Haplotype (FSFHap) algorithm in the software TASSEL 5.2. On the basis of this analysis, 22,933 loci that have an average rate of missing data of 0.281% were used to construct a genetic map, which spans 1,088.3 cM across 12 chromosomes and has a maximum spacing of 6.0 cM. Use of this high-quality linkage map enabled the identification of several quantitative trait loci (QTL) known to control traits in fruit and validated our approach. This study highlights the utility of GBS markers for the identification of trait-associated QTLs in melon and facilitates further investigation of genome structure.

  14. Genome-Wide Single Nucleotide Polymorphism Discovery and the Construction of a High-Density Genetic Map for Melon (Cucumis melo L.) Using Genotyping-by-Sequencing

    Science.gov (United States)

    Chang, Che-Wei; Wang, Yu-Hua; Tung, Chih-Wei

    2017-01-01

    Although genotyping-by-sequencing (GBS) enables the efficient and low-cost generation of large numbers of markers, the utility of resultant genotypes are limited, because they are enormously error-prone and contain high proportions of missing data. In this study, we generated single nucleotide polymorphism (SNP) markers for 109 recombinant inbred lines of melon (Cucumis melo L.) using the GBS approach and ordered them according to their physical position on the draft double haploid line DHL92 genome. Next, by investigating associations between these SNPs, we discovered that some segments on the physical map conflict with linkage relationships. Therefore, to filter out error-prone loci, 4,110 SNPs in which we have a high degree of confidence were selected as anchors to test independence with respect to unselected markers, and the resultant dataset was then analyzed using the Full-Sib Family Haplotype (FSFHap) algorithm in the software TASSEL 5.2. On the basis of this analysis, 22,933 loci that have an average rate of missing data of 0.281% were used to construct a genetic map, which spans 1,088.3 cM across 12 chromosomes and has a maximum spacing of 6.0 cM. Use of this high-quality linkage map enabled the identification of several quantitative trait loci (QTL) known to control traits in fruit and validated our approach. This study highlights the utility of GBS markers for the identification of trait-associated QTLs in melon and facilitates further investigation of genome structure. PMID:28220139

  15. Genome-wide single nucleotide polymorphism-based assay for high-resolution epidemiological analysis of the methicillin-resistant Staphylococcus aureus hospital clone EMRSA-15.

    Science.gov (United States)

    Holmes, A; McAllister, G; McAdam, P R; Hsien Choi, S; Girvan, K; Robb, A; Edwards, G; Templeton, K; Fitzgerald, J R

    2014-02-01

    The EMRSA-15 clone is a major cause of nosocomial methicillin-resistant Staphylococcus aureus (MRSA) infections in the UK and elsewhere but existing typing methodologies have limited capacity to discriminate closely related strains, and are often poorly reproducible between laboratories. Here, we report the design, development and validation of a genome-wide single nucleotide polymorphism (SNP) typing method and compare it to established methods for typing of EMRSA-15. In order to identify discriminatory SNPs, the genomes of 17 EMRSA-15 strains, selected to represent the breadth of genotypic and phenotypic diversity of EMRSA-15 isolates in Scotland, were determined and phylogenetic reconstruction was carried out. In addition to 17 phylogenetically informative SNPs, five binary markers were included to form the basis of an EMRSA-15 genotyping assay. The SNP-based typing assay was as discriminatory as pulsed-field gel electrophoresis, and significantly more discriminatory than staphylococcal protein A (spa) typing for typing of a representative panel of diverse EMRSA-15 strains, isolates from two EMRSA-15 hospital outbreak investigations, and a panel of bacteraemia isolates obtained in healthcare facilities in the east of Scotland during a 12-month period. The assay is a rapid, and reproducible approach for epidemiological analysis of EMRSA-15 clinical isolates in Scotland. Unlike established methods the DNA sequence-based method is ideally suited for inter-laboratory comparison of identified genotypes, and its flexibility lends itself to supplementation with additional SNPs or markers for the identification of novel S. aureus strains in other regions of the world.

  16. Phylogeny and molecular signatures (conserved proteins and indels that are specific for the Bacteroidetes and Chlorobi species

    Directory of Open Access Journals (Sweden)

    Lorenzini Emily

    2007-05-01

    Full Text Available Abstract Background The Bacteroidetes and Chlorobi species constitute two main groups of the Bacteria that are closely related in phylogenetic trees. The Bacteroidetes species are widely distributed and include many important periodontal pathogens. In contrast, all Chlorobi are anoxygenic obligate photoautotrophs. Very few (or no biochemical or molecular characteristics are known that are distinctive characteristics of these bacteria, or are commonly shared by them. Results Systematic blast searches were performed on each open reading frame in the genomes of Porphyromonas gingivalis W83, Bacteroides fragilis YCH46, B. thetaiotaomicron VPI-5482, Gramella forsetii KT0803, Chlorobium luteolum (formerly Pelodictyon luteolum DSM 273 and Chlorobaculum tepidum (formerly Chlorobium tepidum TLS to search for proteins that are uniquely present in either all or certain subgroups of Bacteroidetes and Chlorobi. These studies have identified > 600 proteins for which homologues are not found in other organisms. This includes 27 and 51 proteins that are specific for most of the sequenced Bacteroidetes and Chlorobi genomes, respectively; 52 and 38 proteins that are limited to species from the Bacteroidales and Flavobacteriales orders, respectively, and 5 proteins that are common to species from these two orders; 185 proteins that are specific for the Bacteroides genus. Additionally, 6 proteins that are uniquely shared by species from the Bacteroidetes and Chlorobi phyla (one of them also present in the Fibrobacteres have also been identified. This work also describes two large conserved inserts in DNA polymerase III (DnaE and alanyl-tRNA synthetase that are distinctive characteristics of the Chlorobi species and a 3 aa deletion in ClpB chaperone that is mainly found in various Bacteroidales, Flavobacteriales and Flexebacteraceae, but generally not found in the homologs from other organisms. Phylogenetic analyses of the Bacteroidetes and Chlorobi species is also

  17. PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

    Science.gov (United States)

    Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

    2016-10-06

    With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most

  18. Global genomic diversity of human papillomavirus 6 based on 724 isolates and 190 complete genome sequences.

    Science.gov (United States)

    Jelen, Mateja M; Chen, Zigui; Kocjan, Boštjan J; Burt, Felicity J; Chan, Paul K S; Chouhy, Diego; Combrinck, Catharina E; Coutlée, François; Estrade, Christine; Ferenczy, Alex; Fiander, Alison; Franco, Eduardo L; Garland, Suzanne M; Giri, Adriana A; González, Joaquín Víctor; Gröning, Arndt; Heidrich, Kerstin; Hibbitts, Sam; Hošnjak, Lea; Luk, Tommy N M; Marinic, Karina; Matsukura, Toshihiko; Neumann, Anna; Oštrbenk, Anja; Picconi, Maria Alejandra; Richardson, Harriet; Sagadin, Martin; Sahli, Roland; Seedat, Riaz Y; Seme, Katja; Severini, Alberto; Sinchi, Jessica L; Smahelova, Jana; Tabrizi, Sepehr N; Tachezy, Ruth; Tohme, Sarah; Uloza, Virgilijus; Vitkauskiene, Astra; Wong, Yong Wee; Zidovec Lepej, Snježana; Burk, Robert D; Poljak, Mario

    2014-07-01

    Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV6 variant lineages

  19. Footprint of positive selection in Treponema pallidum subsp. pallidum genome sequences suggests adaptive microevolution of the syphilis pathogen.

    Science.gov (United States)

    Giacani, Lorenzo; Chattopadhyay, Sujay; Centurion-Lara, Arturo; Jeffrey, Brendan M; Le, Hoavan T; Molini, Barbara J; Lukehart, Sheila A; Sokurenko, Evgeni V; Rockey, Daniel D

    2012-01-01

    In the rabbit model of syphilis, infection phenotypes associated with the Nichols and Chicago strains of Treponema pallidum (T. pallidum), though similar, are not identical. Between these strains, significant differences are found in expression of, and antibody responses to some candidate virulence factors, suggesting the existence of functional genetic differences between isolates. The Chicago strain genome was therefore sequenced and compared to the Nichols genome, available since 1998. Initial comparative analysis suggested the presence of 44 single nucleotide polymorphisms (SNPs), 103 small (≤3 nucleotides) indels, and 1 large (1204 bp) insertion in the Chicago genome with respect to the Nichols genome. To confirm the above findings, Sanger sequencing was performed on most loci carrying differences using DNA from Chicago and the Nichols strain used in the original T. pallidum genome project. A majority of the previously identified differences were found to be due to errors in the published Nichols genome, while the accuracy of the Chicago genome was confirmed. However, 20 SNPs were confirmed between the two genomes, and 16 (80.0%) were found in coding regions, with all being of non-synonymous nature, strongly indicating action of positive selection. Sequencing of 16 genomic loci harboring SNPs in 12 additional T. pallidum strains, (SS14, Bal 3, Bal 7, Bal 9, Sea 81-3, Sea 81-8, Sea 86-1, Sea 87-1, Mexico A, UW231B, UW236B, and UW249C), was used to identify "Chicago-" or "Nichols -specific" differences. All but one of the 16 SNPs were "Nichols-specific", with Chicago having identical sequences at these positions to almost all of the additional strains examined. These mutations could reflect differential adaptation of the Nichols strain to the rabbit host or pathoadaptive mutations acquired during human infection. Our findings indicate that SNPs among T. pallidum strains emerge under positive selection and, therefore, are likely to be functional in nature.

  20. A forward-backward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using high-density single nucleotide polymorphism (SNP array

    Directory of Open Access Journals (Sweden)

    Bailey Dione K

    2007-05-01

    Full Text Available Abstract Background DNA copy number aberration (CNA is one of the key characteristics of cancer cells. Recent studies demonstrated the feasibility of utilizing high density single nucleotide polymorphism (SNP genotyping arrays to detect CNA. Compared with the two-color array-based comparative genomic hybridization (array-CGH, the SNP arrays offer much higher probe density and lower signal-to-noise ratio at the single SNP level. To accurately identify small segments of CNA from SNP array data, segmentation methods that are sensitive to CNA while resistant to noise are required. Results We have developed a highly sensitive algorithm for the edge detection of copy number data which is especially suitable for the SNP array-based copy number data. The method consists of an over-sensitive edge-detection step and a test-based forward-backward edge selection step. Conclusion Using simulations constructed from real experimental data, the method shows high sensitivity and specificity in detecting small copy number changes in focused regions. The method is implemented in an R package FASeg, which includes data processing and visualization utilities, as well as libraries for processing Affymetrix SNP array data.

  1. Genomic profiling of thousands of candidate polymorphisms predicts risk of relapse in 778 Danish and German childhood acute lymphoblastic leukemia patients.

    Science.gov (United States)

    Wesołowska-Andersen, A; Borst, L; Dalgaard, M D; Yadav, R; Rasmussen, K K; Wehner, P S; Rasmussen, M; Ørntoft, T F; Nordentoft, I; Koehler, R; Bartram, C R; Schrappe, M; Sicheritz-Ponten, T; Gautier, L; Marquart, H; Madsen, H O; Brunak, S; Stanulla, M; Gupta, R; Schmiegelow, K

    2015-02-01

    Childhood acute lymphoblastic leukemia survival approaches 90%. New strategies are needed to identify the 10-15% who evade cure. We applied targeted, sequencing-based genotyping of 25 000 to 34 000 preselected potentially clinically relevant single-nucleotide polymorphisms (SNPs) to identify host genome profiles associated with relapse risk in 352 patients from the Nordic ALL92/2000 protocols and 426 patients from the German Berlin-Frankfurt-Munster (BFM) ALL2000 protocol. Patients were enrolled between 1992 and 2008 (median follow-up: 7.6 years). Eleven cross-validated SNPs were significantly associated with risk of relapse across protocols. SNP and biologic pathway level analyses associated relapse risk with leukemia aggressiveness, glucocorticosteroid pharmacology/response and drug transport/metabolism pathways. Classification and regression tree analysis identified three distinct risk groups defined by end of induction residual leukemia, white blood cell count and variants in myeloperoxidase (MPO), estrogen receptor 1 (ESR1), lamin B1 (LMNB1) and matrix metalloproteinase-7 (MMP7) genes, ATP-binding cassette transporters and glucocorticosteroid transcription regulation pathways. Relapse rates ranged from 4% (95% confidence interval (CI): 1.6-6.3%) for the best group (72% of patients) to 76% (95% CI: 41-90%) for the worst group (5% of patients, Prisk-based treatments adaptation.

  2. Base composition, selection, and phylogenetic significance of indels in the recombination activating gene-1 in vertebrates

    Directory of Open Access Journals (Sweden)

    Vences Miguel

    2009-12-01

    Full Text Available Abstract Background The Recombination Activating Proteins, RAG1 and RAG2, play a crucial role in the immune response in vertebrates. Among the nuclear markers currently used for phylogenetic purposes, Rag1 has especially enjoyed enormous popularity, since it successfully contributed to elucidating the relationships among and within a large variety of vertebrate lineages. We here report on a comparative investigation of the genetic variation, base composition, presence of indels, and selection in Rag1 in different vertebrate lineages (Actinopterygii, Amphibia, Aves, Chondrichthyes, Crocodylia, Lepidosauria, Mammalia, and Testudines through the analysis of 582 sequences obtained from Genbank. We also analyze possible differences between distinct parts of the gene with different type of protein functions. Results In the vertebrate lineages studied, Rag1 is over 3 kb long. We observed a high level of heterogeneity in base composition at the 3rd codon position in some of the studied vertebrate lineages and in some specific taxa. This result is also paralleled by taxonomic differences in the GC content at the same codon position. Moreover, positive selection occurs at some sites in Aves, Lepidosauria and Testudines. Indels, which are often used as phylogenetic characters, are more informative across vertebrates in the 5' than in the 3'-end of the gene. When the entire gene is considered, the use of indels as phylogenetic character only recovers one major vertebrate clade, the Actinopterygii. However, in numerous cases insertions or deletions are specific to a monophyletic group. Conclusions Rag1 is a phylogenetic marker of undoubted quality. Our study points to the need of carrying out a preliminary investigation on the base composition and the possible existence of sites under selection of this gene within the groups studied to avoid misleading resolution. The gene shows highly heterogeneous base composition, which affects some taxa in particular and

  3. Rates and genomic consequences of spontaneous mutational events in Drosophila melanogaster.

    Science.gov (United States)

    Schrider, Daniel R; Houle, David; Lynch, Michael; Hahn, Matthew W

    2013-08-01

    Because spontaneous mutation is the source of all genetic diversity, measuring mutation rates can reveal how natural selection drives patterns of variation within and between species. We sequenced eight genomes produced by a mutation-accumulation experiment in Drosophila melanogaster. Our analysis reveals that point mutation and small indel rates vary significantly between the two different genetic backgrounds examined. We also find evidence that ∼2% of mutational events affect multiple closely spaced nucleotides. Unlike previous similar experiments, we were able to estimate genome-wide rates of large deletions and tandem duplications. These results suggest that, at least in inbred lines like those examined here, mutational pressures may result in net growth rather than contraction of the Drosophila genome. By comparing our mutation rate estimates to polymorphism data, we are able to estimate the fraction of new mutations that are eliminated by purifying selection. These results suggest that ∼99% of duplications and deletions are deleterious--making them 10 times more likely to be removed by selection than nonsynonymous mutations. Our results illuminate not only the rates of new small- and large-scale mutations, but also the selective forces that they encounter once they arise.

  4. A profile-based method for identifying functional divergence of orthologous genes in bacterial genomes.

    Science.gov (United States)

    Wheeler, Nicole E; Barquist, Lars; Kingsley, Robert A; Gardner, Paul P

    2016-12-01

    Next generation sequencing technologies have provided us with a wealth of information on genetic variation, but predicting the functional significance of this variation is a difficult task. While many comparative genomics studies have focused on gene flux and large scale changes, relatively little attention has been paid to quantifying the effects of single nucleotide polymorphisms and indels on protein function, particularly in bacterial genomics. We present a hidden Markov model based approach we call delta-bitscore (DBS) for identifying orthologous proteins that have diverged at the amino acid sequence level in a way that is likely to impact biological function. We benchmark this approach with several widely used datasets and apply it to a proof-of-concept study of orthologous proteomes in an investigation of host adaptation in Salmonella enterica We highlight the value of the method in identifying functional divergence of genes, and suggest that this tool may be a better approach than the commonly used dN/dS metric for identifying functionally significant genetic changes occurring in recently diverged organisms. A program implementing DBS for pairwise genome comparisons is freely available at: https://github.com/UCanCompBio/deltaBS CONTACT: nicole.wheeler@pg.canterbury.ac.nz or lars.barquist@uni-wuerzburg.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  5. Genomic single-nucleotide polymorphisms confirm that Gunnison and Greater sage-grouse are genetically well differentiated and that the Bi-State population is distinct

    Science.gov (United States)

    Oyler-McCance, Sara J.; Cornman, Robert S.; Jones, Kenneth L.; Fike, Jennifer

    2015-01-01

    Sage-grouse are iconic, declining inhabitants of sagebrush habitats in western North America, and their management depends on an understanding of genetic variation across the landscape. Two distinct species of sage-grouse have been recognized, Greater (Centrocercus urophasianus) and Gunnison sage-grouse (C. minimus), based on morphology, behavior, and variation at neutral genetic markers. A parapatric group of Greater Sage-Grouse along the border of California and Nevada ("Bi-State") is also genetically distinct at the same neutral genetic markers, yet not different in behavior or morphology. Because delineating taxonomic boundaries and defining conservation units is often difficult in recently diverged taxa and can be further complicated by highly skewed mating systems, we took advantage of new genomic methods that improve our ability to characterize genetic variation at a much finer resolution. We identified thousands of single-nucleotide polymorphisms (SNPs) among Gunnison, Greater, and Bi-State sage-grouse and used them to comprehensively examine levels of genetic diversity and differentiation among these groups. The pairwise multilocus fixation index (FST) was high (0.49) between Gunnison and Greater sage-grouse, and both principal coordinates analysis and model-based clustering grouped samples unequivocally by species. Standing genetic variation was lower within the Gunnison Sage-Grouse. The Bi-State population was also significantly differentiated from Greater Sage-Grouse, albeit more weakly (FST = 0.09), and genetic clustering results were consistent with reduced gene flow with Greater Sage-Grouse. No comparable genetic divisions were found within the Greater Sage-Grouse sample, which spanned the southern half of the range. Thus, we provide much stronger genetic evidence supporting the recognition of Gunnison Sage-Grouse as a distinct species with low genetic diversity. Further, our work confirms that the Bi-State population is differentiated from other

  6. Genome-wide association study identifies single nucleotide polymorphism in DYRK1A associated with replication of HIV-1 in monocyte-derived macrophages.

    Directory of Open Access Journals (Sweden)

    Sebastiaan M Bol

    Full Text Available BACKGROUND: HIV-1 infected macrophages play an important role in rendering resting T cells permissive for infection, in spreading HIV-1 to T cells, and in the pathogenesis of AIDS dementia. During highly active anti-retroviral treatment (HAART, macrophages keep producing virus because tissue penetration of antiretrovirals is suboptimal and the efficacy of some is reduced. Thus, to cure HIV-1 infection with antiretrovirals we will also need to efficiently inhibit viral replication in macrophages. The majority of the current drugs block the action of viral enzymes, whereas there is an abundance of yet unidentified host factors that could be targeted. We here present results from a genome-wide association study identifying novel genetic polymorphisms that affect in vitro HIV-1 replication in macrophages. METHODOLOGY/PRINCIPAL FINDINGS: Monocyte-derived macrophages from 393 blood donors were infected with HIV-1 and viral replication was determined using Gag p24 antigen levels. Genomic DNA from individuals with macrophages that had relatively low (n = 96 or high (n = 96 p24 production was used for SNP genotyping with the Illumina 610 Quad beadchip. A total of 494,656 SNPs that passed quality control were tested for association with HIV-1 replication in macrophages, using linear regression. We found a strong association between in vitro HIV-1 replication in monocyte-derived macrophages and SNP rs12483205 in DYRK1A (p = 2.16 × 10(-5. While the association was not genome-wide significant (p<1 × 10(-7, we could replicate this association using monocyte-derived macrophages from an independent group of 31 individuals (p = 0.0034. Combined analysis of the initial and replication cohort increased the strength of the association (p = 4.84 × 10(-6. In addition, we found this SNP to be associated with HIV-1 disease progression in vivo in two independent cohort studies (p = 0.035 and p = 0.0048. CONCLUSIONS/SIGNIFICANCE: These findings suggest that the kinase

  7. The prion protein gene polymorphisms associated with bovine spongiform encephalopathy susceptibility differ significantly between cattle and buffalo.

    Science.gov (United States)

    Zhao, Hui; Du, Yanli; Chen, Shunmei; Qing, Lili; Wang, Xiaoyan; Huang, Jingfei; Wu, Dongdong; Zhang, Yaping

    2015-12-01

    Prion protein, encoded by the prion protein gene (PRNP), plays a crucial role in the pathogenesis of transmissible spongiform encephalopathies (TSEs). Several polymorphisms within the PRNP are known to be associated with influencing bovine spongiform encephalopathy (BSE) susceptibility in cattle, namely two insertion/deletion (indel) polymorphisms (a 23-bp indel in the putative promoter and a 12-bp indel in intron 1), the number of octapeptide repeats (octarepeats) present in coding sequence (CDS) and amino acid polymorphisms. The domestic buffaloes, Bubalus bubalis, are a ruminant involved in various aspects of agriculture. It is of interest to ask whether the PRNP polymorphisms differ between cattle and buffalo. In this study, we analyzed the previously reported polymorphisms associated with BSE susceptibility in Chinese buffalo breeds, and compared these polymorphisms in cattle with BSE, healthy cattle and buffalo by pooling data from the literature. Our analysis revealed three significant findings in buffalo: 1) extraordinarily low deletion allele frequencies of the 23- and 12-bp indel polymorphisms; 2) significantly low allelic frequencies of six octarepeats in CDS and 3) the presence of S4R, A16V, P54S, G108S, V123M, S154N and F257L substitutions in buffalo CDSs. Sequence alignments comparing the buffalo coding sequence to other species were analyzed using the McDonald-Kreitman test to reveal five groups (Bison bonasus, Bos indicus, Bos gaurus, Boselaphus tragocamelus, Syncerus caffer caffer) with significantly divergent non-synonymous substitutions from buffalo, suggesting potential divergence of buffalo PRNP and others. To the best of our knowledge this is the first study of PRNP polymorphisms associated with BSE susceptibility in Chinese buffalo. Our findings have provided evidence that buffaloes have a unique genetic background in the PRNP gene in comparison with cattle.

  8. Introgression browser: high-throughput whole-genome SNP visualization.

    Science.gov (United States)

    Aflitos, Saulo Alves; Sanchez-Perez, Gabino; de Ridder, Dick; Fransz, Paul; Schranz, Michael E; de Jong, Hans; Peters, Sander A

    2015-04-01

    Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not viable. Here, we present the Introgression Browser (iBrowser), a bioinformatics tool aimed at visualizing introgressions at nucleotide or SNP (Single Nucleotide Polymorphisms) accuracy. The software selects homozygous SNPs from Variant Call Format (VCF) information and filters out heterozygous SNPs, multi-nucleotide polymorphisms (MNPs) and insertion-deletions (InDels). For data analysis iBrowser makes use of sliding windows, but if needed it can generate any desired fragmentation pattern through General Feature Format (GFF) information. In an example of tomato (Solanum lycopersicum) accessions we visualize SNP patterns and elucidate both position and boundaries of the introgressions. We also show that our tool is capable of identifying alien DNA in a panel of the closely related S. pimpinellifolium by examining phylogenetic relationships of the introgressed segments in tomato. In a third example, we demonstrate the power of the iBrowser in a panel of 597 Arabidopsis accessions, detecting the boundaries of a SNP-free region around a polymorphic 1.17 Mbp inverted segment on the short arm of chromosome 4. The architecture and functionality of iBrowser makes the software appropriate for a broad set of analyses including SNP mining, genome structure analysis, and pedigree analysis. Its functionality, together with the capability to process large data sets and efficient visualization of sequence variation, makes iBrowser a valuable breeding tool.

  9. Association of Novel Nonsynonymous Single Nucleotide Polymorphisms in ampD with Cephalosporin Resistance and Phylogenetic Variations in ampC, ampR, ompF, and ompC in Enterobacter cloacae Isolates That Are Highly Resistant to Carbapenems

    Science.gov (United States)

    Ellington, Matthew J.; Hopkins, Katie L.; Turton, Jane F.; Doumith, Michel; Loy, Richard; Staves, Peter; Hinic, Vladimira; Frei, Reno; Woodford, Neil

    2016-01-01

    In Enterobacter cloacae, the genetic lesions associated with derepression of the AmpC β-lactamase include diverse single nucleotide polymorphisms (SNPs) and/or indels in the ampD and ampR genes and SNPs in ampC, while diverse SNPs in the promoter region or SNPs/indels within the coding sequence of outer membrane proteins have been described to alter porin production leading to carbapenem resistance. We sought to define the underlying mechanisms conferring cephalosporin and carbapenem resistance in a collection of E. cloacae isolates with unusually high carbapenem resistance and no known carbapenemase and, in contrast to many previous studies, considered the SNPs we detected in relation to the multilocus sequence type (MLST)-based phylogeny of our collection. Whole-genome sequencing was applied on the most resistant isolates to seek novel carbapenemases, expression of ampC was measured by reverse transcriptase PCR, and porin translation was detected by SDS-PAGE. SNPs occurring in ampC, ampR, ompF, and ompC genes (and their promoter regions) were mostly phylogenetic variations, relating to the isolates' sequence types, whereas nonsynonymous SNPs in ampD were associated with derepression of AmpC and cephalosporin resistance. The additional loss of porins resulted in high-level carbapenem resistance, underlining the clinical importance of chromosomal mutations among carbapenem-resistant E. cloacae. PMID:26856839

  10. RESTRICTION FRAGMENT LENGTH POLYMORPHISM (RFLP) ANALYSIS OF GENOMIC DNA OF 5 STRAINS OF TRICHINELLA SPIRALIS IN CHINA

    Institute of Scientific and Technical Information of China (English)

    王虹; 张月清; 劳为德; 吴赵永

    1995-01-01

    Five restriction endonucleases were used to digest genomic DNA from 5 isolates of Trichinella spiralis obtained from Changchun,Tianjin,Xian,Henan and Yunnan.All the isolates were secured from pigs ex-cept the Changchun strain which came from dog.The DNA fragments digested by endonuclease were sepa-mted by agarose gel electrophoesis.The DNA fragments digested by endonuclease were sepa-rated by agarose gel electrophoresis.The Changchun is olate had a EcoRI band at 1.12kb and a Dral band at 1.97kb which were unique to this isolate.A cloned specific repetitive DNA sequence(1.12kb) from the Changchun strain was selected to prepare a probe for the Southern blotting of EcoRI restriction DNA frag-ments for the 5 isolates.The 1.12kb hybridizing band did not appear except in the Changchun isolate.These results seem to indicate that there are differences between the isolates obtained from hosts in differ-ent geographical regions.

  11. An empirical test of the treatment of indels during optimization alignment based on the phylogeny of the genus Secale (Poaceae)

    DEFF Research Database (Denmark)

    Petersen, Gitte; Seberg, Ole; Aagesen, Lone

    2004-01-01

    The ability of the program POY, implementing optimization alignment, to deal with major indels is explored and discussed in connection with a phylogenetic analysis of the genus Secale based on partial Adhl sequences. The Adhl sequences used span exon 2-4. Nearly all variation is found in intron 2...

  12. In silico detection of phylogenetic informative Y-chromosomal single nucleotide polymorphisms from whole genome sequencing data.

    Science.gov (United States)

    Van Geystelen, Anneleen; Wenseleers, Tom; Decorte, Ronny; Caspers, Maarten J L; Larmuseau, Maarten H D

    2014-11-01

    A state-of-the-art phylogeny of the human Y-chromosome is an essential tool for forensic genetics. The explosion of whole genome sequencing (WGS) data due to the rapid progress of next-generation sequencing facilities is useful to optimize and to increase the resolution of the phylogenetic Y-chromosomal tree. The most interesting Y-chromosomal variants to increase the phylogeny are SNPs (Y-SNPs) especially since the software to call them in WGS data and to genotype them in forensic assays has been optimized over the past years. The PENNY software presented here detects potentially phylogenetic interesting Y-SNPs in silico based on SNP calling data files and classifies them into different types according to their position in the currently used Y-chromosomal tree. The software utilized 790 available male WGS samples of which 172 had a high SNP calling quality. In total, 1269 Y-SNPs potentially capable of increasing the resolution of the Y-chromosomal phylogenetic tree were detected based on a first run with PENNY. Based on a test panel of 57 high-quality and 618 low-quality WGS samples, we could prove that these newly added Y-SNPs indeed increased the resolution of the phylogenetic Y-chromosomal analysis substantially. Finally, we performed a second run with PENNY whereby all samples including those of the test panel are used and this resulted in 509 additional phylogenetic promising Y-SNPs. By including these additional Y-SNPs, a final update of the present phylogenetic Y-chromosomal tree which is useful for forensic applications was generated. In order to find more convincing forensic interesting Y-SNPs with this PENNY software, the number of samples and variety of the haplogroups to which these samples belong needs to increase. The PENNY software (inclusive the user manual) is freely available on the website http://bio.kuleuven.be/eeb/lbeg/software.

  13. Single Nucleotide Polymorphism

    DEFF Research Database (Denmark)

    Børsting, Claus; Pereira, Vania; Andersen, Jeppe Dyrberg

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the genome. They have been studied extensively in the last decade with various purposes in mind. In this chapter, we will discuss the advantages and disadvantages of using SNPs for human identification and bri...

  14. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2012-01-01

    Full Text Available Abstract Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952 of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612 were intronic and 9% (n = 464 were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS. Significant (P ® MassARRAY. No significant differences (P > 0.1 were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total. Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving

  15. DNA sequence polymorphisms within the bovine guanine nucleotide-binding protein Gs subunit alpha (Gsα-encoding (GNAS genomic imprinting domain are associated with performance traits

    Directory of Open Access Journals (Sweden)

    Mullen Michael P

    2011-01-01

    Full Text Available Abstract Background Genes which are epigenetically regulated via genomic imprinting can be potential targets for artificial selection during animal breeding. Indeed, imprinted loci have been shown to underlie some important quantitative traits in domestic mammals, most notably muscle mass and fat deposition. In this candidate gene study, we have identified novel associations between six validated single nucleotide polymorphisms (SNPs spanning a 97.6 kb region within the bovine guanine nucleotide-binding protein Gs subunit alpha gene (GNAS domain on bovine chromosome 13 and genetic merit for a range of performance traits in 848 progeny-tested Holstein-Friesian sires. The mammalian GNAS domain consists of a number of reciprocally-imprinted, alternatively-spliced genes which can play a major role in growth, development and disease in mice and humans. Based on the current annotation of the bovine GNAS domain, four of the SNPs analysed (rs43101491, rs43101493, rs43101485 and rs43101486 were located upstream of the GNAS gene, while one SNP (rs41694646 was located in the second intron of the GNAS gene. The final SNP (rs41694656 was located in the first exon of transcripts encoding the putative bovine neuroendocrine-specific protein NESP55, resulting in an aspartic acid-to-asparagine amino acid substitution at amino acid position 192. Results SNP genotype-phenotype association analyses indicate that the single intronic GNAS SNP (rs41694646 is associated (P ≤ 0.05 with a range of performance traits including milk yield, milk protein yield, the content of fat and protein in milk, culled cow carcass weight and progeny carcass conformation, measures of animal body size, direct calving difficulty (i.e. difficulty in calving due to the size of the calf and gestation length. Association (P ≤ 0.01 with direct calving difficulty (i.e. due to calf size and maternal calving difficulty (i.e. due to the maternal pelvic width size was also observed at the rs

  16. DNA sequence polymorphisms within the bovine guanine nucleotide-binding protein Gs subunit alpha (Gsα)-encoding (GNAS) genomic imprinting domain are associated with performance traits

    Science.gov (United States)

    2011-01-01

    Background Genes which are epigenetically regulated via genomic imprinting can be potential targets for artificial selection during animal breeding. Indeed, imprinted loci have been shown to underlie some important quantitative traits in domestic mammals, most notably muscle mass and fat deposition. In this candidate gene study, we have identified novel associations between six validated single nucleotide polymorphisms (SNPs) spanning a 97.6 kb region within the bovine guanine nucleotide-binding protein Gs subunit alpha gene (GNAS) domain on bovine chromosome 13 and genetic merit for a range of performance traits in 848 progeny-tested Holstein-Friesian sires. The mammalian GNAS domain consists of a number of reciprocally-imprinted, alternatively-spliced genes which can play a major role in growth, development and disease in mice and humans. Based on the current annotation of the bovine GNAS domain, four of the SNPs analysed (rs43101491, rs43101493, rs43101485 and rs43101486) were located upstream of the GNAS gene, while one SNP (rs41694646) was located in the second intron of the GNAS gene. The final SNP (rs41694656) was located in the first exon of transcripts encoding the putative bovine neuroendocrine-specific protein NESP55, resulting in an aspartic acid-to-asparagine amino acid substitution at amino acid position 192. Results SNP genotype-phenotype association analyses indicate that the single intronic GNAS SNP (rs41694646) is associated (P ≤ 0.05) with a range of performance traits including milk yield, milk protein yield, the content of fat and protein in milk, culled cow carcass weight and progeny carcass conformation, measures of animal body size, direct calving difficulty (i.e. difficulty in calving due to the size of the calf) and gestation length. Association (P ≤ 0.01) with direct calving difficulty (i.e. due to calf size) and maternal calving difficulty (i.e. due to the maternal pelvic width size) was also observed at the rs43101491 SNP. Following

  17. Identification of Sesame Genomic Variations from Genome Comparison of Landrace and Variety.

    Science.gov (United States)

    Wei, Xin; Zhu, Xiaodong; Yu, Jingyin; Wang, Linhai; Zhang, Yanxin; Li, Donghua; Zhou, Rong; Zhang, Xiurong

    2016-01-01

    Sesame (Sesamum indicum L.) is one of the main oilseed crops, providing vegetable oil and protein to human. Landrace is the gene source of variety, carrying many desire alleles for genetic improvement. Despite the importance of sesame landrace, genome of sesame landrace remains unexplored and genomic variations between landrace and variety still is not clear. To identify the genomic variations between sesame landrace and variety, two representative sesame landrace accessions, "Baizhima" and "Mishuozhima," were selected and re-sequenced. The genome sequencing and de novo assembling of the two sesame landraces resulted in draft genomes of 267 Mb and 254 Mb, respectively, with the contig N50 more than 47 kb. Totally, 1,332,025 SNPs and 506,245 InDels were identified from the genome of "Baizhima" and "Mishuozhima" by comparison of the genome of a variety "Zhongzhi13." Among the genomic variations, 70,018 SNPs and 8311 InDels were located in the coding regions of genes. Genomic variations may contribute to variation of sesame agronomic traits such as flowering time, plant height, and oil content. The identified genomic variations were successfully used in the QTL mapping and the black pigment synthesis gene, PPO, was found to be the candidate gene of sesame seed coat color. The comprehensively compared genomes of sesame landrace and modern variety produced massive useful genomic information, constituting a powerful tool to support genetic research, and molecular breeding of sesame.

  18. Identification of Sesame Genomic Variations from Genome Comparison of Landrace and Variety

    Science.gov (United States)

    Wei, Xin; Zhu, Xiaodong; Yu, Jingyin; Wang, Linhai; Zhang, Yanxin; Li, Donghua; Zhou, Rong; Zhang, Xiurong

    2016-01-01

    Sesame (Sesamum indicum L.) is one of the main oilseed crops, providing vegetable oil and protein to human. Landrace is the gene source of variety, carrying many desire alleles for genetic improvement. Despite the importance of sesame landrace, genome of sesame landrace remains unexplored and genomic variations between landrace and variety still is not clear. To identify the genomic variations between sesame landrace and variety, two representative sesame landrace accessions, “Baizhima” and “Mishuozhima,” were selected and re-sequenced. The genome sequencing and de novo assembling of the two sesame landraces resulted in draft genomes of 267 Mb and 254 Mb, respectively, with the contig N50 more than 47 kb. Totally, 1,332,025 SNPs and 506,245 InDels were identified from the genome of “Baizhima” and “Mishuozhima” by comparison of the genome of a variety “Zhongzhi13.” Among the genomic variations, 70,018 SNPs and 8311 InDels were located in the coding regions of genes. Genomic variations may contribute to variation of sesame agronomic traits such as flowering time, plant height, and oil content. The identified genomic variations were successfully used in the QTL mapping and the black pigment synthesis gene, PPO, was found to be the candidate gene of sesame seed coat color. The comprehensively compared genomes of sesame landrace and modern variety produced massive useful genomic information, constituting a powerful tool to support genetic research, and molecular breeding of sesame. PMID:27536315

  19. Genome-wide association study for colorectal cancer identifies risk polymorphisms in German familial cases and implicates MAPK signalling pathways in disease susceptibility.

    Science.gov (United States)

    Lascorz, Jesús; Försti, Asta; Chen, Bowang; Buch, Stephan; Steinke, Verena; Rahner, Nils; Holinski-Feder, Elke; Morak, Monika; Schackert, Hans K; Görgens, Heike; Schulmann, Karsten; Goecke, Timm; Kloor, Matthias; Engel, Cristoph; Büttner, Reinhard; Kunkel, Nelli; Weires, Marianne; Hoffmeister, Michael; Pardini, Barbara; Naccarati, Alessio; Vodickova, Ludmila; Novotny, Jan; Schreiber, Stefan; Krawczak, Michael; Bröring, Clemens D; Völzke, Henry; Schafmayer, Clemens; Vodicka, Pavel; Chang-Claude, Jenny; Brenner, Hermann; Burwinkel, Barbara; Propping, Peter; Hampe, Jochen; Hemminki, Kari

    2010-09-01

    Genetic susceptibility accounts for approximately 35% of all colorectal cancer (CRC). Ten common low-risk variants contributing to CRC risk have been identified through genome-wide association studies (GWASs). In our GWAS, 610 664 genotyped single-nucleotide polymorphisms (SNPs) passed the quality control filtering in 371 German familial CRC patients and 1263 controls, and replication studies were conducted in four additional case-control sets (4915 cases and 5607 controls). Known risk loci at 8q24.21 and 11q23 were confirmed, and a previously unreported association, rs12701937, located between the genes GLI3 (GLI family zinc finger 3) and INHBA (inhibin, beta A) [P = 1.1 x 10(-3), odds ratio (OR) 1.14, 95% confidence interval (CI) 1.05-1.23, dominant model in the combined cohort], was identified. The association was stronger in familial cases compared with unselected cases (P = 2.0 x 10(-4), OR 1.36, 95% CI 1.16-1.60, dominant model). Two other unreported SNPs, rs6038071, 40 kb upstream of CSNK2A1 (casein kinase 2, alpha 1 polypeptide) and an intronic marker in MYO3A (myosin IIIA), rs11014993, associated with CRC only in the familial CRC cases (P = 2.5 x 10(-3), recessive model, and P = 2.7 x 10(-4), dominant model). Three software tools successfully pointed to the overrepresentation of genes related to the mitogen-activated protein kinase (MAPK) signalling pathways among the 1340 most strongly associated markers from the GWAS (allelic P value genes involved in MAPK signalling events (P(trend) = 2.2 x 10(-16), OR(per allele) = 1.34, 95% CI 1.11-1.61).

  20. Genomic variability of Mycobacterium tuberculosis strains of the Euro-American lineage based on large sequence deletions and 15-locus MIRU-VNTR polymorphism.

    Directory of Open Access Journals (Sweden)

    Laura Rindi

    Full Text Available A sample of 260 Mycobacterium tuberculosis strains assigned to the Euro-American family was studied to identify phylogenetically informative genomic regions of difference (RD. Mutually exclusive deletions of regions RD115, RD122, RD174, RD182, RD183, RD193, RD219, RD726 and RD761 were found in 202 strains; the RD(Rio deletion was detected exclusively among the RD174-deleted strains. Although certain deletions were found more frequently in certain spoligotype families (i.e., deletion RD115 in T and LAM, RD174 in LAM, RD182 in Haarlem, RD219 in T and RD726 in the "Cameroon" family, the RD-defined sublineages did not specifically match with spoligotype-defined families, thus arguing against the use of spoligotyping for establishing exact phylogenetic relationships between strains. Notably, when tested for katG463/gyrA95 polymorphism, all the RD-defined sublineages belonged to Principal Genotypic Group (PGG 2, except sublineage RD219 exclusively belonging to PGG3; the 58 Euro-American strains with no deletion were of either PGG2 or 3. A representative sample of 197 isolates was then analyzed by standard 15-locus MIRU-VNTR typing, a suitable approach to independently assess genetic relationships among the strains. Analysis of the MIRU-VNTR typing results by using a minimum spanning tree (MST and a classical dendrogram showed groupings that were largely concordant with those obtained by RD-based analysis. Isolates of a given RD profile show, in addition to closely related MIRU-VNTR profiles, related spoligotype profiles that can serve as a basis for better spoligotype-based classification.

  1. Analyzing Somatic Genome Rearrangements in Human Cancers by Using Whole-Exome Sequencing | Office of Cancer Genomics

    Science.gov (United States)

    Although exome sequencing data are generated primarily to detect single-nucleotide variants and indels, they can also be used to identify a subset of genomic rearrangements whose breakpoints are located in or near exons. Using >4,600 tumor and normal pairs across 15 cancer types, we identified over 9,000 high confidence somatic rearrangements, including a large number of gene fusions.

  2. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    Science.gov (United States)

    2011-09-01

    Almasy, L, Blangero, J. (2009) Human QTL linkage mapping. Genetica 136:333-340. Amos, CI. (2007) Successful design and conduct of genome-wide...quantitative trait loci. Genetica 136:237-243. Skol AD, Scott LJ, Abecasis GR, Boehnke M. (2006) Joint analysis is more efficient than replication

  3. Genome-Wide Association Analysis for Blood Lipid Traits Measured in Three Pig Populations Reveals a Substantial Level of Genetic Heterogeneity.

    Directory of Open Access Journals (Sweden)

    Hui Yang

    Full Text Available Serum lipids are associated with myocardial infarction and cardiovascular disease in humans. Here we dissected the genetic architecture of blood lipid traits by applying genome-wide association studies (GWAS in 1,256 pigs from Laiwu, Erhualian and Duroc × (Landrace × Yorkshire populations, and a meta-analysis of GWAS in more than 2,400 pigs from five diverse populations. A total of 22 genomic loci surpassing the suggestive significance level were detected on 11 pig chromosomes (SSC for six blood lipid traits. Meta-analysis of GWAS identified 5 novel loci associated with blood lipid traits. Comparison of GWAS loci across the tested populations revealed a substantial level of genetic heterogeneity for porcine blood lipid levels. We further evaluated the causality of nine polymorphisms nearby or within the APOB gene on SSC3 for serum LDL-C and TC levels. Of the 9 polymorphisms, an indel showed the most significant association with LDL-C and TC in Laiwu pigs. But the significant association was not identified in the White Duroc × Erhualian F2 resource population, in which the QTL for LDL-C and TC was also detected on SSC3. This indicates that population-specific signals may exist for the SSC3 QTL. Further investigations are warranted to validate this assumption.

  4. Genome-Wide Association Analysis for Blood Lipid Traits Measured in Three Pig Populations Reveals a Substantial Level of Genetic Heterogeneity.

    Science.gov (United States)

    Yang, Hui; Huang, Xiaochang; Zeng, Zhijun; Zhang, Wanchang; Liu, Chenlong; Fang, Shaoming; Huang, Lusheng; Chen, Congying

    2015-01-01

    Serum lipids are associated with myocardial infarction and cardiovascular disease in humans. Here we dissected the genetic architecture of blood lipid traits by applying genome-wide association studies (GWAS) in 1,256 pigs from Laiwu, Erhualian and Duroc × (Landrace × Yorkshire) populations, and a meta-analysis of GWAS in more than 2,400 pigs from five diverse populations. A total of 22 genomic loci surpassing the suggestive significance level were detected on 11 pig chromosomes (SSC) for six blood lipid traits. Meta-analysis of GWAS identified 5 novel loci associated with blood lipid traits. Comparison of GWAS loci across the tested populations revealed a substantial level of genetic heterogeneity for porcine blood lipid levels. We further evaluated the causality of nine polymorphisms nearby or within the APOB gene on SSC3 for serum LDL-C and TC levels. Of the 9 polymorphisms, an indel showed the most significant association with LDL-C and TC in Laiwu pigs. But the significant association was not identified in the White Duroc × Erhualian F2 resource population, in which the QTL for LDL-C and TC was also detected on SSC3. This indicates that population-specific signals may exist for the SSC3 QTL. Further investigations are warranted to validate this assumption.

  5. Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly

    Directory of Open Access Journals (Sweden)

    Shultz Jeffry

    2008-07-01

    Full Text Available Abstract Background Many of the world's most important food crops have either polyploid genomes or homeologous regions derived from segmental shuffling following polyploid formation. The soybean (Glycine max genome has been shown to be composed of approximately four thousand short interspersed homeologous regions with 1, 2 or 4 copies per haploid genome by RFLP analysis, microsatellite anchors to BACs and by contigs formed from BAC fingerprints. Despite these similar regions,, the genome has been sequenced by whole genome shotgun sequence (WGS. Here the aim was to use BAC end sequences (BES derived from three minimum tile paths (MTP to examine the extent and homogeneity of polyploid-like regions within contigs and the extent of correlation between the polyploid-like regions inferred from fingerprinting and the polyploid-like sequences inferred from WGS matches. Results Results show that when sequence divergence was 1–10%, the copy number of homeologous regions could be identified from sequence variation in WGS reads overlapping BES. Homeolog sequence variants (HSVs were single nucleotide polymorphisms (SNPs; 89% and single nucleotide indels (SNIs 10%. Larger indels were rare but present (1%. Simulations that had predicted fingerprints of homeologous regions could be separated when divergence exceeded 2% were shown to be false. We show that a 5–10% sequence divergence is necessary to separate homeologs by fingerprinting. BES compared to WGS traces showed polyploid-like regions with less than 1% sequence divergence exist at 2.3% of the locations assayed. Conclusion The use of HSVs like SNPs and SNIs to characterize BACs wil improve contig building methods. The implications for bioinformatic and functional annotation of polyploid and paleopolyploid genomes show that a combined approach of BAC fingerprint based physical maps, WGS sequence and HSV-based partitioning of BAC clones from homeologous regions to separate contigs will allow reliable de

  6. Development and Utilization of InDel Markers to Identify Peanut (Arachis hypogaea) Disease Resistance

    OpenAIRE

    Liu, Lifeng; Dang, Phat M.; Charles Y Chen

    2015-01-01

    Peanut diseases, such as leaf spot and spotted wilt caused by Tomato Spotted Wilt Virus, can significantly reduce yield and quality. Application of marker assisted plant breeding requires the development and validation of different types of DNA molecular markers. Nearly 10,000 SSR-based molecular markers have been identified by various research groups around the world, but less than 14.5% showed polymorphism in peanut and only 6.4% have been mapped. Low levels of polymorphism limit the applic...

  7. Development and Utilization of InDel Markers to Identify Peanut (Arachis hypogaea) Disease Resistance

    OpenAIRE

    Lifeng eLiu; Dang, Phat M.; Charles Y Chen

    2015-01-01

    Peanut diseases, such as leaf spot and spotted wilt caused by Tomato Spotted Wilt Virus, can significantly reduce yield and quality. Application of marker assisted plant breeding requires the development and validation of different types of DNA molecular markers. Nearly 10,000 SSR-based molecular markers have been identified by various research groups around the world, but less than 14.5% showed polymorphism in peanut and only 6.4% have been mapped. Low levels of polymorphism limit the app...

  8. Direct Injection of CRISPR/Cas9-Related mRNA into Cytoplasm of Parthenogenetically Activated Porcine Oocytes Causes Frequent Mosaicism for Indel Mutations

    Directory of Open Access Journals (Sweden)

    Masahiro Sato

    2015-08-01

    Full Text Available Some reports demonstrated successful genome editing in pigs by one-step zygote microinjection of mRNA of CRISPR/Cas9-related components. Given the relatively long gestation periods and the high cost of housing, the establishment of a single blastocyst-based assay for rapid optimization of the above system is required. As a proof-of-concept, we attempted to disrupt a gene (GGTA1 encoding the α-1,3-galactosyltransferase that synthesizes the α-Gal epitope using parthenogenetically activated porcine oocytes. The lack of α-Gal epitope expression can be monitored by staining with fluorescently labeled isolectin BS-I-B4 (IB4, which binds specifically to the α-Gal epitope. When oocytes were injected with guide RNA specific to GGTA1 together with enhanced green fluorescent protein (EGFP and human Cas9 mRNAs, 65% (24/37 of the developing blastocysts exhibited green fluorescence, although almost all (96%, 23/24 showed a mosaic fluorescent pattern. Staining with IB4 revealed that the green fluorescent area often had a reduced binding activity to IB4. Of the 16 samples tested, six (five fluorescent and one non-fluorescent blastocysts had indel mutations, suggesting a correlation between EGFP expression and mutation induction. Furthermore, it is suggested that zygote microinjection of mRNAs might lead to the production of piglets with cells harboring various mutation types.

  9. Rapid genome evolution in Pms1 region of rice revealed by comparative sequence analysis

    Institute of Scientific and Technical Information of China (English)

    YU JinSheng; FAN YouRong; LIU Nan; SHAN Yan; LI XiangHua; ZHANG QiFa

    2007-01-01

    Pms1, a locus for photoperiod sensitive genic male sterility in rice, was identified and mapped to chromosome 7 in previous studies. Here we report an effort to identify the candidate genes for Pms1 by comparative sequencing of BAC clones from two cultivars Minghui 63 and Nongken 58, the parents for the initial mapping population. Annotation and comparison of the sequences of the two clones resulted in a total of five potential candidates which should be functionally tested. We also conducted comparative analysis of sequences of these two cultivars with two other cultivars, Nipponbare and 93-11,for which sequence data were available in public databases. The analysis revealed large differences in sequence composition among the four genotypes in the Pms1 region primarily due to retroelement activity leading to rapid recent growth and divergence of the genomes. High levels of polymorphism in the forms of indels and SNPs were found both in intra- and inter-subspecific comparisons. Dating analysis using LTRs of the retroelements in this region showed that the substitution rate of LTRs was much higher than reported in the literature. The results provided strong evidence for rapid genomic evolution of this region as a consequence of natural and artificial selection.

  10. Involvement of the Ventrolateral Prefrontal Cortex in Learning Others' Bad Reputations and Indelible Distrust.

    Science.gov (United States)

    Suzuki, Atsunobu; Ito, Yuichi; Kiyama, Sachiko; Kunimi, Mitsunobu; Ohira, Hideki; Kawaguchi, Jun; Tanabe, Hiroki C; Nakai, Toshiharu

    2016-01-01

    A bad reputation can persistently affect judgments of an individual even when it turns out to be invalid and ought to be disregarded. Such indelible distrust may reflect that the negative evaluation elicited by a bad reputation transfers to a person. Consequently, the person him/herself may come to activate this negative evaluation irrespective of the accuracy of the reputation. If this theoretical model is correct, an evaluation-related brain region will be activated when witnessing a person whose bad reputation one has learned about, regardless of whether the reputation is deemed valid or not. Here, we tested this neural hypothesis with functional magnetic resonance imaging (fMRI). Participants memorized faces paired with either a good or a bad reputation. Next, they viewed the faces alone and inferred whether each person was likely to cooperate, first while retrieving the reputations, and then while trying to disregard them as false. A region of the left ventrolateral prefrontal cortex (vlPFC), which may be involved in negative evaluation, was activated by faces previously paired with bad reputations, irrespective of whether participants attempted to retrieve or disregard these reputations. Furthermore, participants showing greater activity of the left ventrolateral prefrontal region in response to the faces with bad reputations were more likely to infer that these individuals would not cooperate. Thus, once associated with a bad reputation, a person may elicit evaluation-related brain responses on their own, thereby evoking distrust independently of their reputation.

  11. Medicina genómica: Aplicaciones del polimorfismo de un nucleótido y micromatrices de ADN Genomic Medicine: Polymorphisms and microarray applications

    Directory of Open Access Journals (Sweden)

    Monica P. Spalvieri

    2004-12-01

    Full Text Available Esta actualización tiene por objeto difundir un nuevo enfoque de las variaciones del ADN entre individuos y comentar las nuevas tecnologías para su detección. La secuenciación total del genoma humano es el comienzo para conocer la diversidad genética. La unidad de medida reconocida de esta variabilidad es el polimorfismo de un solo nucleótido (single nucleotide polymorphism o SNP. El estudio de los SNPs está restringido a la investigación pero las numerosas publicaciones sobre el tema hacen vislumbrar su entrada en la práctica clínica. Se presentan ejemplos del uso de SNPs como marcadores moleculares en la genotipificación étnica, la expresión génica de enfermedades y como potenciales blancos farmacológicos. Se comenta la técnica de las matrices (arrays que facilita el estudio de múltiples secuencias de genes mediante chips de diseño específico. Los métodos convencionales analizan hasta un máximo de 20 genes, mientras que una sola micromatriz provee información sobre decenas de miles de genes simultáneamente con una genotipificación rápida y exacta. Los avances de la biotecnología permitirán conocer, además de la secuencia de cada gen, la frecuencia y ubicación exacta de los SNPs y su influencia en los comportamientos celulares. Si bien la validez de los resultados y la eficiencia de las micromatrices son aún controvertidos, el conocimiento y caracterización del perfil genético de un paciente impulsará seguramente un cambio radical en la prevención, diagnóstico, pronóstico y tratamiento de las enfermedades humanas.This update shows new concepts related to the significance of DNA variations among individuals, as well as to their detection by using a new technology. The sequencing of the human genome is only the beginning of what will enable us to understand genetic diversity. The unit of DNA variability is the polymorphism of a single nucleotide (SNP. At present, studies on SNPs are restricted to basic research

  12. Three indel variants in chicken LPIN1 exon 6/flanking region are associated with performance and carcass traits.

    Science.gov (United States)

    Wang, R; Wang, T; Lu, W; Zhang, W; Chen, W; Kang, X; Huang, Y

    2015-01-01

    LPIN1 is a Mg(2+)-dependent phosphatidic acid phosphatase. Variation in chicken LPIN1 exon 6 and its flanking regions were identified and three indel variants in 6 breeds and their associations with performance traits were studied. Seven variants were detected from 6 breeds, which contained a synonymous tri-allelic variant (c.924A/T/C) and three indels. The exon 6 variants detected from chicken breeds were conserved among bird species. The indel variation frequency presented clear differences among breeds. Two coding indels (c.1014-1018del3 and c.1125-1138del12) were multiples of three nucleotides and maintained the open reading frames of LPIN1 proteins. However, they were predicted to result in the clear change of the RNA secondary structure of chicken LPIN1 exon 6 and LPIN1 protein conformation. The association analysis showed that c.871-15-22del6 variation had a significant effect on body weight at hatch (BW0) and 2 weeks (BW2); c. 1014-1018del3 variation had a significant effect on BW4, BW6, caecum length and gizzard weight (GW) traits; c.1125-1138del12 variation had a significant effect on BW12, shank length at 4 weeks (SL4), carcass weight, lactate dehydrogenase traits (LDH), glucose (GLU) and albumin (ALB) traits. The genotype combination for c.1014-1018del3 and c.1125-1138del12 also presented significant effects on SL4, SL8, GW, leg muscle weight, ALB, GLU and LDH. The study demonstrated that chicken LPIN1 has an important effect on body, carcass and organ weight, serum LDH, GLU and ALB level.

  13. IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

    Directory of Open Access Journals (Sweden)

    Jung Kyongyong

    2009-04-01

    Full Text Available Abstract Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site http://www.imgd.org/.

  14. High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak

    Directory of Open Access Journals (Sweden)

    Trout-Yakel Keri M

    2010-02-01

    Full Text Available Abstract Background A large, multi-province outbreak of listeriosis associated with ready-to-eat meat products contaminated with Listeria monocytogenes serotype 1/2a occurred in Canada in 2008. Subtyping of outbreak-associated isolates using pulsed-field gel electrophoresis (PFGE revealed two similar but distinct AscI PFGE patterns. High-throughput pyrosequencing of two L. monocytogenes isolates was used to rapidly provide the genome sequence of the primary outbreak strain and to investigate the extent of genetic diversity associated with a change of a single restriction enzyme fragment during PFGE. Results The chromosomes were collinear, but differences included 28 single nucleotide polymorphisms (SNPs and three indels, including a 33 kbp prophage that accounted for the observed difference in AscI PFGE patterns. The distribution of these traits was assessed within further clinical, environmental and food isolates associated with the outbreak, and this comparison indicated that three distinct, but highly related strains may have been involved in this nationwide outbreak. Notably, these two isolates were found to harbor a 50 kbp putative mobile genomic island encoding translocation and efflux functions that has not been observed in other Listeria genomes. Conclusions High-throughput genome sequencing provided a more detailed real-time assessment of genetic traits characteristic of the outbreak strains than could be achieved with routine subtyping methods. This study confirms that the latest generation of DNA sequencing technologies can be applied during high priority public health events, and laboratories need to prepare for this inevitability and assess how to properly analyze and interpret whole genome sequences in the context of molecular epidemiology.

  15. Genome-wide copy number profiling on high-density bacterial artificial chromosomes, single-nucleotide polymorphisms, and oligonucleotide microarrays: a platform comparison based on statistical power analysis.

    NARCIS (Netherlands)

    Hehir-Kwa, J.Y.; Egmont-Peterson, M.; Janssen, I.M.; Smeets, D.F.C.M.; Geurts van Kessel, A.H.M.; Veltman, J.A.

    2007-01-01

    Recently, comparative genomic hybridization onto bacterial artificial chromosome (BAC) arrays (array-based comparative genomic hybridization) has proved to be successful for the detection of submicroscopic DNA copy-number variations in health and disease. Technological improvements to achieve a

  16. Rapid, economical single-nucleotide polymorphism and microsatellite discovery based on de novo assembly of a reduced representation genome in a non-model organism: a case study of Atlantic cod Gadus morhua.

    Science.gov (United States)

    Carlsson, J; Gauthier, D T; Carlsson, J E L; Coughlan, J P; Dillane, E; Fitzgerald, R D; Keating, U; McGinnity, P; Mirimin, L; Cross, T F

    2013-03-01

    By combining next-generation sequencing technology (454) and reduced representation library (RRL) construction, the rapid and economical isolation of over 25 000 potential single-nucleotide polymorphisms (SNP) and >6000 putative microsatellite loci from c. 2% of the genome of the non-model teleost, Atlantic cod Gadus morhua from the Celtic Sea, south of Ireland, was demonstrated. A small-scale validation of markers indicated that 80% (11 of 14) of SNP loci and 40% (6 of 15) of the microsatellite loci could be amplified and showed variability. The results clearly show that small-scale next-generation sequencing of RRL genomes is an economical and rapid approach for simultaneous SNP and microsatellite discovery that is applicable to any species. The low cost and relatively small investment in time allows for positive exploitation of ascertainment bias to design markers applicable to specific populations and study questions.

  17. Analysis of genomic polymorphisms of Bordetella pertussis isolates*%百日咳鲍特菌基因组多态性分析

    Institute of Scientific and Technical Information of China (English)

    徐颖华; 卫辰; 王丽婵; 骆鹏; 侯启明; 张庶民

    2012-01-01

    Objective To investigate the genomic characterizations of B .pertussis isolates .Methods Genomic polymorphisms of recent B .pertussis isolates were determined by two international universal molecular typing methods,allelic analysis and ML V A for B .pertussis .The diversity indexs of typing methods were analyzed statistically .Results The sequencing analysis of pertacitn(Prn), trachea! colonization factor(tcfA) and pertussis toxin promoter region(ptxP) showed that three allelic profiles Prnl/ptxPl/tcfA2, Prn2/PtxP3/tcfA2 and Prn3/ptxPl/tcf A2 wore; found with the; frequency of 71 .43%,19 .05% and 9 .52% in the; strains analyzed, respectively .21 isolates produced 12 different B .pertussis MLV A profiles,and a new profile was found .Main ML vA profiles were MLVA-136 and MLVA-152,accounted 28 .57% and for 19 .05% respectively .The results of statistical analysis suggested that ML-VA were better than allelic analysis for molecular typing of B .pertussis .Conclusion These results demonstrated that B .pertussis epidemic strains in some regions of China were different from those in European countries .These works could allow us to gain more insights into the molecular epidemiological patterns of the B .pertussis strains facilitating for a better control of pertussis spread and new vaccine prevention strategies in China .%目的 了解百日咳鲍特菌(简称百日咳杆菌)基因组特征.方法 应用百日咳杆菌等位基因分型法和多位点可变数量串联重复序列分析(MLVA)法分析百日咳杆菌基因多态性,同时对2种分型方法的多态性指数进行统计学分析.结果 百日咳黏附素(Prn)、支气管黏附因子(tcfA)和百日咳毒素基因上游启动子区域(ptxP)3种等位基因序列分析结果显示,共发现3种等位基因组合型:Prn1/ptxP1/tcfA2、Prn2/ptxP3/tcfA2和Prn3/ptxP1/tcfA2,所占比例分别为71.43%、19.05%和9.52%.MLVA法将21株分离菌株分成了12个不同的MLVA型,MLVA-136型和MLVA-152型为主要流

  18. 利用SSCP技术分析甘蓝型油菜10个功能基因序列差异%Polymorphism Analysis of Ten Functional Genes in Brassica napus Using SSCP Method

    Institute of Scientific and Technical Information of China (English)

    李媛媛; 陈庆芳; 傅廷栋; 马朝芝

    2012-01-01

    survey polymorphisms between SI-1300 and Eagle. All primers showed polymorphisms, resulting ten polymorphic loci. Subsequently, ten polymorphic bands were randomly selected, sequenced and aligned with the gene sequences for primers designed using the bl2seq software. The results indicated that the average identity was 98%, and the average number of different bases was only 2.3 between the sequenced fragments and their functional genes. Furthermore, the sequence comparison of polymorphic fragments amplified by five primer pairs was performed between SI-1300 and Eagle. The polymorphic fragments were highly conserved between SI-1300 and Eagle, and there were 39 single-nucleotide polymorphisms (SNPs) and five insertion-deletions (INDELs) in the DNA fragments amplified by the five primer pairs. The average frequency of sequence polymorphism was estimated to be one SNP every 30 bp and one INDEL every 233 bp. In conclusion, the sequences of functional genes, which could be really amplified by specific primers, are highly conversed among different cultivars in B. Napus, and SNP is the most basic genetic variation for functional genes. This study will provide a foundation for investigating the molecular basis of important traits in rapeseed using comparative genomics.

  19. NGS meta data analysis for identification of SNP and INDEL patterns in human airway transcriptome: A preliminary indicator for lung cancer

    Directory of Open Access Journals (Sweden)

    Sathya B.

    2015-03-01

    Full Text Available High-throughput sequencing of RNA (RNA-Seq was developed primarily to analyze global gene expression in different tissues. It is also an efficient way to discover coding SNPs and when multiple individuals with different genetic backgrounds were used, RNA-Seq is very effective for the identification of SNPs. The objective of this study was to perform SNP and INDEL discoveries in human airway transcriptome of healthy never smokers, healthy current smokers, smokers without lung cancer and smokers with lung cancer. By preliminary comparative analysis of these four data sets, it is expected to get SNP and INDEL patterns responsible for lung cancer. A total of 85,028 SNPs and 5738 INDELs in healthy never smokers, 32,671 SNPs and 1561 INDELs in healthy current smokers, 50,205 SNPs and 3008 INDELs in smokers without lung cancer and 51,299 SNPs and 3138 INDELs in smokers with lung cancer were identified. The analysis of the SNPs and INDELs in genes that were reported earlier as differentially expressed was also performed. It has been found that a smoking person has SNPs at position 62,186,542 and 62,190,293 in SCGB1A1 gene and 180,017,251, 180,017,252, and 180,017,597 in SCGB3A1 gene and INDELs at position 35,871,168 in NFKBIA gene and 180,017,797 in SCGB3A1 gene. The SNPs identified in this study provides a resource for genetic studies in smokers and shall contribute to the development of a personalized medicine. This study is only a preliminary kind and more vigorous data analysis and wet lab validation are required.

  20. Identification of Single Nucleotide Polymorphisms and analysis of Linkage Disequilibrium in sunflower elite inbred lines using the candidate gene approach

    Directory of Open Access Journals (Sweden)

    Heinz Ruth A

    2008-01-01

    Full Text Available Abstract Background Association analysis is a powerful tool to identify gene loci that may contribute to phenotypic variation. This includes the estimation of nucleotide diversity, the assessment of linkage disequilibrium structure (LD and the evaluation of selection processes. Trait mapping by allele association requires a high-density map, which could be obtained by the addition of Single Nucleotide Polymorphisms (SNPs and short insertion and/or deletions (indels to SSR and AFLP genetic maps. Nucleotide diversity analysis of randomly selected candidate regions is a promising approach for the success of association analysis and fine mapping in the sunflower genome. Moreover, knowledge of the distance over which LD persists, in agronomically meaningful sunflower accessions, is important to establish the density of markers and the experimental design for association analysis. Results A set of 28 candidate genes related to biotic and abiotic stresses were studied in 19 sunflower inbred lines. A total of 14,348 bp of sequence alignment was analyzed per individual. In average, 1 SNP was found per 69 nucleotides and 38 indels were identified in the complete data set. The mean nucleotide polymorphism was moderate (θ = 0.0056, as expected for inbred materials. The number of haplotypes per region ranged from 1 to 9 (mean = 3.54 ± 1.88. Model-based population structure analysis allowed detection of admixed individuals within the set of accessions examined. Two putative gene pools were identified (G1 and G2, with a large proportion of the inbred lines being assigned to one of them (G1. Consistent with the absence of population sub-structuring, LD for G1 decayed more rapidly (r2 = 0.48 at 643 bp; trend line, pooled data than the LD trend line for the entire set of 19 individuals (r2 = 0.64 for the same distance. Conclusion Knowledge about the patterns of diversity and the genetic relationships between breeding materials could be an invaluable aid in crop

  1. AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome

    Directory of Open Access Journals (Sweden)

    Mei Lingling

    2011-11-01

    Full Text Available Abstract Background To complement next-generation sequencing technologies, there is a pressing need for efficient pre-sequencing capture methods with reduced costs and DNA requirement. The Alu family of short interspersed nucleotide elements is the most abundant type of transposable elements in the human genome and a recognized source of genome instability. With over one million Alu elements distributed throughout the genome, they are well positioned to facilitate genome-wide sequence amplification and capture of regions likely to harbor genetic variation hotspots of biological relevance. Results Here we report on the use of inter-Alu PCR with an enhanced range of amplicons in conjunction with next-generation sequencing to generate an Alu-anchored scan, or 'AluScan', of DNA sequences between Alu transposons, where Alu consensus sequence-based 'H-type' PCR primers that elongate outward from the head of an Alu element are combined with 'T-type' primers elongating from the poly-A containing tail to achieve huge amplicon range. To illustrate the method, glioma DNA was compared with white blood cell control DNA of the same patient by means of AluScan. The over 10 Mb sequences obtained, derived from more than 8,000 genes spread over all the chromosomes, revealed a highly reproducible capture of genomic sequences enriched in genic sequences and cancer candidate gene regions. Requiring only sub-micrograms of sample DNA, the power of AluScan as a discovery tool for genetic variations was demonstrated by the identification of 357 instances of loss of heterozygosity, 341 somatic indels, 274 somatic SNVs, and seven potential somatic SNV hotspots between control and glioma DNA. Conclusions AluScan, implemented with just a small number of H-type and T-type inter-Alu PCR primers, provides an effective capture of a diversity of genome-wide sequences for analysis. The method, by enabling an examination of gene-enriched regions containing exons, introns, and

  2. Genomic expression catalogue of a global collection of BCG vaccine strains show evidence for highly diverged metabolic and cell-wall adaptations

    KAUST Repository

    Abdallah, Abdallah

    2015-10-21

    Although Bacillus Calmette-Guérin (BCG) vaccines against tuberculosis have been available for more than 90 years, their effectiveness has been hindered by variable protective efficacy and a lack of lasting memory responses. One factor contributing to this variability may be the diversity of the BCG strains that are used around the world, in part from genomic changes accumulated during vaccine production and their resulting differences in gene expression. We have compared the genomes and transcriptomes of a global collection of fourteen of the most widely used BCG strains at single base-pair resolution. We have also used quantitative proteomics to identify key differences in expression of proteins across five representative BCG strains of the four tandem duplication (DU) groups. We provide a comprehensive map of single nucleotide polymorphisms (SNPs), copy number variation and insertions and deletions (indels) across fourteen BCG strains. Genome-wide SNP characterization allowed the construction of a new and robust phylogenic genealogy of BCG strains. Transcriptional and proteomic profiling revealed a metabolic remodeling in BCG strains that may be reflected by altered immunogenicity and possibly vaccine efficacy. Together, these integrated-omic data represent the most comprehensive catalogue of genetic variation across a global collection of BCG strains.

  3. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  4. De novo assembly of a haplotype-resolved human genome

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang

    2015-01-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome...... of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should...... shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb...

  5. Comparative Genome of GK and Wistar Rats Reveals Genetic Basis of Type 2 Diabetes.

    Directory of Open Access Journals (Sweden)

    Tiancheng Liu

    Full Text Available The Goto-Kakizaki (GK rat, which has been developed by repeated inbreeding of glucose-intolerant Wistar rats, is the most widely studied rat model for Type 2 diabetes (T2D. However, the detailed genetic background of T2D phenotype in GK rats is still largely unknown. We report a survey of T2D susceptible variations based on high-quality whole genome sequencing of GK and Wistar rats, which have generated a list of GK-specific variations (228 structural variations, 2660 CNV amplification and 2834 CNV deletion, 1796 protein affecting SNVs or indels by comparative genome analysis and identified 192 potential T2D-associated genes. The genes with variants are further refined with prior knowledge and public resource including variant polymorphism of rat strains, protein-protein interactions and differential gene expression. Finally we have identified 15 genetic mutant genes which include seven known T2D related genes (Tnfrsf1b, Scg5, Fgb, Sell, Dpp4, Icam1, and Pkd2l1 and eight high-confidence new candidate genes (Ldlr, Ccl2, Erbb3, Akr1b1, Pik3c2a, Cd5, Eef2k, and Cpd. Our result reveals that the T2D phenotype may be caused by the accumulation of multiple variations in GK rat, and that the mutated genes may affect biological functions including adipocytokine signaling, glycerolipid metabolism, PPAR signaling, T cell receptor signaling and insulin signaling pathways. We present the genomic difference between two closely related rat strains (GK and Wistar and narrow down the scope of susceptible loci. It also requires further experimental study to understand and validate the relationship between our candidate variants and T2D phenotype. Our findings highlight the importance of sequenced-based comparative genomics for investigating disease susceptibility loci in inbreeding animal models.

  6. 26 polymorphic microsatellite markers screened from the genome of guinea pigs%豚鼠基因组26个多态性微卫星标记的筛选

    Institute of Scientific and Technical Information of China (English)

    刘迪文; 杨伟伟; 吴宝金

    2014-01-01

    Objective To screen microsatellite DNA markers from genome of guinea pigs for further genetic quality control and gene-mapping of this species .Methods Microsatellite sequences were obtained by magnetic bead enrichment and genome database screening , and candidate loci were chosen to design primers .Thereafter , genomic DNA of 5 different guinea pig strains were employed to select polymorphic microsatellite DNA markers based on PCR amplification results .Re-sults A total of 304 microsatellite sequences were analyzed by magnetic bead enrichment and 125 primers were designed . One polymorphic microsatellite DNA marker and 17 specific sites ( no polymorphic was found ) were determined .By gene-mapping , 292 microsatellite sequences were obtained and 178 primers were analyzed , totally 25 polymorphic microsatellite DNA markers and 28 specific sites ( without polymorphics ) were discovered .Conclusions We obtained 26 polymorphic microsatellite DNA markers and 45 potential markers in guinea pigs , and these may lay a foundation for application of mic-rosatellite DNA markers in genetic quality control and gene-mapping of guinea pigs .%目的:筛选豚鼠基因组的多态性微卫星标记,为豚鼠遗传质量控制及基因定位等工作奠定基础。方法采用磁珠富集法和豚鼠基因组数据库筛选法获取微卫星位点序列,通过分析和初步筛选,挑选部分候选位点,根据其序列设计引物,对5种不同来源的豚鼠基因组DNA标本进行PCR扩增,以期获得多态性分子标记。结果本实验采用磁珠富集法共获得微卫星序列304个,设计引物125对,最终获得多态性位点1个,暂未发现多态性的特异性位点17个;用数据库筛选法共获得微卫星序列292个,设计并合成相应引物178对,最终发现多态性位点25个,暂未发现多态性的特异性位点28个。结论本实验获得26个多态性微卫星标记,45个潜在的候选标记,为微卫星标记在

  7. Discovering and verifying DNA polymorphisms in a mung bean [V. radiata (L. R. Wilczek] collection by EcoTILLING and sequencing

    Directory of Open Access Journals (Sweden)

    Dean Rob E

    2008-06-01

    Full Text Available Abstract Background Vigna radiata, which is classified in the family Fabaceae, is an important economic crop and a dietary staple in many developing countries. The species radiata can be further subdivided into varieties of which the variety sublobata is currently acknowledged as the putative progenitor of radiata. EcoTILLING was employed to identify single nucleotide polymorphisms (SNPs and small insertions/deletions (INDELS in a collection of Vigna radiata accessions. Findings A total of 157 DNA polymorphisms in the collection were produced from ten primer sets when using V. radiata var. sublobata as the reference. The majority of polymorphisms detected were found in putative introns. The banding patterns varied from simple to complex as the number of DNA polymorphisms between two pooled samples increased. Numerous SNPs and INDELS ranging from 4–24 and 1–6, respectively, were detected in all fragments when pooling V. radiata var. sublobata with V. radiata var. radiata. On the other hand, when accessions of V. radiata var. radiata were mixed together and digested with CEL I relatively few SNPs and no INDELS were detected. Conclusion EcoTILLING was utilized to identify polymorphisms in a collection of mung bean, which previously showed limited molecular genetic diversity and limited morphological diversity in the flowers and pod descriptors. Overall, EcoTILLING proved to be a powerful genetic analysis tool providing the rapid identification of naturally occurring variation.

  8. Identification and characterization of polymorphisms within the 5' flanking region, first exon and part of first intron of bovine GH gene.

    Science.gov (United States)

    Ferraz, A L J; Bortolossi, J C; Curi, R A; Ferro, M I T; Ferro, J A; Furlan, L R

    2006-06-01

    The aim of the present study was to identify and characterize polymorphisms within the 5' flanking region, first exon and part of first intron of the bovine growth hormone gene among different beef cattle breeds: Nelore (n = 25), Simmental (n = 39), Simbrasil (n = 24), Simmental x Nelore (n = 30), Canchim x Nelore (n = 30) and Angus x Nelore (n = 30). Two DNA fragments (GH1, 464 bp and GH2, 453 bp) were amplified by polymerase chain reaction and then used for polymorphism identification by SSCP. Within the GH1 fragment, five polymorphisms were identified, corresponding to three different alleles: GH1.1, GH1.2 and GH1.3 (GenBank: AY662648, AY662649 and AY662650, respectively). These allele sequences were aligned and compared with bovine GH gene nucleotide sequence (GenBank: M57764 and AF118837), resulting in the identification of five insertion/deletions (INDELs) and five single nucleotide polymorphisms (SNPs). In the GH2 fragment two alleles were identified, GH2.1 and GH2.2 (GenBank: AY662651 and AY662652, respectively). The allele sequences were compared with GenBank sequences (M57764, AF007750 and AH009106) and three INDELs and four SNPs were identified. In conclusion, we were able to identify six new polymorphisms of the bovine GH gene (one INDEL and five SNPs), which can be used as molecular markers in genetic studies.

  9. Family Polymorphism

    DEFF Research Database (Denmark)

    Ernst, Erik

    2001-01-01

    safety and flexibility at the level of multi-object systems. We are granted the flexibility of using different families of kinds of objects, and we are guaranteed the safety of the combination. This paper highlights the inability of traditional polymorphism to handle multiple objects, and presents family...... polymorphism as a way to overcome this problem. Family polymorphism has been implemented in the programming language gbeta, a generalized version of Beta, and the source code of this implementation is available under GPL....

  10. Family Polymorphism

    DEFF Research Database (Denmark)

    Ernst, Erik

    2001-01-01

    safety and flexibility at the level of multi-object systems. We are granted the flexibility of using different families of kinds of objects, and we are guaranteed the safety of the combination. This paper highlights the inability of traditional polymorphism to handle multiple objects, and presents family...... polymorphism as a way to overcome this problem. Family polymorphism has been implemented in the programming language gbeta, a generalized version of Beta, and the source code of this implementation is available under GPL....

  11. De novo sequencing, assembly and analysis of the genome of the laboratory strain Saccharomyces cerevisiae CEN.PK113-7D, a model for modern industrial biotechnology

    NARCIS (Netherlands)

    Nijkamp, J.F.; Van den Broek, M.A.; Datema, E.; De Kok, S.; Bosman, L.; Luttik, M.A.H.; Daran-Lapujade, P.A.S.; Vongsangnak, W.; Nielsen, J.; Heijne. W.H.M.; Klaassen, P.; Paddon, C.J.; Platt, D.; Kötter, P.; Van Ham, R.C.; Reinders, M.J.T.; Pronk, J.T.; De Ridder, D.; Daran, J.M.

    2012-01-01

    Saccharomyces cerevisiae CEN.PK 113-7D is widely used for metabolic engineering and systems biology research in industry and academia. We sequenced, assembled, annotated and analyzed its genome. Single-nucleotide variations (SNV), insertions/deletions (indels) and differences in genome organization

  12. Functional Analysis of In-frame Indel ARID1A Mutations Reveals New Regulatory Mechanisms of Its Tumor Suppressor Functions

    Directory of Open Access Journals (Sweden)

    Bin Guan

    2012-10-01

    Full Text Available AT-rich interactive domain 1A (ARID1A has emerged as a new tumor suppressor in which frequent somatic mutations have been identified in several types of human cancers. Although most ARID1A somatic mutations are frame-shift or nonsense mutations that contribute to mRNA decay and loss of protein expression, 5% of ARID1A mutations are in-frame insertions or deletions (indels that involve only a small stretch of peptides. Naturally occurring in-frame indel mutations provide unique and useful models to explore the biology and regulatory role of ARID1A. In this study, we analyzed indel mutations identified in gynecological cancers to determine how these mutations affect the tumor suppressor function of ARID1A. Our results demonstrate that all in-frame mutants analyzed lost their ability to inhibit cellular proliferation or activate transcription of CDKN1A, which encodes p21, a downstream effector of ARID1A. We also showed that ARID1A is a nucleocytoplasmic protein whose stability depends on its subcellular localization. Nuclear ARID1A is less stable than cytoplasmic ARID1A because ARID1A is rapidly degraded by the ubiquitin-proteasome system in the nucleus. In-frame deletions affecting the consensus nuclear export signal reduce steady-state protein levels of ARID1A. This defect in nuclear exportation leads to nuclear retention and subsequent degradation. Our findings delineate a mechanism underlying the regulation of ARID1A subcellular distribution and protein stability and suggest that targeting the nuclear ubiquitin-proteasome system can increase the amount of the ARID1A protein in the nucleus and restore its tumor suppressor functions.

  13. Genome-wide copy number profiling on high-density bacterial artificial chromosomes, single-nucleotide polymorphisms, and oligonucleotide microarrays: a platform comparison based on statistical power analysis.

    NARCIS (Netherlands)

    Hehir-Kwa, J.Y.; Egmont-Peterson, M.; Janssen, I.M.; Smeets, D.F.C.M.; Geurts van Kessel, A.H.M.; Veltman, J.A.

    2007-01-01

    Recently, comparative genomic hybridization onto bacterial artificial chromosome (BAC) arrays (array-based comparative genomic hybridization) has proved to be successful for the detection of submicroscopic DNA copy-number variations in health and disease. Technological improvements to achieve a high

  14. The location of a disease-associated polymorphism and genomic structure of the human 52-kDa Ro/SSA locus (SSA1)

    Energy Technology Data Exchange (ETDEWEB)

    Tsugu, H.; Horowitz, R.; Gibson, N. [Univ. of Oklahoma Health Sciences Center, Oklahoma City, OK (United States)] [and others

    1994-12-01

    Sera from approximately 30% of patients with systemic lupus erythematosus (SLE) contain high titers of autoantibodies that bind to the 52-kDa Ro/SSA protein. We previously detected polymorphisms in the 52-kDa Ro/SSA gene (SSA1) with restriction enzymes, one of which is strongly associated with the presence of SLE (P < 0.0005) in African Americans. A higher disease frequency and more severe forms of the disease are commonly noted among these female patients. To determine the location and nature of this polymorphism, we obtained two clones that span 8.5 kb of the 52-kDa Ro/SSA locus including its upstream regulatory region. Six exons were identified, and their nucleotide sequences plus adjacent noncoding regions were determined. No differences were found between these exons and the coding region of one of the reported cDNAs. The disease-associated polymorphic site suggested by a restriction enzyme map and confirmed by DNA amplification and nucleotide sequencing was present upstream of exon 1. This polymorphism may be a genetic marker for a disease-related variation in the coding region for the protein or in the upstream regulatory region of this gene. Although this RFLP is present in Japanese, it is not associated with lupus in this race. 41 refs., 4 figs., 2 tabs.

  15. It Is Not All about Single Nucleotide Polymorphisms: Comparison of Mobile Genetic Elements and Deletions in Listeria monocytogenes Genomes Links Cases of Hospital-Acquired Listeriosis to the Environmental Source.

    Science.gov (United States)

    Wang, Qinning; Holmes, Nadine; Martinez, Elena; Howard, Peter; Hill-Cawthorne, Grant; Sintchenko, Vitali

    2015-11-01

    The control of food-borne outbreaks caused by Listeria monocytogenes in humans relies on the timely identification of food or environmental sources and the differentiation of outbreak-related isolates from unrelated ones. This study illustrates the utility of whole-genome sequencing for examining the link between clinical and environmental isolates of L. monocytogenes associated with an outbreak of hospital-acquired listeriosis in Sydney, Australia. Comparative genomic analysis confirmed an epidemiological link between the three clinical and two environmental isolates. Single nucleotide polymorphism (SNP) analysis showed that only two SNPs separated the three human outbreak isolates, which differed by 19 to 20 SNPs from the environmental isolates and 71 to >10,000 SNPs from sporadic L. monocytogenes isolates. The chromosomes of all human outbreak isolates and the two suspected environmental isolates were syntenic. In contrast to the genomes of background sporadic isolates, all epidemiologically linked isolates contained two novel prophages and a previously unreported clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) locus subtype sequence. The mobile genetic element (MGE) profile of these isolates was distinct from that of the other serotype 1/2b reference strains and sporadic isolates. The identification of SNPs and clonally distinctive MGEs strengthened evidence to distinguish outbreak-related isolates of L. monocytogenes from cocirculating endemic strains.

  16. New traits in crops produced by genome editing techniques based on deletions

    NARCIS (Netherlands)

    Wiel, van de C.C.M.; Schaart, J.G.; Lotz, L.A.P.; Smulders, M.J.M.

    2017-01-01

    One of the most promising New Plant Breeding Techniques is genome editing (also called gene editing) with the help of a programmable site-directed nuclease (SDN). In this review, we focus on SDN-1, which is the generation of small deletions or insertions (indels) at a precisely defined location in t

  17. New traits in crops produced by genome editing techniques based on deletions

    NARCIS (Netherlands)

    Wiel, van de C.C.M.; Schaart, J.G.; Lotz, L.A.P.; Smulders, M.J.M.

    2017-01-01

    One of the most promising New Plant Breeding Techniques is genome editing (also called gene editing) with the help of a programmable site-directed nuclease (SDN). In this review, we focus on SDN-1, which is the generation of small deletions or insertions (indels) at a precisely defined location in

  18. Porcine growth differentiation factor 9 gene polymorphisms and their associations with litter size

    Institute of Scientific and Technical Information of China (English)

    Yushan Zhang; Hongli Du; Jing Chen; Guanfu Yang; Xiquan Zhang

    2008-01-01

    Growth differentiation factor 9 (GDF9) is expressed in oocytes and is thought to be required for ovarian folliculogenesis.Given this function,GDF9 may be considered as a candidate gene controlling pig ovulate rate.In this study,the complete coding sequence was cloned (encoding a 444 amino acid),intron sequence and partial 5'-UTR of pig GDF9.RT-PCR results showed that GDF9 mRNA is expressed in a wide range of tissues of the ruttish Erhualian pig.The expression levels of GDF9 mRNA in pituitary,ovary,uterus and oviduct are higher in the Erhualian pigs than those in Duroc pigs,especially in pituitary with a significant difference (P<0.05).Comparative sequencing revealed 12 polymorphisms,including 8 single nucleotide polymorphisms (SNPS) and one 314 bp indel in noncoding regions,and the other 3 SNPS in coding regions.Four polymorphisms,G359C,C1801T,T1806C and 314 bp indel,were developed as markers for further use in population variation and association studies.The G359C polymorphism segregates only in Chinese native pigs,Erhualian and Dahuabai,on the contrary,314 bp indel segregates only in Duroc and Landrace.C1801T and T1806C sites seem to be completely linked and segregate in Erhualian,Dahuabai and Landrace.In a word,GDF9 may be not associated with pig litter size in extensive populations as per the studies of allele distributions of the four polymorphisms and pilot association in four breeds.

  19. A De Novo Genome Sequence Assembly of the Arabidopsis thaliana Accession Niederzenz-1 Displays Presence/Absence Variation and Strong Synteny

    Science.gov (United States)

    Pucker, Boas; Holtgräwe, Daniela; Rosleff Sörensen, Thomas; Stracke, Ralf; Viehöver, Prisca

    2016-01-01

    Arabidopsis thaliana is the most important model organism for fundamental plant biology. The genome diversity of different accessions of this species has been intensively studied, for example in the 1001 genome project which led to the identification of many small nucleotide polymorphisms (SNPs) and small insertions and deletions (InDels). In addition, presence/absence variation (PAV), copy number variation (CNV) and mobile genetic elements contribute to genomic differences between A. thaliana accessions. To address larger genome rearrangements between the A. thaliana reference accession Columbia-0 (Col-0) and another accession of about average distance to Col-0, we created a de novo next generation sequencing (NGS)-based assembly from the accession Niederzenz-1 (Nd-1). The result was evaluated with respect to assembly strategy and synteny to Col-0. We provide a high quality genome sequence of the A. thaliana accession (Nd-1, LXSY01000000). The assembly displays an N50 of 0.590 Mbp and covers 99% of the Col-0 reference sequence. Scaffolds from the de novo assembly were positioned on the basis of sequence similarity to the reference. Errors in this automatic scaffold anchoring were manually corrected based on analyzing reciprocal best BLAST hits (RBHs) of genes. Comparison of the final Nd-1 assembly to the reference revealed duplications and deletions (PAV). We identified 826 insertions and 746 deletions in Nd-1. Randomly selected candidates of PAV were experimentally validated. Our Nd-1 de novo assembly allowed reliable identification of larger genic and intergenic variants, which was difficult or error-prone by short read mapping approaches alone. While overall sequence similarity as well as synteny is very high, we detected short and larger (affecting more than 100 bp) differences between Col-0 and Nd-1 based on bi-directional comparisons. The de novo assembly provided here and additional assemblies that will certainly be published in the future will allow to

  20. Genome-Wide Single-Nucleotide Polymorphisms in CMS and Restorer Lines Discovered by Genotyping Using Sequencing and Association with Marker-Combining Ability for 12 Yield-Related Traits in Oryza sativa L. subsp. Japonica

    Science.gov (United States)

    Zaid, Imdad U.; Tang, Weijie; Liu, Erbao; Khan, Sana U.; Wang, Hui; Mawuli, Edzesi W.; Hong, Delin

    2017-01-01

    Heterosis or hybrid vigor is closely related with general combing ability (GCA) of parents and special combining ability (SCA) of combinations. The evaluation of GCA and SCA facilitate selection of parents and combinations in heterosis breeding. In order to improve combining ability (CA) by molecular marker assist selection, it is necessary to identify marker loci associated with the CA. To identify the single nucleotide polymorphisms (SNP) loci associated with CA in the parental genomes of japonica rice, genome-wide discovered SNP loci were tested for association with the CA of 18 parents for 12 yield-related traits. In this study, 81 hybrids were created and evaluated to calculate the CA of 18 parents. The parents were sequenced by genotyping by sequencing (GBS) method for identification of genome-wide SNPs. The analysis of GBS indicated that the successful mapping of 9.86 × 106 short reads in the Nipponbare reference genome consists of 39,001 SNPs in parental genomes at 11,085 chromosomal positions. The discovered SNPs were non-randomly distributed within and among the 12 chromosomes of rice. Overall, 20.4% (8026) of the discovered SNPs were coding types, and 8.6% (3344) and 9.9% (3951) of the SNPs revealed synonymous and non-synonymous changes, which provide valuable knowledge about the underlying performance of the parents. Furthermore, the associations between SNPs and CA indicated that 362 SNP loci were significantly related to the CA of 12 parental traits. The identified SNP loci of CA in our study were distributed genome wide and caused a positive or negative effect on the CA of traits. For the yield-related traits, such as grain thickness, days to heading, panicle length, grain length and 1000-grain weight, a maximum number of positive SNP loci of CA were found in CMS A171 and in the restorers LC64 and LR27. On an individual basis, some of associated loci that resided on chromosomes 2, 5, 7, 9, and 11 recorded maximum positive values for the CA of traits

  1. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    Science.gov (United States)

    Dong, Chun-nan; Yang, Ya-dong; Li, Shu-jin; Yang, Ya-ran; Zhang, Xiao-jing; Fang, Xiang-dong; Yan, Jiang-wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these “nucleosome protected STRs” (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  2. Development of genome-specific primers for homoeologous genes in allopolyploid species: the waxy and starch synthase II genes in allohexaploid wheat (Triticum aestivum L. as examples

    Directory of Open Access Journals (Sweden)

    Brûlé-Babel Anita

    2010-05-01

    Full Text Available Abstract Background In allopolypoid crops, homoeologous genes in different genomes exhibit a very high sequence similarity, especially in the coding regions of genes. This makes it difficult to design genome-specific primers to amplify individual genes from different genomes. Development of genome-specific primers for agronomically important genes in allopolypoid crops is very important and useful not only for the study of sequence diversity and association mapping of genes in natural populations, but also for the development of gene-based functional markers for marker-assisted breeding. Here we report on a useful approach for the development of genome-specific primers in allohexaploid wheat. Findings In the present study, three genome-specific primer sets for the waxy (Wx genes and four genome-specific primer sets for the starch synthase II (SSII genes were developed mainly from single nucleotide polymorphisms (SNPs and/or insertions or deletions (Indels in introns and intron-exon junctions. The size of a single PCR product ranged from 750 bp to 1657 bp. The total length of amplified PCR products by these genome-specific primer sets accounted for 72.6%-87.0% of the Wx genes and 59.5%-61.6% of the SSII genes. Five genome-specific primer sets for the Wx genes (one for Wx-7A, three for Wx-4A and one for Wx-7D could distinguish the wild type wheat and partial waxy wheat lines. These genome-specific primer sets for the Wx and SSII genes produced amplifications in hexaploid wheat, cultivated durum wheat, and Aegilops tauschii accessions, but failed to generate amplification in the majority of wild diploid and tetraploid accessions. Conclusions For the first time, we report on the development of genome-specific primers from three homoeologous Wx and SSII genes covering the majority of the genes in allohexaploid wheat. These genome-specific primers are being used for the study of sequence diversity and association mapping of the three homoeologous Wx

  3. Polymorphisms associated with ventricular tachyarrhythmias: rationale, design, and endpoints of the 'diagnostic data influence on disease management and relation of genomics to ventricular tachyarrhythmias in implantable cardioverter/defibrillator patients (DISCOVERY)' study

    DEFF Research Database (Denmark)

    Garcia, Javier; Wieneke, Heinrich; Spencker, Sebastian;

    2010-01-01

    (SNPs) are DNA sequence variations occurring when a single nucleotide in the genome differs among members of a species. A novel concept has emerged being that these common genetic variations might modify the susceptibility of a certain population to specific diseases. Thus, genetic factors may also...... modulate the risk for arrhythmias and sudden cardiac death, and identification of common variants could help to better identify patients at risk. The DISCOVERY study is an interventional, longitudinal, prospective, multi-centre diagnostic study that will enrol 1287 patients in approximately 80 European...... centres. In the genetic part of the DISCOVERY study, candidate gene polymorphisms involved in coding of the G-protein subunits will be correlated with the occurrence of ventricular arrhythmias in patients receiving an ICD for primary prevention. Furthermore, in order to search for additional sequence...

  4. Parsimony and Model-Based Analyses of Indels in Avian Nuclear Genes Reveal Congruent and Incongruent Phylogenetic Signals

    Directory of Open Access Journals (Sweden)

    Frederick H. Sheldon

    2013-03-01

    Full Text Available Insertion/deletion (indel mutations, which are represented by gaps in multiple sequence alignments, have been used to examine phylogenetic hypotheses for some time. However, most analyses combine gap data with the nucleotide sequences in which they are embedded, probably because most phylogenetic datasets include few gap characters. Here, we report analyses of 12,030 gap characters from an alignment of avian nuclear genes using maximum parsimony (MP and a simple maximum likelihood (ML framework. Both trees were similar, and they exhibited almost all of the strongly supported relationships in the nucleotide tree, although neither gap tree supported many relationships that have proven difficult to recover in previous studies. Moreover, independent lines of evidence typically corroborated the nucleotide topology instead of the gap topology when they disagreed, although the number of conflicting nodes with high bootstrap support was limited. Filtering to remove short indels did not substantially reduce homoplasy or reduce conflict. Combined analyses of nucleotides and gaps resulted in the nucleotide topology, but with increased support, suggesting that gap data may prove most useful when analyzed in combination with nucleotide substitutions.

  5. Association and Genetic Identification of Loci for Four Fruit Traits in Tomato Using InDel Markers

    Directory of Open Access Journals (Sweden)

    Xiaoxi Liu

    2017-07-01

    Full Text Available Tomato (Solanum lycopersicum fruit weight (FW, soluble solid content (SSC, fruit shape and fruit color are crucial for yield, quality and consumer acceptability. In this study, a 192 accessions tomato association panel comprising a mixture of wild species, cherry tomato, landraces, and modern varieties collected worldwide was genotyped with 547 InDel markers evenly distributed on 12 chromosomes and scored for FW, SSC, fruit shape index (FSI, and color parameters over 2 years with three replications each year. The association panel was sorted into two subpopulations. Linkage disequilibrium ranged from 3.0 to 47.2 Mb across 12 chromosomes. A set of 102 markers significantly (p < 1.19–1.30 × 10−4 associated with SSC, FW, fruit shape, and fruit color was identified on 11 of the 12 chromosomes using a mixed linear model. The associations were compared with the known gene/QTLs for the same traits. Genetic analysis using F2 populations detected 14 and 4 markers significantly (p < 0.05 associated with SSC and FW, respectively. Some loci were commonly detected by both association and linkage analysis. Particularly, one novel locus for FW on chromosome 4 detected by association analysis was also identified in F2 populations. The results demonstrated that association mapping using limited number of InDel markers and a relatively small population could not only complement and enhance previous QTL information, but also identify novel loci for marker-assisted selection of fruit traits in tomato.

  6. A genomic scale map of genetic diversity in Trypanosoma cruzi

    Directory of Open Access Journals (Sweden)

    Ackermann Alejandro A

    2012-12-01

    Full Text Available Abstract Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs: TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the

  7. Epstein-Barr virus genome polymorphisms of Epstein-Barr virus-associated gastric carcinoma in gastric remnant carcinoma in Guangzhou, southern China, an endemic area of nasopharyngeal carcinoma.

    Science.gov (United States)

    Chen, Jian-ning; Jiang, Ye; Li, Hai-gang; Ding, Yun-gang; Fan, Xin-juan; Xiao, Lin; Han, Jing; Du, Hong; Shao, Chun-kui

    2011-09-01

    Epstein-Barr virus (EBV) is associated with a subset of gastric carcinoma which was defined as EBV-associated gastric carcinoma (EBVaGC). The proportion of EBVaGC in gastric remnant carcinoma (GRC) was apparently higher than that in conventional gastric carcinoma (CGC) which occurs in the intact stomach. To clarify the possible mechanisms, 26 GRC cases from Guangzhou were investigated for the presence of EBV, and the EBV genome polymorphisms of EBVaGC in GRC were analyzed. Besides, the clinicopathologic characteristics, EBV latency pattern of EBVaGC in GRC were also investigated. Eight (30.8%) out of 26 cases were identified as EBVaGCs. Type A strain, prototype F, type I, mut-W1/I1, XhoI- and del-LMP1 variants were predominant among EBVaGC patients, accounting for 7 (87.5%), 7 (87.5%), 8 (100%), 6 (75%), 5 (62.5%) and 8 (100%) cases, respectively. All EBVaGC cases were male and with the histology of diffuse-type carcinoma. The tumor cells expressed EBNA1 (87.5%) and LMP2A (62.5%) but not LMP1, EBNA2 and ZEBRA. Thus, the EBV latency pattern was latency I. These were similar to those in CGC, except for the significantly higher proportion of EBVaGC in GRC than in CGC, suggesting that there is no more aggressive EBV variant in EBVaGC in GRC, and the injuries of gastric mucosa and/or changes of the microenvironment within the remnant stomach may be involved in the development of EBVaGC in GRC. This, to our knowledge, is the first study concerning about the EBV genome polymorphisms of EBVaGC in GRC in the world. Copyright © 2011 Elsevier B.V. All rights reserved.

  8. Genomic inflation factors under polygenic inheritance

    NARCIS (Netherlands)

    Yang, Jian; Weedon, Michael N.; Purcell, Shaun; Lettre, Guillaume; Estrada, Karol; Willer, Cristen J.; Smith, Albert V.; Ingelsson, Erik; O'Connell, Jeffrey R.; Mangino, Massimo; Maegi, Reedik; Madden, Pamela A.; Heath, Andrew C.; Nyholt, Dale R.; Martin, Nicholas G.; Montgomery, Grant W.; Frayling, Timothy M.; Hirschhorn, Joel N.; McCarthy, Mark I.; Goddard, Michael E.; Visscher, Peter M.

    2011-01-01

    Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome i

  9. High Resolution Melt (HRM analysis is an efficient tool to genotype EMS mutants in complex crop genomes

    Directory of Open Access Journals (Sweden)

    Lochlainn Seosamh Ó

    2011-12-01

    Full Text Available Abstract Background Targeted Induced Loci Lesions IN Genomes (TILLING is increasingly being used to generate and identify mutations in target genes of crop genomes. TILLING populations of several thousand lines have been generated in a number of crop species including Brassica rapa. Genetic analysis of mutants identified by TILLING requires an efficient, high-throughput and cost effective genotyping method to track the mutations through numerous generations. High resolution melt (HRM analysis has been used in a number of systems to identify single nucleotide polymorphisms (SNPs and insertion/deletions (IN/DELs enabling the genotyping of different types of samples. HRM is ideally suited to high-throughput genotyping of multiple TILLING mutants in complex crop genomes. To date it has been used to identify mutants and genotype single mutations. The aim of this study was to determine if HRM can facilitate downstream analysis of multiple mutant lines identified by TILLING in order to characterise allelic series of EMS induced mutations in target genes across a number of generations in complex crop genomes. Results We demonstrate that HRM can be used to genotype allelic series of mutations in two genes, BraA.CAX1a and BraA.MET1.a in Brassica rapa. We analysed 12 mutations in BraA.CAX1.a and five in BraA.MET1.a over two generations including a back-cross to the wild-type. Using a commercially available HRM kit and the Lightscanner™ system we were able to detect mutations in heterozygous and homozygous states for both genes. Conclusions Using HRM genotyping on TILLING derived mutants, it is possible to generate an allelic series of mutations within multiple target genes rapidly. Lines suitable for phenotypic analysis can be isolated approximately 8-9 months (3 generations from receiving M3 seed of Brassica rapa from the RevGenUK TILLING service.

  10. Rapid identification of genetic modifications in Bacillus anthracis using whole genome draft sequences generated by 454 pyrosequencing.

    Directory of Open Access Journals (Sweden)

    Peter E Chen

    Full Text Available BACKGROUND: The anthrax letter attacks of 2001 highlighted the need for rapid identification of biothreat agents not only for epidemiological surveillance of the intentional outbreak but also for implementing appropriate countermeasures, such as antibiotic treatment, in a timely manner to prevent further casualties. It is clear from the 2001 cases that survival may be markedly improved by administration of antimicrobial therapy during the early symptomatic phase of the illness; i.e., within 3 days of appearance of symptoms. Microbiological detection methods are feasible only for organisms that can be cultured in vitro and cannot detect all genetic modifications with the exception of antibiotic resistance. Currently available immuno or nucleic acid-based rapid detection assays utilize known, organism-specific proteins or genomic DNA signatures respectively. Hence, these assays lack the ability to detect novel natural variations or intentional genetic modifications that circumvent the targets of the detection assays or in the case of a biological attack using an antibiotic resistant or virulence enhanced Bacillus anthracis, to advise on therapeutic treatments. METHODOLOGY/PRINCIPAL FINDINGS: We show here that the Roche 454-based pyrosequencing can generate whole genome draft sequences of deep and broad enough coverage of a bacterial genome in less than 24 hours. Furthermore, using the unfinished draft sequences, we demonstrate that unbiased identification of known as well as heretofore-unreported genetic modifications that include indels and single nucleotide polymorphisms conferring antibiotic and phage resistances is feasible within the next 12 hours. CONCLUSIONS/SIGNIFICANCE: Second generation sequencing technologies have paved the way for sequence-based rapid identification of both known and previously undocumented genetic modifications in cultured, conventional and newly emerging biothreat agents. Our findings have significant implications in

  11. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus.

    Science.gov (United States)

    Chen, Chunxian; Gmitter, Fred G

    2013-11-01

    Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered - 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had "no hits found", 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. High-quality EST-SNPs from different citrus genotypes were detected, and

  12. Comparative genomics of Australian isolates of the wheat stem rust pathogen Puccinia graminis f. sp. tritici reveals extensive polymorphism in candidate effector genes

    Directory of Open Access Journals (Sweden)

    Narayana Mithur Upadhyaya

    2015-01-01

    Full Text Available The wheat stem rust fungus Puccinia graminis f. sp. tritici (Pgt, is one of the most destructive pathogens of wheat. In this study, a draft genome was built for a founder Australian Pgt isolate of pathotype (pt. 21-0 (collected in 1954 by next generation DNA sequencing. A combination of reference-based assembly using the genome of the previously sequenced American Pgt isolate CDL 75-36-700-3 (p7a and de novo assembly were performed resulting in a 92 Mbp reference genome for Pgt isolate 21-0. Approximately 13 Mbp of de novo assembled sequence in this genome is not present in the p7a reference assembly. This novel sequence is not specific to 21-0 as it is also present in three other Pgt rust isolates of independent origin.The new reference genome was subsequently used to build a pan-genome based on five Australian Pgt isolates. Transcriptomes from germinated urediniospores and haustoria were separately assembled for pt. 21-0 and comparison of gene expression profiles showed differential expression in ~10% of the genes each in germinated spores and haustoria. A total of 1,924 secreted proteins were predicted from the 21-0 transcriptome, of which 520 were classified as haustorial secreted proteins (HSPs. Comparison of 21-0 with two presumed clonal field derivatives of this lineage (collected in 1982 and 1984 that had evolved virulence on four additional resistance genes (Sr5, Sr11, Sr27, SrSatu identified mutations in 25 HSP effector candidates, some of which could explain their novel virulence phenotypes.

  13. PRNP and SPRN genes polymorphism in atypical bovine spongiform encephalopathy cases diagnosed in Polish cattle.

    Science.gov (United States)

    Gurgul, Artur; Polak, Mirosław Paweł; Larska, Magdalena; Słota, Ewa

    2012-08-01

    Polymorphisms in the coding region of the prion protein gene (PRNP) have been associated with the susceptibility and incubation period of prion diseases in humans and sheep. However, polymorphisms in this part of the bovine PRNP gene do not affect the classical bovine spongiform encephalopathy (BSE) susceptibility in cattle. Studies carried out in Germany have shown that insertion/deletion-type polymorphisms located in the promoter region of the bovine prion gene are possible genetic factors modulating BSE susceptibility by changing the level of PRNP expression. No such association was observed for atypical BSE cases; however, due to the rare nature of the disease, these results should be confirmed. Additionally, a single nonsynonymous mutation in PRNP codon 211 (E211K) was described in one H-type BSE case in the USA; however, it was not found in any other cases. Here, we performed genetic characterization of PRNP promoter indel variations and determined the polymorphism of open reading frames (ORFs) of PRNP and bovine prion-like Shadoo (SPRN) genes in six Polish atypical BSE cases and compared these results to the population of clinically healthy Polish Holstein cattle. No potentially pathogenic mutations were found in the PRNP ORF in atypical BSE-affected cattle, but our study showed a high frequency of deletions at the indel loci of PRNP promoter in these animals. Additionally, a rare sequence variation in the SPRN protein-coding sequence was found in one L-type atypical BSE-affected animal.

  14. Evolution of the P-type II ATPase gene family in the fungi and presence of structural genomic changes among isolates of Glomus intraradices

    Directory of Open Access Journals (Sweden)

    Sanders Ian R

    2006-03-01

    Full Text Available Abstract Background The P-type II ATPase gene family encodes proteins with an important role in adaptation of the cell to variation in external K+, Ca2+ and Na2+ concentrations. The presence of P-type II gene subfamilies that are specific for certain kingdoms has been reported but was sometimes contradicted by discovery of previously unknown homologous sequences in newly sequenced genomes. Members of this gene family have been sampled in all of the fungal phyla except the arbuscular mycorrhizal fungi (AMF; phylum Glomeromycota, which are known to play a key-role in terrestrial ecosystems and to be genetically highly variable within populations. Here we used highly degenerate primers on AMF genomic DNA to increase the sampling of fungal P-Type II ATPases and to test previous predictions about their evolution. In parallel, homologous sequences of the P-type II ATPases have been used to determine the nature and amount of polymorphism that is present at these loci among isolates of Glomus intraradices harvested from the same field. Results In this study, four P-type II ATPase sub-families have been isolated from three AMF species. We show that, contrary to previous predictions, P-type IIC ATPases are present in all basal fungal taxa. Additionally, P-Type IIE ATPases should no longer be considered as exclusive to the Ascomycota and the Basidiomycota, since we also demonstrate their presence in the Zygomycota. Finally, a comparison of homologous sequences encoding P-type IID ATPases showed unexpectedly that indel mutations among coding regions, as well as specific gene duplications occur among AMF individuals within the same field. Conclusion On the basis of these results we suggest that the diversification of P-Type IIC and E ATPases followed the diversification of the extant fungal phyla with independent events of gene gains and losses. Consistent with recent findings on the human genome, but at a much smaller geographic scale, we provided evidence

  15. SpeedSeq: Ultra-fast personal genome analysis and interpretation

    Science.gov (United States)

    Chiang, Colby; Layer, Ryan M.; Faust, Gregory G.; Lindberg, Michael R.; Rose, David B.; Garrison, Erik P.; Marth, Gabor T.; Quinlan, Aaron R.; Hall, Ira M.

    2015-01-01

    SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 hours on a low-cost server, alleviating a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers competitive or superior performance to current methods for detecting germline and somatic single nucleotide variants, indels, and structural variants, and includes novel functionality for streamlined interpretation. PMID:26258291

  16. Genomic resources for water yam (Dioscorea alata L.): analyses of EST-Sequences, De Novo sequencing and GBS libraries

    Science.gov (United States)

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources such as SSRs, SNPs and InDels in several model and non-model plant species. Yam (Dioscorea spp.) i...

  17. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  18. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum).

    Science.gov (United States)

    Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

    2015-01-01

    We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  19. Development of a RAD-Seq Based DNA Polymorphism Identification Software, AgroMarker Finder, and Its Application in Rice Marker-Assisted Breeding.

    Science.gov (United States)

    Fan, Wei; Zong, Jie; Luo, Zhijing; Chen, Mingjiao; Zhao, Xiangxiang; Zhang, Dabing; Qi, Yiping; Yuan, Zheng

    2016-01-01

    Rapid and accurate genome-wide marker detection is essential to the marker-assisted breeding and functional genomics studies. In this work, we developed an integrated software, AgroMarker Finder (AMF: http://erp.novelbio.com/AMF), for providing graphical user interface (GUI) to facilitate the recently developed restriction-site associated DNA (RAD) sequencing data analysis in rice. By application of AMF, a total of 90,743 high-quality markers (82,878 SNPs and 7,865 InDels) were detected between rice varieties JP69 and Jiaoyuan5A. The density of the identified markers is 0.2 per Kb for SNP markers, and 0.02 per Kb for InDel markers. Sequencing validation revealed that the accuracy of genome-wide marker detection by AMF is 93%. In addition, a validated subset of 82 SNPs and 31 InDels were found to be closely linked to 117 important agronomic trait genes, providing a basis for subsequent marker-assisted selection (MAS) and variety identification. Furthermore, we selected 12 markers from 31 validated InDel markers to identify seed authenticity of variety Jiaoyuanyou69, and we also identified 10 markers closely linked to the fragrant gene BADH2 to minimize linkage drag for Wuxiang075 (BADH2 donor)/Jiachang1 recombinants selection. Therefore, this software provides an efficient approach for marker identification from RAD-seq data, and it would be a valuable tool for plant MAS and variety protection.

  20. Accelerating genome editing in CHO cells using CRISPR Cas9 and CRISPy, a web-based target finding tool.

    Science.gov (United States)

    Ronda, Carlotta; Pedersen, Lasse Ebdrup; Hansen, Henning Gram; Kallehauge, Thomas Beuchert; Betenbaugh, Michael J; Nielsen, Alex Toftgaard; Kildegaard, Helene Faustrup

    2014-08-01

    Chinese hamster ovary (CHO) cells are widely used in the biopharmaceutical industry as a host for the production of complex pharmaceutical proteins. Thus genome engineering of CHO cells for improved product quality and yield is of great interest. Here, we demonstrate for the first time the efficacy of the CRISPR Cas9 technology in CHO cells by generating site-specific gene disruptions in COSMC and FUT8, both of which encode proteins involved in glycosylation. The tested single guide RNAs (sgRNAs) created an indel frequency up to 47.3% in COSMC, while an indel frequency up to 99.7% in FUT8 was achieved by applying lectin selection. All eight sgRNAs examined in this study resulted in relatively high indel frequencies, demonstrating that the Cas9 system is a robust and efficient genome-editing methodology in CHO cells. Deep sequencing revealed that 85% of the indels created by Cas9 resulted in frameshift mutations at the target sites, with a strong preference for single base indels. Finally, we have developed a user-friendly bioinformatics tool, named "CRISPy" for rapid identification of sgRNA target sequences in the CHO-K1 genome. The CRISPy tool identified 1,970,449 CRISPR targets divided into 27,553 genes and lists the number of off-target sites in the genome. In conclusion, the proven functionality of Cas9 to edit CHO genomes combined with our CRISPy database have the potential to accelerate genome editing and synthetic biology efforts in CHO cells.

  1. Comparison of Genomic and Epigenomic Expression in Monozygotic Twins Discordant for Rett Syndrome.

    Directory of Open Access Journals (Sweden)

    Kunio Miyake

    Full Text Available Monozygotic (identical twins have been widely used in genetic studies to determine the relative contributions of heredity and the environment in human diseases. Discordance in disease manifestation between affected monozygotic twins has been attributed to either environmental factors or different patterns of X chromosome inactivation (XCI. However, recent studies have identified genetic and epigenetic differences between monozygotic twins, thereby challenging the accepted experimental model for distinguishing the effects of nature and nurture. Here, we report the genomic and epigenomic sequences in skin fibroblasts of a discordant monozygotic twin pair with Rett syndrome, an X-linked neurodevelopmental disorder characterized by autistic features, epileptic seizures, gait ataxia and stereotypical hand movements. The twins shared the same de novo mutation in exon 4 of the MECP2 gene (G269AfsX288, which was paternal in origin and occurred during spermatogenesis. The XCI patterns in the twins did not differ in lymphocytes, skin fibroblasts, and hair cells (which originate from ectoderm as does neuronal tissue. No reproducible differences were detected between the twins in single nucleotide polymorphisms (SNPs, insertion-deletion polymorphisms (indels, or copy number variations. Differences in DNA methylation between the twins were detected in fibroblasts in the upstream regions of genes involved in brain function and skeletal tissues such as Mohawk Homeobox (MKX, Brain-type Creatine Kinase (CKB, and FYN Tyrosine Kinase Protooncogene (FYN. The level of methylation in these upstream regions was inversely correlated with the level of gene expression. Thus, differences in DNA methylation patterns likely underlie the discordance in Rett phenotypes between the twins.

  2. Discovery of Single Nucleotide Polymorphisms and Mutations by Pyrosequencing

    OpenAIRE

    2006-01-01

    Comparative genomics, analyzing variation among individual genomes, is an area of intense investigation. DNA sequencing is usually employed to look for polymorphisms and mutations. Pyrosequencing, a real-time DNA sequencing method, is emerging as a popular platform for comparative genomics. Here we review the use of this technology for mutation scanning, polymorphism discovery and chemical haplotyping. We describe the methodology and accuracy of this technique and discuss how t...

  3. Psoriasis-Associated Genetic Polymorphism in North Indian Population in the CCHCR1 Gene and in a Genomic Segment Flanking the HLA-C Region

    Directory of Open Access Journals (Sweden)

    G. Gandhi

    2011-01-01

    Full Text Available Psoriasis is a common, chronic, recurrent, inflammatory, hyper proliferative disorder of the skin, which has a relatively high prevalence in the general population (0.6–4.8%. Linkage and association analyses in various populations have revealed a major locus for psoriasis susceptibility, PSORS1, at 6p21.3. Association of the disease with human leukocyte antigen (HLA Cw6, corneodesmosin (CDSN and the coiled-coil alpha-helical rod protein-1 (CCHCR1 has also been reported. Though the PSORS1 locus accounts for 30–50% of familial psoriasis in various global population groups, yet no studies have been published from the North Indian population. Some of the SNPs in HLA-C and CCHCR1 genes have been reported as markers for disease susceptibility. Therefore in the present study, DNA samples from psoriasis patients from North India were genotyped for polymorphisms in CCHCR1 and HLA-C genes. The allele frequencies were calculated for patients and controls, and were compared for odds ratio and confidence interval values. SNPn.7*22222 (rs12208888, SNPn.7*22333 (rs12216025, SNPn.9*24118 (rs10456057, CCHCR1_386 (rs130065, CCHCR1_404 (rs130076 and CCHCR1_1364 (rs130071 were found to be significant in psoriasis patients. Linkage disequilibrium analysis revealed two haplotypes (rs12208888, rs2844608, rs12216025, rs10456057, rs130065, rs130066, rs130068, rs130269, and rs12208888, rs2844608, rs12216025, rs130076, rs130066, rs130068, rs130269, rs130071 as highly susceptible haplotypes for psoriasis in the cohort studied. Preliminary analysis of the data also suggests the possibilities of ethnic group specific disease related polymorphisms, pending validation in future studies.

  4. A mixture detection method based on separate amplification using primer specific alleles of INDELs-a study based on two person's DNA mixture.

    Science.gov (United States)

    Liu, Jinding; Wang, Jiaqi; Zhang, Xiaojia; Li, Zeqin; Yun, Keming; Liu, Zhizhen; Zhang, Gengqian

    2017-02-01

    Samples containing unbalanced DNA mixtures from individuals often occur in forensic DNA examination and clinical detection. Because of the PCR amplification bias, the minor contributor DNA is often masked by the major contributor DNA when using traditional STR or SNP typing techniques. Here we propose a method based in allele-specific Insertion/Deletion (INDEL) genotyping to detect DNA mixtures in forensic samples. Fourteen INDELs were surveyed in the Chinese Han population of Shanxi Province. The INDELs were amplified using two separate primer-specific reactions by real-time PCR. The difference Ct value of the 2 reactions (D-value) were used for determination of the single source DNA. INDELs types and further confirmed by electrophoresis separation. The minor allele frequency (MAF) was above 0.2 in 10 INDELs. The detection limit was 0.3125 ng-1.25 ng template DNA for real-time PCR in all 14 INDEL markers. For single source 10 ng DNA, the average D-value was 0.31 ± 0.14 for LS type, 6.96 ± 1.05 for LL type and 7.20 ± 1.09 for SS type. For the series of simulated DNA mixture, the Ct value varied between the ranges of single source DNA, depending on their INDEL typing and mixture ratios. This method can detect the specific allele of the minor DNA contributor as little as 1:50 in rs397782455 and rs397696936; 1:100 in rs397832665, rs397822382 and rs397897230; the detection limit of the minor DNA contributor was as little as 1:500-1:1000 in the rest INDEL markers, a much higher sensitivity compared with traditional STR typing. The D-value variation depended on the alternation of dilution ratio and INDEL types. When the dilution was 1:1000, the maximum and minimum D-values were 8.84 ± 0.11 in rs397897230 and 4.27 ± 0.19 in rs397897239 for LL and SS type mixture, the maximum and minimum D-values were 9.32 ± 0.54 in rs397897230 and 4.38 ± 0.26 in rs 397897239 for LL(SS) and LS type mixture, separately. Any D-value between 0.86 and 5.11 in the 14

  5. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    Directory of Open Access Journals (Sweden)

    Walker M Andrew

    2006-09-01

    Full Text Available Abstract Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c, 54 (Dixon, 83 (Ann1 and 9 (Temecula-1. A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes

  6. Whole genome and exome sequencing realignment supports the assignment of KCNJ12, KCNJ17, and KCNJ18 paralogous genes in thyrotoxic periodic paralysis locus: functional characterization of two polymorphic Kir2.6 isoforms.

    Science.gov (United States)

    Paninka, Rolf M; Mazzotti, Diego R; Kizys, Marina M L; Vidi, Angela C; Rodrigues, Hélio; Silva, Silas P; Kunii, Ilda S; Furuzawa, Gilberto K; Arcisio-Miranda, Manoel; Dias-da-Silva, Magnus R

    2016-08-01

    Next-generation sequencing (NGS) has enriched the understanding of the human genome. However, homologous or repetitive sequences shared among genes frequently produce dubious alignments and can puzzle NGS mutation analysis, especially for paralogous potassium channels. Potassium inward rectifier (Kir) channels are important to establish the resting membrane potential and regulating the muscle excitability. Mutations in Kir channels cause disorders affecting the heart and skeletal muscle, such as arrhythmia and periodic paralysis. Recently, a susceptibility muscle channelopathy-thyrotoxic periodic paralysis (TPP)-has been related to Kir2.6 channel (KCNJ18 gene). Due to their high nucleotide sequence homology, variants found in the potassium channels Kir2.6 and Kir2.5 have been mistakenly attributable to Kir2.2 polymorphisms or mutations. We aimed at elucidating nucleotide misalignments by performing realignment of whole exome sequencing (WES) and whole genome sequencing (WGS) reads to specific Kir2.2, Kir2.5, and Kir2.6 cDNA sequences using BWA-MEM/GATK pipeline. WES/WGS reads correctly aligned 26.9/43.2, 37.6/31.0, and 35.4/25.8 % to Kir2.2, Kir2.5, and Kir2.6, respectively. Realignment was able to reduce over 94 % of misalignments. No putative mutations of Kir2.6 were identified for the three TPP patients included in the cohort of 36 healthy controls using either WES or WGS. We also distinguished sequences for a single Kir2.2, a single Kir2.5 sequence, and two Kir2.6 isoforms, which haplotypes were named RRAI and QHEV, based on changes at 39, 40, 56, and 249 residues. Electrophysiology records on both Kir2.6_RRAI and _QHEV showed typical rectifying currents. In our study, the reduction of misalignments allowed the elucidation of paralogous gene sequences and two distinct Kir2.6 haplotypes, and pointed the need for checking the frequency of these polymorphisms in other populations with different genetic background.

  7. 新生隐球菌毒力差异菌株的全基因组测序及毒力相关基因的筛选%Genomic sequencing analysis of Cryptococcus neoformans var grubii strains of two genotypes with dif-ferent virulence and selection of virulence-associated genes

    Institute of Scientific and Technical Information of China (English)

    刘涛华; 王颜颜; 陈玉如; 赵亮; 吕倩; 牟丽丽; 康颖倩

    2016-01-01

    Objective To analyze the genomic sequences of Cryptococcus neoformans var grubii strains of two genotypes with different virulence and to screen out the virulence-associated genes. Methods A clinical strain (IFM56800) with the strongest virulence and an environmental strain (IFM56731) with the weakest virulence were screened out for whole genome sequencing analysis. The results of sequencing analy-sis were comprehensively analyzed by using the method of comparative genomics. Genetic variations were ex-tensively screened by using the strategies of non-synonymous single nucleotide polymorphisms ( nsSNPs), nonsense SNPs and the insertions or deletions ( InDels) causing frameshift mutations. The filtered genes were sequenced in 20 experimental strains. The whole RNAs were extracted and then the full-length cDNAs were sequenced by using the rapid amplification of 5′ and 3′ cDNA ends (RACE) method. Results By whole genome sequencing, valid data with high coverage (127 times and 111 times) was obtained in both the environmental strain IFM56731 and the clinical strain IFM56800. The data of InDels and SNPs were statisti-cally analyzed, respectively. Six genes were chosen for further analysis based on the strategies of nonsense SNPs and the InDels causing frameshift mutations. The six genes were amplified and sequenced in all of the experimental strains, three of which were further analyzed with cDNA sequencing. Ultimately, the location and structure of CNAG_01032 gene were determined. The predicted nonsense mutation locus was verified to present in the actual mRNA. Conclusion The strategies of nonsense SNPs and the InDels causing frame-shift mutations showed high-efficiency in screening potential virulence-associated genes. The CNAG_01032 gene was screened out as a novel virulence-associated gene.%目的:了解及分析新生隐球菌格鲁比变种(Cryptococcus neoformans var grubii)两组毒力差异明显的多位点微卫星型(multilocus microsatellite typing

  8. Genomic profiling of thousands of candidate polymorphisms predicts risk of relapse in 778 Danish and German childhood acute lymphoblastic leukemia patients

    DEFF Research Database (Denmark)

    Wesolowska, Agata; Borst, L.; Dalgaard, Marlene Danner

    2015-01-01

    genome profiles associated with relapse risk in 352 patients from the Nordic ALL92/2000 protocols and 426 patients from the German Berlin–Frankfurt–Munster (BFM) ALL2000 protocol. Patients were enrolled between 1992 and 2008 (median follow-up: 7.6 years). Eleven cross-validated SNPs were significantly...... associated with risk of relapse across protocols. SNP and biologic pathway level analyses associated relapse risk with leukemia aggressiveness, glucocorticosteroid pharmacology/response and drug transport/metabolism pathways. Classification and regression tree analysis identified three distinct risk groups...

  9. Candidate driver genes involved in genome maintenance and DNA repair in Sézary syndrome.

    Science.gov (United States)

    Woollard, Wesley J; Pullabhatla, Venu; Lorenc, Anna; Patel, Varsha M; Butler, Rosie M; Bayega, Anthony; Begum, Nelema; Bakr, Farrah; Dedhia, Kiran; Fisher, Joshua; Aguilar-Duran, Silvia; Flanagan, Charlotte; Ghasemi, Aria A; Hoffmann, Ricarda M; Castillo-Mosquera, Nubia; Nuttall, Elisabeth A; Paul, Arisa; Roberts, Ceri A; Solomonidis, Emmanouil G; Tarrant, Rebecca; Yoxall, Antoinette; Beyers, Carl Z; Ferreira, Silvia; Tosi, Isabella; Simpson, Michael A; de Rinaldis, Emanuele; Mitchell, Tracey J; Whittaker, Sean J

    2016-06-30

    Sézary syndrome (SS) is a leukemic variant of cutaneous T-cell lymphoma (CTCL) and represents an ideal model for study of T-cell transformation. We describe whole-exome and single-nucleotide polymorphism array-based copy number analyses of CD4(+) tumor cells from untreated patients at diagnosis and targeted resequencing of 101 SS cases. A total of 824 somatic nonsynonymous gene variants were identified including indels, stop-gain/loss, splice variants, and recurrent gene variants indicative of considerable molecular heterogeneity. Driver genes identified using MutSigCV include POT1, which has not been previously reported in CTCL; and TP53 and DNMT3A, which were also identified consistent with previous reports. Mutations in PLCG1 were detected in 11% of tumors including novel variants not previously described in SS. This study is also the first to show BRCA2 defects in a significant proportion (14%) of SS tumors. Aberrations in PRKCQ were found to occur in 20% of tumors highlighting selection for activation of T-cell receptor/NF-κB signaling. A complex but consistent pattern of copy number variants (CNVs) was detected and many CNVs involved genes identified as putative drivers. Frequent defects involving the POT1 and ATM genes responsible for telomere maintenance were detected and may contribute to genomic instability in SS. Genomic aberrations identified were enriched for genes implicated in cell survival and fate, specifically PDGFR, ERK, JAK STAT, MAPK, and TCR/NF-κB signaling; epigenetic regulation (DNMT3A, ASLX3, TET1-3); and homologous recombination (RAD51C, BRCA2, POLD1). This study now provides the basis for a detailed functional analysis of malignant transformation of mature T cells and improved patient stratification and treatment.

  10. Comparative genomics of Tunisian Leishmania major isolates causing human cutaneous leishmaniasis with contrasting clinical severity.

    Science.gov (United States)

    Ghouila, Amel; Guerfali, Fatma Z; Atri, Chiraz; Bali, Aymen; Attia, Hanene; Sghaier, Rabiaa M; Mkannez, Ghada; Dickens, Nicholas J; Laouini, Dhafer

    2017-06-01

    Zoonotic cutaneous leishmaniasis caused by Leishmania (L.) major parasites affects urban and suburban areas in the center and south of Tunisia where the disease is endemo-epidemic. Several cases were reported in human patients for which infection due to L. major induced lesions with a broad range of severity. However, very little is known about the mechanisms underlying this diversity. Our hypothesis is that parasite genomic variability could, in addition to the host immunological background, contribute to the intra-species clinical variability observed in patients and explain the lesion size differences observed in the experimental model. Based on several epidemiological, in vivo and in vitro experiments, we focused on two clinical isolates showing contrasted severity in patients and BALB/c experimental mice model. We used DNA-seq as a high-throughput technology to facilitate the identification of genetic variants with discriminating potential between both isolates. Our results demonstrate that various levels of heterogeneity could be found between both L. major isolates in terms of chromosome or gene copy number variation (CNV), and that the intra-species divergence could surprisingly be related to single nucleotide polymorphisms (SNPs) and Insertion/Deletion (InDels) events. Interestingly, we particularly focused here on genes affected by both types of variants and correlated them with the observed gene CNV. Whether these differences are sufficient to explain the severity in patients is obviously still open to debate, but we do believe that additional layers of -omic information is needed to complement the genomic screen in order to draw a more complete map of severity determinants.

  11. Genotyping by sequencing reveals the interspecific C. maxima / C. reticulata admixture along the genomes of modern citrus varieties of mandarins, tangors, tangelos, orangelos and grapefruits.

    Science.gov (United States)

    Oueslati, Amel; Salhi-Hannachi, Amel; Luro, François; Vignes, Hélène; Mournet, Pierre; Ollitrault, Patrick

    2017-01-01

    The mandarin horticultural group is an important component of world citrus production for the fresh fruit market. This group formerly classified as C. reticulata is highly polymorphic and recent molecular studies have suggested that numerous cultivated mandarins were introgressed by C. maxima (the pummelos). C. maxima and C. reticulata are also the ancestors of sweet and sour oranges, grapefruit, and therefore of all the "small citrus" modern varieties (mandarins, tangors, tangelos) derived from sexual hybridization between these horticultural groups. Recently, NGS technologies have greatly modified how plant evolution and genomic structure are analyzed, moving from phylogenetics to phylogenomics. The objective of this work was to develop a workflow for phylogenomic inference from Genotyping By Sequencing (GBS) data and to analyze the interspecific admixture along the nine citrus chromosomes for horticultural groups and recent varieties resulting from the combination of the C. reticulata and C. maxima gene pools. A GBS library was established from 55 citrus varieties, using the ApekI restriction enzyme and selective PCR to improve the read depth. Diagnostic polymorphisms (DPs) of C. reticulata/C. maxima differentiation were identified and used to decipher the phylogenomic structure of the 55 varieties. The GBS approach was powerful and revealed 30,289 SNPs and 8,794 Indels with 12.6% of missing data. 11,133 DPs were selected covering the nine chromosomes with a higher density in genic regions. GBS combined with the detection of DPs was powerful for deciphering the "phylogenomic karyotypes" of cultivars derived from admixture of the two ancestral species after a limited number of interspecific recombinations. All the mandarins, mandarin hybrids, tangelos and tangors analyzed displayed introgression of C. maxima in different parts of the genome. C. reticulata/C. maxima admixture should be a major component of the high phenotypic variability of this germplasm opening

  12. Evolutionary Analyses of Hanwoo (Korean Cattle)-Specific Single-Nucleotide Polymorphisms and Genes Using Whole-Genome Resequencing Data of a Hanwoo Population

    Science.gov (United States)

    Lee, Daehwan; Cho, Minah; Hong, Woon-young; Lim, Dajeong; Kim, Hyung-Chul; Cho, Yong-Min; Jeong, Jin-Young; Choi, Bong-Hwan; Ko, Younhee; Kim, Jaebum

    2016-01-01

    Advances in next generation sequencing (NGS) technologies have enabled population-level studies for many animals to unravel the relationships between genotypic differences and traits of specific populations. The objective of this study was to perform evolutionary analysis of single nucleotide polymorphisms (SNP) in genes of Korean native cattle Hanwoo in comparison to SNP data from four other cattle breeds (Jersey, Simmental, Angus, and Holstein) and four related species (pig, horse, human, and mouse) obtained from public databases through NGS-based resequencing. We analyzed population structures and differentiation levels for the five cattle breeds and estimated species-specific SNPs with their origins and phylogenetic relationships among species. In addition, we identified Hanwoo-specific genes and proteins, and determined distinct changes in protein-protein interactions among five species (cattle, pig, horse, human, mouse) in the STRING network database by additionally considering indirect protein interactions. We found that the Hanwoo population was clearly different from the other four cattle populations. There were Hanwoo-specific genes related to its meat trait. Protein interaction rewiring analysis also confirmed that there were Hanwoo-specific protein-protein interactions that might have contributed to its unique meat quality. PMID:27640093

  13. Achromobacter xylosoxidans genomic characterization and correlation of randomly amplified polymorphic DNA profiles with relevant clinical features [corrected] of cystic fibrosis patients.

    Science.gov (United States)

    Magni, Annarita; Trancassini, Maria; Varesi, Paola; Iebba, Valerio; Curci, Anna; Pecoraro, Claudia; Cimino, Giuseppe; Schippa, Serena; Quattrucci, Serena

    2010-04-01

    Achromobacter xylosoxidans is an emerging pathogen increasingly being isolated from respiratory samples of cystic fibrosis (CF) patients. Its role and clinical significance in lung pathogenesis have not yet been clarified. The aim of the present study was to genetically characterize A. xylosoxidans strains isolated from CF patients by use of randomly amplified polymorphic DNA (RAPD) profiles and to look for a possible correlation between RAPD profiles and the patients' clinical features, such as their spirometry values, the presence of concomitant chronic bacterial flora at the time of isolation, and the persistent or intermittent presence of A. xylosoxidans strains. A set of 106 strains of A. xylosoxidans were typed by RAPD analysis, and their profiles were analyzed by agglomerative hierarchical classification (AHC) and associated with the patient characteristics mentioned above by factorial discriminant analysis (FDA). The overall results obtained in this study showed that (i) there is a marked genetic relationship between strains isolated from the same patients at different times, (ii) characteristic RAPD profiles are associated with different predicted classes for forced expiratory volume in 1 s (FEV1%), (iii) some characteristic RAPD profiles are associated with different concomitant chronic flora (CCF) profiles, and (iv) there is a significant division of RAPD profiles into "persistent strains" and "intermittent strains" of A. xylosoxidans. These findings seem to imply that the lung habitats found in CF patients are capable of shaping and selecting the colonizing bacterial flora, as seems to be the case for the A. xylosoxidans strains studied.

  14. Evolutionary Analyses of Hanwoo (Korean Cattle)-Specific Single-Nucleotide Polymorphisms and Genes Using Whole-Genome Resequencing Data of a Hanwoo Population.

    Science.gov (United States)

    Lee, Daehwan; Cho, Minah; Hong, Woon-Young; Lim, Dajeong; Kim, Hyung-Chul; Cho, Yong-Min; Jeong, Jin-Young; Choi, Bong-Hwan; Ko, Younhee; Kim, Jaebum

    2016-09-01

    Advances in next generation sequencing (NGS) technologies have enabled population-level studies for many animals to unravel the relationships between genotypic differences and traits of specific populations. The objective of this study was to perform evolutionary analysis of single nucleotide polymorphisms (SNP) in genes of Korean native cattle Hanwoo in comparison to SNP data from four other cattle breeds (Jersey, Simmental, Angus, and Holstein) and four related species (pig, horse, human, and mouse) obtained from public databases through NGS-based resequencing. We analyzed population structures and differentiation levels for the five cattle breeds and estimated species-specific SNPs with their origins and phylogenetic relationships among species. In addition, we identified Hanwoo-specific genes and proteins, and determined distinct changes in protein-protein interactions among five species (cattle, pig, horse, human, mouse) in the STRING network database by additionally considering indirect protein interactions. We found that the Hanwoo population was clearly different from the other four cattle populations. There were Hanwoo-specific genes related to its meat trait. Protein interaction rewiring analysis also confirmed that there were Hanwoo-specific protein-protein interactions that might have contributed to its unique meat quality.

  15. Prospects and progress on single nucleotide polymorphisms in human genome%人类基因组SNPs的研究现状及应用前景

    Institute of Scientific and Technical Information of China (English)

    王娟

    2006-01-01

    基因组DNA是生物体各种生理、病理性状的物质基础,人类DNA序列变异约90%表现为单核苷酸多态性(single nucleotide polymorphisms,SNPs),这是一种常见的遗传变异类型,在人类基因组中广泛存在,被认为是人类疾病易感性和药物反应的决定性因素.本文主要介绍了SNPs的分类及特点、人类基因组SNPs的研究现状、SNPs在实践中的应用,以及SNPs在遗传作图、医药、遗传易感性、个体化医疗等方面的研究前景,并探讨了当前SNP s研究中存在的问题.

  16. Targeted Porcine Genome Engineering with TALENs

    DEFF Research Database (Denmark)

    Luo, Yonglun; Lin, Lin; Golas, Mariola Monika

    2015-01-01

    , including construction of sequence-specific TALENs, delivery of TALENs into primary porcine fibroblasts, and detection of TALEN-mediated cleavage, is described. This chapter is useful for scientists who are inexperienced with TALEN engineering of porcine cells as well as of other large animals....... confers precisely editing (e.g., mutations or indels) or insertion of a functional transgenic cassette to user-designed loci. Techniques for targeted genome engineering are growing dramatically and include, e.g., zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs...

  17. Skin Barrier Function Is Not Impaired and Kallikrein 7 Gene Polymorphism Is Frequently Observed in Korean X-linked Ichthyosis Patients Diagnosed by Fluorescence in Situ Hybridization and Array Comparative Genomic Hybridization.

    Science.gov (United States)

    Lee, Noo Ri; Yoon, Na Young; Jung, Minyoung; Kim, Ji-Yun; Seo, Seong Jun; Wang, Hye-Young; Lee, Hyeyoung; Sohn, Young Bae; Choi, Eung Ho

    2016-08-01

    X-linked ichthyosis (XLI) is a recessively inherited ichthyosis. Skin barrier function of XLI patients reported in Western countries presented minimally abnormal or normal. Here, we evaluated the skin barrier properties and a skin barrier-related gene mutation in 16 Korean XLI patients who were diagnosed by fluorescence in situ hybridization and array comparative genomic hybridization analysis. Skin barrier properties were measured, cytokine expression levels in the stratum corneum (SC) were evaluated with the tape stripped specimen from skin surface, and a genetic test was done on blood. XLI patients showed significantly lower SC hydration, but normal basal trans-epidermal water loss and skin surface pH as compared to a healthy control group. Histopathology of ichthyosis epidermis showed no acanthosis, and levels of the pro-inflammatory cytokines in the corneal layer did not differ between control and lesional/non-lesional skin of XLI patients. Among the mutations in filaggrin (FLG), kallikrein 7 (KLK7), and SPINK5 genes, the prevalence of KLK7 gene mutations was significantly higher in XLI patients (50%) than in controls (0%), whereas FLG and SPINK5 prevalence was comparable. Korean XLI patients exhibited unimpaired skin barrier function and frequent association with the KLK7 gene polymorphism, which may differentiate them from Western XLI patients.

  18. Drd4 gene polymorphisms are associated with personality variation in a passerine bird.

    Science.gov (United States)

    Fidler, Andrew E; van Oers, Kees; Drent, Piet J; Kuhn, Sylvia; Mueller, Jakob C; Kempenaers, Bart

    2007-07-22

    Polymorphisms in several neurotransmitter-associated genes have been associated with variation in human personality traits. Among the more promising of such associations is that between the human dopamine receptor D4 gene (Drd4) variants and novelty-seeking behaviour. However, genetic epistasis, genotype-environment interactions and confounding environmental factors all act to obscure genotype-personality relationships. Such problems can be addressed by measuring personality under standardized conditions and by selection experiments, with both approaches only feasible with non-human animals. Looking for similar Drd4 genotype-personality associations in a free-living bird, the great tit (Parus major), we detected 73 polymorphisms (66 SNPs, 7 indels) in the P. major Drd4 orthologue. Two of the P. major Drd4 gene polymorphisms were investigated for evidence of association with novelty-seeking behaviour: a coding region synonymous single nucleotide polymorphism (SNP830) and a 15bp indel (ID15) located 5' to the putative transcription initiation site. Frequencies of the three Drd4 SNP830 genotypes, but not the ID15 genotypes, differed significantly between two P. major lines selected over four generations for divergent levels of 'early exploratory behaviour' (EEB). Strong corroborating evidence for the significance of this finding comes from the analysis of free-living, unselected birds where we found a significant association between SNP830 genotypes and differing mean EEB levels. These findings suggest that an association between Drd4 gene polymorphisms and animal personality variation predates the divergence of the avian and mammalian lineages. Furthermore, this work heralds the possibility of following microevolutionary changes in frequencies of behaviourally relevant Drd4 polymorphisms within populations where natural selection acts differentially on different personality types.

  19. Identification of Single-Nucleotide Polymorphic Loci Associated with Biomass Yield under Water Deficit in Alfalfa (Medicago sativa L.) Using Genome-Wide Sequencing and Association Mapping.

    Science.gov (United States)

    Yu, Long-Xi

    2017-01-01

    Alfalfa is a worldwide grown forage crop and is important due to its high biomass production and nutritional value. However, the production of alfalfa is challenged by adverse environmental factors such as drought and other stresses. Developing drought resistance alfalfa is an important breeding target for enhancing alfalfa productivity in arid and semi-arid regions. In the present study, we used genotyping-by-sequencing and genome-wide association to identify marker loci associated with biomass yield under drought in the field in a panel of diverse germplasm of alfalfa. A total of 28 markers at 22 genetic loci were associated with yield under water deficit, whereas only four markers associated with the same trait under well-watered condition. Comparisons of marker-trait associations between water deficit and well-watered conditions showed non-similarity except one. Most of the markers were identical across harvest periods within the treatment, although different levels of significance were found among the three harvests. The loci associated with biomass yield under water deficit located throughout all chromosomes in the alfalfa genome agreed with previous reports. Our results suggest that biomass yield under drought is a complex quantitative trait with polygenic inheritance and may involve a different mechanism compared to that of non-stress. BLAST searches of the flanking sequences of the associated loci against DNA databases revealed several stress-responsive genes linked to the drought resistance loci, including leucine-rich repeat receptor-like kinase, B3 DNA-binding domain protein, translation initiation factor IF2, and phospholipase-like protein. With further investigation, those markers closely linked to drought resistance can be used for MAS to accelerate the development of new alfalfa cultivars with improved resistance to drought and other abiotic stresses.

  20. Identification of Single-Nucleotide Polymorphic Loci Associated with Biomass Yield under Water Deficit in Alfalfa (Medicago sativa L. Using Genome-Wide Sequencing and Association Mapping

    Directory of Open Access Journals (Sweden)

    Long-Xi Yu

    2017-06-01

    Full Text Available Alfalfa is a worldwide grown forage crop and is important due to its high biomass production and nutritional value. However, the production of alfalfa is challenged by adverse environmental factors such as drought and other stresses. Developing drought resistance alfalfa is an important breeding target for enhancing alfalfa productivity in arid and semi-arid regions. In the present study, we used genotyping-by-sequencing and genome-wide association to identify marker loci associated with biomass yield under drought in the field in a panel of diverse germplasm of alfalfa. A total of 28 markers at 22 genetic loci were associated with yield under water deficit, whereas only four markers associated with the same trait under well-watered condition. Comparisons of marker-trait associations between water deficit and well-watered conditions showed non-similarity except one. Most of the markers were identical across harvest periods within the treatment, although different levels of significance were found among the three harvests. The loci associated with biomass yield under water deficit located throughout all chromosomes in the alfalfa genome agreed with previous reports. Our results suggest that biomass yield under drought is a complex quantitative trait with polygenic inheritance and may involve a different mechanism compared to that of non-stress. BLAST searches of the flanking sequences of the associated loci against DNA databases revealed several stress-responsive genes linked to the drought resistance loci, including leucine-rich repeat receptor-like kinase, B3 DNA-binding domain protein, translation initiation factor IF2, and phospholipase-like protein. With further investigation, those markers closely linked to drought resistance can be used for MAS to accelerate the development of new alfalfa cultivars with improved resistance to drought and other abiotic stresses.