WorldWideScience

Sample records for gene sequence variation

  1. Sequence variations in the FAD2 gene in seeded pumpkins.

    Science.gov (United States)

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-12-21

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2.

  2. Variation in the nucleotide sequence of a prolamin gene family in wild rice.

    Science.gov (United States)

    Barbier, P; Ishihama, A

    1990-07-01

    Variation in the DNA sequence of the 10 kDa prolamin gene family within the wild rice species Oryza rufipogon was probed using the direct sequencing of PCR-amplified genes. A comparison of the nucleotide and deduced amino-acid sequences of eight Asian strains of O. rufipogon and one strain of the related African species O. longistaminata is presented.

  3. Tandem gene arrays in Trypanosoma brucei: Comparative phylogenomic analysis of duplicate sequence variation

    Directory of Open Access Journals (Sweden)

    Jackson Andrew P

    2007-04-01

    Full Text Available Abstract Background The genome sequence of the protistan parasite Trypanosoma brucei contains many tandem gene arrays. Gene duplicates are created through tandem duplication and are expressed through polycistronic transcription, suggesting that the primary purpose of long, tandem arrays is to increase gene dosage in an environment where individual gene promoters are absent. This report presents the first account of the tandem gene arrays in the T. brucei genome, employing several related genome sequences to establish how variation is created and removed. Results A systematic survey of tandem gene arrays showed that substantial sequence variation existed across the genome; variation from different regions of an array often produced inconsistent phylogenetic affinities. Phylogenetic relationships of gene duplicates were consistent with concerted evolution being a widespread homogenising force. However, tandem duplicates were not usually identical; therefore, any homogenising effect was coincident with divergence among duplicates. Allelic gene conversion was detected using various criteria and was apparently able to both remove and introduce sequence variation. Tandem arrays containing structural heterogeneity demonstrated how sequence homogenisation and differentiation can occur within a single locus. Conclusion The use of multiple genome sequences in a comparative analysis of tandem gene arrays identified substantial sequence variation among gene duplicates. The distribution of sequence variation is determined by a dynamic balance of conservative and innovative evolutionary forces. Gene trees from various species showed that intraspecific duplicates evolve in concert, perhaps through frequent gene conversion, although this does not prevent sequence divergence, especially where structural heterogeneity physically separates a duplicate from its neighbours. In describing dynamics of sequence variation that have consequences beyond gene dosage, this

  4. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Directory of Open Access Journals (Sweden)

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  5. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ

    Directory of Open Access Journals (Sweden)

    Qing-Ming An

    2015-11-01

    Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.

  6. NUCLEOTIDE SEQUENCE VARIATION IN LEPTIN GENE OF MURRAH BUFFALO (BUBALUS BUBALIS

    Directory of Open Access Journals (Sweden)

    Sanjoy Datta

    2012-12-01

    Full Text Available Leptin is a 16 kD protein, synthesized by adipose tissue and is involved in regulation of feed intake, energy balance, fertility and immune functions. Present study was undertaken with the objectives of sequence characterization and studying the nucleotide variation in leptin gene in Murrah buffalo. The leptin gene consists of three exons and two introns which spans about 18.9kb, of which the first exon is not transcribed into protein. In buffaloes, the leptin gene is located on chromosome eight and maps to BBU 8q32. The leptin gene was amplified by PCR using oligonucleotide primers to obtain 289 bp fragment comprising of exon 2 and 405 bp fragment containing exon 3 of leptin gene. The amplicons were sequenced to identify variation at nucleotide level. Sequence comparison of buffalo with cattle reveals variation at five nucleotide sequences at positions 983, 1083, 1147, 1152, 1221 and all the SNPs are synonymous resulting no in change in amino acids. Three of these eight nucleotide variations have been reported for the first time in buffalo. The results indicate conservation of DNA sequence between cattle and buffalo. Nucleotide sequence variations observed at leptin gene between Bubalus bubalis and Bos taurus species revealed 97% nucleotide identity.

  7. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity.

    Directory of Open Access Journals (Sweden)

    Slavé Petrovski

    2015-09-01

    Full Text Available Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS, termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1 genes that are known to cause disease through haploinsufficiency, 2 genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3 genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4 genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding

  8. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity.

    Science.gov (United States)

    Petrovski, Slavé; Gussow, Ayal B; Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H; Allen, Andrew S; Goldstein, David B

    2015-09-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, nc

  9. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    Directory of Open Access Journals (Sweden)

    Kaas Rolf S

    2012-10-01

    Full Text Available Abstract Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness of the 186 sequenced E. coli genomes. The core-gene tree displays high confidence and divides the E. coli strains into the observed MLST type clades and also separates defined phylotypes. Conclusion The results of comparing a large and diverse E. coli dataset support the theory that reliable and good resolution phylogenies can be inferred from the core-genome. The results further suggest that the resolution at the isolate level may, subsequently be improved by targeting more variable genes. The use of whole genome sequencing will make it possible to eliminate, or at least reduce, the need for several typing steps used in traditional epidemiology.

  10. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    Energy Technology Data Exchange (ETDEWEB)

    Macke, J.P.; Nathans, J.; King, V.L. (Johns Hopkins Univ., Baltimore, MD (United States)); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. (Northwestern Univ., Evanston, IL (United States)); Brown, T. (Johns Hopkins Univ. School of Hygiene and Public Health, Baltimore, MD (United States))

    1993-10-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  11. The Sequence Variations of Intron-3 of the α-Amylase Gene in Adzuki Bean

    Institute of Scientific and Technical Information of China (English)

    JIN Wen-lin; Yamaguchi Hirofumi; Isigami Matiko; Yasuda Kentaro

    2003-01-01

    This study describes variation of intron-3 of a-amylase gene from 156 breeds of adzuki beansusing SSCP(single-strand conformation polymorphism)analysis. Based on a-amylase gene structure and se-quence, A pair of PCR primers, F (CCTACATTCTAACACACCCT) and R (GCATATTGTGCCAGTACAAT)were designed to amplify intron-3 fragments of a-amylase gene. 14 variant types were detected, including 13,9, 10, 4 variant types in the wild, weed, locally cultivated and modern brought-up adzuki beans respectively,9, 8, 7 variant types of the wild adzuki beans from Japan, China and Korea respectively, and some other va-riant types in the local adzuki beans from China and Bhutan. 60 % of subjects of cultivated races were found tobe EE type in the experiment. In addition, sequence analysis of intron-3 of α-amylase gene from 8 varianttypes reveals the evolution process of various variant types in adzuki beans.

  12. Exome sequencing and arrayCGH detection of gene sequence and copy number variation between ILS and ISS mouse strains.

    Science.gov (United States)

    Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M

    2014-06-01

    It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to

  13. Variation in the sequence and modification state of the human insulin gene flanking regions.

    Science.gov (United States)

    Ullrich, A; Dull, T J; Gray, A; Philips, J A; Peter, S

    1982-04-10

    The nucleotide sequence of a highly repetitive sequence region upstream from the human insulin gene is reported. The length of this region varies between alleles in the population, and appears to be stably transmitted to the next generation in a Mendelian fashion. There is no significant correlation between the length of this sequence and two types of diabetes mellitus. We observe variation in the cleavability of a BglI recognition site downstream from the human insulin gene, which is probably due to variable nucleotide modification. This presumed modification state appears not to be inherited, and varies between tissues within an individual and between individuals for a given tissue. Both alleles in a given tissue DNA sample are modified to the same extent.

  14. Understanding gene sequence variation in the context of transcription regulation in yeast.

    Directory of Open Access Journals (Sweden)

    Irit Gat-Viks

    2010-01-01

    Full Text Available DNA sequence polymorphism in a regulatory protein can have a widespread transcriptional effect. Here we present a computational approach for analyzing modules of genes with a common regulation that are affected by specific DNA polymorphisms. We identify such regulatory-linkage modules by integrating genotypic and expression data for individuals in a segregating population with complementary expression data of strains mutated in a variety of regulatory proteins. Our procedure searches simultaneously for groups of co-expressed genes, for their common underlying linkage interval, and for their shared regulatory proteins. We applied the method to a cross between laboratory and wild strains of S. cerevisiae, demonstrating its ability to correctly suggest modules and to outperform extant approaches. Our results suggest that middle sporulation genes are under the control of polymorphism in the sporulation-specific tertiary complex Sum1p/Rfm1p/Hst1p. In another example, our analysis reveals novel inter-relations between Swi3 and two mitochondrial inner membrane proteins underlying variation in a module of aerobic cellular respiration genes. Overall, our findings demonstrate that this approach provides a useful framework for the systematic mapping of quantitative trait loci and their role in gene expression variation.

  15. Natural variation in CBF gene sequence, gene expression and freezing tolerance in the Versailles core collection of Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Brunel Dominique

    2008-10-01

    Full Text Available Abstract Background Plants from temperate regions are able to withstand freezing temperatures due to a process known as cold acclimation, which is a prior exposure to low, but non-freezing temperatures. During acclimation, a large number of genes are induced, bringing about biochemical changes in the plant, thought to be responsible for the subsequent increase in freezing tolerance. Key regulatory proteins in this process are the CBF1, 2 and 3 transcription factors which control the expression of a set of target genes referred to as the "CBF regulon". Results To assess the role of the CBF genes in cold acclimation and freezing tolerance of Arabidopsis thaliana, the CBF genes and their promoters were sequenced in the Versailles core collection, a set of 48 accessions that maximizes the naturally-occurring genetic diversity, as well as in the commonly used accessions Col-0 and WS. Extensive polymorphism was found in all three genes. Freezing tolerance was measured in all accessions to assess the variability in acclimated freezing tolerance. The effect of sequence polymorphism was investigated by evaluating the kinetics of CBF gene expression, as well as that of a subset of the target COR genes, in a set of eight accessions with contrasting freezing tolerance. Our data indicate that CBF genes as well as the selected COR genes are cold induced in all accessions, irrespective of their freezing tolerance. Although we observed different levels of expression in different accessions, CBF or COR gene expression was not closely correlated with freezing tolerance. Conclusion Our results indicate that the Versailles core collection contains significant natural variation with respect to freezing tolerance, polymorphism in the CBF genes and CBF and COR gene expression. Although there tends to be more CBF and COR gene expression in tolerant accessions, there are exceptions, reinforcing the idea that a complex network of genes is involved in freezing tolerance

  16. Novel sequence variations in LAMA2 and SGCG genes modulating cis-acting regulatory elements and RNA secondary structure

    Directory of Open Access Journals (Sweden)

    Olfa Siala

    2010-01-01

    Full Text Available In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD software. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA and SGCG (c.*102A/C genes. These variations were present in 210 tested healthy controls from Tunisian, Moroccan, Algerian, Lebanese and French populations suggesting that they represent novel polymorphisms within LAMA2 and SGCG genes sequences. ESEfinder showed that the c.*102A/C substitution created a new exon splicing enhancer in the 3'UTR of SGCG genes, whereas the c.6085 +12delA deletion was situated in the base pairing region between LAMA2 mRNA and the U1snRNA spliceosomal components. The RNA structure analyses showed that both variations modulated RNA secondary structure. Our results are suggestive of correlations between mRNA folding and the recruitment of spliceosomal components mediating splicing, including SR proteins. The contribution of common sequence variations to mRNA structural and functional diversity will contribute to a better study of gene expression.

  17. A molecular footprint of limb loss: sequence variation of the autopodial identity gene Hoxa-13.

    Science.gov (United States)

    Kohlsdorf, Tiana; Cummings, Michael P; Lynch, Vincent J; Stopper, Geffrey F; Takahashi, Kazuhiko; Wagner, Günter P

    2008-12-01

    The homeobox gene Hoxa-13 codes for a transcription factor involved in multiple functions, including body axis and hand/foot development in tetrapods. In this study we investigate whether the loss of one function (e.g., limb loss in snakes) left a molecular footprint in exon 1 of Hoxa-13 that could be associated with the release of functional constraints caused by limb loss. Fragments of the Hoxa-13 exon 1 were sequenced from 13 species and analyzed, with additional published sequences of the same region, using relative rates and likelihood-ratio tests. Five amino acid sites in exon 1 of Hoxa-13 were detected as evolving under positive selection in the stem lineage of snakes. To further investigate whether there is an association between limb loss and sequence variation in Hoxa-13, we used the random forest method on an alignment that included shark, basal fish lineages, and "eu-tetrapods" such as mammals, turtle, alligator, and birds. The random forest method approaches the problem as one of classification, where we seek to predict the presence or absence of autopodium based on amino acid variation in Hoxa-13 sequences. Different alignments tested were associated with similar error rates (18.42%). The random forest method suggested that phenotypic states (autopodium present and absent) can often be correctly predicted based on Hoxa-13 sequences. Basal, nontetrapod gnat-hostomes that never had an autopodium were consistently classified as limbless together with the snakes, while eu-tetrapods without any history of limb loss in their phylogeny were also consistently classified as having a limb. Misclassifications affected mostly lizards, which, as a group, have a history of limb loss and limb re-evolution, and the urodele and caecilian in our sample. We conclude that a molecular footprint can be detected in Hoxa-13 that is associated with the lack of an autopodium; groups with classification ambiguity (lizards) are characterized by a history of repeated limb loss

  18. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David

    2012-01-01

    more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  19. Plasmodium falciparum antigenic variation. Mapping mosaic var gene sequences onto a network of shared, highly polymorphic sequence blocks.

    Science.gov (United States)

    Bull, Peter C; Buckee, Caroline O; Kyes, Sue; Kortok, Moses M; Thathy, Vandana; Guyah, Bernard; Stoute, José A; Newbold, Chris I; Marsh, Kevin

    2008-06-01

    Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) is a potentially important family of immune targets, encoded by an extremely diverse gene family called var. Understanding of the genetic organization of var genes is hampered by sequence mosaicism that results from a long history of non-homologous recombination. Here we have used software designed to analyse social networks to visualize the relationships between large collections of short var sequences tags sampled from clinical parasite isolates. In this approach, two sequences are connected if they share one or more highly polymorphic sequence blocks. The results show that the majority of analysed sequences including several var-like sequences from the chimpanzee parasite Plasmodium reichenowi can be either directly or indirectly linked together in a single unbroken network. However, the network is highly structured and contains putative subgroups of recombining sequences. The major subgroup contains the previously described group A var genes, previously proposed to be genetically distinct. Another subgroup contains sequences found to be associated with rosetting, a parasite virulence phenotype. The mosaic structure of the sequences and their division into subgroups may reflect the conflicting problems of maximizing antigenic diversity and minimizing epitope sharing between variants while maintaining their host cell binding functions.

  20. Analysis of the sequence variations in the Mhc DRB1-like gene of the endangered Humboldt penguin (Spheniscus humboldti).

    Science.gov (United States)

    Kikkawa, Eri F; Tsuda, Tomi T; Naruse, Taeko K; Sumiyama, Daisuke; Fukuda, Michio; Kurita, Masanori; Murata, Koichi; Wilson, Rory P; LeMaho, Yvon; Tsuda, Michio; Kulski, Jerzy K; Inoko, Hidetoshi

    2005-04-01

    The Major Histocompatibility Complex (Mhc) genomic region of many vertebrates is known to contain at least one highly polymorphic class II gene that is homologous in sequence to one or other of the human Mhc DRB1 class II genes. The diversity of the avian Mhc class II gene sequences have been extensively studied in chickens, quails, and some songbirds, but have been largely ignored in the oceanic birds, including the flightless penguins. We have previously reported that several penguin species have a high degree of polymorphism on exon 2 of the Mhc class II DRB1-like gene. In this study, we present for the first time the complete nucleotide sequences of exon 2, intron 2, and exon 3 of the DRB1-like gene of 20 Humboldt penguins, a species that is presently vulnerable to the dangers of extinction. The Humboldt DRB1-like nucleotide and amino acid sequences reveal at least eight unique alleles. Phylogenetic analysis of all the available avian DRB-like sequences showed that, of five penguin species and nine other bird species, the sequences of the Humboldt penguins grouped most closely to the Little penguin and the mallard, respectively. The present analysis confirms that the sequence variations of the Mhc class II gene, DRB1, are useful for discriminating among individuals within the same penguin population as well those within different penguin population groups and species.

  1. Multiple Cis-Acting Sequences Contribute to Evolved Regulatory Variation for Drosophila Adh Genes

    Science.gov (United States)

    Fang, X. M.; Brennan, M. D.

    1992-01-01

    Drosophila affinidisjuncta and Drosophila hawaiiensis are closely related species that display distinct tissue-specific expression patterns for their homologous alcohol dehydrogenase genes (Adh genes). In Drosophila melanogaster transformants, both genes are expressed at high levels in the larval and adult fat bodies, but the D. affinidisjuncta gene is expressed 10-50-fold more strongly in the larval and adult midguts and Malpighian tubules. The present study reports the mapping of cis-acting sequences contributing to the regulatory differences between these two genes in transformants. Chimeric genes were constructed and introduced into the germ line of D. melanogaster. Stage- and tissue-specific expression patterns were determined by measuring steady-state RNA levels in larvae and adults. Three portions of the promoter region make distinct contributions to the tissue-specific regulatory differences between the native genes. Sequences immediately upstream of the distal promoter have a strong effect in the adult Malpighian tubules, while sequences between the two promoters are relatively important in the larval Malpighian tubules. A third gene segment, immediately upstream of the proximal promoter, influences levels of the proximal Adh transcript in all tissues and developmental stages examined, and largely accounts for the regulatory difference in the larval and adult midguts. However, these as well as other sequences make smaller contributions to various aspects of the tissue-specific regulatory differences. In addition, some chimeric genes display aberrant RNA levels for the whole organism, suggesting close physical association between sequences involved in tissue-specific regulatory differences and those important for Adh expression in the larval and adult fat bodies. PMID:1644276

  2. Molecular phylogenetic and sequence variation analysis of dimeric α-amylase inhibitor genes in wheat and its wild relative species

    Directory of Open Access Journals (Sweden)

    Bharati Pandey

    2016-06-01

    Full Text Available Dimeric alpha-amylase inhibitors serve protection against insects that are highly dependent on starch for their energy. In order to study the molecular evolution and sequence variation, we have sequenced dimeric α-amylase inhibitors gene from different genomes in Triticeae including Indian bread and durum wheat genotypes. Using BLAST, obtained sequences show very high homology with other inhibitors available at GenBank database and had common conserved 10 cysteine residues. Investigated frequency of significant SNPs in the α-amylase inhibitor gene was 1 out of 60 bases. The phylogenetic analysis based on deduced amino acid sequences revealed that the genes encoding dimeric α-amylase inhibitors formed three groups and genes isolated from Indian bread wheat clustered with 0.19 inhibitors. In addition, we predicted that dimeric α-amylase inhibitors co-localized into chloroplast and mitochondria expect for the sequences isolated from Aegilops tauschii. Fingerprinting analysis done with ScanProsite confirmed biologically meaningful signatures. Multiple sequence alignment of dimeric α-amylase proteins from different plant species revealed a conserved secondary structure region, indicating homology at the sequence and structural levels. Analysis of the protein sequences obtained from wheat and its wild related species are very similar, indicates a highest conservation of these proteins.

  3. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Science.gov (United States)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  4. Sequence variation in the alpha-toxin encoding plc gene of Clostridium perfringens strains isolated from diseased and healthy chickens

    DEFF Research Database (Denmark)

    Abildgaard, L; Engberg, RM; Pedersen, Karl

    2009-01-01

    The aim of the present study was to analyse the genetic diversity of the alpha-toxin encoding plc gene and the variation in a-toxin production of Clostridium perfringens type A strains isolated from presumably healthy chickens and chickens suffering from either necrotic enteritis (NE) or cholangio......-hepatitis. The a-toxin encoding plc genes from 60 different pulsed-field gel electrophoresis (PFGE) types (strains) of C perfringens were sequenced and translated in silico to amino acid sequences and the a-toxin production was investigated in batch cultures of 45 of the strains using an enzyme...

  5. DNA sequence and haplotype variation in two candidate genes for dilated cardiomyopathy in the turkey Meleagris gallopavo.

    Science.gov (United States)

    Lin, Kuan-chin; Xu, Jun; Kamara, Davida; Geng, Tuoyu; Gyenai, Kwaku; Reed, Kent M; Smith, Edward J

    2007-05-01

    Determining variation in genes is fundamental to understanding their function in the disease state. Cardiac troponin T (cTnT) and phospholamban (PLN) genes have been implicated in dilated cardiomyopathy (DCM) in human and model species. To investigate the role of these 2 candidate genes in DCM in the turkey Meleagris gallopavo, understanding sequence variants and map position distribution is necessary. To this end, a total of 1854 and 1771 bp of cTnT and PLN gene sequences, respectively, were scanned for single nucleotide polymorphisms (SNPs) in a randomly bred population. A total of 15 SNPs was identified in the cTnT and PLN genomic sequences. Nine haplotypes, 5 in cTnT and 4 in PLN, were identified. Observed heterozygosities (0.02-0.39) in the turkey population were low for both genes. Within each gene, 1 SNP corresponding to a restriction enzyme site was identified and used to develop a PCR-restriction fragment length polymorphism (RFLP) genotyping assay. The PLN gene was genetically mapped to turkey chromosome 2, equivalent to Gallus gallus chromosome 3, and cTnT mapped to a turkey microchromosome. Although limited because of the relatively small sample size of 55 birds, the data from this SNP analysis of PLN and cTnT provide a foundation from which to evaluate the function of cTnT and PLN in the turkey. Information about the distribution of the SNPs and haplotypes will facilitate future association and linkage studies.

  6. Capturing sequence variation among flowering-time regulatory gene homologues in the allopolyploid crop species Brassica napus

    Directory of Open Access Journals (Sweden)

    Sarah eSchiessl

    2014-08-01

    Full Text Available Flowering, the transition from the vegetative to the generative phase, is a decisive time point in the lifecycle of a plant. Flowering is controlled by a complex network of transcription factors, photoreceptors, enzymes and miRNAs. In recent years, several studies gave rise to the hypothesis that this network is also strongly involved in the regulation of other important lifecycle processes ranging from germination and seed development through to fundamental developmental and yield-related traits. In the allopolyploid crop species Brassica napus, (genome AACC, homoeologous copies of flowering time regulatory genes are implicated in major phenological variation within the species, however the extent and control of intraspecific and intergenomic variation among flowering-time regulators is still unclear. To investigate differences among B. napus morphotypes in relation to flowering-time gene variation, we performed targeted deep sequencing of 29 regulatory flowering-time genes in four genetically and phenologically diverse B. napus accessions. The genotype panel included a winter-type oilseed rape, a winter fodder rape, a spring-type oilseed rape (all B. napus ssp. napus and a swede (B. napus ssp. napobrassica, which show extreme differences in winter-hardiness, vernalization requirement and flowering behaviour. A broad range of genetic variation was detected in the targeted genes for the different morphotypes, including non-synonymous SNPs, copy number variation and presence-absence variation. The results suggest that this broad variation in vernalisation, clock and signaling genes could be a key driver of morphological differentiation for flowering-related traits in this recent allopolyploid crop species.

  7. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  8. Sequence variation within the neuropeptide Y gene and obesity in Mexican Americans.

    Science.gov (United States)

    Bray, M S; Boerwinkle, E; Hanis, C L

    2000-05-01

    Recently, we reported evidence for linkage between neuropeptide Y (NPY) and both obesity and several obesity-related quantitative measures in a sample of Mexican Americans from Starr County, Texas. The purpose of this study was to investigate putative variation within the coding and promoter regions of NPY. Five young, obese individuals (body mass index [BMI] 33 to 45 kg/m2, age 14 to 30 years); five adult, lean individuals (BMI 20 to 26 kg/m2, age 39 to 65 years); and five sibling pairs sharing no alleles that were identical by descent at a marker locus proximal to NPY were selected for fluorescence-based sequencing of approximately 1100 base pairs (bp) immediately 5' from the start site and all four exons of NPY. We identified a total of eight variant sites, including a 2-bp insertion/deletion (I/D) within a putative negative regulatory region (-880I/D) and a 17-bp deletion at the exon 1/intron 1 junction (69I/D). The -880I/D and 69I/D variants were typed in a separate random sample of Mexican Americans (N = 914) from Starr County, Texas. Analyses of variance resulted in a significant association between -880I/D and waist-to-hip ratio (p = 0.041) in the entire sample and between -880I/D and BMI (p = 0.031), abdominal circumference (p = 0.044), and waist-to-hip ratio (p = 0.041) in a non-obese subsample (BMI sequence variation within the regulatory and coding sequence of NPY. Several variants were observed, and of those tested, the -880I/D promoter region variant may influence body fat patterning in non-obese individuals but does not appear to play a major role in the etiology of common forms of obesity in this population.

  9. The thermostable direct hemolysin-related hemolysin (trh) gene of Vibrio parahaemolyticus: Sequence variation and implications for detection and function.

    Science.gov (United States)

    Nilsson, William B; Turner, Jeffrey W

    2016-07-01

    Vibrio parahaemolyticus is a leading cause of bacterial food-related illness associated with the consumption of undercooked seafood. Only a small subset of strains is pathogenic. Most clinical strains encode for the thermostable direct hemolysin (TDH) and/or the TDH-related hemolysin (TRH). In this work, we amplify and sequence the trh gene from over 80 trh+strains of this bacterium and identify thirteen genetically distinct alleles, most of which have not been deposited in GenBank previously. Sequence data was used to design new primers for more reliable detection of trh by endpoint PCR. We also designed a new quantitative PCR assay to target a more conserved gene that is genetically-linked to trh. This gene, ureR, encodes the transcriptional regulator for the urease gene cluster immediately upstream of trh. We propose that this ureR assay can be a useful screening tool as a surrogate for direct detection of trh that circumvents challenges associated with trh sequence variation.

  10. De Novo Assembly of Bitter Gourd Transcriptomes: Gene Expression and Sequence Variations in Gynoecious and Monoecious Lines.

    Science.gov (United States)

    Shukla, Anjali; Singh, V K; Bharadwaj, D R; Kumar, Rajesh; Rai, Ashutosh; Rai, A K; Mugasimangalam, Raja; Parameswaran, Sriram; Singh, Major; Naik, P S

    2015-01-01

    Bitter gourd (Momordica charantia L.) is a nutritious vegetable crop of Asian origin, used as a medicinal herb in Indian and Chinese traditional medicine. Molecular breeding in bitter gourd is in its infancy, due to limited molecular resources, particularly on functional markers for traits such as gynoecy. We performed de novo transcriptome sequencing of bitter gourd using Illumina next-generation sequencer, from root, flower buds, stem and leaf samples of gynoecious line (Gy323) and a monoecious line (DRAR1). A total of 65,540 transcripts for Gy323 and 61,490 for DRAR1 were obtained. Comparisons revealed SNP and SSR variations between these lines and, identification of gene classes. Based on available transcripts we identified 80 WRKY transcription factors, several reported in responses to biotic and abiotic stresses; 56 ARF genes which play a pivotal role in auxin-regulated gene expression and development. The data presented will be useful in both functions studies and breeding programs in bitter gourd.

  11. Leveraging long sequencing reads to investigate R-gene clustering and variation in sugar beet

    Science.gov (United States)

    Host-pathogen interactions are of prime importance to modern agriculture. Plants utilize various types of resistance genes to mitigate pathogen damage. Identification of the specific gene responsible for a specific resistance can be difficult due to duplication and clustering within R-gene families....

  12. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    Science.gov (United States)

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway.

  13. Genetic variation among the Mapuche Indians from the Patagonian region of Argentina: mitochondrial DNA sequence variation and allele frequencies of several nuclear genes.

    Science.gov (United States)

    Ginther, C; Corach, D; Penacino, G A; Rey, J A; Carnese, F R; Hutz, M H; Anderson, A; Just, J; Salzano, F M; King, M C

    1993-01-01

    DNA samples from 60 Mapuche Indians, representing 39 maternal lineages, were genetically characterized for (1) nucleotide sequences of the mtDNA control region; (2) presence or absence of a nine base duplication in mtDNA region V; (3) HLA loci DRB1 and DQA1; (4) variation at three nuclear genes with short tandem repeats; and (5) variation at the polymorphic marker D2S44. The genetic profile of the Mapuche population was compared to other Amerinds and to worldwide populations. Two highly polymorphic portions of the mtDNA control region, comprising 650 nucleotides, were amplified by the polymerase chain reaction (PCR) and directly sequenced. The 39 maternal lineages were defined by two or three generation families identified by the Mapuches. These 39 lineages included 19 different mtDNA sequences that could be grouped into four classes. The same classes of sequences appear in other Amerinds from North, Central, and South American populations separated by thousands of miles, suggesting that the origin of the mtDNA patterns predates the migration to the Americas. The mtDNA sequence similarity between Amerind populations suggests that the migration throughout the Americas occurred rapidly relative to the mtDNA mutation rate. HLA DRB1 alleles 1602 and 1402 were frequent among the Mapuches. These alleles also occur at high frequency among other Amerinds in North and South America, but not among Spanish, Chinese or African-American populations. The high frequency of these alleles throughout the Americas, and their specificity to the Americas, supports the hypothesis that Mapuches and other Amerind groups are closely related.(ABSTRACT TRUNCATED AT 250 WORDS)

  14. Genetic diversity and its seasonal variation of Jiaozhou Bay phytoplankton determined by rbcL gene sequencing

    Institute of Scientific and Technical Information of China (English)

    LIU Yongjian; YANG Guanpin; GUAN Xiaojing; MEN Rongxin

    2006-01-01

    Ribulose-1, 5-bisphosphate carboxynase/oxygenase large subunit gene (rbcL) of Jiaozhou Bay phytoplankton was amplified from spring, summer and autumn surface seawater DNAs and cloned respectively. About 50 clones were randomly selected from each library and sequenced. If identical amino acid sequences are considered as the same operational taxonomy unit (OTU), 61 OTUs are identified according to inferred amino acid sequences, among them, 21 from spring seawater, 15 from summer seawater and 25 from autumn seawater. Shannon index calculated from OTU abundances reflects the genetic diversity of a community. The indexes of spring, summer and autumn surface seawater phytoplankton are 2.69, 2.44 and 2.76 respectively, indicating that phytoplankton genetic diversity of autumn seawater is the richest. Seasonal variation of phytoplankton community is significant; the community compositions of three seasons are almost completely different except for the two OTUs shared by summer and autumn. Surface seawater phytoplankton communities are possibly metacommunities different spatially and temporally.

  15. Bardet-Biedl syndrome in Denmark-report of 13 novel sequence variations in six genes

    DEFF Research Database (Denmark)

    Hjortshøj, Tina Duelund; Grønskov, Karen; Philp, Alisdair R

    2010-01-01

    Bardet-Biedl syndrome (BBS) is an autosomal recessive disease characterized by retinal dystrophy, polydactyly, obesity, learning disabilities, renal involvement, and male hypogenitalism. BBS is genetically heterogeneous with mutations of 14 genes, accounting for approximately 70% of cases. Triall...

  16. Sequence analysis and identification of new variations in the coding sequence of melatonin receptor gene (MTNR1A of Indian Chokla sheep breed

    Directory of Open Access Journals (Sweden)

    Vijay Kumar Saxena

    2014-12-01

    Full Text Available Melatonin receptor 1A gene is the prime receptor mediating the effect of melatonin at the neuroendocrine level for control of seasonal reproduction in sheep. The aims of this study were to examine the polymorphism pattern of coding sequence of MTNR1A gene in Chokla sheep, a breed of Indian arid tract and to identify new variations in relation to its aseasonal status. Genomic DNAs of 101 Chokla sheep were collected and an 824 bp coding sequence of Exon II was amplified. RFLP was performed with enzyme RsaI and MnlI to assess the presence of polymorphism at position C606T and G612A, respectively. Genotyping revealed significantly higher frequency of M and R alleles than m and r alleles. RR and MM were found to be dominantly present in the group of studied population. Cloning and sequencing of Exon II followed by mutation/polymorphism analysis revealed ten mutations of which three were non-synonymous mutations (G706A, C893A, G931C. G706A leads to substitution of valine by isoleucine Val125I (U14109 in the fifth transmembrane domain. C893A leads to substitution of alanine by aspartic acid in the third extracellular loop. G931C mutation brings about substitution of amino acid alanine by proline in the seventh transmembrane helix, can affect the conformational stability of the molecule. Polyphen-2 analysis revealed that the polymorphism at position 931 is potentially damaging while the mutations at positions 706 and 893 were benign. It is concluded that G931C mutation of MTNR 1A gene, may explain, in part, the importance of melatonin structure integrity in influencing seasonality in sheep.

  17. Sequence Variation of the Pertussis Toxin S1 Subunit Encoding Gene in the Clinical Isolates of Bordetella pertussis in Iran

    Directory of Open Access Journals (Sweden)

    Hosseinpour

    2015-08-01

    Full Text Available Background Whooping cough (pertussis is an acute respiratory disease caused by Bordetella pertussis (B. pertussis. Pertussis toxin is an important virulence factor of B. pertussis and plays a major role in the immune and inflammatory responses. Likewise, allelic variations in the genes of virulence factors have led to the non-responsiveness of the new strains to both whole-cell and acellular vaccines. Given the importance of pertussis vaccine, we sought to address the lack of fundamental studies on the polymorphisms of the virulence genes of B. pertussis in Iran. Objectives The aim of this study was to identify the polymorphisms of the pertussis toxin S1 subunit (ptxS1 gene in the circulating strains and compare them to the vaccine strain. Patients and Methods In this study, 50 strains of B. pertussis isolated from patients with pertussis were investigated in the pertussis reference laboratory of Pasteur institute of Iran. Cultivation, biochemical tests, and the specific antisera were used to confirm B. pertussis. The sequencing of the polymerase chain reaction products was performed to determine the ptxS1 alleles, and B. pertussis 134 was studied as the vaccine strain. Results The results showed that all the strains had the dominant allele ptxS1A. There were differences between the alleles of the clinical strains and the vaccine strain. Conclusions In recent years, a significant increase in the incidence of pertussis has been reported worldwide. Our findings regarding the allelic shift of the ptxS1 gene are similar to those reported in many European and American countries showing the difference of the dominant allele of ptxS1 between the circulating isolates and the vaccine strains.

  18. Sequence variation at the phenylalanine hydroxylase gene in the British Isles

    Energy Technology Data Exchange (ETDEWEB)

    Tyfield, L.A. [Southmead Hospital, Bristol (United Kingdom)]|[Univ. of Bristol (United Kingdom); Stephenson, A. [Southmead Hospital, Bristol (United Kingdom); Cockburn, F. [Royal Hospital for Sick Children, Glasgow (United Kingdom)] [and others

    1997-02-01

    Using mutation and haplotype analysis, we have examined the phenylalanine hydroxylase gene in the phenylketonuria populations of four geographical areas of the British Isles: the west of Scotland, southern Wales, and southwestern and southeastern England. The enormous genetic diversity of this locus within the British Isles is demonstrated in the large number of different mutations characterized and in the variety of genetic backgrounds on which individual mutations are found. Allele frequencies of the more common mutations exhibited significant nonrandom distribution in a north/south differentiation. Differences between the west of Scotland and southwestern England may be related to different events in the recent and past histories of their respective populations. Similarities between southern Wales and southeastern England are likely to reflect the heterogeneity that is seen in and around two large capital cities. Finally, comparison with more recently colonized areas of the world corroborates the genealogical origin by range expansion of several mutations. 38 refs., 2 tabs.

  19. Sequence variation in the B1 gene among Toxoplasma gondii isolates from swine and cats in Italy.

    Science.gov (United States)

    Santoro, Azzurra; Veronesi, Fabrizia; Milardi, Giovanni Luigi; Ranucci, David; Branciari, Raffaella; Diaferia, Manuela; Gabrielli, Simona

    2017-07-01

    The evaluation of the genetic variations of Toxoplasma gondii among isolates of a wide variety of animal hosts can provide significant information for better understanding the epidemiology and population structure of the parasite in different geographical areas. The aim of this study was to provide information on T. gondii genetic diversity in host species living in central Italy, which could act as a potential source of human infection. Seventy-seven feline faecal samples, and 36 and 20 diaphragm pillar tissue samples from pigs and wild boars were collected in Umbria (central Italy). The samples were tested by a nested-PCR protocol amplifying an informative region within the B1 gene, a multi-copy genetic target, showing a good rate of variability. Thirty-six specimens (27.07%) belonging to 10 pigs, 13 wild boars and 13 cats, tested positive to the B1 nested-PCR screening. Of these, 23 good quality sequences (8 from wild boars, 5 from pigs, and 10 from cats) were analyzed. A comparison of the B1 DNA sequences showed that a single homogeneous nucleotide substitution (C/T) was present at position 31 in the isolates from pigs and wild boars compared with the sampled cats and other hosts (including humans) available in GenBank™. The present results suggest the existence of a T. gondii genetic diversity for swine host species, based on a SNP (C/T) of the B1 gene. Further studies are needed to draw more solid conclusions on the discriminatory power of the B1 target by collecting more swine samples from much broader geographical areas. Copyright © 2017. Published by Elsevier Ltd.

  20. Update on Pneumocystis carinii f. sp. hominis typing based on nucleotide sequence variations in internal transcribed spacer regions of rRNA genes

    DEFF Research Database (Denmark)

    Lee, C H; Helweg-Larsen, J; Tang, X

    1998-01-01

    Pneumocystis carinii f. sp. hominis isolates from 207 clinical specimens from nine countries were typed based on nucleotide sequence variations in the internal transcribed spacer regions I and II (ITS1 and ITS2, respectively) of rRNA genes. The number of ITS1 nucleotides has been revised from the...

  1. cis sequence effects on gene expression

    Directory of Open Access Journals (Sweden)

    Jacobs Kevin

    2007-08-01

    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  2. Sequence variation identified in the 18S rRNA gene of Theileria mutans and Theileria velifera from the African buffalo (Syncerus caffer).

    Science.gov (United States)

    Chaisi, Mamohale E; Collins, Nicola E; Potgieter, Fred T; Oosthuizen, Marinda C

    2013-01-16

    The African buffalo (Syncerus caffer) is a natural reservoir host for both pathogenic and non-pathogenic Theileria species. These often occur naturally as mixed infections in buffalo. Although the benign and mildly pathogenic forms do not have any significant economic importance, their presence could complicate the interpretation of diagnostic test results aimed at the specific diagnosis of the pathogenic Theileria parva in cattle and buffalo in South Africa. The 18S rRNA gene has been used as the target in a quantitative real-time PCR (qPCR) assay for the detection of T. parva infections. However, the extent of sequence variation within this gene in the non-pathogenic Theileria spp. of the Africa buffalo is not well known. The aim of this study was, therefore, to characterise the full-length 18S rRNA genes of Theileria mutans, Theileria sp. (strain MSD) and T. velifera and to determine the possible influence of any sequence variation on the specific detection of T. parva using the 18S rRNA qPCR. The reverse line blot (RLB) hybridization assay was used to select samples which either tested positive for several different Theileria spp., or which hybridised only with the Babesia/Theileria genus-specific probe and not with any of the Babesia or Theileria species-specific probes. The full-length 18S rRNA genes from 14 samples, originating from 13 buffalo and one bovine from different localities in South Africa, were amplified, cloned and the resulting recombinants sequenced. Variations in the 18S rRNA gene sequences were identified in T. mutans, Theileria sp. (strain MSD) and T. velifera, with the greatest diversity observed amongst the T. mutans variants. This variation possibly explained why the RLB hybridization assay failed to detect T. mutans and T. velifera in some of the analysed samples.

  3. Analysis of sequence variations in low-density lipoprotein receptor gene among Malaysian patients with familial hypercholesterolemia.

    Science.gov (United States)

    Al-Khateeb, Alyaa; Zahri, Mohd K; Mohamed, Mohd S; Sasongko, Teguh H; Ibrahim, Suhairi; Yusof, Zurkurnai; Zilfalil, Bin A

    2011-03-19

    Familial hypercholesterolemia is a genetic disorder mainly caused by defects in the low-density lipoprotein receptor gene. Few and limited analyses of familial hypercholesterolemia have been performed in Malaysia, and the underlying mutations therefore remain largely unknown.We studied a group of 154 unrelated FH patients from a northern area of Malaysia (Kelantan). The promoter region and exons 2-15 of the LDLR gene were screened by denaturing high-performance liquid chromatography to detect short deletions and nucleotide substitutions, and by multiplex ligation-dependent probe amplification to detect large rearrangements. A total of 29 gene sequence variants were reported in 117(76.0%) of the studied subjects. Eight different mutations (1 large rearrangement, 1 short deletion, 5 missense mutations, and 1 splice site mutation), and 21 variants. Eight gene sequence variants were reported for the first time and they were noticed in familial hypercholesterolemic patients, but not in controls (p.Asp100Asp, p.Asp139His, p.Arg471Gly, c.1705+117 T>G, c.1186+41T>A, 1705+112C>G, Dup exon 12 and p.Trp666ProfsX45). The incidence of the p.Arg471Gly variant was 11%. Patients with pathogenic mutations were younger, had significantly higher incidences of cardiovascular disease, xanthomas, and family history of hyperlipidemia, together with significantly higher total cholesterol and low density lipoprotein levels than patients with non-pathogenic variants. Twenty-nine gene sequence variants occurred among FH patients; those with predicted pathogenicity were associated with higher incidences of cardiovascular diseases, tendon xanthomas, and higher total and low density lipoprotein levels compared to the rest. These results provide preliminary information on the mutation spectrum of this gene among patients with FH in Malaysia.

  4. Analysis of sequence variations in low-density lipoprotein receptor gene among Malaysian patients with familial hypercholesterolemia

    Directory of Open Access Journals (Sweden)

    Yusof Zurkurnai

    2011-03-01

    Full Text Available Abstract Background Familial hypercholesterolemia is a genetic disorder mainly caused by defects in the low-density lipoprotein receptor gene. Few and limited analyses of familial hypercholesterolemia have been performed in Malaysia, and the underlying mutations therefore remain largely unknown. We studied a group of 154 unrelated FH patients from a northern area of Malaysia (Kelantan. The promoter region and exons 2-15 of the LDLR gene were screened by denaturing high-performance liquid chromatography to detect short deletions and nucleotide substitutions, and by multiplex ligation-dependent probe amplification to detect large rearrangements. Results A total of 29 gene sequence variants were reported in 117(76.0% of the studied subjects. Eight different mutations (1 large rearrangement, 1 short deletion, 5 missense mutations, and 1 splice site mutation, and 21 variants. Eight gene sequence variants were reported for the first time and they were noticed in familial hypercholesterolemic patients, but not in controls (p.Asp100Asp, p.Asp139His, p.Arg471Gly, c.1705+117 T>G, c.1186+41T>A, 1705+112C>G, Dup exon 12 and p.Trp666ProfsX45. The incidence of the p.Arg471Gly variant was 11%. Patients with pathogenic mutations were younger, had significantly higher incidences of cardiovascular disease, xanthomas, and family history of hyperlipidemia, together with significantly higher total cholesterol and low density lipoprotein levels than patients with non-pathogenic variants. Conclusions Twenty-nine gene sequence variants occurred among FH patients; those with predicted pathogenicity were associated with higher incidences of cardiovascular diseases, tendon xanthomas, and higher total and low density lipoprotein levels compared to the rest. These results provide preliminary information on the mutation spectrum of this gene among patients with FH in Malaysia.

  5. A combination of PhP typing and β-d-glucuronidase gene sequence variation analysis for differentiation of Escherichia coli from humans and animals.

    Science.gov (United States)

    Masters, N; Christie, M; Katouli, M; Stratton, H

    2015-06-01

    We investigated the usefulness of the β-d-glucuronidase gene variance in Escherichia coli as a microbial source tracking tool using a novel algorithm for comparison of sequences from a prescreened set of host-specific isolates using a high-resolution PhP typing method. A total of 65 common biochemical phenotypes belonging to 318 E. coli strains isolated from humans and domestic and wild animals were analysed for nucleotide variations at 10 loci along a 518 bp fragment of the 1812 bp β-d-glucuronidase gene. Neighbour-joining analysis of loci variations revealed 86 (76.8%) human isolates and 91.2% of animal isolates were correctly identified. Pairwise hierarchical clustering improved assignment; where 92 (82.1%) human and 204 (99%) animal strains were assigned to their respective cluster. Our data show that initial typing of isolates and selection of common types from different hosts prior to analysis of the β-d-glucuronidase gene sequence improves source identification. We also concluded that numerical profiling of the nucleotide variations can be used as a valuable approach to differentiate human from animal E. coli. This study signifies the usefulness of the β-d-glucuronidase gene as a marker for differentiating human faecal pollution from animal sources.

  6. Screening for K-Casein (CSN3 Gene Variation in Carpathian Goat Breed by Isoelectric focusing (IEF and DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Adrian Valentin Balteanu

    2015-05-01

    Full Text Available In goats, k-casein (CSN3 locus is highly polymorphic with up to 16 allele currently characterized. They produce 13 protein variants (CSN3 that were classified in two groups (AIEF and BIEF, according to their isoelectric point. Isoelectric focusing (IEF of milk samples allows the detection of these two CSN3 groups, but for correct identification of CSN3 alleles DNA based genotyping methods are needed. Therefore the objective of this study was to identify the types of alleles occurring at the CSN3 locus in Carpathian goat breed by using a combined IEF and DNA sequencing approach. IEF analysis of milk samples collected from two Carpathian goat populations reared in Romania revealed two distinct CSN3 patterns. Amplification and sequencing of CSN3 cDNA obtained from these goats revealed four polymorphic sites located in the exon 4 that are responsible for amino acids substitutions, as compared with the reference sequence of A allele. By comparative analysis of IEF and cDNA sequencing data obtained from the two populations, we shown that AIEF alleles are represented by B allele, while BIEF alleles are represented by D allele. However, the variation of CSN3 locus in Carpathian goat breed could be more complex, therefore further studies are needed to characterize it.

  7. Mutation screening of patients with Leber Congenital Amaurosis or the enhanced S-Cone Syndrome reveals a lack of sequence variations in the NRL gene.

    Science.gov (United States)

    Acar, Ceren; Mears, Alan J; Yashar, Beverly M; Maheshwary, Anjali S; Andreasson, Sten; Baldi, Alfonso; Sieving, Paul A; Iannaccone, Alessandro; Musarella, Maria A; Jacobson, Samuel G; Swaroop, Anand

    2003-01-24

    To determine if mutations in the retinal transcription factor gene NRL are associated with retinopathies other than autosomal dominant retinitis pigmentosa (adRP). Genomic DNA was isolated from blood samples obtained from 50 patients with Leber Congenital Amaurosis (LCA), 17 patients with the Enhanced S-Cone Syndrome (ESCS), and a patient with an atypical retinal degeneration that causes photoreceptor rosettes with blue cone opsin. The 5' upstream region (putative promoter), untranslated exon 1, coding exons 2 and 3, and exon-intron boundaries of the NRL gene were analyzed by direct sequencing of the PCR-amplified products. Complete sequencing of the NRL gene in DNA samples from this cohort of patients revealed only one nucleotide change. The C->G transversion at nucleotide 711 of NRL exon 3 was detected in one LCA patient; however, this change did not alter the amino acid (L237L). No potential disease causing mutation was identified in the NRL gene in patients with LCA, ESCS, or the atypical retinal degeneration. Together with previous studies, our results demonstrate that mutations in the NRL gene are not a major cause of retinopathy. To date, only missense changes have been reported in adRP patients, and sequence variations are rare. It is possible that the loss of NRL function in humans is associated with a more complex clinical phenotype due to its expression in pineal gland in addition to rod photoreceptors.

  8. Genomic Sequence Variation Markup Language (GSVML).

    Science.gov (United States)

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as

  9. Prevalence of Marek's disease virus in different chicken populations in Iraq and indicative virulence based on sequence variation in the ecoRI-q (meq) gene.

    Science.gov (United States)

    Wajid, Salih J; Katz, Margaret E; Renz, Katrin G; Walkden-Brown, Stephen W

    2013-06-01

    limited meq gene sequence variation, that all sequenced samples had a short meq with two four-proline repeats, and that this is consistent with a high level of virulence.

  10. Prediction model for sequence variation in the glycoprotein gene of infectious hematopoietic necrosis virus in California, U.S.A.

    Science.gov (United States)

    Kelley, Garry O; Garabed, Rebecca; Branscum, Adam; Perez, Andres; Thurmond, Mark

    2007-12-13

    The influence of spatio-temporal factors on genetic variation of infectious hematopoietic necrosis virus (IHNV) is an active area of research. Using host-isolate pairs collected from 1966 to 2004 for 237 IHNV isolates from California and southern Oregon, we examined genetic variation of the mid-G gene of IHNV that could be quantified across times and geographic locations. Information hypothesized to influence genetic variation was environmental and/or fish host demographic factors, viz. location (inland or coastal), year of isolation, habitat (river, lake, or hatchery), the agent factors of subgroup (LI or LII) and serotype (1, 2, or 3), and the host factors of fish age (juvenile or adult), sex (male or female), and season of spawning run (spring, fall, late fall, winter). Inverse distance weighting (IDW) was performed to create isopleth maps of the genetic distances of each subgroup. IDW maps showed that more genetic divergence was predicted for isolates found inland (for both subgroups: LI and LII) than for coastal watershed isolates. A mixed-effect beta regression with a logit link function was used to seek associations between genetic distances and hypothesized explanatory factors. The model that best described genetic distance contained the factors of location, year of isolation, and the interaction between location and year. Our model suggests that genetic distance was greater for isolates collected from 1966 to 2004 at inland locations than for isolates found in coastal watersheds during the same years. The agreement between the IDW and beta regression analyses quantitatively supports our conclusion that, during this time period, more genetic variation existed within subgroup LII in inland watersheds than within coastal LI isolates.

  11. Patterns of nucleotide sequence variation in ICAM1 and TNF genes in twelve ethnic groups of India: roles of demographic history and natural selection

    Indian Academy of Sciences (India)

    Sanghamitra Sengupta; Shabana Farheen; Neelanjana Mukherjee; Partha P. Majumder

    2007-12-01

    We have studied DNA sequence variation in and around the genes ICAM1 and TNF, which play functional and correlated roles in inflammatory processes and immune cell responses, in 12 diverse ethnic groups of India, with a view to investigating the relative roles of demographic history and natural selection in shaping the observed patterns of variation. The total numbers of single nucleotide polymorphisms (SNPs) detected at the ICAM1 and TNF loci were 29 and 12, respectively. Haplotype and allele frequencies differed significantly across populations. The site frequency spectra at these loci were significantly different from those expected under neutrality, and showed an excess of intermediate-frequency variants consistent with balancing selection. However, as expected under balancing selection, there was no significant reduction of $F_{ST}$ values compared to neutral autosomal loci. Mismatch distributions were consistent with population expansion for both loci. On the other hand, the phylogenetic network among haplotypes for the TNF locus was similar to expectations under population expansion, while that for the ICAM1 was as expected under balancing selection. Nucleotide diversity at the ICAM1 locus was an order of magnitude lower in the promoter region, compared to the introns or exons, but no such difference was noted for the TNF gene. Thus, we conclude that the pattern of nucleotide variation in these genes has been modulated by both demographic history and selection. This is not surprising in view of the known allelic associations of several polymorphisms in these genes with various diseases, both infectious and noninfectious.

  12. Association between sequence variations of the Mediterranean fever gene and the risk of migraine: a case–control study

    Directory of Open Access Journals (Sweden)

    Coşkun S

    2016-08-01

    Full Text Available Salih Coşkun,1 Sefer Varol,2 Hasan H Özdemir,2 Sercan Bulut Çelik,3 Metin Balduz,4 Mehmet Akif Camkurt,5 Abdullah Çim,1 Demet Arslan,2 Mehmet Uğur Çevik2 1Department of Medical Genetics, 2Department of Neurology, Faculty of Medicine, Dicle University, Diyarbakir, 3Family Health Center, Batman, 4Department of Neurology, Şanlıurfa Education and Research Hospital, Şanlıurfa, 5Department of Psychiatry, Afsin State Hospital, Kahramanmaraş, Turkey Abstract: Migraine pathogenesis involves a complex interaction between hormones, neurotransmitters, and inflammatory pathways, which also influence the migraine phenotype. The Mediterranean fever gene (MEFV encodes the pyrin protein. The major role of pyrin appears to be in the regulation of inflammation activity and the processing of the cytokine pro-interleukin-1β, and this cytokine plays a part in migraine pathogenesis. This study included 220 migraine patients and 228 healthy controls. Eight common missense mutations of the MEFV gene, known as M694V, M694I, M680I, V726A, R761H, K695R, P369S, and E148Q, were genotyped using real-time polymerase chain reaction with 5' nuclease assays, which include sequence specific primers, and probes with a reporter dye. When mutations were evaluated separately among the patient and control groups, only the heterozygote E148Q carrier was found to be significantly higher in the control group than in the patient group (P=0.029, odds ratio [95% confidence interval] =0.45 [0.21–0.94]. In addition, the frequency of the homozygote and the compound heterozygote genotype carrier was found to be significantly higher in patients (n=8, 3.6% than in the control group (n=1, 0.4% (P=0.016, odds ratio [95% confidence interval] =8.57 [1.06–69.07]. However, there was no statistically significant difference in the allele frequencies of MEFV mutations between the patients and the healthy control group (P=0.964. In conclusion, the results of the present study suggest that

  13. Variations in gene organization and DNA uptake signal sequence in the folP region between commensal and pathogenic Neisseria species

    Directory of Open Access Journals (Sweden)

    Qvarnstrom Yvonne

    2006-02-01

    Full Text Available Abstract Background Horizontal gene transfer is an important source of genetic variation among Neisseria species and has contributed to the spread of resistance to penicillin and sulfonamide drugs in the pathogen Neisseria meningitidis. Sulfonamide resistance in Neisseria meningitidis is mediated by altered chromosomal folP genes. At least some folP alleles conferring resistance have been horizontally acquired from other species, presumably from commensal Neisseriae. In this work, the DNA sequence surrounding folP in commensal Neisseria species was determined and compared to corresponding regions in pathogenic Neisseriae, in order to elucidate the potential for inter-species DNA transfer within this region. Results The upstream region of folP displayed differences in gene order between species, including an insertion of a complete Correia element in Neisseria lactamica and an inversion of a larger genomic segment in Neisseria sicca, Neisseria subflava and Neisseria mucosa. The latter species also had DNA uptake signal sequences (DUS in this region that were one base different from the DUS in pathogenic Neisseriae. Another interesting finding was evidence of a horizontal transfer event from Neisseria lactamica or Neisseria cinerea that introduced a novel folP allele to the meningococcal population. Conclusion Genetic recombination events immediately upstream of folP and horizontal transfer have resulted in sequence differences in the folP region between the Neisseria species. This variability could be a consequence of the selective pressure on this region exerted by the use of sulfonamide drugs.

  14. Association between sequence variations of the Mediterranean fever gene and the risk of migraine: a case–control study

    Science.gov (United States)

    Coşkun, Salih; Varol, Sefer; Özdemir, Hasan H; Çelik, Sercan Bulut; Balduz, Metin; Camkurt, Mehmet Akif; Çim, Abdullah; Arslan, Demet; Çevik, Mehmet Uğur

    2016-01-01

    Migraine pathogenesis involves a complex interaction between hormones, neurotransmitters, and inflammatory pathways, which also influence the migraine phenotype. The Mediterranean fever gene (MEFV) encodes the pyrin protein. The major role of pyrin appears to be in the regulation of inflammation activity and the processing of the cytokine pro-interleukin-1β, and this cytokine plays a part in migraine pathogenesis. This study included 220 migraine patients and 228 healthy controls. Eight common missense mutations of the MEFV gene, known as M694V, M694I, M680I, V726A, R761H, K695R, P369S, and E148Q, were genotyped using real-time polymerase chain reaction with 5′ nuclease assays, which include sequence specific primers, and probes with a reporter dye. When mutations were evaluated separately among the patient and control groups, only the heterozygote E148Q carrier was found to be significantly higher in the control group than in the patient group (P=0.029, odds ratio [95% confidence interval] =0.45 [0.21–0.94]). In addition, the frequency of the homozygote and the compound heterozygote genotype carrier was found to be significantly higher in patients (n=8, 3.6%) than in the control group (n=1, 0.4%) (P=0.016, odds ratio [95% confidence interval] =8.57 [1.06–69.07]). However, there was no statistically significant difference in the allele frequencies of MEFV mutations between the patients and the healthy control group (P=0.964). In conclusion, the results of the present study suggest that biallelic mutations in the MEFV gene could be associated with a risk of migraine in the Turkish population. Moreover, MEFV mutations could be related to increased frequency and short durations of migraine attacks (P=0.043 and P=0.021, respectively). Future studies in larger groups and expression analysis of MEFV are required to clarify the role of the MEFV gene in migraine susceptibility. PMID:27621632

  15. Association between sequence variations of the Mediterranean fever gene and the risk of migraine: a case-control study.

    Science.gov (United States)

    Coşkun, Salih; Varol, Sefer; Özdemir, Hasan H; Çelik, Sercan Bulut; Balduz, Metin; Camkurt, Mehmet Akif; Çim, Abdullah; Arslan, Demet; Çevik, Mehmet Uğur

    2016-01-01

    Migraine pathogenesis involves a complex interaction between hormones, neurotransmitters, and inflammatory pathways, which also influence the migraine phenotype. The Mediterranean fever gene (MEFV) encodes the pyrin protein. The major role of pyrin appears to be in the regulation of inflammation activity and the processing of the cytokine pro-interleukin-1β, and this cytokine plays a part in migraine pathogenesis. This study included 220 migraine patients and 228 healthy controls. Eight common missense mutations of the MEFV gene, known as M694V, M694I, M680I, V726A, R761H, K695R, P369S, and E148Q, were genotyped using real-time polymerase chain reaction with 5' nuclease assays, which include sequence specific primers, and probes with a reporter dye. When mutations were evaluated separately among the patient and control groups, only the heterozygote E148Q carrier was found to be significantly higher in the control group than in the patient group (P=0.029, odds ratio [95% confidence interval] =0.45 [0.21-0.94]). In addition, the frequency of the homozygote and the compound heterozygote genotype carrier was found to be significantly higher in patients (n=8, 3.6%) than in the control group (n=1, 0.4%) (P=0.016, odds ratio [95% confidence interval] =8.57 [1.06-69.07]). However, there was no statistically significant difference in the allele frequencies of MEFV mutations between the patients and the healthy control group (P=0.964). In conclusion, the results of the present study suggest that biallelic mutations in the MEFV gene could be associated with a risk of migraine in the Turkish population. Moreover, MEFV mutations could be related to increased frequency and short durations of migraine attacks (P=0.043 and P=0.021, respectively). Future studies in larger groups and expression analysis of MEFV are required to clarify the role of the MEFV gene in migraine susceptibility.

  16. Genetic Variation of Cassava Mealybug, Phenacoccus manihoti (Hemiptera: Pseudococcidae, Based on DNA Sequences from Mitochondrial and Nuclear Genes

    Directory of Open Access Journals (Sweden)

    Atsalek RATTANAWANNEE

    2016-02-01

    Full Text Available The present study aimed to investigate the genetic variation and genetic structure of the Phenacoccus manihoti Matile-Ferrero, one of the most serious insect pests of cassava worldwide, in populations in Thailand, using mitochondrial and nuclear DNA sequence based analysis. The samples of P. manihoti were collected from 28 major cassava-growing areas within 18 provinces in Thailand. Our field survey results showed that the northeastern and eastern regions of Thailand were widely and highly infested with P. manihoti. Phylogenetic analysis revealed 2 mitochondrial clades and a single nuclear clade, which corresponded to low genetic variability. This suggests that P. manihoti has a high potential to spread aggressively throughout the cassava-growing areas in Thailand that in which it was first found in 2008. In addition, the generally low genetic divergence observed may be due to the highly prevalent parthenogenetic reproduction of this insect pest species. Further research is therefore necessary to develop proportional prevention and surveillance programs for early detection and rapid response. In addition, the genetic structure and variability of P. manihoti populations from neighboring countries should be studied.

  17. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models

    Science.gov (United States)

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y. Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  18. Geographically Distinct and Domain-Specific Sequence Variations in the Alleles of Rice Blast Resistance Gene Pib.

    Science.gov (United States)

    Vasudevan, Kumar; Vera Cruz, Casiana M; Gruissem, Wilhelm; Bhullar, Navreet K

    2016-01-01

    Rice blast is caused by Magnaporthe oryzae, which is the most destructive fungal pathogen affecting rice growing regions worldwide. The rice blast resistance gene Pib confers broad-spectrum resistance against Southeast Asian M. oryzae races. We investigated the allelic diversity of Pib in rice germplasm originating from 12 major rice growing countries. Twenty-five new Pib alleles were identified that have unique single nucleotide polymorphisms (SNPs), insertions and/or deletions, in addition to the polymorphic nucleotides that are shared between the different alleles. These partially or completely shared polymorphic nucleotides indicate frequent sequence exchange events between the Pib alleles. In some of the new Pib alleles, nucleotide diversity is high in the LRR domain, whereas, in others it is distributed among the NB-ARC and LRR domains. Most of the polymorphic amino acids in LRR and NB-ARC2 domains are predicted as solvent-exposed. Several of the alleles and the unique SNPs are country specific, suggesting a diversifying selection of alleles in various geographical locations in response to the locally prevalent M. oryzae population. Together, the new Pib alleles are an important genetic resource for rice blast resistance breeding programs and provide new information on rice-M. oryzae interactions at the molecular level.

  19. Strait of Gibraltar: an effective gene-flow barrier for wind-pollinated Carex helodes (Cyperaceae) as revealed by DNA sequences, AFLP, and cytogenetic variation.

    Science.gov (United States)

    Escudero, Marcial; Vargas, Pablo; Valcárcel, Virginia; Luceño, Modesto

    2008-06-01

    The Strait of Gibraltar is the most important barrier disconnecting the landmasses of Europe and Africa on the western Mediterranean extreme. Carex helodes is a wind-pollinated species endemic to the western Mediterranean. Because molecular and cytogenetic data allow the inference of its evolutionary history, we analyzed variations in chromosome number, including meiotic chromosome behavior, amplified fragment length polymorphism (AFLP) fingerprints, and nucleotide substitutions in plastid and nuclear DNA sequences. Cytogeographic results showed that the African populations have stabilized at a single chromosome number of 2n = 74, whereas the most frequent cytotype in Iberia is 2n = 72. Phylogenetic reconstructions of 17 sequences from nine closely related species revealed that C. helodes is monophyletic and that the Moroccan populations are embedded in the Iberian lineages. The haplotype network is also consistent with a European origin of the northern African haplotype. AFLP analysis also revealed hierarchical levels of genetic variation compatible with a founder effect process responsible for the African populations. All sources of evidence support the hypothesis that the Strait of Gibraltar has been an effective gene-flow barrier, generating two isolated evolutionary lineages after their dispersal. Recent connections between the two lineages appear unlikely, whereas active gene flow occurs among populations within the two lineages.

  20. Deep sequencing of the viral phoH gene reveals temporal variation, depth-specific composition, and persistent dominance of the same viral phoH genes in the Sargasso Sea

    Directory of Open Access Journals (Sweden)

    Dawn B. Goldsmith

    2015-06-01

    Full Text Available Deep sequencing of the viral phoH gene, a host-derived auxiliary metabolic gene, was used to track viral diversity throughout the water column at the Bermuda Atlantic Time-series Study (BATS site in the summer (September and winter (March of three years. Viral phoH sequences reveal differences in the viral communities throughout a depth profile and between seasons in the same year. Variation was also detected between the same seasons in subsequent years, though these differences were not as great as the summer/winter distinctions. Over 3,600 phoH operational taxonomic units (OTUs; 97% sequence identity were identified. Despite high richness, most phoH sequences belong to a few large, common OTUs whereas the majority of the OTUs are small and rare. While many OTUs make sporadic appearances at just a few times or depths, a small number of OTUs dominate the community throughout the seasons, depths, and years.

  1. Typing of Panton-Valentine Leukocidin-encoding Phages and lukSF-PV Gene Sequence Variation in Staphylococcus aureus from China

    Directory of Open Access Journals (Sweden)

    Huanqiang Zhao

    2016-08-01

    Full Text Available Panton-Valentine leucocidin (PVL, encoded by lukSF-PV genes, a bi-component and pore-forming toxin, is carried by different staphylococcal bacteriophages. The prevalence of PVL in Staphylococcus aureus (S. aureus have been reported around the globe. However, the data on PVL-encoding phage types, lukSF-PV gene variation and chromosomal phage insertion sites for PVL-positive S. aureus are limited, especially in China. In order to obtain a more complete understanding of the molecular epidemiology of PVL-positive S. aureus, an integrated and modified PCR-based scheme was applied to detect the PVL-encoding phage types. Phage insertion locus and the lukSF-PV variant were determined by PCR and sequencing. Meanwhile, the genetic background was characterized by staphylococcal cassette chromosome mec (SCCmec typing, staphylococcal protein A (spa gene polymorphisms typing, pulsed-field gel electrophoresis (PFGE typing, accessory gene regulator (agr locus typing and multilocus sequence typing (MLST. Seventy eight (78/1175, 6.6% isolates possessed the lukSF-PV genes and 59.0% (46/78 of PVL-positive strains belonged to CC59 lineage. Eight known different PVL-encoding phage types were detected, and Φ7247PVL/ΦST5967PVL (n=13 and ΦPVL (n=12 were the most prevalent among them. While 25 (25/78, 32.1% isolates, belonging to ST30 and ST59 clones, were unable to be typed by the modified PCR-based scheme. Single nucleotide polymorphisms (SNPs were identified at five locations in the lukSF-PV genes, two of which were non-synonymous. Maximum-likelihood tree analysis of attachment sites sequences detected six SNP profiles for attR and eight for attL, respectively. In conclusion, the PVL-positive S. aureus mainly harbored Φ7247PVL/ΦST5967PVL and ΦPVL in the regions studied. lukSF-PV gene sequences, PVL-encoding phages and phage insertion locus generally varied with lineages. Moreover, PVL-positive clones that have emerged worldwide likely carry distinct phages.

  2. Typing of Panton-Valentine Leukocidin-Encoding Phages and lukSF-PV Gene Sequence Variation in Staphylococcus aureus from China.

    Science.gov (United States)

    Zhao, Huanqiang; Hu, Fupin; Jin, Shu; Xu, Xiaogang; Zou, Yuhan; Ding, Baixing; He, Chunyan; Gong, Fang; Liu, Qingzhong

    2016-01-01

    Panton-Valentine leukocidin (PVL, encoded by lukSF-PV genes), a bi-component and pore-forming toxin, is carried by different staphylococcal bacteriophages. The prevalence of PVL in Staphylococcus aureus has been reported around the globe. However, the data on PVL-encoding phage types, lukSF-PV gene variation and chromosomal phage insertion sites for PVL-positive S. aureus are limited, especially in China. In order to obtain a more complete understanding of the molecular epidemiology of PVL-positive S. aureus, an integrated and modified PCR-based scheme was applied to detect the PVL-encoding phage types. Phage insertion locus and the lukSF-PV variant were determined by PCR and sequencing. Meanwhile, the genetic background was characterized by staphylococcal cassette chromosome mec (SCCmec) typing, staphylococcal protein A (spa) gene polymorphisms typing, pulsed-field gel electrophoresis (PFGE) typing, accessory gene regulator (agr) locus typing and multilocus sequence typing (MLST). Seventy eight (78/1175, 6.6%) isolates possessed the lukSF-PV genes and 59.0% (46/78) of PVL-positive strains belonged to CC59 lineage. Eight known different PVL-encoding phage types were detected, and Φ7247PVL/ΦST5967PVL (n = 13) and ΦPVL (n = 12) were the most prevalent among them. While 25 (25/78, 32.1%) isolates, belonging to ST30, and ST59 clones, were unable to be typed by the modified PCR-based scheme. Single nucleotide polymorphisms (SNPs) were identified at five locations in the lukSF-PV genes, two of which were non-synonymous. Maximum-likelihood tree analysis of attachment sites sequences detected six SNP profiles for attR and eight for attL, respectively. In conclusion, the PVL-positive S. aureus mainly harbored Φ7247PVL/ΦST5967PVL and ΦPVL in the regions studied. lukSF-PV gene sequences, PVL-encoding phages, and phage insertion locus generally varied with lineages. Moreover, PVL-positive clones that have emerged worldwide likely carry distinct phages.

  3. Sequence variation in the melanocortin-1 receptor (MC1R pigmentation gene and its role in the cryptic coloration of two South American sand lizards

    Directory of Open Access Journals (Sweden)

    Josmael Corso

    2012-01-01

    Full Text Available In reptiles, dorsal body darkness often varies with substrate color or temperature environment, and is generally presumed to be an adaptation for crypsis or thermoregulation. However, the genetic basis of pigmentation is poorly known in this group. In this study we analyzed the coding region of the melanocortin-1-receptor (MC1R gene, and therefore its role underlying the dorsal color variation in two sympatric species of sand lizards (Liolaemus that inhabit the southeastern coast of South America: L. occipitalis and L. arambarensis. The first is light-colored and occupies aeolic pale sand dunes, while the second is brownish and lives in a darker sandy habitat. We sequenced 630 base pairs of MC1R in both species. In total, 12 nucleotide polymorphisms were observed, and four amino acid replacement sites, but none of them could be associated with a color pattern. Comparative analysis indicated that these taxa are monomorphic for amino acid sites that were previously identified as functionally important in other reptiles. Thus, our results indicate that MC1R is not involved in the pigmentation pattern observed in Liolaemus lizards. Therefore, structural differences in other genes, such as ASIP, or variation in regulatory regions of MC1R may be responsible for this variation. Alternatively, the phenotypic differences observed might be a consequence of non-genetic factors, such as thermoregulatory mechanisms.

  4. Detection of genomic variations in BRCA1 and BRCA2 genes by long-range PCR and next-generation sequencing.

    Science.gov (United States)

    Hernan, Imma; Borràs, Emma; de Sousa Dias, Miguel; Gamundi, María José; Mañé, Begoña; Llort, Gemma; Agúndez, José A G; Blanca, Miguel; Carballo, Miguel

    2012-01-01

    Advances in sequencing technologies, such as next-generation sequencing (NGS), represent an opportunity to perform genetic testing in a clinical scenario. In this study, we developed and tested a method for the detection of mutations in the large BRCA1 and BRCA2 tumor suppressor genes, using long-range PCR (LR-PCR) and NGS, in samples from individuals with a personal and/or family history of breast and/or ovarian cancer. Eleven LR-PCR fragments, between 3000 and 15,300 bp, containing all coding exons and flanking splice junctions of BRCA1 and BRCA2, were obtained from DNA samples of five individuals carrying mutations in either BRCA1 or BRCA2. Libraries for NGS were prepared using an enzymatic (Nextera technology) method. We analyzed five individual samples in parallel by NGS and obtained complete coverage of all LR-PCR fragments, with an average coding sequence depth for each nucleotide of >30 reads, running from ×7 (in exon 22 of BRCA1) to >×150. We detected and confirmed 100% of the mutations that predispose to the risk of cancer, together with other genomic variations in BRCA1 and BRCA2. Our approach demonstrates that genomic LR-PCR, together with NGS, using the GS Junior 454 System platform, is an effective method for patient sample analysis of BRCA1 and BRCA2 genes. In addition, this method could be performed in regular molecular genetics laboratories.

  5. INTRASPECIFIC VARIATIONS OF 16S MITOCHONDRIAL GENE SEQUENCES OF YELLOW RICE STEM BORER,scirpopbaga incertulas (LEPIDOPTERA: CRAMBIDAE FROM WEST JAVA

    Directory of Open Access Journals (Sweden)

    RIKA RAFFIUDIN

    2011-01-01

    Full Text Available Yellow rice stem borer ( is one of the most important rice pest insectsin Asia, including Indonesia. However, there is a lack of genetic data for this importantagricultural insect. Therefore, this study was conducted to explore intraspecific differentiationof partial 16S mitochondrial gene from Bogor, Karawang, Indramayu and Cirebon(West Java, Indonesia. Here, we reported a total of 325 bp of 16S mitochondrial gene offrom the obtained samples. Among all DNA sequences, three haplotypes of 16Smitochondrial gene were observed and submitted to GenBank under Accession Number ofGU191881, GU191882, GU191883, respectively for haplotype 1, 2, and 3. The haplotype 1was found in all surveyed locations, except Bogor. Haplotype 2 and 3 werefound only in from Cirebon and Bogor samples. These haplotype variations can be applied asDNA markers for early larva detection method among other rice stem borers.Hence, further explorations of the mitochondrial variations of in Java and otherparts of Indonesia are neededmoth, haplotypes, genetic differentiations, molecular identification1* 1 2(1(2Department of Biology, Bogor Agricultural University (IPB, Darmaga,Bogor 16680, INDONESIAIndonesian Centre for Agricultural Biotechnology and Genetic ResourcesResearch and Development (ICABGRDScirpophaga incertulasS. incertulasS.incertulasS. incertulasS. incertulasS. incertulasABSTRACTINTRODUCTION

  6. DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L.

    Science.gov (United States)

    González-Martínez, Santiago C; Ersoz, Elhan; Brown, Garth R; Wheeler, Nicholas C; Neale, David B

    2006-03-01

    Genetic association studies are rapidly becoming the experimental approach of choice to dissect complex traits, including tolerance to drought stress, which is the most common cause of mortality and yield losses in forest trees. Optimization of association mapping requires knowledge of the patterns of nucleotide diversity and linkage disequilibrium and the selection of suitable polymorphisms for genotyping. Moreover, standard neutrality tests applied to DNA sequence variation data can be used to select candidate genes or amino acid sites that are putatively under selection for association mapping. In this article, we study the pattern of polymorphism of 18 candidate genes for drought-stress response in Pinus taeda L., an important tree crop. Data analyses based on a set of 21 putatively neutral nuclear microsatellites did not show population genetic structure or genomewide departures from neutrality. Candidate genes had moderate average nucleotide diversity at silent sites (pi(sil) = 0.00853), varying 100-fold among single genes. The level of within-gene LD was low, with an average pairwise r2 of 0.30, decaying rapidly from approximately 0.50 to approximately 0.20 at 800 bp. No apparent LD among genes was found. A selective sweep may have occurred at the early-response-to-drought-3 (erd3) gene, although population expansion can also explain our results and evidence for selection was not conclusive. One other gene, ccoaomt-1, a methylating enzyme involved in lignification, showed dimorphism (i.e., two highly divergent haplotype lineages at equal frequency), which is commonly associated with the long-term action of balancing selection. Finally, a set of haplotype-tagging SNPs (htSNPs) was selected. Using htSNPs, a reduction of genotyping effort of approximately 30-40%, while sampling most common allelic variants, can be gained in our ongoing association studies for drought tolerance in pine.

  7. Genetic diversity and the biogeographical process ,of Acheilognathus macropterus revealed by sequence variations of mitochondrial cytochrome b gene

    Institute of Scientific and Technical Information of China (English)

    ZHU Yurong; LIU Huanzhang

    2007-01-01

    In this study, thirty-six individuals of Acheilognathus macropterus were collected from the Heilongjiang River,the Yangtze River,and the Nandujiang River.Partial mitochondrial cytochrome b gene region (636 base pair) was sequenced to these samples and 22 haplotypes were found.With A.chankaensis and A.tokinensis as outgroups,their relationships were analyzed.The p-distances were calculated with Mega software and a molecular phyiogenetic tree was constructed using the neighbor-joining (NJ) method.The proportions of main morphological characters were compared as well.P-distances showed that the genetic differences in A.macropterus samples were far smaller than those between these samples and the outgroups.The molecular phylogenetic tree shows that samples with barbels and those without barbels were intermingled.There was no distinctive difference in proportions of morphological characteristics among them.These results suggested that samples with barbels and those without barbels (formally identified as A.taenianalis) are the same species;A.taenianalis is synonymous with A.macropterus.The thirtysix individuals were grouped into five clades and the positions of the samples in the clades were correspondingly grouped within their geographical distributions.Among the five clades,clades 1 and 5 included samples from the Heilongjiang River and Nandujiang River respectively.The samples from the Yangtze River scattered into clades 2,3,and 4.There were distinctive genetic differences (> 5%)among them.Interestingly,the distributions of the 21 samples in these three clades were not correlated to their geographical distributions.It is postulated that these genetic differences were due to the bitterlings' mating choice mechanism,the prozygotic isolation.The genetic differences between the fish from Nandujiang River and those from the mainland indicated that they were separated early.However,the small genetic differences among the samples and the positions of the fish from the

  8. Characterization of expressed Pgip genes in rice and wheat reveals similar extent of sequence variation to dicot PGIPs and identifies an active PGIP lacking an entire LRR repeat.

    Science.gov (United States)

    Janni, Michela; Di Giovanni, Michela; Roberti, Serena; Capodicasa, Cristina; D'Ovidio, Renato

    2006-11-01

    Polygalacturonase-inhibiting proteins (PGIPs) are leucine-rich repeat (LRR) proteins involved in plant defence. A number of PGIPs have been characterized from dicot species, whereas only a few data are available from monocots. Database searches and genome-specific cloning strategies allowed the identification of four rice (Oryza sativa L.) and two wheat (Triticum aestivum L.) Pgip genes. The rice Pgip genes (Ospgip1, Ospgip2, Ospgip3 and Ospgip4) are distributed over a 30 kbp region of the short arm of chromosome 5, whereas the wheat Pgip genes, Tapgip1 and Tapgip2, are localized on the short arm of chromosome 7B and 7D, respectively. Deduced amino acid sequences show the typical LRR modular organization and a conserved distribution of the eight cysteines at the N- and C-terminal regions. Sequence comparison suggests that monocot and dicot PGIPs form two separate clusters sharing about 40% identity and shows that this value is close to the extent of variability observed within each cluster. Gene-specific RT-PCR and biochemical analyses demonstrate that both Ospgips and Tapgips are expressed in the whole plant or in a tissue-specific manner, and that OsPGIP1, lacking an entire LRR repeat, is an active inhibitor of fungal polygalacturonases. This last finding can contribute to define the molecular features of PG-PGIP interactions and highlights that the genetic events that can generate variability at the Pgip locus are not only limited to substitutions or small insertions/deletions, as so far reported, but can also involve variation in the number of LRRs.

  9. Sequence Variations in the Bovine IGF-I and IGFBP3 Genes and Their Association with Growth and Development Traits in Chinese Beef Cattle

    Institute of Scientific and Technical Information of China (English)

    GAO Xue; SHI Ming-yan; XU Xiu-rong; LI Jun-ya; REN Hong-yan; XU Shang-zhong

    2009-01-01

    The objective of this study was to determine the genotype effects of the bovine insulin-like growth factor I (IGF-I) and its binding protein 3 (IGFBP3) genes on growth and development traits in beef cows,including 130 Chinese Simmental,42 Nanyang,and 47 Luxi Yellow cattle.Sequence variations in the bovine IGF-I and IGFBP3 genes were investigated by single strand conformation polymorphism (SSCP).SSCPs were detected in 6 fragments,which is the 5'-flanking region,the 2nd exon,the 5th exon,and the 5th intron of the IGF-I gene,and the 2nd exon,the 3rd exon of the IGFBP3 gene.Two polymorphisms,an A-to-G transition in the 2nd exon of the IGF-I gene and a T-to-C transition in the 2nd exon of IGFBP3 gene were detected in 3 breeds.The allele frequencies of 2 polymorphisms were 0.0411 (A),0.9589 (B),and 0.7237 (A),0.2763 (B),respectively.These 2 loci were analyzed to associate with body weight,height at withers,body length,heart girth,rump width,and beef production index (BPI) at 0,6,12,24,and 36-month old.The 1GFBP3 locus was shown to be associated with rump width,heart girth at 24-month and 36-month.Animals with BB genotype had higher rump width (24.86±0.47) cm at 24-month and (27.50±0.63) cm at 36-month.The heart girth was highest for the individuals with BB genotype (171.33±1.84) cm and higher than those with AB genotype (166.68±1.13) cm (P<0.05) at 36-month.

  10. Combination of real-time PCR and sequencing to detect multiple clinically relevant genetic variations in the lactase gene

    DEFF Research Database (Denmark)

    Brasen, Claus Lohman; Frischknecht, Lone; Ørnskov, Dorthe;

    2017-01-01

    genotyping of the -13910C > T variant. By using a quality value of 99% and sequencing the undetermined samples we improved the ability of the assay to identify variants other than -13910C > T. This resulted in a reduction of the diagnostic error rate by a factor of 2.4 while increasing the expenses only 3...

  11. Comparative analysis of Mycobacterium tuberculosis pe and ppe genes reveals high sequence variation and an apparent absence of selective constraints.

    NARCIS (Netherlands)

    McEvoy, C.R.; Cloete, R.; Müller, B.; Schürch, A.C.; Helden, P.D. van; Gagneux, S.; Warren, R.M.; Gey van Pittius, N.C.

    2012-01-01

    Mycobacterium tuberculosis complex (MTBC) genomes contain 2 large gene families termed pe and ppe. The function of pe/ppe proteins remains enigmatic but studies suggest that they are secreted or cell surface associated and are involved in bacterial virulence. Previous studies have also shown that so

  12. Presence of sequence and SNP variation in the IRF6 gene in healthy residents of Guangdong Province

    Directory of Open Access Journals (Sweden)

    Wu Wenli

    2016-01-01

    Full Text Available This study was to investigate the single nucleotide polymorphism (SNP in the interferon regulatory factor 6 (IRF6 gene in healthy residents of Guangdong Province, China, for further analysis of their associations with the development of cleft lip with or without palate (CL/P.

  13. In silico detection of sequence variations modifying transcriptional regulation.

    Directory of Open Access Journals (Sweden)

    Malin C Andersen

    2008-01-01

    Full Text Available Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers. The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation.

  14. A sequence variation scan of the coagulation factor VIII (FVIII) structural gene and associations with plasma FVIII activity levels.

    Science.gov (United States)

    Viel, Kevin R; Machiah, Deepa K; Warren, Diane M; Khachidze, Manana; Buil, Alfonso; Fernstrom, Karl; Souto, Juan C; Peralta, Juan M; Smith, Todd; Blangero, John; Porter, Sandra; Warren, Stephen T; Fontcuberta, Jordi; Soria, Jose M; Flanders, W Dana; Almasy, Laura; Howard, Tom E

    2007-05-01

    Plasma factor VIII coagulant activity (FVIII:C) level is a highly heritable quantitative trait that is strongly correlated with thrombosis risk. Polymorphisms within only 1 gene, the ABO blood-group locus, have been unequivocally demonstrated to contribute to the broad population variability observed for this trait. Because less than 2.5% of the structural FVIII gene (F8) has been examined previously, we resequenced all known functional regions in 222 potentially distinct alleles from 137 unrelated nonhemophilic individuals representing 7 racial groups. Eighteen of the 47 variants identified, including 17 single-nucleotide polymorphisms (SNPs), were previously unknown. As the degree of linkage disequilibrium across F8 was weak overall, we used measured-genotype association analysis to evaluate the influence of each polymorphism on the FVIII:C levels in 398 subjects from 21 pedigrees known as the Genetic Analysis of Idiopathic Thrombophilia project (GAIT). Our results suggested that 92714C>G, a nonsynonymous SNP encoding the B-domain substitution D1241E, was significantly associated with FVIII:C level. After accounting for important covariates, including age and ABO genotype, the association persisted with each C-allele additively increasing the FVIII:C level by 14.3 IU dL(-1) (P = .016). Nevertheless, because the alleles of 56010G>A, a SNP within the 3' splice junction of intron 7, are strongly associated with 92714C>G in GAIT, additional studies are required to determine whether D1241E is itself a functional variant.

  15. Sequence variation in virulence-related genes of Bordetella pertussis isolates from Poland in the period 1959-2013.

    Science.gov (United States)

    Mosiej, E; Zawadka, M; Krysztopa-Grzybowska, K; Polak, M; Augustynowicz, E; Piekarska, K; Lutyńska, A

    2015-01-01

    This study aimed to characterise Bordetella pertussis isolates circulating in Poland since 1959. Sequence analysis of ptxA, ptxC, prn, tcfA, fim2, fim3 and ptxP for 175 clinical isolates and currently and previously used vaccine strains was performed. Clinical isolates from the period 1995-2013 were found to be different to three currently used vaccine strains harbouring the allelic combination ptxA2-ptxC1-ptxP1-prn1-tcfA2-fim2-1-fim3-1, seen frequently in Poland in the early pertussis vaccination period but not found after 1995. Generally, among B. pertussis isolates from the period 2000-2013, two genotypes predominated, ptxA1-ptxC1-ptxP1-prn1-tcfA2-fim2-2-fim3-1 and ptxA1-ptxC1-ptxP1-prn2-tcfA2-fim2-1-fim3-1, with frequencies of 45% and 32.5%, respectively. The isolates harbouring ptxA1-ptxC2-ptxP3-prn2-tcfA2-fim2-1-fim3-2 and ptxA1-ptxC2-ptxP3-prn2-tcfA2-fim2-1-fim3-1 profiles, currently highly prevalent within other European Union (EU) countries, were rarely found in Poland, as they circulated in the period 2000-2013 with frequencies of 10% and 5%, respectively. We hypothesise that several previous changes of strain composition in whole-cell pertussis vaccine produced locally and used since 1960 in Poland resulted in a more diverse immune pressure in the population, resulting in different prevalence of alleles compared to elsewhere.

  16. Sequence and expression variations in 23 genes involved in mitochondrial and non-mitochondrial apoptotic pathways and risk of oral leukoplakia and cancer.

    Science.gov (United States)

    Datta, Sayantan; Ray, Anindita; Singh, Richa; Mondal, Pinaki; Basu, Analabha; De Sarkar, Navonil; Majumder, Mousumi; Maiti, Guruparasad; Baral, Aradhita; Jha, Ganga Nath; Mukhopadhyay, Indranil; Panda, Chinmay; Chowdhury, Shantanu; Ghosh, Saurabh; Roychoudhury, Susanta; Roy, Bidyut

    2015-11-01

    Oral cancer is usually preceded by pre-cancerous lesion and related to tobacco abuse. Tobacco carcinogens damage DNA and cells harboring such damaged DNA normally undergo apoptotic death, but cancer cells are exceptionally resistant to apoptosis. Here we studied association between sequence and expression variations in apoptotic pathway genes and risk of oral cancer and precancer. Ninety nine tag SNPs in 23 genes, involved in mitochondrial and non-mitochondrial apoptotic pathways, were genotyped in 525 cancer and 253 leukoplakia patients and 538 healthy controls using Illumina Golden Gate assay. Six SNPs (rs1473418 at BCL2; rs1950252 at BCL2L2; rs8190315 at BID; rs511044 at CASP1; rs2227310 at CASP7 and rs13010627 at CASP10) significantly modified risk of oral cancer but SNPs only at BCL2, CASP1and CASP10 modulated risk of leukoplakia. Combination of SNPs showed a steep increase in risk of cancer with increase in "effective" number of risk alleles. In silico analysis of published data set and our unpublished RNAseq data suggest that change in expression of BID and CASP7 may have affected risk of cancer. In conclusion, three SNPs, rs1473418 in BCL2, rs1950252 in BCL2L2 and rs511044 in CASP1, are being implicated for the first time in oral cancer. Since SNPs at BCL2, CASP1 and CASP10 modulated risk of both leukoplakia and cancer, so, they should be studied in more details for possible biomarkers in transition of leukoplakia to cancer. This study also implies importance of mitochondrial apoptotic pathway gene (such as BCL2) in progression of leukoplakia to oral cancer.

  17. Analysis and validation of genome-specific DNA variations in 5' flanking conserved sequences of wheat low-molecular-weight glutenin subunit genes

    Institute of Scientific and Technical Information of China (English)

    LONG; Hai; WEI; Yuming

    2006-01-01

    The thirty-three 5' flanking conserved sequences of the known low-molecular-weight subunit (LMW-GS) genes have been divided into eight clusters, which was in agreement with the classification based on the deduced N-terminal protein sequences. The DNA polymorphism between the eight clusters was obtained by sequence alignment, and a total of 34 polymorphic positions were observed in the approximately 200 bp regions, among which 18 polymorphic positions were candidate SNPs. Seven cluster-specific primer sets were designed for seven out of eight clusters containing cluster-specific bases, with which the genomic DNA of the ditelosomic lines of group 1 chromosomes of a wheat variety 'Chinese Spring' was employed to carry out chromosome assignment. The subsequent cloning and DNA sequencing of PCR fragments validated the sequences specificity of the 5' flanking conserved sequences between LMW-GS gene groups in different genomes. These results suggested that the coding and 5' flanking regions of LMW-GS genes are likely to have evolved in a concerted fashion. The seven primer sets developed in this study could be used to isolate the complete ORFs of seven groups of LMW-GS genes, respectively, and therefore possess great value for further research in the contributions of a single LMW-GS gene to wheat quality in the complex genetic background and the efficient selections of quality-related components in breeding programs.

  18. Identification of Sequence Variation in the Apolipoprotein A2 Gene and Their Relationship with Serum High-Density Lipoprotein Cholesterol Levels

    Science.gov (United States)

    Bandarian, Fatemeh; Daneshpour, Maryam Sadat; Hedayati, Mehdi; Naseri, Mohsen; Azizi, Fereidoun

    2016-01-01

    Background: Apolipoprotein A2 (APOA2) is the second major apolipoprotein of the high-density lipoprotein cholesterol (HDL-C). The study aim was to identify APOA2 gene variation in individuals within two extreme tails of HDL-C levels and its relationship with HDL-C level. Methods: This cross-sectional survey was conducted on participants from Tehran Glucose and Lipid Study (TLGS) at Research Institute for Endocrine Sciences, Tehran, Iran from April 2012 to February 2013. In total, 79 individuals with extreme low HDL-C levels (≤5th percentile for age and gender) and 63 individuals with extreme high HDL-C levels (≥95th percentile for age and gender) were selected. Variants were identified using DNA amplification and direct sequencing. Results: Screen of all exons and the core promoter region of APOA2 gene identified nine single nucleotide substitutions and one microsatellite; five of which were known and four were new variants. Of these nine variants, two were common tag single nucleotide polymorphisms (SNPs) and seven were rare SNPs. Both exonic substitutions were missense mutations and caused an amino acid change. There was a significant association between the new missense mutation (variant Chr.1:16119226, Ala98Pro) and HDL-C level. Conclusion: None of two common tag SNPs of rs6413453 and rs5082 contributes to the HDL-C trait in Iranian population, but a new missense mutation in APOA2 in our population has a significant association with HDL-C. PMID:26590203

  19. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  20. Structure, variation and expression analysis of glutenin gene promoters from Triticum aestivum cultivar Chinese Spring shows the distal region of promoter 1Bx7 is key regulatory sequence.

    Science.gov (United States)

    Wang, Kai; Zhang, Xue; Zhao, Ying; Chen, Fanguo; Xia, Guangmin

    2013-09-25

    In this study, ten glutenin gene promoters were isolated from model wheat (Triticum aestivum L. cv. Chinese Spring) using a genomic PCR strategy with gene-specific primers. Six belonged to high-molecular-weight glutenin subunit (HMW-GS) gene promoters, and four to low-molecular-weight glutenin subunit (LMW-GS). Sequence lengths varied from 1361 to 2,554 bp. We show that the glutenin gene promoter motifs are conserved in diverse sequences in this study, with HMW-GS and LMW-GS gene promoters characterized by distinct conserved motif combinations. Our findings show that HMW-GS promoters contain more functional motifs in the distal region of the glutenin gene promoter (> -700 bp) compared with LMW-GS. The y-type HMW-GS gene promoters possess unique motifs including RY repeat and as-2 box compared to the x-type. We also identified important motifs in the distal region of HMW-GS gene promoters including the 5'-UTR Py-rich stretch motif and the as-2 box motif. We found that cis-acting elements in the distal region of promoter 1Bx7 enhanced the expression of HMW-GS gene 1Bx7. Taken together, these data support efforts in designing molecular breeding strategies aiming to improve wheat quality. Our results offer insight into the regulatory mechanisms of glutenin gene expression.

  1. Genetic variation in genes affecting milk composition and quality

    DEFF Research Database (Denmark)

    Bertelsen, Henriette Pasgaard

    In the past decade major advances in next generation sequencing technologies have provided new opportuneties for the detection of genetic variation. Combining the knowlegde of genetic variation with phenotypic distributions provides considerable possibilites for detection of candidate genes....... In addition, exploring genetic variation related to the major milk proteins of bovine milk indntified genetic variations with possitive effects on milk coagulation...

  2. Sequence Variation of COI Gene of Dendrolimus tabulaeformis in Different Types of Stand%不同林分类型油松毛虫COI基因变异分析

    Institute of Scientific and Technical Information of China (English)

    夏明瑞; 周国娜; 高宝嘉

    2012-01-01

    In order to clarify the effect of different stand types on the genetic structure of Dendrolinus tabulaeformis, we determined a 554 bp segment of the mitochondrial cytochrome oxidase subunit ( COI ) gene sequence of D. tabulaeformis populations from of 4 types of stands, and then analyzed the sequence variability and genetic differentiation. The results showed that in the gene sequence of 544 bp, the average contents of A, T, C and G were 30. 6% , 39. 4% , 15. 3% and 14. 7% , respectively, and the contents of A + T accounted for 70% of the total bases and were obviously higher than that of C + G, with an obvious A/T bias. In this sequence fragment, 64 nucleotide sites showed variation and the variability was 11.7%. Genetic distance of the nucleotide sequences was 0.002 -0.046, indicating a low genetic variation. A dendrogram was constructed by using NJ and MP method, and the results showed that the genetic differentiations were correlated with the type of stands. Furthermore, the Fst coefficients of gene differentiation were between - 0. 128 to 0. 117, and the number of migrant per generation was all greater than 1, manifesting that there were to some degrees genetic differentiation and gene exchange between D. tabulaeformis populations from different types of stands.

  3. Tracing outbreaks of Streptococcus equi infection (strangles) in horses using sequence variation in the seM gene and pulsed-field gel electrophoresis.

    Science.gov (United States)

    Lindahl, Susanne; Söderlund, Robert; Frosth, Sara; Pringle, John; Båverud, Viveca; Aspán, Anna

    2011-11-21

    Strangles is a serious respiratory disease in horses caused by Streptococcus equi subspecies equi (S. equi). Transmission of the disease occurs by direct contact with an infected horse or contaminated equipment. Genetically, S. equi strains are highly homogenous and differentiation of strains has proven difficult. However, the S. equi M-protein SeM contains a variable N-terminal region and has been proposed as a target gene to distinguish between different strains of S. equi and determine the source of an outbreak. In this study, strains of S. equi (n=60) from 32 strangles outbreaks in Sweden during 1998-2003 and 2008-2009 were genetically characterized by sequencing the SeM protein gene (seM), and by pulsed-field gel electrophoresis (PFGE). Swedish strains belonged to 10 different seM types, of which five have not previously been described. Most were identical or highly similar to allele types from strangles outbreaks in the UK. Outbreaks in 2008/2009 sharing the same seM type were associated by geographic location and/or type of usage of the horses (racing stables). Sequencing of the seM gene generally agreed with pulsed-field gel electrophoresis profiles. Our data suggest that seM sequencing as a epidemiological tool is supported by the agreement between seM and PFGE and that sequencing of the SeM protein gene is more sensitive than PFGE in discriminating strains of S. equi.

  4. Recombination Rate Variation Modulates Gene Sequence Evolution Mainly via GC-Biased Gene Conversion, Not Hill-Robertson Interference, in an Avian System.

    Science.gov (United States)

    Bolívar, Paulina; Mugal, Carina F; Nater, Alexander; Ellegren, Hans

    2016-01-01

    The ratio of nonsynonymous to synonymous substitution rates (ω) is often used to measure the strength of natural selection. However, ω may be influenced by linkage among different targets of selection, that is, Hill-Robertson interference (HRI), which reduces the efficacy of selection. Recombination modulates the extent of HRI but may also affect ω by means of GC-biased gene conversion (gBGC), a process leading to a preferential fixation of G:C ("strong," S) over A:T ("weak," W) alleles. As HRI and gBGC can have opposing effects on ω, it is essential to understand their relative impact to make proper inferences of ω. We used a model that separately estimated S-to-S, S-to-W, W-to-S, and W-to-W substitution rates in 8,423 avian genes in the Ficedula flycatcher lineage. We found that the W-to-S substitution rate was positively, and the S-to-W rate negatively, correlated with recombination rate, in accordance with gBGC but not predicted by HRI. The W-to-S rate further showed the strongest impact on both dN and dS. However, since the effects were stronger at 4-fold than at 0-fold degenerated sites, likely because the GC content of these sites is farther away from its equilibrium, ω slightly decreases with increasing recombination rate, which could falsely be interpreted as a consequence of HRI. We corroborated this hypothesis analytically and demonstrate that under particular conditions, ω can decrease with increasing recombination rate. Analyses of the site-frequency spectrum showed that W-to-S mutations were skewed toward high, and S-to-W mutations toward low, frequencies, consistent with a prevalent gBGC-driven fixation bias.

  5. A sequence-based variation map of zebrafish.

    Science.gov (United States)

    Patowary, Ashok; Purkanti, Ramya; Singh, Meghna; Chauhan, Rajendra; Singh, Angom Ramcharan; Swarnkar, Mohit; Singh, Naresh; Pandey, Vikas; Torroja, Carlos; Clark, Matthew D; Kocher, Jean-Pierre; Clark, Karl J; Stemple, Derek L; Klee, Eric W; Ekker, Stephen C; Scaria, Vinod; Sivasubbu, Sridhar

    2013-03-01

    Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred laboratory animals. The nonisogenic nature of the zebrafish as a model system offers the opportunity to understand natural variations and their effect in modulating phenotype. In an effort to better characterize the range of natural variation in this model system and to complement the zebrafish reference genome project, the whole genome sequence of a wild zebrafish at 39-fold genome coverage was determined. Comparative analysis with the zebrafish reference genome revealed approximately 5.2 million single nucleotide variations and over 1.6 million insertion-deletion variations. This dataset thus represents a new catalog of genetic variations in the zebrafish genome. Further analysis revealed selective enrichment for variations in genes involved in immune function and response to the environment, suggesting genome-level adaptations to environmental niches. We also show that human disease gene orthologs in the sequenced wild zebrafish genome show a lower ratio of nonsynonymous to synonymous single nucleotide variations.

  6. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    Science.gov (United States)

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  7. Sequence variations in the 5' flanking and IVS-II regions of the G gamma- and A gamma-globin genes of beta S chromosomes with five different haplotypes.

    Science.gov (United States)

    Lanclos, K D; Oner, C; Dimovski, A J; Gu, Y C; Huisman, T H

    1991-06-01

    We have amplified and sequenced the 5' flanking and the second intervening sequence (IVS-II) regions of both the G gamma- and A gamma-globin genes of the beta S chromosomes from sickle cell anemia (SS) patients with homozygosities for five different haplotypes. The sequencing data, compared with previously published sequences for the normal chromosomes A and B, show many similarities to chromosome B for haplotypes 19, 20, and 17, while haplotypes 3 and 31 are remarkably similar to chromosome A and also similar to each other. Several unique mutations were found in the 5' flanking regions (G gamma and A gamma) of haplotypes 19 and 20 and in the IVS-II segments of the same genes of haplotypes 19, 20, and 17; the IVS-II of haplotypes 3 and 31 were identical to those of chromosome A. Dot-blot analyses of amplified DNA from additional SS patients with specific probes have confirmed that these mutations are unique for each haplotype. The two general patterns that have been observed among the five haplotypes have most probably arisen by gene conversion events between the A and B type chromosomes in the African population. These patterns correlate with high and low fetal hemoglobin expression, and it is speculated that these and other yet unknown gene conversions may contribute to the variations in hemoglobin F and G gamma levels observed among SS patients. In vitro expression experiments involving the approximately 1.3-kb 5' flanking regions of the G gamma- and A gamma-globin genes of the beta S chromosomes with the five different haplotypes failed to detect differences between the levels of expression, suggesting that the sequence variations observed between these segments of DNA are not the primary cause of the differences in hemoglobin F levels among the SS patients.

  8. Genomic variation in Salmonella enterica core genes for epidemiological typing

    Directory of Open Access Journals (Sweden)

    Leekitcharoenphon Pimlapas

    2012-03-01

    Full Text Available Abstract Background Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over time. The core genes--the genes that are conserved in all (or most members of a genus or species--are potentially good candidates for investigating genomic variation in phylogeny and epidemiology. Results We identify a set of 2,882 core genes clusters based on 73 publicly available Salmonella enterica genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher confidence. The core genes can be divided into two categories: a few highly variable genes and a larger set of conserved core genes, with low variance. For the most variable core genes, the variance in amino acid sequences is higher than for the corresponding nucleotide sequences, suggesting that there is a positive selection towards mutations leading to amino acid changes. Conclusions Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important especially in trend analysis.

  9. Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

    Science.gov (United States)

    Hoy, Marshal S.; Rodriguez, Rusty J.

    2013-01-01

    Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.                   

  10. Variation in seed fatty acid composition and sequence divergence in the FAD2 gene coding region between wild and cultivated sesame.

    Science.gov (United States)

    Chen, Zhenbang; Tonnis, Brandon; Morris, Brad; Wang, Richard B; Zhang, Amy L; Pinnow, David; Wang, Ming Li

    2014-12-03

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examination of fatty acid composition. The coding region of the FAD2 gene for fatty acid desaturase (FAD) in these accessions was also sequenced. Cultivated sesame accessions flowered and matured earlier than the wild species. The cultivated sesame seeds contained a significantly higher percentage of oleic acid (40.4%) than the seeds of the wild species (26.1%). Nucleotide polymorphisms were identified in the FAD2 gene coding region between wild and cultivated species. Some nucleotide polymorphisms led to amino acid changes, one of which was located in the enzyme active site and may contribute to the altered fatty acid composition. Based on the morphology observation, chemical analysis, and sequence analysis, it was determined that two accessions were misnamed and need to be reclassified. The results obtained from this study are useful for sesame improvement in molecular breeding programs.

  11. Sequence variation of the glycoprotein gene identifies three distinct lineages within field isolates of viral hemorrhagic septicemia virus, a fish rhabdovirus

    Science.gov (United States)

    Benmansour, A.; Bascuro, B.; Monnier, A.F.; Vende, P.; Winton, J.R.; de Kinkelin, P.

    1997-01-01

    To evaluate the genetic diversity of viral haemorrhagic septicaemia virus (VHSV), the sequence of the glycoprotein genes (G) of 11 North American and European isolates were determined. Comparison with the G protein of representative members of the family Rhabdoviridae suggested that VHSV was a different virus species from infectious haemorrhagic necrosis virus (IHNV) and Hirame rhabdovirus (HIRRV). At a higher taxonomic level, VHSV, IHNV and HIRRV formed a group which was genetically closest to the genus Lyssavirus. Compared with each other, the G genes of VHSV displayed a dissimilar overall genetic diversity which correlated with differences in geographical origin. The multiple sequence alignment of the complete G protein, showed that the divergent positions were not uniformly distributed along the sequence. A central region (amino acid position 245-300) accumulated substitutions and appeared to be highly variable. The genetic heterogeneity within a single isolate was high, with an apparent internal mutation frequency of 1.2 x 10(-3) per nucleotide site, attesting the quasispecies nature of the viral population. The phylogeny separated VHSV strains according to the major geographical area of isolation: genotype I for continental Europe, genotype II for the British Isles, and genotype III for North America. Isolates from continental Europe exhibited the highest genetic variability, with sub-groups correlated partially with the serological classification. Neither neutralizing polyclonal sera, nor monoclonal antibodies, were able to discriminate between the genotypes. The overall structure of the phylogenetic tree suggests that VHSV genetic diversity and evolution fit within the model of random change and positive selection operating on quasispecies.

  12. Dynamics of Lewis b binding and sequence variation of the babA adhesin gene during chronic Helicobacter pylori infection in humans.

    Science.gov (United States)

    Nell, Sandra; Kennemann, Lynn; Schwarz, Sandra; Josenhans, Christine; Suerbaum, Sebastian

    2014-12-16

    Helicobacter pylori undergoes rapid microevolution during chronic infection, but very little is known about how this affects host interaction factors. The best-studied adhesin of H. pylori is BabA, which mediates binding to the blood group antigen Lewis b [Le(b)]. To study the dynamics of Le(b) adherence during human infection, we analyzed paired H. pylori isolates obtained sequentially from chronically infected individuals. A complete loss or significant reduction of Le(b) binding was observed in strains from 5 out of 23 individuals, indicating that the Le(b) binding phenotype is quite stable during chronic human infection. Sequence comparisons of babA identified differences due to mutation and/or recombination in 12 out of 16 strain pairs analyzed. Most amino acid changes were found in the putative N-terminal extracellular adhesion domain. One strain pair that had changed from a Le(b) binding to a nonbinding phenotype was used to study the role of distinct sequence changes in Le(b) binding. By transformations of the nonbinding strain with a babA gene amplified from the binding strain, H. pylori strains with mosaic babA genes were generated. Recombinants were enriched for a gain of Le(b) binding by biopanning or for BabA expression on the bacterial surface by pulldown assay. With this approach, we identified several amino acid residues affecting the strength of Le(b) binding. Additionally, the data showed that the C terminus of BabA, which is predicted to encode an outer membrane β-barrel domain, plays an essential role in the biogenesis of this protein. Helicobacter pylori causes a chronic infection of the human stomach that can lead to ulcers and cancer. The bacterium can bind to gastric epithelial cells with specialized outer membrane proteins. The best-studied protein is the BabA adhesin which binds to the Lewis b blood group antigen. Since H. pylori is a bacterium with very high genetic variability, we asked whether babA evolves during chronic infection and

  13. SEQUENCE POLYMORPHISMS OF FOUR CHLOROPLAST GENES IN FOUR ACACIA SPECIES

    Directory of Open Access Journals (Sweden)

    Anthonius Y.P.B.C. Widyatmoko

    2011-06-01

    Full Text Available Sequence polymorphisms among and within four Acacia species,  A. aulacocarpa, A. auriculiformis, A. crassicarpa, and A. mangium, were investigated using four chloroplast DNA genes (atpA, petA, rbcL, and rpoA. The phylogenetic relationship among these species is discussed in light of the results of the sequence information. No intraspecific sequence variation was found in the four genes of the four species, and a conservative rate of mutation of the chloroplast DNA genes was also confirmed in the Acacia species. In the atpA and petA of the four genes, all four species possessed identical sequences, and no sequence variation was found among the four Acacia species. In the rbcL and rpoA genes, however, sequence polymorphisms were revealed among these species. Acacia aulacocarpa and A. crassicarpa shared an identical sequence, and A. auriculiformis and A. mangium also showed no sequence variation.  The fact that A. mangium and A. auriculiformis shared identical sequences as did A. aulacocarpa and A. crassicarpa indicated that the two respective species were extremely closely related. Although a putative natural hybrid of A. aulacocarpa and A. auriculiformis has been reported, our results suggested that natural hybridization should be further verified using molecular markers.

  14. Variations in gut microbiota and fecal metabolic phenotype associated with depression by 16S rRNA gene sequencing and LC/MS-based metabolomics.

    Science.gov (United States)

    Yu, Meng; Jia, Hongmei; Zhou, Chao; Yang, Yong; Zhao, Yang; Yang, Maohua; Zou, Zhongmei

    2017-05-10

    As a prevalent, life-threatening and highly recurrent psychiatric illness, depression is characterized by a wide range of pathological changes; however, its etiology remains incompletely understood. Accumulating evidence supports that gut microbiota affects not only gastrointestinal physiology but also central nervous system (CNS) function and behavior through the microbiota-gut-brain axis. To assess the impact of gut microbiota on fecal metabolic phenotype in depressive conditions, an integrated approach of 16S rRNA gene sequencing combined with ultra high-performance liquid chromatography-mass spectrometry (UHPLC-MS) based metabolomics was performed in chronic variable stress (CVS)-induced depression rat model. Interestingly, depression led to significant gut microbiota changes, at the phylum and genus levels in rats treated with CVS compared to controls. The relative abundances of the bacterial genera Marvinbryantia, Corynebacterium, Psychrobacter, Christensenella, Lactobacillus, Peptostreptococcaceae incertae sedis, Anaerovorax, Clostridiales incertae sedis and Coprococcus were significantly decreased, whereas Candidatus Arthromitus and Oscillibacter were markedly increased in model rats compared with normal controls. Meanwhile, distinct changes in fecal metabolic phenotype of depressive rats were also found, including lower levels of amino acids, and fatty acids, and higher amounts of bile acids, hypoxanthine and stercobilins. Moreover, there were substantial associations of perturbed gut microbiota genera with the altered fecal metabolites, especially compounds involved in the metabolism of tryptophan and bile acids. These results showed that the gut microbiota was altered in association with fecal metabolism in depressive conditions. These findings suggest that the 16S rRNA gene sequencing and LC-MS based metabolomics approach can be further applied to assess pathogenesis of depression. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Phylogenetic analysis, based on EPIYA repeats in the cagA gene of Indian Helicobacter pylori, and the implications of sequence variation in tyrosine phosphorylation motifs on determining the clinical outcome

    Directory of Open Access Journals (Sweden)

    Santosh K. Tiwari

    2011-01-01

    Full Text Available The population of India harbors one of the world's most highly diverse gene pools, owing to the influx of successive waves of immigrants over regular periods in time. Several phylogenetic studies involving mitochondrial DNA and Y chromosomal variation have demonstrated Europeans to have been the first settlers in India. Nevertheless, certain controversy exists, due to the support given to the thesis that colonization was by the Austro-Asiatic group, prior to the Europeans. Thus, the aim was to investigate pre-historic colonization of India by anatomically modern humans, using conserved stretches of five amino acid (EPIYA sequences in the cagA gene of Helicobacter pylori. Simultaneously, the existence of a pathogenic relationship of tyrosine phosphorylation motifs (TPMs, in 32 H. pylori strains isolated from subjects with several forms of gastric diseases, was also explored. High resolution sequence analysis of the above described genes was performed. The nucleotide sequences obtained were translated into amino acids using MEGA (version 4.0 software for EPIYA. An MJ-Network was constructed for obtaining TPM haplotypes by using NETWORK (version 4.5 software. The findings of the study suggest that Indian H. pylori strains share a common ancestry with Europeans. No specific association of haplotypes with the outcome of disease was revealed through additional network analysis of TPMs.

  16. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece

    2014-04-03

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima\\'s D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. Availability and implementation: The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp. © The Author 2014.

  17. Genomic variation in Salmonella enterica core genes for epidemiological typing

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Lukjancenko, Oksana; Rundsten, Carsten Friis

    2012-01-01

    Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over...... genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher...... that there is a positive selection towards mutations leading to amino acid changes. Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important...

  18. Sequence variation and genetic diversity in the giant panda

    Institute of Scientific and Technical Information of China (English)

    张亚平; Oliver A.Ryder; 范志勇; 张和明; 何廷美; 何光昕; 张安居; 费立松; 钟顺隆; 陈红; 张成林; 杨明海; 朱飞兵; 彭真信; 普天春; 陈玉村; 姚敏达; 郭伟

    1997-01-01

    About 336-444 bp mitochondrial D-loop region and tRNA gene were sequenced for 40 individuals of the giant panda which were collected from Mabian, Meigu, Yuexi, Baoxing, Pingwu, Qingchuan, Nanping and Baishuijiang, respectively. 9 haplotypes were found in 21 founders. The results showed that the giant panda has low genetic variations, and that there is no notable genetic isolation among geographical populations. The ancestor of the living giant panda population perhaps appeared in the late Pleistocene, and unfortunately, might have suffered bottle-neck attacks. Afterwards, its genetic diversity seemed to recover to some extent.

  19. Identification and sequence analysis of Tapasin gene in guinea fowl

    Directory of Open Access Journals (Sweden)

    Varuna P. Panicker

    2014-12-01

    Full Text Available Aim: An attempt has been made to identify and study the nucleotide sequence variability in exon 5 - exon 6 regions of guinea fowl Tapasin gene. Materials and Methods: Blood samples were collected from randomly selected birds (12 guinea fowl birds and Tapasin gene amplified using chicken specific primers designed from GenBank submitted sequences. Polymerase chain reaction conditions were standardized so as get only single amplicons. Obtained products were then cloned and sequenced; sequences were then analyzed using suitable software. Results: Amplicon size of the Tapasin gene in guinea fowl was same as reported in chicken with areas of transitions and transversions. The sequence variations reported in these coding sequences might have influence in the protein structure, which may be correlated with the increased immune status of the bird when compared with chicken breeds. Conclusion: Since Tapasin gene is an immunologically important gene, which plays an important role in the immune status of the bird. Sequence variations in the gene can be correlated with the altered immune status of the bird.

  20. Targeted sequencing of cancer-related genes in colorectal cancer using next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Sae-Won Han

    Full Text Available Recent advance in sequencing technology has enabled comprehensive profiling of genetic alterations in cancer. We have established a targeted sequencing platform using next-generation sequencing (NGS technology for clinical use, which can provide mutation and copy number variation data. NGS was performed with paired-end library enriched with exons of 183 cancer-related genes. Normal and tumor tissue pairs of 60 colorectal adenocarcinomas were used to test feasibility. Somatic mutation and copy number alteration were analyzed. A total of 526 somatic non-synonymous sequence variations were found in 113 genes. Among these, 278 single nucleotide variations were 232 different somatic point mutations. 216 SNV were 79 known single nucleotide polymorphisms in the dbSNP. 32 indels were 28 different indel mutations. Median number of mutated gene per tumor was 4 (range 0-23. Copy number gain (>X2 fold was found in 65 genes in 40 patients, whereas copy number loss (genes in 39 patients. The most frequently altered genes (mutation and/or copy number alteration were APC in 35 patients (58%, TP53 in 34 (57%, and KRAS in 24 (40%. Altered gene list revealed ErbB signaling pathway as the most commonly involved pathway (25 patients, 42%. Targeted sequencing platform using NGS technology is feasible for clinical use and provides comprehensive genetic alteration data.

  1. Strait of Gibraltar: an effective gene-flow barrier for wind-pollinated Carex helodes (Cyperaceae) as revealed by DNA sequences, AFLP, and cytogenetic variation

    National Research Council Canada - National Science Library

    Escudero, Marcial; Vargas, Pablo; Valcarcel, Virginia; Luceno, Modesto

    2008-01-01

    ...) fingerprints, and nucleotide substitutions in plastid and nuclear DNA sequences. Cytogeographic results showed that the African populations have stabilized at a single chromosome number of 2 n...

  2. Sequence variation and selection of small RNAs in domesticated rice

    Directory of Open Access Journals (Sweden)

    Cai Daguang

    2010-04-01

    Full Text Available Abstract Background Endogenous non-coding small RNAs (21-24 nt play an important role in post-transcriptional gene regulation in plants. Domestication selection is the most important evolutionary force in shaping crop genomes. The extent of polymorphism at small RNA loci in domesticated rice and whether small RNA loci are targets of domestication selection have not yet been determined. Results A polymorphism survey of 94 small RNA loci (88 MIRNAs, four TAS3 loci and two miRNA-like long hairpins was conducted in domesticated rice, generating 2 Mb of sequence data. Many mutations (substitution or insertion/deletion were observed at small RNA loci in domesticated rice, e.g. 12 mutation sites were observed in the mature miRNA sequences of 11 MIRNAs (12.5% of the investigated MIRNAs. Several small RNA loci showed significant signals for positive selection and/or potential domestication selection. Conclusions Sequence variation at miRNAs and other small RNAs is higher than expected in domesticated rice. Like protein-coding genes, non-coding small RNA loci could be targets of domestication selection and play an important role in rice domestication and improvement.

  3. Using multilocus sequence typing to study bacterial variation: prospects in the genomic era.

    Science.gov (United States)

    Jolley, Keith A; Maiden, Martin C J

    2014-01-01

    Multilocus sequence typing (MLST) indexes the sequence variation present in a small number (usually seven) of housekeeping gene fragments located around the bacterial genome. Unique alleles at these loci are assigned arbitrary integer identifiers, which effectively summarizes the variation present in several thousand base pairs of genome sequence information as a series of numbers. Comparing bacterial isolates using allele-based methods efficiently corrects for the effects of lateral gene transfer present in many bacterial populations and is computationally efficient. This 'gene-by-gene' approach can be applied to larger collections of loci, such as the ribosomal protein genes used in ribosomal MLST (rMLST), up to and including the complete set of coding sequences present in a genome, whole-genome MLST (wgMLST), providing scalable, efficient and readily interpreted genome analysis.

  4. Disease gene identification strategies for exome sequencing

    NARCIS (Netherlands)

    Gilissen, C.; Hoischen, A.; Brunner, H.G.; Veltman, J.A.

    2012-01-01

    Next generation sequencing can be used to search for Mendelian disease genes in an unbiased manner by sequencing the entire protein-coding sequence, known as the exome, or even the entire human genome. Identifying the pathogenic mutation amongst thousands to millions of genomic variants is a major c

  5. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    Science.gov (United States)

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  6. Variation in the Major Surface Glycoprotein Genes in Pneumocystis jirovecii

    Science.gov (United States)

    Kutty, Geetha; Maldarelli, Frank; Achaz, Guillaume; Kovacs, Joseph A.

    2008-01-01

    The genome of Pneumocystis, which causes life-threatening pneumonia in immunosuppressed patients, contains a multi-copy gene family that encodes the major surface glycoprotein (Msg). Pneumocystis can vary the expressed Msg, presumably as a mechanism to avoid host immune responses. Analysis of 24 msg gene sequences obtained from a single human Pneumocystis isolate demonstrated that the sequences segregate into two branches. Based on a number of analyses, recombination among msg genes appears to be an important mechanism for generating msg diversity. Intra-branch recombination occurred more frequently than inter-branch recombination. Restriction fragment length polymorphism analysis demonstrated substantial variation in the repertoire of the msg gene family among isolates of human Pneumocystis, which was not observed in laboratory isolates of rat or mouse Pneumocystis; this may be the result of examining outbred vs. captive populations. Increased diversity in the Msg repertoire, generated in part by recombination, increases the potential for antigenic variation in this abundant surface protein. PMID:18627244

  7. A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes.

    Directory of Open Access Journals (Sweden)

    Jan Freudenberg

    Full Text Available To measure the strength of natural selection that acts upon single nucleotide variants (SNVs in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs per nonsynonymous site and synonymous SNVs (sSNVs per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs, nervous system genes (NSGs, randomly sampled genes (RSGs, and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

  8. Association of sequence variations in vitamin K epoxide reductase and gamma-glutamyl carboxylas genes with biochemical measures of vitamin K status

    Science.gov (United States)

    Genetic factors, specifically the VKORC1 and GGCX genes, have been shown to contribute to the interindividual variability in response to the vitamin K-antagonist, warfarin, which influences the dose required to achieve the desired anticoagulation response. These differences in warfarin sensitivity ...

  9. Parkinson's disease and mitochondrial gene variations

    DEFF Research Database (Denmark)

    Andalib, Sasan; Vafaee, Manouchehr Seyedi; Gjedde, Albert

    2014-01-01

    Parkinson's disease (PD) is a common disorder of the central nervous system in the elderly. The pathogenesis of PD is a complex process, with genetics as an important contributing factor. This factor may stem from mitochondrial gene variations and mutations as well as from nuclear gene variations...... and mutations. More recently, a particular role of mitochondrial dysfunction has been suggested, arising from mitochondrial DNA variations or acquired mutations in PD pathogenesis. The present review summarizes and weighs the evidence in support of mitochondrial DNA (mtDNA) variations as important contributors...

  10. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  11. Mapping of a chromosome 12 region associated with airway hyperresponsiveness in a recombinant congenic mouse strain and selection of potential candidate genes by expression and sequence variation analyses.

    Directory of Open Access Journals (Sweden)

    Cynthia Kanagaratham

    Full Text Available In a previous study we determined that BcA86 mice, a strain belonging to a panel of AcB/BcA recombinant congenic strains, have an airway responsiveness phenotype resembling mice from the airway hyperresponsive A/J strain. The majority of the BcA86 genome is however from the hyporesponsive C57BL/6J strain. The aim of this study was to identify candidate regions and genes associated with airway hyperresponsiveness (AHR by quantitative trait locus (QTL analysis using the BcA86 strain. Airway responsiveness of 205 F2 mice generated from backcrossing BcA86 strain to C57BL/6J strain was measured and used for QTL analysis to identify genomic regions in linkage with AHR. Consomic mice for the QTL containing chromosomes were phenotyped to study the contribution of each chromosome to lung responsiveness. Candidate genes within the QTL were selected based on expression differences in mRNA from whole lungs, and the presence of coding non-synonymous mutations that were predicted to have a functional effect by amino acid substitution prediction tools. One QTL for AHR was identified on Chromosome 12 with its 95% confidence interval ranging from 54.6 to 82.6 Mbp and a maximum LOD score of 5.11 (p = 3.68 × 10(-3. We confirmed that the genotype of mouse Chromosome 12 is an important determinant of lung responsiveness using a Chromosome 12 substitution strain. Mice with an A/J Chromosome 12 on a C57BL/6J background have an AHR phenotype similar to hyperresponsive strains A/J and BcA86. Within the QTL, genes with deleterious coding variants, such as Foxa1, and genes with expression differences, such as Mettl21d and Snapc1, were selected as possible candidates for the AHR phenotype. Overall, through QTL analysis of a recombinant congenic strain, microarray analysis and coding variant analysis we identified Chromosome 12 and three potential candidate genes to be in linkage with airway responsiveness.

  12. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed

    Directory of Open Access Journals (Sweden)

    Sophia Johler

    2016-06-01

    Full Text Available Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences.

  13. Sequence variations of the pancreatic islet/liver glucose transporter (GLUT2) gene in Japanese subjects with noninsulin dependent diabetes mellitus

    Energy Technology Data Exchange (ETDEWEB)

    Matsubara, Atsushi; Tanizawa, Yukio; Matsutani, Akira [Yamaguchi Univ. School of Medicine (Japan)] [and others

    1995-10-01

    To assess the genetic susceptibility to noninsulin-dependent diabetes mellitus (NIDDM) in Japanese subjects, we investigated the role of GLUT2 gene defects in patients with NIDDM. When the allelic frequency of a simple tandem repeat polymorphism in the GLUT2 gene was compared, the allele with 155 base pairs was more common in NIDDM patients (n = 99) than in controls (n = 89; 5.1% v. 0.6%; P = 0.0118, by Fisher`s exact test), whereas this was not significant after the correction for multiple comparisons. To directly identify mutations, we then analyzed each of 11 exons by the polymerase chain reaction-single strand conformation polymorphism analysis in 60 NIDDM patients. We found 2 missense mutations in exon 3: CCC{r_arrow}CTC (Pro{sup 68}{r_arrow}Leu) in 1 patient and ACT{r_arrow}ATT (Thr{sup 110}{r_arrow}Ile) in 3 patients, all in the heterozygous state. These mutations were found in 60 control subjects. To evaluate the significance of the Pro{sup 68}{r_arrow}Leu mutation, the family members of the proband were studied. The mutation did not appear to be associated with the disease or other clinical parameters including change in immunoreactive insulin/change in plasma glucose or oral glucose load. The other mutation (Thr{sup 110}{r_arrow}Ile) is known to be functionally insignificant. We identified 4 additional nucleotide changes, all of which appeared to be silent. We concluded that the mutations in the GLUT2 gene were not major determinants of genetic susceptibility to NIDDM in Japanese. 34 refs., 2 figs., 3 tabs.

  14. Variations on strongly lacunary quasi Cauchy sequences

    Science.gov (United States)

    Kaplan, Huseyin; Cakalli, Huseyin

    2016-08-01

    We introduce a new function space, namely the space of Nθ (p)-ward continuous functions, which turns out to be a closed subspace of the space of continuous functions for each positive integer p. Nθα(p ) -ward continuity is also introduced and investigated for any fixed 0 kr-1, kr], and θ = (kr) is a lacunary sequence, i.e. an increasing sequence of positive integers such that k0 ≠ 0, and hr: kr-kr-1 →∞.

  15. Phylogenetic relationships of Acheilognathidae (Cypriniformes: Cyprinoidea) as revealed from evidence of both nuclear and mitochondrial gene sequence variation: evidence for necessary taxonomic revision in the family and the identification of cryptic species.

    Science.gov (United States)

    Chang, Chia-Hao; Li, Fan; Shao, Kwang-Tsao; Lin, Yeong-Shin; Morosawa, Takahiro; Kim, Sungmin; Koo, Hyeyoung; Kim, Won; Lee, Jae-Seong; He, Shunping; Smith, Carl; Reichard, Martin; Miya, Masaki; Sado, Tetsuya; Uehara, Kazuhiko; Lavoué, Sébastien; Chen, Wei-Jen; Mayden, Richard L

    2014-12-01

    Bitterlings are relatively small cypriniform species and extremely interesting evolutionarily due to their unusual reproductive behaviors and their coevolutionary relationships with freshwater mussels. As a group, they have attracted a great deal of attention in biological studies. Understanding the origin and evolution of their mating system demands a well-corroborated hypothesis of their evolutionary relationships. In this study, we provide the most comprehensive phylogenetic reconstruction of species relationships of the group based on partitioned maximum likelihood and Bayesian methods using DNA sequence variation of nuclear and mitochondrial genes on 41 species, several subspecies and three undescribed species. Our findings support the monophyly of the Acheilognathidae. Two of the three currently recognized genera are not monophyletic and the family can be subdivided into six clades. These clades are further regarded as genera based on both their phylogenetic relationships and a reappraisal of morphological characters. We present a revised classification for the Acheilognathidae with five genera/lineages: Rhodeus, Acheilognathus (new constitution), Tanakia (new constitution), Paratanakia gen. nov., and Pseudorhodeus gen. nov. and an unnamed clade containing five species currently referred to as "Acheilognathus". Gene trees of several bitterling species indicate that the taxa are not monophyletic. This result highlights a potentially dramatic underestimation of species diversity in this family. Using our new phylogenetic framework, we discuss the evolution of the Acheilognathidae relative to classification, taxonomy and biogeography. Copyright © 2014 Elsevier Inc. All rights reserved.

  16. GENE SEQUENCE HOMOLOGY OF CHEMOKINES ACROSS SPECIES

    Science.gov (United States)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-react...

  17. Algorithm of detecting structural variations in DNA sequences

    Science.gov (United States)

    Nałecz-Charkiewicz, Katarzyna; Nowak, Robert

    2014-11-01

    Whole genome sequencing enables to use the longest common subsequence algorithm to detect genetic structure variations. We propose to search position of short unique fragments, genetic markers, to achieve acceptable time and space complexity. The markers are generated by algorithms searching the genetic sequence or its Fourier transformation. The presented methods are checked on structural variations generated in silico on bacterial genomes giving the comparable or better results than other solutions.

  18. ChickVD: a sequence variation database for the chicken genome

    DEFF Research Database (Denmark)

    Wang, Jing; He, Ximiao; Ruan, Jue

    2005-01-01

    Working in parallel with the efforts to sequence the chicken (Gallus gallus) genome, the Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, The Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DNA...... from domestic breeds. Using the Red Jungle Fowl genome sequence as a reference, we identified 3.1 million non-redundant DNA sequence variants. To facilitate the application of our data to avian genetics and to provide a foundation for functional and evolutionary studies, we created the 'Chicken...... Variation Database' (ChickVD). A graphical MapView shows variants mapped onto the chicken genome in the context of gene annotations and other features, including genetic markers, trait loci, cDNAs, chicken orthologs of human disease genes and raw sequence traces. ChickVD also stores information...

  19. Network of tRNA Gene Sequences

    Institute of Scientific and Technical Information of China (English)

    WEI Fang-ping; LI Sheng; MA Hong-ru

    2008-01-01

    A network of 3719 tRNA gene sequences was constructed using simplest alignment. Its topology, degree distribution and clustering coefficient were studied. The behaviors of the network shift from fluctuated distribution to scale-free distribution when the similarity degree of the tRNA gene sequences increases. The tRNA gene sequences with the same anticodon identity are more self-organized than those with different anticodon identities and form local clusters in the network. Some vertices of the local cluster have a high connection with other local clusters, and the probable reason was given. Moreover, a network constructed by the same number of random tRNA sequences was used to make comparisons. The relationships between the properties of the tRNA similarity network and the characters of tRNA evolutionary history were discussed.

  20. Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition.

    Directory of Open Access Journals (Sweden)

    Moses M Muraya

    Full Text Available A major goal of maize genomic research is to identify sequence polymorphisms responsible for phenotypic variation in traits of economic importance. Large-scale detection of sequence variation is critical for linking genes, or genomic regions, to phenotypes. However, due to its size and complexity, it remains expensive to generate whole genome sequences of sufficient coverage for divergent maize lines, even with access to next generation sequencing (NGS technology. Because methods involving reduction of genome complexity, such as genotyping-by-sequencing (GBS, assess only a limited fraction of sequence variation, targeted sequencing of selected genomic loci offers an attractive alternative. We therefore designed a sequence capture assay to target 29 Mb genomic regions and surveyed a total of 4,648 genes possibly affecting biomass production in 21 diverse inbred maize lines (7 flints, 14 dents. Captured and enriched genomic DNA was sequenced using the 454 NGS platform to 19.6-fold average depth coverage, and a broad evaluation of read alignment and variant calling methods was performed to select optimal procedures for variant discovery. Sequence alignment with the B73 reference and de novo assembly identified 383,145 putative single nucleotide polymorphisms (SNPs, of which 42,685 were non-synonymous alterations and 7,139 caused frameshifts. Presence/absence variation (PAV of genes was also detected. We found that substantial sequence variation exists among genomic regions targeted in this study, which was particularly evident within coding regions. This diversification has the potential to broaden functional diversity and generate phenotypic variation that may lead to new adaptations and the modification of important agronomic traits. Further, annotated SNPs identified here will serve as useful genetic tools and as candidates in searches for phenotype-altering DNA variation. In summary, we demonstrated that sequencing of captured DNA is a powerful

  1. Variational multi-valued velocity field estimation for transparent sequences

    DEFF Research Database (Denmark)

    Ramirez-Manzanares, Alonso; Rivera, Mariano; Kornprobst, Pierre;

    2011-01-01

    Motion estimation in sequences with transparencies is an important problem in robotics and medical imaging applications. In this work we propose a variational approach for estimating multi-valued velocity fields in transparent sequences. Starting from existing local motion estimators, we derive a...

  2. Mitochondrial DNA sequence variation in Greeks.

    Science.gov (United States)

    Kouvatsi, A; Karaiskou, N; Apostolidis, A; Kirmizidis, G

    2001-12-01

    Mitochondrial DNA (mtDNA) control region sequences were determined in 54 unrelated Greeks, coming from different regions in Greece, for both segments HVR-I and HVR-II. Fifty-two different mtDNA haplotypes were revealed, one of which was shared by three individuals. A very low heterogeneity was found among Greek regions. No one cluster of lineages was specific to individuals coming from a certain region. The average pairwise difference distribution showed a value of 7.599. The data were compared with that for other European or neighbor populations (British, French, Germans, Tuscans, Bulgarians, and Turks). The genetic trees that were constructed revealed homogeneity between Europeans. Median networks revealed that most of the Greek mtDNA haplotypes are clustered to the five known haplogroups and that a number of haplotypes are shared among Greeks and other European and Near Eastern populations.

  3. Using chaos to generate variations on movement sequences

    Science.gov (United States)

    Bradley, Elizabeth; Stuart, Joshua

    1998-12-01

    We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.

  4. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    Energy Technology Data Exchange (ETDEWEB)

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-12-31

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants` isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  5. Molecular phylogenetics of the family Cyprinidae (Actinopterygii: Cypriniformes) as evidenced by sequence variation in the first intron of S7 ribosomal protein-coding gene: further evidence from a nuclear gene of the systematic chaos in the family.

    Science.gov (United States)

    He, Shunping; Mayden, Richard L; Wang, Xuzheng; Wang, Wei; Tang, Kevin L; Chen, Wei-Jen; Chen, Yiyu

    2008-03-01

    The family Cyprinidae is the largest freshwater fish group in the world, including over 200 genera and 2100 species. The phylogenetic relationships of major clades within this family are simply poorly understood, largely because of the overwhelming diversity of the group; however, several investigators have advanced different hypotheses of relationships that pre- and post-date the use of shared-derived characters as advocated through phylogenetic systematics. As expected, most previous investigations used morphological characters. Recently, mitochondrial DNA (mtDNA) sequences and combined morphological and mtDNA investigations have been used to explore and advance our understanding of species relationships and test monophyletic groupings. Limitations of these studies include limited taxon sampling and a strict reliance upon maternally inherited mtDNA variation. The present study is the first endeavor to recover the phylogenetic relationships of the 12 previously recognized monophyletic subfamilies within the Cyprinidae using newly sequenced nuclear DNA (nDNA) for over 50 species representing members of the different previously hypothesized subfamily and family groupings within the Cyprinidae and from other cypriniform families as outgroup taxa. Hypothesized phylogenetic relationships are constructed using maximum parsimony and Basyesian analyses of 1042 sites, of which 971 sites were variable and 790 were phylogenetically informative. Using other appropriate cypriniform taxa of the families Catostomidae (Myxocyprinus asiaticus), Gyrinocheilidae (Gyrinocheilus aymonieri), and Balitoridae (Nemacheilus sp. and Beaufortia kweichowensis) as outgroups, the Cyprinidae is resolved as a monophyletic group. Within the family the genera Raiamas, Barilius, Danio, and Rasbora, representing many of the tropical cyprinids, represent basal members of the family. All other species can be classified into variably supported and resolved monophyletic lineages, depending upon analysis

  6. Nemertean toxin genes revealed through transcriptome sequencing.

    Science.gov (United States)

    Whelan, Nathan V; Kocot, Kevin M; Santos, Scott R; Halanych, Kenneth M

    2014-11-27

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63-74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Forward Genetics by Sequencing EMS Variation-Induced Inbred Lines

    Directory of Open Access Journals (Sweden)

    Charles Addo-Quaye

    2017-02-01

    Full Text Available In order to leverage novel sequencing techniques for cloning genes in eukaryotic organisms with complex genomes, the false positive rate of variant discovery must be controlled for by experimental design and informatics. We sequenced five lines from three pedigrees of ethyl methanesulfonate (EMS-mutagenized Sorghum bicolor, including a pedigree segregating a recessive dwarf mutant. Comparing the sequences of the lines, we were able to identify and eliminate error-prone positions. One genomic region contained EMS mutant alleles in dwarfs that were homozygous reference sequences in wild-type siblings and heterozygous in segregating families. This region contained a single nonsynonymous change that cosegregated with dwarfism in a validation population and caused a premature stop codon in the Sorghum ortholog encoding the gibberellic acid (GA biosynthetic enzyme ent-kaurene oxidase. Application of exogenous GA rescued the mutant phenotype. Our method for mapping did not require outcrossing and introduced no segregation variance. This enables work when line crossing is complicated by life history, permitting gene discovery outside of genetic models. This inverts the historical approach of first using recombination to define a locus and then sequencing genes. Our formally identical approach first sequences all the genes and then seeks cosegregation with the trait. Mutagenized lines lacking obvious phenotypic alterations are available for an extension of this approach: mapping with a known marker set in a line that is phenotypically identical to starting material for EMS mutant generation.

  8. Deep sequence analysis of non-small cell lung cancer: Integrated analysis of gene expression, alternative splicing, and single nucleotide variations in lung adenocarcinomas with and without oncogenic KRAS mutations

    Directory of Open Access Journals (Sweden)

    Krishna R Kalari

    2012-02-01

    Full Text Available KRAS mutations are highly prevalent in non-small cell lung cancer (NSCLC, and tumors harboring these mutations tend to be aggressive and resistant to chemotherapy. We used next-generation sequencing technology to identify pathways that are specifically altered in lung tumors harboring a KRAS mutation. Paired-end RNA-sequencing of 15 primary lung adenocarcinoma tumors (8 harboring mutant KRAS and 7 with wild-type KRAS were performed. Sequences were mapped to the human genome, and genomic features, including differentially expressed genes, alternate splicing isoforms and single nucleotide variants, were determined for tumors with and without KRAS mutation using a variety of computational methods. Network analysis was carried out on genes showing differential expression (374 genes, alternate splicing (259 genes and SNV-related changes (65 genes in NSCLC tumors harboring a KRAS mutation. Genes exhibiting two or more connections from the lung adenocarcinoma network were used to carry out integrated pathway analysis. The most significant signaling pathways identified through this analysis were the NFkB, ERK1/2 and AKT pathways. A 27 gene mutant KRAS-specific sub network was extracted based on gene-gene connections within the integrated network, and interrogated for druggable targets. Our results confirm previous evidence that mutant KRAS tumors exhibit activated NFkB, ERK1/2 and AKT pathways and may be preferentially sensitive to target therapeutics toward these pathways. In addition, our analysis indicates novel, previously unappreciated links between mutant KRAS and the TNFR and PPARγ signaling pathways, suggesting that targeted PPARγ antagonists and TNFR inhibitors may be useful therapeutic strategies for treatment of mutant KRAS lung tumors. Our study is the first to integrate genomic features from RNA-Seq data from NSCLC and to define a first draft genomic landscape model that is unique to tumors with oncogenic KRAS mutations.

  9. ENIGMA-Evidence-based network for the interpretation of germline mutant alleles: An international initiative to evaluate risk and clinical significance associated with sequence variation in BRCA1 and BRCA2 genes

    DEFF Research Database (Denmark)

    Spurdle, Amanda B; Healey, Sue; Devereau, Andrew;

    2012-01-01

    As genetic testing for predisposition to human diseases has become an increasingly common practice in medicine, the need for clear interpretation of the test results is apparent. However, for many disease genes, including the breast cancer susceptibility genes BRCA1 and BRCA2, a significant......, and coordinately develop and apply algorithms for classification of variants in BRCA1 and BRCA2. It is envisaged that the research and clinical application of models developed by ENIGMA will be relevant to the interpretation of sequence variants in other disease genes....

  10. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    Science.gov (United States)

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds.

  11. Sequencing and Gene Expression Analysis of Leishmania tropica LACK Gene.

    Directory of Open Access Journals (Sweden)

    Nour Hammoudeh

    2014-12-01

    Full Text Available Leishmania Homologue of receptors for Activated C Kinase (LACK antigen is a 36-kDa protein, which provokes a very early immune response against Leishmania infection. There are several reports on the expression of LACK through different life-cycle stages of genus Leishmania, but only a few of them have focused on L.tropica.The present study provides details of the cloning, DNA sequencing and gene expression of LACK in this parasite species. First, several local isolates of Leishmania parasites were typed in our laboratory using PCR technique to verify of Leishmania parasite species. After that, LACK gene was amplified and cloned into a vector for sequencing. Finally, the expression of this molecule in logarithmic and stationary growth phase promastigotes, as well as in amastigotes, was evaluated by Reverse Transcription-PCR (RT-PCR technique.The typing result confirmed that all our local isolates belong to L.tropica. LACK gene sequence was determined and high similarity was observed with the sequences of other Leishmania species. Furthermore, the expression of LACK gene in both promastigotes and amastigotes forms was confirmed.Overall, the data set the stage for future studies of the properties and immune role of LACK gene products.

  12. Sequence validation of candidates for selectively important genes in sunflower.

    Directory of Open Access Journals (Sweden)

    Mark A Chapman

    Full Text Available Analyses aimed at identifying genes that have been targeted by past selection provide a powerful means for investigating the molecular basis of adaptive differentiation. In the case of crop plants, such studies have the potential to not only shed light on important evolutionary processes, but also to identify genes of agronomic interest. In this study, we test for evidence of positive selection at the DNA sequence level in a set of candidate genes previously identified in a genome-wide scan for genotypic evidence of selection during the evolution of cultivated sunflower. In the majority of cases, we were able to confirm the effects of selection in shaping diversity at these loci. Notably, the genes that were found to be under selection via our sequence-based analyses were devoid of variation in the cultivated sunflower gene pool. This result confirms a possible strategy for streamlining the search for adaptively-important loci process by pre-screening the derived population to identify the strongest candidates before sequencing them in the ancestral population.

  13. The nucleotide sequences of two leghemoglobin genes from soybean

    DEFF Research Database (Denmark)

    Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O

    1982-01-01

    We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...

  14. Anchored pseudo-de novo assembly of human genomes identifies extensive sequence variation from unmapped sequence reads.

    Science.gov (United States)

    Faber-Hammond, Joshua J; Brown, Kim H

    2016-07-01

    The human genome reference (HGR) completion marked the genomics era beginning, yet despite its utility universal application is limited by the small number of individuals used in its development. This is highlighted by the presence of high-quality sequence reads failing to map within the HGR. Sequences failing to map generally represent 2-5 % of total reads, which may harbor regions that would enhance our understanding of population variation, evolution, and disease. Alternatively, complete de novo assemblies can be created, but these effectively ignore the groundwork of the HGR. In an effort to find a middle ground, we developed a bioinformatic pipeline that maps paired-end reads to the HGR as separate single reads, exports unmappable reads, de novo assembles these reads per individual and then combines assemblies into a secondary reference assembly used for comparative analysis. Using 45 diverse 1000 Genomes Project individuals, we identified 351,361 contigs covering 195.5 Mb of sequence unincorporated in GRCh38. 30,879 contigs are represented in multiple individuals with ~40 % showing high sequence complexity. Genomic coordinates were generated for 99.9 %, with 52.5 % exhibiting high-quality mapping scores. Comparative genomic analyses with archaic humans and primates revealed significant sequence alignments and comparisons with model organism RefSeq gene datasets identified novel human genes. If incorporated, these sequences will expand the HGR, but more importantly our data highlight that with this method low coverage (~10-20×) next-generation sequencing can still be used to identify novel unmapped sequences to explore biological functions contributing to human phenotypic variation, disease and functionality for personal genomic medicine.

  15. Cloning and sequencing genes related to preeclampsia

    Institute of Scientific and Technical Information of China (English)

    SHI Juan-zi; LIU Yan-fang; YAO Yuan-qing; YAN Wei; ZHU Feng; ZHAO Zhong-liang

    2001-01-01

    To clone genes specifically expressed in the placenta of patients with preeclampsia, and to explain the mechanism in the etiopathology ofpreeclampsia. Methods: The placentae ofpreeclamptic and normotensive subjects with pregnancy were used as models, and the cDNA Library was constructed and 20 differentially expressed fragments were cloned after a new version of PCR-based subtractive hybridization. The false positive clones were identified by reverse dot blot analysis. With one of the obtained gene taken as the probe, the placentas of 10 normal pregnant women and 10 preeclamptic patients were studied by using dot hybridization methods. Results: Six false positive clones were identified by reverse dot blot, and the rest 14 clones were identified as preeclampsia-related genes. These clones were sequenced, and analyzed with BLAST analysis system. Eleven of 14 clones were genes already known, among which one belongs to necdin family; the rest 3 were identified as novel genes. These 3 genes were acknowledged by GenBank, with the accession numbers AF232216, AF232217, AF233648. The results of dot hybridization using necdin gene as probe were as follows: (1) There was this mRNA in the placental tissues of normal pregnancy as well as in that ofpreeclampsia.(2) The intensity of transcription of this mRNA in the placental tissues of preeclampsia increased significantly compared with that of the normal pregnancy (P<0.05). Conclusions: This study for the first time reported this group of genes, especially necdin-expressing gene, which are related to the etiopathology of preeclampsia. In addition, the overtranscription ofnecdin gene has been found in preeclampsia. It is helpful in further studies of the etiology ofpreeclampsia.

  16. Characterization of sulphonamide-resistant Escherichia coli using comparison of sul2 gene sequences and multilocus sequence typing

    DEFF Research Database (Denmark)

    Trobos, Margarita; Christensen, Henrik; Sunde, Marianne

    2009-01-01

    The sul2 gene encodes sulphonamide resistance (Sul(R)) and is commonly found in Escherichia coli from different hosts. We typed E coli isolates by multilocus sequence typing (MLST) and compared the results to sequence variation of sul2, in order to investigate the relation to host origin of patho......The sul2 gene encodes sulphonamide resistance (Sul(R)) and is commonly found in Escherichia coli from different hosts. We typed E coli isolates by multilocus sequence typing (MLST) and compared the results to sequence variation of sul2, in order to investigate the relation to host origin...... of pathogenic and commensal E coli strains and to investigate whether transfer of sul2 into different genomic lineages has happened multiple times. Sixty-eight E coli isolated in Denmark and Norway from different hosts and years were MLST typed and sul2 PCR products were sequenced and compared. PFGE...

  17. Predicting gene expression from sequence: a reexamination.

    Directory of Open Access Journals (Sweden)

    Yuan Yuan

    2007-11-01

    Full Text Available Although much of the information regarding genes' expressions is encoded in the genome, deciphering such information has been very challenging. We reexamined Beer and Tavazoie's (BT approach to predict mRNA expression patterns of 2,587 genes in Saccharomyces cerevisiae from the information in their respective promoter sequences. Instead of fitting complex Bayesian network models, we trained naïve Bayes classifiers using only the sequence-motif matching scores provided by BT. Our simple models correctly predict expression patterns for 79% of the genes, based on the same criterion and the same cross-validation (CV procedure as BT, which compares favorably to the 73% accuracy of BT. The fact that our approach did not use position and orientation information of the predicted binding sites but achieved a higher prediction accuracy, motivated us to investigate a few biological predictions made by BT. We found that some of their predictions, especially those related to motif orientations and positions, are at best circumstantial. For example, the combinatorial rules suggested by BT for the PAC and RRPE motifs are not unique to the cluster of genes from which the predictive model was inferred, and there are simpler rules that are statistically more significant than BT's ones. We also show that CV procedure used by BT to estimate their method's prediction accuracy is inappropriate and may have overestimated the prediction accuracy by about 10%.

  18. Partitioning of genetic variation between regulatory and coding gene segments: the predominance of software variation in genes encoding introvert proteins.

    Science.gov (United States)

    Mitchison, A

    1997-01-01

    In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics.

  19. Coordinated effects of sequence variation on DNA binding, chromatin structure, and transcription.

    Science.gov (United States)

    Kilpinen, Helena; Waszak, Sebastian M; Gschwind, Andreas R; Raghav, Sunil K; Witwicki, Robert M; Orioli, Andrea; Migliavacca, Eugenia; Wiederkehr, Michaël; Gutierrez-Arcelus, Maria; Panousis, Nikolaos I; Yurovsky, Alisa; Lappalainen, Tuuli; Romano-Palumbo, Luciana; Planchon, Alexandra; Bielser, Deborah; Bryois, Julien; Padioleau, Ismael; Udin, Gilles; Thurnheer, Sarah; Hacker, David; Core, Leighton J; Lis, John T; Hernandez, Nouria; Reymond, Alexandre; Deplancke, Bart; Dermitzakis, Emmanouil T

    2013-11-08

    DNA sequence variation has been associated with quantitative changes in molecular phenotypes such as gene expression, but its impact on chromatin states is poorly characterized. To understand the interplay between chromatin and genetic control of gene regulation, we quantified allelic variability in transcription factor binding, histone modifications, and gene expression within humans. We found abundant allelic specificity in chromatin and extensive local, short-range, and long-range allelic coordination among the studied molecular phenotypes. We observed genetic influence on most of these phenotypes, with histone modifications exhibiting strong context-dependent behavior. Our results implicate transcription factors as primary mediators of sequence-specific regulation of gene expression programs, with histone modifications frequently reflecting the primary regulatory event.

  20. Genomic and genie sequence variation in synthetic hexaploid wheat(AABBDD)as compared to their parental species

    Institute of Scientific and Technical Information of China (English)

    Lihong Nie; Zongfu Han; Lahu Lu; Yingyin Yao; Qixin Sun; Zhongfu Ni

    2008-01-01

    In order to understand the genomic changes during the evolution of hexaploid wheat,two sets of synthetic hexaploid wheat from hybridization between maternal tetraploid wheat (AABB) and paternal diploid goat grass(DD)were used for DNA-AFLP and single strand conformation polymorphism (SSCP) analysis to determine the genomic and genie variation in the synthetic hexaploid wheat.Results indicated that more DNA sequences from paternal diploid species wen eliminated in the synthetic hexaploid wheat than from maternal tetraploid wheat,suggesting that genome from parental species of lower ploidity tends to be eliminated preferentially.However,sequence variation detected by SSCP procedure was much lower than those detected by DNA-AFLP.which indicated that much less variation in the genie regions occurred in the synthetic hexaploid wheat.and sequence variations detected by DNA-AFLP could be derived mostly from non-coding regions and repetitive sequences.Our results also indicated that sequence variation in 4 genes can be detected in hybrid F1.which suggested that this type of sequence variation could be resulted from distant hybridization.It was interesting to note that 3 out of the 4 genes were mapped and clustered on the long alTll of chromosome 2D,which indicated that variation in genic sequences in synthetic hexaploid wheat might not be a randomized process.

  1. Mitochondrial DNA sequence variation in the Anatolian Peninsula (Turkey)

    Indian Academy of Sciences (India)

    Hatice Mergen; Reyhan Öner; Cihan Öner

    2004-04-01

    Throughout human history, the region known today as the Anatolian peninsula (Turkey) has served as a junction connecting the Middle East, Europe and Central Asia, and, thus, has been subject to major population movements. The present study is undertaken to obtain information about the distribution of the existing mitochondrial D-loop sequence variations in the Turkish population of Anatolia. A few studies have previously reported mtDNA sequences in Turks. We attempted to extend these results by analysing a cohort that is not only larger, but also more representative of the Turkish population living in Anatolia. In order to obtain a descriptive picture for the phylogenetic distribution of the mitochondrial genome within Turkey, we analysed mitochondrial D-loop region sequence variations in 75 individuals from different parts of Anatolia by direct sequencing. Analysis of the two hypervariable segments within the noncoding region of the mitochondrial genome revealed the existence of 81 nucleotide mutations at 79 sites. The neighbour-joining tree of Kimura’s distance matrix has revealed the presence of six main clusters, of which H and U are the most common. The data obtained are also compared with several European and Turkic Central Asian populations.

  2. Detecting gene mutations in Japanese Alzheimer's patients by semiconductor sequencing.

    Science.gov (United States)

    Yagi, Ryoichi; Miyamoto, Ryosuke; Morino, Hiroyuki; Izumi, Yuishin; Kuramochi, Masahito; Kurashige, Takashi; Maruyama, Hirofumi; Mizuno, Noriyoshi; Kurihara, Hidemi; Kawakami, Hideshi

    2014-07-01

    Alzheimer's disease (AD) is the most common form of dementia. To date, several genes have been identified as the cause of AD, including PSEN1, PSEN2, and APP. The association between APOE and late-onset AD has also been reported. We here used a bench top next-generation sequencer, which uses an integrated semiconductor device, detects hydrogen ions, and operates at a high-speed using nonoptical technology. We examined 45 Japanese AD patients with positive family histories, and 29 sporadic patients with early onset (useful for detecting genetic variations in familial AD.

  3. Sequence variations of the locus-specific 5' untranslated regions of SLA class I genes and the development of a comprehensive genomic DNA-based high-resolution typing method for SLA-2.

    Science.gov (United States)

    Choi, H; Le, M T; Lee, H; Choi, M-K; Cho, H-S; Nagasundarapandian, S; Kwon, O-J; Kim, J-H; Seo, K; Park, J-K; Lee, J-H; Ho, C-S; Park, C

    2015-10-01

    The genetic diversity of the major histocompatibility complex (MHC) class I molecules of pigs has not been well characterized. Therefore, the influence of MHC genetic diversity on the immune-related traits of pigs, including disease resistance and other MHC-dependent traits, is not well understood. Here, we attempted to develop an efficient method for systemic analysis of the polymorphisms in the epitope-binding region of swine leukocyte antigens (SLA) class I genes. We performed a comparative analysis of the last 92 bp of the 5' untranslated region (UTR) to the beginning of exon 4 of six SLA classical class I-related genes, SLA-1, -2, -3, -4, -5, and -9, from 36 different sequences. Based on this information, we developed a genomic polymerase chain reaction (PCR) and direct sequencing-based comprehensive typing method for SLA-2. We successfully typed SLA-2 from 400 pigs and 8 cell lines, consisting of 9 different pig breeds, and identified 49 SLA-2 alleles, including 31 previously reported alleles and 18 new alleles. We observed differences in the composition of SLA-2 alleles among different breeds. Our method can be used to study other SLA class I loci and to deepen our knowledge of MHC class I genes in pigs. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  4. Phylogenetic analysis of vibrios and related species by means of atpA gene sequences.

    Science.gov (United States)

    Thompson, Cristiane C; Thompson, Fabiano L; Vicente, Ana Carolina P; Swings, Jean

    2007-11-01

    We investigated the use of atpA gene sequences as alternative phylogenetic and identification markers for vibrios. A fragment of 1322 bp (corresponding to approximately 88% of the coding region) was analysed in 151 strains of vibrios. The relationships observed were in agreement with the phylogeny inferred from 16S rRNA gene sequence analysis. For instance, the Vibrio cholerae, Vibrio halioticoli, Vibrio harveyi and Vibrio splendidus species groups appeared in the atpA gene phylogenetic analyses, suggesting that these groups may be considered as separate genera within the current Vibrio genus. Overall, atpA gene sequences appeared to be more discriminatory for species differentiation than 16S rRNA gene sequences. 16S rRNA gene sequence similarities above 97% corresponded to atpA gene sequences similarities above 80%. The intraspecies variation in the atpA gene sequence was about 99% sequence similarity. The results showed clearly that atpA gene sequences are a suitable alternative for the identification and phylogenetic study of vibrios.

  5. Streptococcus mutans clonal variation revealed by multilocus sequence typing.

    Science.gov (United States)

    Nakano, Kazuhiko; Lapirattanakul, Jinthana; Nomura, Ryota; Nemoto, Hirotoshi; Alaluusua, Satu; Grönroos, Lisa; Vaara, Martti; Hamada, Shigeyuki; Ooshima, Takashi; Nakagawa, Ichiro

    2007-08-01

    Streptococcus mutans is the major pathogen of dental caries, a biofilm-dependent infectious disease, and occasionally causes infective endocarditis. S. mutans strains have been classified into four serotypes (c, e, f, and k). However, little is known about the S. mutans population, including the clonal relationships among strains of S. mutans, in relation to the particular clones that cause systemic diseases. To address this issue, we have developed a multilocus sequence typing (MLST) scheme for S. mutans. Eight housekeeping gene fragments were sequenced from each of 102 S. mutans isolates collected from the four serotypes in Japan and Finland. Between 14 and 23 alleles per locus were identified, allowing us theoretically to distinguish more than 1.2 x 10(10) sequence types. We identified 92 sequence types in these 102 isolates, indicating that S. mutans contains a diverse population. Whereas serotype c strains were widely distributed in the dendrogram, serotype e, f, and k strains were differentiated into clonal complexes. Therefore, we conclude that the ancestral strain of S. mutans was serotype c. No geographic specificity was identified. However, the distribution of the collagen-binding protein gene (cnm) and direct evidence of mother-to-child transmission were clearly evident. In conclusion, the superior discriminatory capacity of this MLST scheme for S. mutans may have important practical implications.

  6. PAX6 gene variations associated with aniridia in south India

    Directory of Open Access Journals (Sweden)

    Shashikant Shetty

    2004-04-01

    Full Text Available Abstract Background Mutations in the transcription factor gene PAX6 have been shown to be the cause of the aniridia phenotype. The purpose of this study was to analyze patients with aniridia to uncover PAX6 gene mutations in south Indian population. Methods Total genomic DNA was isolated from peripheral blood of twenty-eight members of six clinically diagnosed aniridia families and 60 normal healthy controls. The coding exons of the human PAX6 gene were amplified by PCR and allele specific variations were detected by single strand conformation polymorphism (SSCP followed by automated sequencing. Results The sequencing results revealed novel PAX6 mutations in three patients with sporadic aniridia: c.715ins5, [c.1201delA; c.1239A>G] and c.901delA. Two previously reported nonsense mutations were also found: c.482C>A, c.830G>A. A neutral polymorphism was detected (IVS9-12C>T at the boundary of intron 9 and exon 10. The two nonsense mutations found in the coding region of human PAX6 gene are reported for the first time in the south Indian population. Conclusion The genetic analysis confirms that haploinsuffiency of the PAX6 gene causes the classic aniridia phenotype. Most of the point mutations detected in our study results in stop codons. Here we add three novel PAX6 gene mutations in south Indian population to the existing spectrum of mutations, which is not a well-studied ethnic group. Our study supports the hypothesis that a mutation in the PAX6 gene correlates with expression of aniridia.

  7. The first determination of DNA sequence of a specific gene.

    Science.gov (United States)

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  8. STR allele sequence variation: Current knowledge and future issues.

    Science.gov (United States)

    Gettings, Katherine Butler; Aponte, Rachel A; Vallone, Peter M; Butler, John M

    2015-09-01

    This article reviews what is currently known about short tandem repeat (STR) allelic sequence variation in and around the twenty-four loci most commonly used throughout the world to perform forensic DNA investigations. These STR loci include D1S1656, TPOX, D2S441, D2S1338, D3S1358, FGA, CSF1PO, D5S818, SE33, D6S1043, D7S820, D8S1179, D10S1248, TH01, vWA, D12S391, D13S317, Penta E, D16S539, D18S51, D19S433, D21S11, Penta D, and D22S1045. All known reported variant alleles are compiled along with genomic information available from GenBank, dbSNP, and the 1000 Genomes Project. Supplementary files are included which provide annotated reference sequences for each STR locus, characterize genomic variation around the STR repeat region, and compare alleles present in currently available STR kit allelic ladders. Looking to the future, STR allele nomenclature options are discussed as they relate to next generation sequencing efforts underway.

  9. Genotyping common and rare variation using overlapping pool sequencing

    Directory of Open Access Journals (Sweden)

    Pasaniuc Bogdan

    2011-07-01

    Full Text Available Abstract Background Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants. Results In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications. Conclusions Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.

  10. The use of high-throughput DNA sequencing in the investigation of antigenic variation: application to Neisseria species.

    Directory of Open Access Journals (Sweden)

    John K Davies

    Full Text Available Antigenic variation occurs in a broad range of species. This process resembles gene conversion in that variant DNA is unidirectionally transferred from partial gene copies (or silent loci into an expression locus. Previous studies of antigenic variation have involved the amplification and sequencing of individual genes from hundreds of colonies. Using the pilE gene from Neisseria gonorrhoeae we have demonstrated that it is possible to use PCR amplification, followed by high-throughput DNA sequencing and a novel assembly process, to detect individual antigenic variation events. The ability to detect these events was much greater than has previously been possible. In N. gonorrhoeae most silent loci contain multiple partial gene copies. Here we show that there is a bias towards using the copy at the 3' end of the silent loci (copy 1 as the donor sequence. The pilE gene of N. gonorrhoeae and some strains of Neisseria meningitidis encode class I pilin, but strains of N. meningitidis from clonal complexes 8 and 11 encode a class II pilin. We have confirmed that the class II pili of meningococcal strain FAM18 (clonal complex 11 are non-variable, and this is also true for the class II pili of strain NMB from clonal complex 8. In addition when a gene encoding class I pilin was moved into the meningococcal strain NMB background there was no evidence of antigenic variation. Finally we investigated several members of the opa gene family of N. gonorrhoeae, where it has been suggested that limited variation occurs. Variation was detected in the opaK gene that is located close to pilE, but not at the opaJ gene located elsewhere on the genome. The approach described here promises to dramatically improve studies of the extent and nature of antigenic variation systems in a variety of species.

  11. Genome-wide sequence variations among Mycobacterium avium subspecies paratuberculosis.

    Directory of Open Access Journals (Sweden)

    Chung-Yi eHsu

    2011-12-01

    Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.

  12. Genetic variation of human papillomavirus type 16 in individual clinical specimens revealed by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Iwao Kukimoto

    Full Text Available Viral genetic diversity within infected cells or tissues, called viral quasispecies, has been mostly studied for RNA viruses, but has also been described among DNA viruses, including human papillomavirus type 16 (HPV16 present in cervical precancerous lesions. However, the extent of HPV genetic variation in cervical specimens, and its involvement in HPV-induced carcinogenesis, remains unclear. Here, we employ deep sequencing to comprehensively analyze genetic variation in the HPV16 genome isolated from individual clinical specimens. Through overlapping full-circle PCR, approximately 8-kb DNA fragments covering the whole HPV16 genome were amplified from HPV16-positive cervical exfoliated cells collected from patients with either low-grade squamous intraepithelial lesion (LSIL or invasive cervical cancer (ICC. Deep sequencing of the amplified HPV16 DNA enabled de novo assembly of the full-length HPV16 genome sequence for each of 7 specimens (5 LSIL and 2 ICC samples. Subsequent alignment of read sequences to the assembled HPV16 sequence revealed that 2 LSILs and 1 ICC contained nucleotide variations within E6, E1 and the non-coding region between E5 and L2 with mutation frequencies of 0.60% to 5.42%. In transient replication assays, a novel E1 mutant found in ICC, E1 Q381E, showed reduced ability to support HPV16 origin-dependent replication. In addition, partially deleted E2 genes were detected in 1 LSIL sample in a mixed state with the intact E2 gene. Thus, the methods used in this study provide a fundamental framework for investigating the influence of HPV somatic genetic variation on cervical carcinogenesis.

  13. Utilizing Gene Tree Variation to Identify Candidate Effector Genes in Zymoseptoria tritici

    Directory of Open Access Journals (Sweden)

    Megan C. McDonald

    2016-04-01

    Full Text Available Zymoseptoria tritici is a host-specific, necrotrophic pathogen of wheat. Infection by Z. tritici is characterized by its extended latent period, which typically lasts 2 wks, and is followed by extensive host cell death, and rapid proliferation of fungal biomass. This work characterizes the level of genomic variation in 13 isolates, for which we have measured virulence on 11 wheat cultivars with differential resistance genes. Between the reference isolate, IPO323, and the 13 Australian isolates we identified over 800,000 single nucleotide polymorphisms, of which ∼10% had an effect on the coding regions of the genome. Furthermore, we identified over 1700 probable presence/absence polymorphisms in genes across the Australian isolates using de novo assembly. Finally, we developed a gene tree sorting method that quickly identifies groups of isolates within a single gene alignment whose sequence haplotypes correspond with virulence scores on a single wheat cultivar. Using this method, we have identified < 100 candidate effector genes whose gene sequence correlates with virulence toward a wheat cultivar carrying a major resistance gene.

  14. Genome-wide analysis reveals diverged patterns of codon bias, gene expression, and rates of sequence evolution in picea gene families.

    Science.gov (United States)

    De La Torre, Amanda R; Lin, Yao-Cheng; Van de Peer, Yves; Ingvarsson, Pär K

    2015-03-05

    The recent sequencing of several gymnosperm genomes has greatly facilitated studying the evolution of their genes and gene families. In this study, we examine the evidence for expression-mediated selection in the first two fully sequenced representatives of the gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of gene expression (>50,000 expressed genes) to study the relationship between gene expression, codon bias, rates of sequence divergence, protein length, and gene duplication. We found that gene expression is correlated with rates of sequence divergence and codon bias, suggesting that natural selection is acting on Picea protein-coding genes for translational efficiency. Gene expression, rates of sequence divergence, and codon bias are correlated with the size of gene families, with large multicopy gene families having, on average, a lower expression level and breadth, lower codon bias, and higher rates of sequence divergence than single-copy gene families. Tissue-specific patterns of gene expression were more common in large gene families with large gene expression divergence than in single-copy families. Recent family expansions combined with large gene expression variation in paralogs and increased rates of sequence evolution suggest that some Picea gene families are rapidly evolving to cope with biotic and abiotic stress. Our study highlights the importance of gene expression and natural selection in shaping the evolution of protein-coding genes in Picea species, and sets the ground for further studies investigating the evolution of individual gene families in gymnosperms.

  15. Sequence Variations in the Flagellar Antigen Genes fliCH25 and fliCH28 of Escherichia coli and Their Use in Identification and Characterization of Enterohemorrhagic E. coli (EHEC O145:H25 and O145:H28.

    Directory of Open Access Journals (Sweden)

    Lothar Beutin

    Full Text Available Enterohemorrhagic E. coli (EHEC serogroup O145 is regarded as one of the major EHEC serogroups involved in severe infections in humans. EHEC O145 encompasses motile and non-motile strains of serotypes O145:H25 and O145:H28. Sequencing the fliC-genes associated with the flagellar antigens H25 and H28 revealed the genetic diversity of the fliCH25 and fliCH28 gene sequences in E. coli. Based on allele discrimination of these fliC-genes real-time PCR tests were designed for identification of EHEC O145:H25 and O145:H28. The fliCH25 genes present in O145:H25 were found to be very similar to those present in E. coli serogroups O2, O100, O165, O172 and O177 pointing to their common evolution but were different from fliCH25 genes of a multiple number of other E. coli serotypes. In a similar way, EHEC O145:H28 harbor a characteristic fliCH28 allele which, apart from EHEC O145:H28, was only found in enteropathogenic (EPEC O28:H28 strains that shared some common traits with EHEC O145:H28. The real time PCR-assays targeting these fliCH25[O145] and fliCH28[O145] alleles allow better characterization of EHEC O145:H25 and EHEC O145:H28. Evaluation of these PCR assays in spiked ready-to eat salad samples resulted in specific detection of both types of EHEC O145 strains even when low spiking levels of 1-10 cfu/g were used. Furthermore these PCR assays allowed identification of non-motile E. coli strains which are serologically not typable for their H-antigens. The combined use of O-antigen genotyping (O145wzy and detection of the respective fliCH25[O145] and fliCH28[O145] allele types contributes to improve identification and molecular serotyping of E. coli O145 isolates.

  16. Sequence Variations in the Flagellar Antigen Genes fliCH25 and fliCH28 of Escherichia coli and Their Use in Identification and Characterization of Enterohemorrhagic E. coli (EHEC) O145:H25 and O145:H28

    Science.gov (United States)

    Beutin, Lothar; Delannoy, Sabine; Fach, Patrick

    2015-01-01

    Enterohemorrhagic E. coli (EHEC) serogroup O145 is regarded as one of the major EHEC serogroups involved in severe infections in humans. EHEC O145 encompasses motile and non-motile strains of serotypes O145:H25 and O145:H28. Sequencing the fliC-genes associated with the flagellar antigens H25 and H28 revealed the genetic diversity of the fliCH25 and fliCH28 gene sequences in E. coli. Based on allele discrimination of these fliC-genes real-time PCR tests were designed for identification of EHEC O145:H25 and O145:H28. The fliCH25 genes present in O145:H25 were found to be very similar to those present in E. coli serogroups O2, O100, O165, O172 and O177 pointing to their common evolution but were different from fliCH25 genes of a multiple number of other E. coli serotypes. In a similar way, EHEC O145:H28 harbor a characteristic fliCH28 allele which, apart from EHEC O145:H28, was only found in enteropathogenic (EPEC) O28:H28 strains that shared some common traits with EHEC O145:H28. The real time PCR-assays targeting these fliCH25[O145] and fliCH28[O145] alleles allow better characterization of EHEC O145:H25 and EHEC O145:H28. Evaluation of these PCR assays in spiked ready-to eat salad samples resulted in specific detection of both types of EHEC O145 strains even when low spiking levels of 1–10 cfu/g were used. Furthermore these PCR assays allowed identification of non-motile E. coli strains which are serologically not typable for their H-antigens. The combined use of O-antigen genotyping (O145wzy) and detection of the respective fliCH25[O145] and fliCH28[O145] allele types contributes to improve identification and molecular serotyping of E. coli O145 isolates. PMID:26000885

  17. Sequence Variations in the Flagellar Antigen Genes fliCH25 and fliCH28 of Escherichia coli and Their Use in Identification and Characterization of Enterohemorrhagic E. coli (EHEC) O145:H25 and O145:H28.

    Science.gov (United States)

    Beutin, Lothar; Delannoy, Sabine; Fach, Patrick

    2015-01-01

    Enterohemorrhagic E. coli (EHEC) serogroup O145 is regarded as one of the major EHEC serogroups involved in severe infections in humans. EHEC O145 encompasses motile and non-motile strains of serotypes O145:H25 and O145:H28. Sequencing the fliC-genes associated with the flagellar antigens H25 and H28 revealed the genetic diversity of the fliCH25 and fliCH28 gene sequences in E. coli. Based on allele discrimination of these fliC-genes real-time PCR tests were designed for identification of EHEC O145:H25 and O145:H28. The fliCH25 genes present in O145:H25 were found to be very similar to those present in E. coli serogroups O2, O100, O165, O172 and O177 pointing to their common evolution but were different from fliCH25 genes of a multiple number of other E. coli serotypes. In a similar way, EHEC O145:H28 harbor a characteristic fliCH28 allele which, apart from EHEC O145:H28, was only found in enteropathogenic (EPEC) O28:H28 strains that shared some common traits with EHEC O145:H28. The real time PCR-assays targeting these fliCH25[O145] and fliCH28[O145] alleles allow better characterization of EHEC O145:H25 and EHEC O145:H28. Evaluation of these PCR assays in spiked ready-to eat salad samples resulted in specific detection of both types of EHEC O145 strains even when low spiking levels of 1-10 cfu/g were used. Furthermore these PCR assays allowed identification of non-motile E. coli strains which are serologically not typable for their H-antigens. The combined use of O-antigen genotyping (O145wzy) and detection of the respective fliCH25[O145] and fliCH28[O145] allele types contributes to improve identification and molecular serotyping of E. coli O145 isolates.

  18. Sequence polymorphism and evolution of three cetacean MHC genes.

    Science.gov (United States)

    Xu, Shi Xia; Ren, Wen Hua; Li, Shu Zhen; Wei, Fu Wen; Zhou, Kai Ya; Yang, Guang

    2009-09-01

    Sequence variability at three major histocompatibility complex (MHC) genes (DQB, DRA, and MHC-I) of cetaceans was investigated in order to get an overall understanding of cetacean MHC evolution. Little sequence variation was detected at the DRA locus, while extensive and considerable variability were found at the MHC-I and DQB loci. Phylogenetic reconstruction and sequence comparison revealed extensive sharing of identical MHC alleles among different species at the three MHC loci examined. Comparisons of phylogenetic trees for these MHC loci with the trees reconstructed only based on non-PBR sites revealed that allelic similarity/identity possibly reflected common ancestry and were not due to adaptive convergence. At the same time, trans-species evolution was also evidenced that the allelic diversity of the three MHC loci clearly pre-dated species divergence events according to the relaxed molecular clock. It may be the forces of balancing selection acting to maintain the high sequence variability and identical alleles in trans-specific manner at the MHC-I and DQB loci.

  19. Ribosomal DNA copy number loss and sequence variation in cancer.

    Science.gov (United States)

    Xu, Baoshan; Li, Hua; Perry, John M; Singh, Vijay Pratap; Unruh, Jay; Yu, Zulin; Zakari, Musinu; McDowell, William; Li, Linheng; Gerton, Jennifer L

    2017-06-01

    Ribosomal DNA is one of the most variable regions in the human genome with respect to copy number. Despite the importance of rDNA for cellular function, we know virtually nothing about what governs its copy number, stability, and sequence in the mammalian genome due to challenges associated with mapping and analysis. We applied computational and droplet digital PCR approaches to measure rDNA copy number in normal and cancer states in human and mouse genomes. We find that copy number and sequence can change in cancer genomes. Counterintuitively, human cancer genomes show a loss of copies, accompanied by global copy number co-variation. The sequence can also be more variable in the cancer genome. Cancer genomes with lower copies have mutational evidence of mTOR hyperactivity. The PTEN phosphatase is a tumor suppressor that is critical for genome stability and a negative regulator of the mTOR kinase pathway. Surprisingly, but consistent with the human cancer genomes, hematopoietic cancer stem cells from a Pten-/- mouse model for leukemia have lower rDNA copy number than normal tissue, despite increased proliferation, rRNA production, and protein synthesis. Loss of copies occurs early and is associated with hypersensitivity to DNA damage. Therefore, copy loss is a recurrent feature in cancers associated with mTOR activation. Ribosomal DNA copy number may be a simple and useful indicator of whether a cancer will be sensitive to DNA damaging treatments.

  20. CODEX: a normalization and copy number variation detection method for whole exome sequencing.

    Science.gov (United States)

    Jiang, Yuchao; Oldridge, Derek A; Diskin, Sharon J; Zhang, Nancy R

    2015-03-31

    High-throughput sequencing of DNA coding regions has become a common way of assaying genomic variation in the study of human diseases. Copy number variation (CNV) is an important type of genomic variation, but detecting and characterizing CNV from exome sequencing is challenging due to the high level of biases and artifacts. We propose CODEX, a normalization and CNV calling procedure for whole exome sequencing data. The Poisson latent factor model in CODEX includes terms that specifically remove biases due to GC content, exon capture and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data. CODEX is compared to existing methods on a population analysis of HapMap samples from the 1000 Genomes Project, and shown to be more accurate on three microarray-based validation data sets. We further evaluate performance on 222 neuroblastoma samples with matched normals and focus on a well-studied rare somatic CNV within the ATRX gene. We show that the cross-sample normalization procedure of CODEX removes more noise than normalizing the tumor against the matched normal and that the segmentation procedure performs well in detecting CNVs with nested structures.

  1. Gene and translation initiation site prediction in metagenomic sequences

    Energy Technology Data Exchange (ETDEWEB)

    Hyatt, Philip Douglas [ORNL; LoCascio, Philip F [ORNL; Hauser, Loren John [ORNL; Uberbacher, Edward C [ORNL

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  2. SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations.

    Directory of Open Access Journals (Sweden)

    Steven N Hart

    Full Text Available BACKGROUND: Structural variation (SV represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. RESULTS: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1 not requiring secondary (or exhaustive primary alignment, 2 portability into established sequencing workflows, and 3 is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.. SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. CONCLUSIONS: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

  3. HLA DNA sequence variation among human populations: molecular signatures of demographic and selective events.

    Directory of Open Access Journals (Sweden)

    Stéphane Buhler

    Full Text Available Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies.Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model. However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used

  4. Sequence length variation, indel costs, and congruence in sensitivity analysis

    DEFF Research Database (Denmark)

    Aagesen, Lone; Petersen, Gitte; Seberg, Ole

    2005-01-01

    the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously...... preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation...... in the data set. Dominance was easily detected, as the character-based congruence measures approached their optimal value when indel costs were incremented. Dominance of a fragment or data partition was overwhelmed when new sequence length-variable fragments or data partitions were added....

  5. Characterization and phylogenetic analysis of -gliadin gene sequences reveals significant genomic divergence in Triticeae species

    Indian Academy of Sciences (India)

    Guang-Rong Li; Tao Lang; En-Nian Yang; Cheng Liu; Zu-Jun Yang

    2014-12-01

    Although the unique properties of wheat -gliadin gene family are well characterized, little is known about the evolution and genomic divergence of -gliadin gene family within the Triticeae. We isolated a total of 203 -gliadin gene sequences from 11 representative diploid and polyploid Triticeae species, and found 108 sequences putatively functional. Our results indicate that -gliadin genes may have possibly originated from wild Secale species, where the sequences contain the shortest repetitive domains and display minimum variation. A miniature inverted-repeat transposable element insertion is reported for the first time in -gliadin gene sequence of Thinopyrum intermedium in this study, indicating that the transposable element might have contributed to the diversification of -gliadin genes family among Triticeae genomes. The phylogenetic analyses revealed that the -gliadin gene sequences of Dasypyrum, Australopyrum, Lophopyrum, Eremopyrum and Pseudoroengeria species have amplified several times. A search for four typical toxic epitopes for celiac disease within the Triticeae -gliadin gene sequences showed that the -gliadins of wild Secale, Australopyrum and Agropyron genomes lack all four epitopes, while other Triticeae species have accumulated these epitopes, suggesting that the evolution of these toxic epitopes sequences occurred during the course of speciation, domestication or polyploidization of Triticeae.

  6. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  7. Mitochondrial Cytochrome c Oxidase Subunit 1 Sequence Variation in Prostate Cancer

    Directory of Open Access Journals (Sweden)

    Takara A. Scott

    2012-01-01

    Full Text Available Purpose. Mitochondrial DNA (mtDNA mutations have been described in every adult neoplasm including prostate cancer. There are marked racial differences in mutations within the cytochrome c oxidase subunit 1 (COI gene in individuals with prostate cancer (PCa. The purpose of this study was to identify the variation in COI gene sequence in African and Caucasian Americans with prostate cancer. Methods. We sequenced the COI gene from peripheral blood in 482 prostate cancer patients and 189 controls. All bases that differed from the revised Cambridge Reference Sequence (rCRS were classified as either silent or missense and the compiled alterations were then compared between races and published reports. Results and Conclusions. We found inherited mtDNA COI missense variants in 8.8% of Caucasian prostate cancer patients (vs. 0.0% controls and 72.8% of African-American prostate cancer patients (vs. 64.3% controls A total of 144 COI variants were identified, of which 30 were missense mutations. Of 482 PCa patients, 116 (24.1% had one or more missense mutations. Further evaluation of this gene and these mutations may allow for the identification of genetically at-risk populations. The high rate of COI mutations in African-Americans may account for some of the racial disparity observed in prostate cancer.

  8. Sequence Variation in the E2-Binding Domain of HPV16 and Biological Function Evaluation in Tunisian Cervical Cancers

    Directory of Open Access Journals (Sweden)

    Saloua Kahla

    2014-01-01

    Full Text Available HPV16 E2 variants have different effects on the transcriptional activity of the LCR. In this study, we examined the nucleotide and amino acid sequence variation within the HPV16 E2 gene and to correlate with disease progression. E2 gene disruption was detected by PCR amplification of the entire E2 gene using a single set of primers. Nucleotide variations were analyzed by bidirectional sequencing. mRNA expression patterns of E6 and E7 gene transcripts were evaluated by a reverse transcriptase-PCR method (RT-PCR. The detection of intact E2 genes was significantly higher among controls than cases (81.8% versus 37.5%, resp., PA results in the amino acid substitution T310K and was more common among the E2 undisrupted cases (7/9; 77.7%, compared to controls (2/9; 22.2%. In addition, specific sequence variations identified in the E2 ORF at positions 3684 C>A were associated with increased viral oncogenes E6-E7 production. Besides HPV16 E2 disruption, the 3684 C>A variation within undisrupted E2 genes could be involved in an alternative mechanism for deregulating the expression of the HPV16 E6 and E7 oncogenes and appears to be a major factor contributing to the development of cervical cancer in Tunisian women.

  9. Comprehensive assessment of sequence variation within the copy number variable defensin cluster on 8p23 by target enriched in-depth 454 sequencing

    Directory of Open Access Journals (Sweden)

    Zhang Xinmin

    2011-05-01

    Full Text Available Abstract Background In highly copy number variable (CNV regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach. Results As a proof of principle, we enriched a ~850 kb section comprising the CNV defensin gene cluster DEFB, the invariable DEFA part and 11 control regions from two genomes by sequence capture and sequenced it by 454 technology. 6,651 differences to the human reference genome were found. Comparison to HapMap genotypes revealed sensitivities and specificities in the range of 94% to 99% for the identification of variations. Using error probabilities for rigorous filtering revealed 2,886 unique single nucleotide variations (SNVs including 358 putative novel ones. DEFB CN determinations by haplotype ratios were in agreement with alternative methods. Conclusion Although currently labor extensive and having high costs, target enriched NGS provides a powerful tool for the comprehensive assessment of SNVs in highly polymorphic CNV regions of individual genomes. Furthermore, it reveals considerable amounts of putative novel variations and simultaneously allows CN estimation.

  10. Marker2sequence, mine your QTL regions for candidate genes

    NARCIS (Netherlands)

    Chibon, P.Y.F.R.P.; Schoof, H.; Visser, R.G.F.; Finkers, H.J.

    2012-01-01

    Marker2sequence (M2S) aims at mining quantitative trait loci (QTLs) for candidate genes. For each gene, within the QTL region, M2S uses data integration technology to integrate putative gene function with associated gene ontology terms, proteins, pathways and literature. As a typical QTL region

  11. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...

  12. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    Directory of Open Access Journals (Sweden)

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  13. Poly(T) variation within mitochondrial protein-coding genes in Globodera (Nematoda: Heteroderidae).

    Science.gov (United States)

    Riepsamen, Angelique H; Blok, Vivian C; Phillips, Mark; Gibson, Tracey; Dowton, Mark

    2008-03-01

    We sequenced a mitochondrial subgenome from the nematode Globodera rostochiensis, in two overlapping pieces. The subgenome was 9210 bp and contained four protein-coding genes (ND4, COIII, ND3, Cytb) and two tRNA genes (tRNA(Thr), tRNA(Gln)). Genome organization was similar to that of Globodera pallida, which is multipartite. Together with the small number of genes on this subgenome, this suggests that the mitochondrial genome of G. rostochiensis is also multipartite. In the initial clones sequenced, COIII and ND3 were full-length, while ND4 and Cytb were interrupted by premature stop codons and contained point indels that disrupted the reading frame. However, sequencing of multiple clones, from DNA extracted both from multiple individuals and from single cysts, revealed a predominant source of variation-in the length of polythymidine tracts. Comparison of our genomic sequences with ESTs similarly revealed variation in the length of polythymidine tracts. We subsequently sequenced both genomic DNA and mRNA from populations of G. pallida. In each case, variation in the length of polythymidine tracts was observed. The levels of expression of mitochondrial genes in G. pallida were representative of the subgenomes present: little evidence of differential expression was observed. These observations are consistent with the operation of posttranscriptional editing in Globodera mitochondria, although this is difficult to show conclusively in the presence of intraindividual gene sequence variation. Further, alternative explanations cannot be discounted; these include the operation of slippage during translation or that genomic copies of most genes are pseudogenes with a small proportion of full-length sequences able to maintain mitochondrial function.

  14. Identification of genetic variations of a Chinese family with paramyotonia congenita via whole exome sequencing

    Directory of Open Access Journals (Sweden)

    Jinxin Li

    2015-06-01

    Full Text Available Paramyotonia congenita (PC is a rare autosomal dominant neuromuscular disorder characterized by juvenile onset and development of cold-induced myotonia after repeated activities. The disease is mostly caused by genetic mutations of the sodium channel, voltage-gated, type IV, alpha subunit (SCN4A gene. This study intended to systematically identify the causative genetic variations of a Chinese Han PC family. Seven members of this PC family, including four patients and three healthy controls, were selected for whole exome sequencing (WES using the Illumina HiSeq platform. Sequence variations were identified using the SoftGenetics program. The mutation R1448C of SCN4A was found to be the only causative mutation. This study applied WES technology to sequence multiple members of a large PC family and was the first to systematically confirm that the genetic change in SCN4A is the only causative variation in this PC family and the SCN4A mutation is sufficient to lead to PC.

  15. Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray).

    Science.gov (United States)

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-02-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  16. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    Science.gov (United States)

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M.; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A. C. T.; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M.; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  17. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    Science.gov (United States)

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A.C.T; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  18. Sequencing genes in silico using single nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  19. Correlating Traits of Gene Retention, Sequence Divergence, Duplicability and Essentiality in Vertebrates, Arthropods, and Fungi

    Science.gov (United States)

    Waterhouse, Robert M.; Zdobnov, Evgeny M.; Kriventseva, Evgenia V.

    2011-01-01

    Delineating ancestral gene relations among a large set of sequenced eukaryotic genomes allowed us to rigorously examine links between evolutionary and functional traits. We classified 86% of over 1.36 million protein-coding genes from 40 vertebrates, 23 arthropods, and 32 fungi into orthologous groups and linked over 90% of them to Gene Ontology or InterPro annotations. Quantifying properties of ortholog phyletic retention, copy-number variation, and sequence conservation, we examined correlations with gene essentiality and functional traits. More than half of vertebrate, arthropod, and fungal orthologs are universally present across each lineage. These universal orthologs are preferentially distributed in groups with almost all single-copy or all multicopy genes, and sequence evolution of the predominantly single-copy orthologous groups is markedly more constrained. Essential genes from representative model organisms, Mus musculus, Drosophila melanogaster, and Saccharomyces cerevisiae, are significantly enriched in universal orthologs within each lineage, and essential-gene-containing groups consistently exhibit greater sequence conservation than those without. This study of eukaryotic gene repertoire evolution identifies shared fundamental principles and highlights lineage-specific features, it also confirms that essential genes are highly retained and conclusively supports the “knockout-rate prediction” of stronger constraints on essential gene sequence evolution. However, the distinction between sequence conservation of single- versus multicopy orthologs is quantitatively more prominent than between orthologous groups with and without essential genes. The previously underappreciated difference in the tolerance of gene duplications and contrasting evolutionary modes of “single-copy control” versus “multicopy license” may reflect a major evolutionary mechanism that allows extended exploration of gene sequence space. PMID:21148284

  20. Simple sequence repeats provide a substrate for phenotypic variation in the Neurospora crassa circadian clock.

    Directory of Open Access Journals (Sweden)

    Todd P Michael

    Full Text Available BACKGROUND: WHITE COLLAR-1 (WC-1 mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR being essential for clock function. METHODOLOGY/PRINCIPAL FINDINGS: Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. CONCLUSIONS/SIGNIFICANCE: Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N.crassa circadian clock will facilitate an understanding of how fungi exploit their environments.

  1. Mitochondrial DNA sequence variation in Finnish patients with matrilineal diabetes mellitus

    Directory of Open Access Journals (Sweden)

    Soini Heidi K

    2012-07-01

    Full Text Available Abstract Background The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA. Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. Findings We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Conclusions Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.

  2. Patterns of variation among distinct alleles of the Flag silk gene from Nephila clavipes.

    Science.gov (United States)

    Higgins, Linden E; White, Sheryl; Nuñez-Farfán, Juan; Vargas, Jesus

    2007-02-20

    Spider silk proteins and their genes are very attractive to researchers in a wide range of disciplines because they permit linking many levels of organization. However, hypotheses of silk gene evolution have been built primarily upon single sequences of each gene each species, and little is known about allelic variation within a species. Silk genes are known for their repeat structure with high levels of homogenization of nucleotide and amino acid sequence among repeated units. One common explanation for this homogeneity is gene convergence. To test this model, we sequenced multiple alleles of one intron-exon segment from the Flag gene from four populations of the spider Nephila clavipes and compared the new sequences to a published sequence. Our analysis revealed very high levels of heterozygosity in this gene, with no pattern of population differentiation. There was no evidence of gene convergence within any of these alleles, with high levels of nucleotide and amino acid substitution among the repeating motifs. Our data suggest that minimally, there is relaxed selection on mutations in this gene and that there may actually be positive selection for heterozygosity.

  3. Expression Characteristics and Sequence Variation Analysis of Glutamine Synthetase Gene in Grain of japonica Rice with Transgressive Variation%粳稻超亲变异系籽粒谷氨酰胺合成酶基因表达特性及序列变异分析

    Institute of Scientific and Technical Information of China (English)

    徐振华; 曲莹; 刘海英; 朱立楠; 张忠臣; 金正勋

    2016-01-01

    content and significantly differed between parents and progenies . GS1 .3 and GS2 genes ,in transgressive variants and their parents with different protein content ,exhibited similar transcription trend during grain filling ,that is ,the transcription level increased 15 -20 days before heading and then decreased gradually .Moreover ,the grain protein content was closely related to GS1 .3 and GS2 expression levels .The varieties with higher GS transcription level showed higher protein content compared with their parents .In addition , although GS1 .3 and GS2 gene sequences showed high conservation ,the gene sequence and protein sequence of GS1 .3 and GS2 were not completely identical in different varieties ,and there are some single nuclear polymorphisms .Random base variation as well as changes in codon and amino acids might occur because inter‐variety sexual hybridization causes base substitution during segregation and stability .

  4. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    Science.gov (United States)

    Timofeeva, Maria N.; Kinnersley, Ben; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10−7), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10−7); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10−7 and OR = 1.09, P = 7.4 × 10−8); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10−9), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10−6). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10−4) and DNA mismatch repair genes (P = 6.1 × 10−4) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438

  5. A human gut microbial gene catalogue established by metagenomic sequencing

    DEFF Research Database (Denmark)

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn

    2010-01-01

    , from faecal samples of 124 European individuals. The gene set, ,150 times larger than the human gene complement, contains an overwhelming majority of the prevalent (more frequent) microbial genes of the cohort and probably includes a large proportion of the prevalent human intestinal microbial genes......To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...

  6. Yeast DNA sequences initiating gene expression in Escherichia coli.

    Science.gov (United States)

    Lewin, Astrid; Tran, Thi Tuyen; Jacob, Daniela; Mayer, Martin; Freytag, Barbara; Appel, Bernd

    2004-01-01

    DNA transfer between pro- and eukaryotes occurs either during natural horizontal gene transfer or as a result of the employment of gene technology. We analysed the capacity of DNA sequences from a eukaryotic donor organism (Saccharomyces cerevisiae) to serve as promoter region in a prokaryotic recipient (Escherichia coli) by creating fusions between promoterless luxAB genes from Vibrio harveyi and random DNA sequences from S. cerevisiae and measuring the luminescence of transformed E. coli. Fifty-four out of 100 randomly analysed S. cerevisiae DNA sequences caused considerable gene expression in E. coli. Determination of transcription start sites within six selected yeast sequences in E. coli confirmed the existence of bacterial -10 and -35 consensus sequences at appropriate distances upstream from transcription initiation sites. Our results demonstrate that the probability of transcription of transferred eukaryotic DNA in bacteria is extremely high and does not require the insertion of the transferred DNA behind a promoter of the recipient genome.

  7. Global properties and functional complexity of human gene regulatory variation.

    Directory of Open Access Journals (Sweden)

    Daniel J Gaffney

    2013-05-01

    Full Text Available Identification and functional interpretation of gene regulatory variants is a major focus of modern genomics. The application of genetic mapping to molecular and cellular traits has enabled the detection of regulatory variation on genome-wide scales and revealed an enormous diversity of regulatory architecture in humans and other species. In this review I summarise the insights gained and questions raised by a decade of genetic mapping of gene expression variation. I discuss recent extensions of this approach using alternative molecular phenotypes that have revealed some of the biological mechanisms that drive gene expression variation between individuals. Finally, I highlight outstanding problems and future directions for development.

  8. Complete nucleotide sequence of primitive vertebrate immunoglobulin light chain genes.

    Science.gov (United States)

    Shamblott, M J; Litman, G W

    1989-06-01

    Antibody to Heterodontus francisci (horned shark) immunoglobulin light chain was used to screen a spleen cDNA expression library, and recombinant clones encoding light chain genes were isolated. The complete sequences of the mature coding regions of two light chain genes in this phylogenetically distant vertebrate have been determined and are reported here. Comparisons of the sequences are consistent with the presence of mammalian-like framework and complementarity-determining regions. The predicted amino acid sequences of the genes are more related to mammalian lambda than to kappa light chains. The nucleotide sequences of the genes are most related to mammalian T-cell antigen receptor beta chain. Heterodontus light chain genes may reflect characteristics of the common ancestor of immunoglobulin and T-cell antigen receptors before its evolutionary diversification.

  9. Complete plastid genome sequence of Primula sinensis (Primulaceae: structure comparison, sequence variation and evidence for accD transfer to nucleus

    Directory of Open Access Journals (Sweden)

    Tong-Jian Liu

    2016-06-01

    Full Text Available Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp were separated by a large single-copy region (82,064 bp and a small single-copy region (17,725 bp. The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis.

  10. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  11. Sequence and secondary structure of the mitochondrial 16S ribosomal RNA gene of Ixodes scapularis.

    Science.gov (United States)

    Krakowetz, Chantel N; Chilton, Neil B

    2015-02-01

    The complete DNA sequences and secondary structure of the mitochondrial (mt) 16S ribosomal (r) RNA gene were determined for six Ixodes scapularis adults. There were 44 variable nucleotide positions in the 1252 bp sequence alignment. Most (95%) nucleotide alterations did not affect the integrity of the secondary structure of the gene because they either occurred at unpaired positions or represented compensatory changes that maintained the base pairing in helices. A large proportion (75%) of the intraspecific variation in DNA sequence occurred within Domains I, II and VI of the 16S gene. Therefore, several regions within this gene may be highly informative for studies of the population genetics and phylogeography of I. scapularis, a major vector of pathogens of humans and domestic animals in North America.

  12. Variation of partial transferrin sequences and phylogenetic relationships among hares (Lepus capensis, Lagomorpha) from Tunisia.

    Science.gov (United States)

    Awadi, Asma; Suchentrunk, Franz; Makni, Mohamed; Ben Slimen, Hichem

    2016-10-01

    North African hares are currently included in cape hares, Lepus capensis sensu lato, a taxon that may be considered a superspecies or a complex of closely related species. The existing molecular data, however, are not unequivocal, with mtDNA control region sequences suggesting a separate species status and nuclear loci (allozymes, microsatellites) revealing conspecificity of L. capensis and L. europaeus. Here, we study sequence variation in the intron 6 (468 bp) of the transferrin nuclear gene, of 105 hares with different coat colour from different regions in Tunisia with respect to genetic diversity and differentiation, as well as their phylogenetic status. Forty-six haplotypes (alleles) were revealed and compared phylogenetically to all available TF haplotypes of various Lepus species retrieved from GenBank. Maximum Likelihood, neighbor joining and median joining network analyses concordantly grouped all currently obtained haplotypes together with haplotypes belonging to six different Chinese hare species and the African scrub hare L. saxatilis. Moreover, two Tunisian haploypes were shared with L. capensis, L timidus, L. sinensis, L. yarkandensis, and L. hainanus from China. These results indicated the evolutionary complexity of the genus Lepus with the mixing of nuclear gene haplotypes resulting from introgressive hybridization or/and shared ancestral polymorphism. We report the presence of shared ancestral polymorphism between North African and Chinese hares. This has not been detected earlier in the mtDNA sequences of the same individuals. Genetic diversity of the TF sequences from the Tunisian populations was relatively high compared to other hare populations. However, genetic differentiation and gene flow analyses (AMOVA, FST, Nm) indicated little divergence with the absence of geographically meaningful phylogroups and lack of clustering with coat colour types. These results confirm the presence of a single hare species in Tunisia, but a sound inference on

  13. Theories of Population Variation in Genes and Genomes

    DEFF Research Database (Denmark)

    Christiansen, Freddy

    genetics, while emphasizing the close interplay between theory and empiricism. Traditional topics such as genetic and phenotypic variation, mutation, migration, and linkage are covered and advanced by contemporary coalescent theory, which describes the genealogy of genes in a population, ultimately...

  14. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing.

    Science.gov (United States)

    Aflitos, Saulo; Schijlen, Elio; de Jong, Hans; de Ridder, Dick; Smit, Sandra; Finkers, Richard; Wang, Jun; Zhang, Gengyun; Li, Ning; Mao, Likai; Bakker, Freek; Dirks, Rob; Breit, Timo; Gravendeel, Barbara; Huits, Henk; Struss, Darush; Swanson-Wagner, Ruth; van Leeuwen, Hans; van Ham, Roeland C H J; Fito, Laia; Guignier, Laëtitia; Sevilla, Myrna; Ellul, Philippe; Ganko, Eric; Kapur, Arvind; Reclus, Emannuel; de Geus, Bernard; van de Geest, Henri; Te Lintel Hekkert, Bas; van Haarst, Jan; Smits, Lars; Koops, Andries; Sanchez-Perez, Gabino; van Heusden, Adriaan W; Visser, Richard; Quan, Zhiwu; Min, Jiumeng; Liao, Li; Wang, Xiaoli; Wang, Guangbiao; Yue, Zhen; Yang, Xinhua; Xu, Na; Schranz, Eric; Smets, Erik; Vos, Rutger; Rauwerda, Johan; Ursem, Remco; Schuit, Cees; Kerns, Mike; van den Berg, Jan; Vriezen, Wim; Janssen, Antoine; Datema, Erwin; Jahrman, Torben; Moquet, Frederic; Bonnet, Julien; Peters, Sander

    2014-10-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies.

  15. Population genetic variation in gene expression is associated withphenotypic variation in Saccharomyces cerevisiae

    Energy Technology Data Exchange (ETDEWEB)

    Fay, Justin C.; McCullough, Heather L.; Sniegowski, Paul D.; Eisen, Michael B.

    2004-02-25

    The relationship between genetic variation in gene expression and phenotypic variation observable in nature is not well understood. Identifying how many phenotypes are associated with differences in gene expression and how many gene-expression differences are associated with a phenotype is important to understanding the molecular basis and evolution of complex traits. Results: We compared levels of gene expression among nine natural isolates of Saccharomyces cerevisiae grown either in the presence or absence of copper sulfate. Of the nine strains, two show a reduced growth rate and two others are rust colored in the presence of copper sulfate. We identified 633 genes that show significant differences in expression among strains. Of these genes,20 were correlated with resistance to copper sulfate and 24 were correlated with rust coloration. The function of these genes in combination with their expression pattern suggests the presence of both correlative and causative expression differences. But the majority of differentially expressed genes were not correlated with either phenotype and showed the same expression pattern both in the presence and absence of copper sulfate. To determine whether these expression differences may contribute to phenotypic variation under other environmental conditions, we examined one phenotype, freeze tolerance, predicted by the differential expression of the aquaporin gene AQY2. We found freeze tolerance is associated with the expression of AQY2. Conclusions: Gene expression differences provide substantial insight into the molecular basis of naturally occurring traits and can be used to predict environment dependent phenotypic variation.

  16. MEFV gene variations in patients with systemic lupus erythematosus.

    Science.gov (United States)

    Erer, Burak; Cosan, Fulya; Oku, Basar; Ustek, Duran; Inanc, Murat; Aral, Orhan; Gul, Ahmet

    2014-01-01

    The aim of this study was to investigate the frequency of familial Mediterranean fever (FMF)-associated MEFV gene variations in patients with systemic lupus erythematosus (SLE). The study group comprised 190 SLE patients and 101 healthy controls of Turkish origin with no clinical features of FMF. All individuals were genotyped for the four most common MEFV gene variations (M694V, M680I, V726A and E148Q) by PCR-restriction fragment length polymorphism analysis. The frequency of carrying any of the four MEFV gene variations under study was 15 % in patients with SLE and 10 % in the healthy controls (p = 0.23). After the exclusion of the less penetrant E148Q variation, re-analysis for the three penetrant mutations revealed a significant association between exon 10 variations and pericarditis [p = 0.038, odds ratio (OR) 3.5, 95 % confidence interval (CI) 1.0-12.1], and pleural effusion (p = 0.043, OR 5.2, 95 % CI 0.8-30.9). No significant association was detected between the MEFV gene variations and a higher acute phase response. The MEFV gene variations analyzed in our study do not seem to increase the overall susceptibility to SLE and do not have any strong association with its clinical manifestations. The possibility of a modest effect of penetrant exon 10 MEFV variants on the development of serosal effusions needs to be explored in a larger series of patients.

  17. Sequence signatures involved in targeting the male-specific lethal complex to X-chromosomal genes in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Philip Philge

    2012-03-01

    Full Text Available Abstract Background In Drosophila melanogaster, the dosage-compensation system that equalizes X-linked gene expression between males and females, thereby assuring that an appropriate balance is maintained between the expression of genes on the X chromosome(s and the autosomes, is at least partially mediated by the Male-Specific Lethal (MSL complex. This complex binds to genes with a preference for exons on the male X chromosome with a 3' bias, and it targets most expressed genes on the X chromosome. However, a number of genes are expressed but not targeted by the complex. High affinity sites seem to be responsible for initial recruitment of the complex to the X chromosome, but the targeting to and within individual genes is poorly understood. Results We have extensively examined X chromosome sequence variation within five types of gene features (promoters, 5' UTRs, coding sequences, introns, 3' UTRs and intergenic sequences, and assessed its potential involvement in dosage compensation. Presented results show that: the X chromosome has a distinct sequence composition within its gene features; some of the detected variation correlates with genes targeted by the MSL-complex; the insulator protein BEAF-32 preferentially binds upstream of MSL-bound genes; BEAF-32 and MOF co-localizes in promoters; and that bound genes have a distinct sequence composition that shows a 3' bias within coding sequence. Conclusions Although, many strongly bound genes are close to a high affinity site neither our promoter motif nor our coding sequence signatures show any correlation to HAS. Based on the results presented here, we believe that there are sequences in the promoters and coding sequences of targeted genes that have the potential to direct the secondary spreading of the MSL-complex to nearby genes.

  18. Sequence signatures involved in targeting the Male-Specific Lethal complex to X-chromosomal genes in Drosophila melanogaster.

    Science.gov (United States)

    Philip, Philge; Pettersson, Fredrik; Stenberg, Per

    2012-03-19

    In Drosophila melanogaster, the dosage-compensation system that equalizes X-linked gene expression between males and females, thereby assuring that an appropriate balance is maintained between the expression of genes on the X chromosome(s) and the autosomes, is at least partially mediated by the Male-Specific Lethal (MSL) complex. This complex binds to genes with a preference for exons on the male X chromosome with a 3' bias, and it targets most expressed genes on the X chromosome. However, a number of genes are expressed but not targeted by the complex. High affinity sites seem to be responsible for initial recruitment of the complex to the X chromosome, but the targeting to and within individual genes is poorly understood. We have extensively examined X chromosome sequence variation within five types of gene features (promoters, 5' UTRs, coding sequences, introns, 3' UTRs) and intergenic sequences, and assessed its potential involvement in dosage compensation. Presented results show that: the X chromosome has a distinct sequence composition within its gene features; some of the detected variation correlates with genes targeted by the MSL-complex; the insulator protein BEAF-32 preferentially binds upstream of MSL-bound genes; BEAF-32 and MOF co-localizes in promoters; and that bound genes have a distinct sequence composition that shows a 3' bias within coding sequence. Although, many strongly bound genes are close to a high affinity site neither our promoter motif nor our coding sequence signatures show any correlation to HAS. Based on the results presented here, we believe that there are sequences in the promoters and coding sequences of targeted genes that have the potential to direct the secondary spreading of the MSL-complex to nearby genes.

  19. Degenerative primer design and gene sequencing validation for select turkey genes.

    Science.gov (United States)

    Hutsko, Stephanie L; Lilburn, Michael S; Wick, Macdonald

    2016-06-01

    We successfully designed and validated degenerative primers for turkey genes MUC2, RPS13, TBP and TFF2 based on chicken sequences in order to use gene transcription analysis to evaluate (quantify) the mucin transcription to probiotic supplementation in turkeys. Primers were designed for the genes MUC2, TFF2, RPS13 and TBP using a degenerative primer design method based on the available Gallus gallus sequences. All primer sets, which produced a single PCR amplicon of the expected sizes, were cloned into the TOPO(®) vector and then transformed into TOP 10(®) competent cells. Plasmid DNA isolation was performed on the TOP10(®) cell culture and sent for sequencing. Sequences were analyzed using NCBI BLAST. All genes sequenced had over 90% homology with both the chicken and predicted turkey sequences. The sequences were used to design new 100% homologous primer sets for the genes of interest. © 2016 Poultry Science Association Inc.

  20. Regulatory sequence of cupin family gene

    Energy Technology Data Exchange (ETDEWEB)

    Hood, Elizabeth; Teoh, Thomas

    2017-07-25

    This invention is in the field of plant biology and agriculture and relates to novel seed specific promoter regions. The present invention further provide methods of producing proteins and other products of interest and methods of controlling expression of nucleic acid sequences of interest using the seed specific promoter regions.

  1. 5种虾虎鱼类线粒体COI基因序列变异及系统进化%Sequence Variation and Molecular Phylogeny of Mitochondrial COI Gene Segments from Five Species of Gobiidae Family

    Institute of Scientific and Technical Information of China (English)

    廖健; 张顺; 龙水生; 黄承勤; 郭昱嵩; 王中铎; 刘楚吾

    2016-01-01

    获得雷州半岛红树林海区5种虾虎鱼稚幼鱼的45条线粒体COI基因序列,其T、C、A、G碱基平均含量分别为29.9%、28.4%、23.9%、17.8%。45条序列共定义23个单倍型,其中检测到变异位点184个,双带缟虾虎鱼(Tridentiger bifasciatus)单倍型比例最高(80.0%),湖栖鳍虾虎鱼(Gobiopterus lacustris)单倍型比例最低(18.2%)。Kimura 2-parameter遗传距离显示,种间遗传距离为0.1376~0.2635,种内遗传距离为0.0000~0.0208。NJ进化树及ML进化树均表明,聚为5大支系的虾虎鱼类中,阿部氏鲻虾虎鱼(Mugilogobius abei)与诸氏鲻虾虎鱼(M. chulae)亲缘关系最近,湖栖鳍虾虎鱼与其他4种虾虎鱼的亲缘关系稍远。基于群体内等位基因频率分布的中性检验结果表明,双带缟虾虎鱼(Tridentiger bifasciatus)、小口拟虾虎鱼(Pseudogobius masago)、阿部氏鲻虾虎鱼、诸氏鲻虾虎鱼4个群体存在大量低频等位基因位点,而湖栖鳍虾虎鱼群体以中等频率等位基因为主。%45 mitochondrialCOI gene sequences offive kinds of goby juvenilesfrom Mangroves of Leizhou region were obtained,andtheaverage content ofT, C,A, Gwere29.9%, 28.4%, 23.9%, 17.8%, respectively.A total of23 haplotypes were found with 184 variable nucleotide positions,and the haplotype proportionofTridentiger bifasciatuswas thehighest(80.0%), and that ofGobiopterus lacustris was thelowest(18.2%).The inter- and intra-specific genetic distances (Kimura two-parameter, K2P) ranged from 0.1376 to 0.2635, and0.0000 to 0.0208, respectively.Based onNeighbor-Joining and Maximum Likelihood phylogenetic tree,among thegobiesclustered intofive branches, Mugilogobius abeiandM. chulaeare in theclosest relationship,while the relationship between Gobiopterus lacustris and the other fourpopulations is a little far.The neutrality tests indicated thatG. lacustriswasdominated by moderate-frequency allelic loci, while there were a large

  2. Cloning, sequencing and variability analysis of the gap gene from Mycoplasma hominis

    DEFF Research Database (Denmark)

    Mygind, Tina; Jacobsen, Iben Søgaard; Melkova, Renata

    2000-01-01

    The gap gene encodes the glycolytic enzyme glyceraldehyde 3-phosphate dehydrogenase (GAPDH). The gene was cloned and sequenced from the Mycoplasma hominis type strain PG21(T). The intraspecies variability was investigated by inspection of restriction fragment length polymorphism (RFLP) patterns...... after polymerase chain reaction (PCR) amplification of the gap gene from 15 strains and furthermore by sequencing of part of the gene in eight strains. The M. hominis gap gene was found to vary more than the Escherichia coli counterpart, but the variation at nucleotide level gave rise to only a few...... to a 104-kDa band in addition to the expected 36-kDa band. The protein reacting at 104 kDa is a M. hominis protein with either an epitope similar to one on GAPDH, or it is an immunoglobulin binding protein...

  3. EEG Sequence Imaging: A Markov Prior for the Variational Garrote

    DEFF Research Database (Denmark)

    Hansen, Sofie Therese; Hansen, Lars Kai

    2013-01-01

    We propose the following generalization of the Variational Garrote for sequential EEG imaging: A Markov prior to promote sparse, but temporally smooth source dynamics. We derive a set of modied Variational Garrote updates and analyze the role of the prior's hyperparameters. An experimental evalua...

  4. A role for gene duplication and natural variation of gene expression in the evolution of metabolism.

    Directory of Open Access Journals (Sweden)

    Daniel J Kliebenstein

    Full Text Available BACKGROUND: Most eukaryotic genomes have undergone whole genome duplications during their evolutionary history. Recent studies have shown that the function of these duplicated genes can diverge from the ancestral gene via neo- or sub-functionalization within single genotypes. An additional possibility is that gene duplicates may also undergo partitioning of function among different genotypes of a species leading to genetic differentiation. Finally, the ability of gene duplicates to diverge may be limited by their biological function. METHODOLOGY/PRINCIPAL FINDINGS: To test these hypotheses, I estimated the impact of gene duplication and metabolic function upon intraspecific gene expression variation of segmental and tandem duplicated genes within Arabidopsis thaliana. In all instances, the younger tandem duplicated genes showed higher intraspecific gene expression variation than the average Arabidopsis gene. Surprisingly, the older segmental duplicates also showed evidence of elevated intraspecific gene expression variation albeit typically lower than for the tandem duplicates. The specific biological function of the gene as defined by metabolic pathway also modulated the level of intraspecific gene expression variation. The major energy metabolism and biosynthetic pathways showed decreased variation, suggesting that they are constrained in their ability to accumulate gene expression variation. In contrast, a major herbivory defense pathway showed significantly elevated intraspecific variation suggesting that it may be under pressure to maintain and/or generate diversity in response to fluctuating insect herbivory pressures. CONCLUSION: These data show that intraspecific variation in gene expression is facilitated by an interaction of gene duplication and biological activity. Further, this plays a role in controlling diversity of plant metabolism.

  5. Mechanism of Gene Amplification via Yeast Autonomously Replicating Sequences

    Directory of Open Access Journals (Sweden)

    Shelly Sehgal

    2015-01-01

    Full Text Available The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification.

  6. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group......Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  7. Pathogenicity gene variations within the order Entomophthorales

    DEFF Research Database (Denmark)

    Grell, Morten Nedergaard; Jensen, Annette Bruun; Lange, Lene

    Fungi within the order Entomophthorales (subphylum Entomophthoromycotina) are obligate biotrophic pathogens of arthropods with a remarkable narrow host range. Infection takes place through the cuticle when conidia hit a susceptible host, facilitated by enzymatic and mechanical mechanisms. In the ...... pathogenicity genes within genera Entomophthora and Pandora, using fungal genomic DNA originating from field-collected, infected insect host species of dipteran (flies, mosquitoes) or hemipteran (aphid) origin....

  8. Pathogenicity gene variations within the order Entomophthorales

    DEFF Research Database (Denmark)

    Grell, Morten Nedergaard; Jensen, Annette Bruun; Lange, Lene

    , conidia are produced and discharged when humidity gets high—usually during night. In an earlier secretome study of field-collected grain aphids (Sitobion avenae) infected with entomophthoralean fungi, a number of pathogenesis-related, secreted enzymes were discovered (Fungal Genetics and Biology 2011, vol...... pathogenicity genes within genera Entomophthora and Pandora, using fungal genomic DNA originating from field-collected, infected insect host species of dipteran (flies, mosquitoes) or hemipteran (aphid) origin....

  9. Sequence Variation in the Gp120 region of SHIV-CN97001 during in vivo Passage

    Institute of Scientific and Technical Information of China (English)

    Qiang LIU; Gui-bo YANG; Yue MA; Chen-li QIU; Jie-jie DAI; Hui XING; Yi-ming SHAO

    2008-01-01

    SHIV-CN97001 played an important role in assessing the immune effect and strategy of the AIDS vaccine which included genes of the predominant prevalent HIV-1 strain in China. In this study, SHIV-CN97001 was in vivo passaged serially to construct pathogenic SHIV-CN97001/rhesus macaques model. To identify variation in the gp120 region of SHIV-CN97001 during passage, the fragments of gp120 gene were amplified by RT-PCR from the plasma of SHIV-CN97001 infected animals at the peak viral load time point and the gene distances (divergence, diversity) were calculated using DISTANCE. The analysis revealed that the genetic distances of SHIV-CN97001 in the third passage animals were the highest during in vivo passage. It had a relationship between viral divergence from the founder strain and viral replication ability. The nucleic acid sequence of the V3 region was highly conservative. All of the SHIV-CN97001 strains had V3 loop central motif (GPGQ) and were predicted to be using CCR5 co-receptor on the basis of the critical amino acids within V3 loop. These results show that there was no significant increase in the genetic distance during serial passage, and SHIV-CN97001 gp120 gene evolved toward ancestral states upon transmission to a new host. This could partly explain why there was no pathogenic viral strain obtained during in vivo passage.

  10. Genetic variation in V gene of class II Newcastle disease virus.

    Science.gov (United States)

    Hao, Huafang; Chen, Shengli; Liu, Peng; Ren, Shanhui; Gao, Xiaolong; Wang, Yanping; Wang, Xinglong; Zhang, Shuxia; Yang, Zengqi

    2016-01-01

    The genetic variation and molecular evolution of the V gene of the class II Newcastle disease virus (NDV) isolates with genotypes I-XVIII were determined using bioinformatics. Results indicated that low homology existed in different genotype viruses, whereas high homology often for the same genotypes, exception may be existed within genotypes I, V, VI, and XII. Sequence analysis showed that the genetic variation of V protein was consistent with virus genotype, and specific signatures on the V protein for nine genotypes were identified. Phylogenetic analysis demonstrated that the phylogenetic trees were highly consistent between the V and F genes, with slight discrepancies in the sub-genotypes. Evolutionary rate analyses based on V and F genes revealed the evolution rates varied in genotypes. These data indicate that the genetic variation of V protein is genotype-related and will help in elucidating the molecular evolution of NDV.

  11. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    Energy Technology Data Exchange (ETDEWEB)

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  12. Associations between Variation in X Chromosome Male Reproductive Genes and Sperm Competitive Ability in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Leah Greenspan

    2011-01-01

    Full Text Available Variation in reproductive success has long been thought to be mediated in part by genes encoding seminal proteins. Here we explore the effect on male reproductive phenotypes of X-linked polymorphisms, a chromosome that is depauperate in genes encoding seminal proteins. Using 57 X chromosome substitution lines, sperm competition was tested both when the males from the wild-extracted line were the first to mate (“defense” crosses, followed by a tester male, and when extracted-line males were the second to mate, after a tester male (“offfense” crosses. We scored the proportion of progeny sired by each male, the fecundity, the remating rate and refractoriness to remating, and tested the significance of variation among lines. Eleven candidate genes were chosen based on previous studies, and portions of these genes were sequenced in all 57 lines. A total of 131 polymorphisms were tested for associations with the reproductive phenotypes using linear models. Nine polymorphisms in 4 genes were found to show significant associations (at a 5% FDR. Overall, it appears that the X chromosomes harbor abundant variation in sperm competition, especially considering the paucity of seminal protein genes. This suggests that much of the male reproductive variation lies outside of genes that encode seminal proteins.

  13. Sequence variation in the guillemot (Alcidae: Cepphus) mitochondrial control region and its nuclear homolog.

    Science.gov (United States)

    Kidd, M G; Friesen, V L

    1998-01-01

    We describe sequence variation in the mitochondrial control region and its nuclear homolog in three species and seven subspecies of guillemots (Cepphus spp.). Nuclear homologs of the 5' end of the control region were found in all individuals. Nuclear sequences were approximately 50% divergent from their mitochondrial counterparts and formed a distinct phylogenetic clade; the mitochondrial-nuclear introgression event must have predated the radiation of Cepphus. As in other vertebrates, the guillemot control region has a relatively conserved central block flanked by hypervariable 5' and 3' ends. Mean pairwise interspecific divergence values among control regions were lower than those in other birds. All individuals were heteroplasmic for the number of simple tandem nucleotide repeats (A(n)C) at the 3' end of the control region. Phylogenetic analyses suggest that black guillemots are basal to pigeon and spectacled guillemots, but evolutionary relationships among subspecies remain unresolved, possibly due to incomplete lineage sorting. Describing molecular variation in nuclear homologs of mitochondrial genes is of general interest in phylogenetics because, if undetected, the homologs may confound interpretations of mitochondrial phylogenies.

  14. Somatic variation precedes extensive diversification of germline sequences and combinatorial joining in the evolution of immunoglobulin heavy chain diversity.

    Science.gov (United States)

    Hinds-Frey, K R; Nishikata, H; Litman, R T; Litman, G W

    1993-09-01

    In Heterodontus, a phylogenetically primitive shark species, the variable (VH), diversity (DH), joining (JH) segments, and constant (CH) exons are organized in individual approximately 18-20-kb "clusters." A single large VH family with > 90% nucleic acid homology and a monotypic second gene family are identified by extensive screening of a genomic DNA library. Little variation in the nucleotide sequences of DH segments from different germline gene clusters is evident, suggesting that the early role for DH was in promoting junctional diversity rather than contributing unique coding specificities. A gene-specific oligodeoxynucleotide screening method was used to relate specific transcription products (cDNAs) to individual gene clusters and showed that gene rearrangements are intra- rather than intercluster. This provides further evidence for restricted diversity in the immunoglobulin heavy chain of Heterodontus, from which it is inferred that combinatorial diversity is a more recently acquired means for generating diversity. The observed differences between cDNA sequences selected and the sequences of segmental elements derived from conventional genomic libraries as well as from VH segment-specific libraries generated by direct PCR amplification of genomic DNA indicate that the VH repertoire is diversified by both junctional diversity and somatic mutation. Taken together, these findings suggest a heretofore unrecognized contribution of somatic variation that preceded both extensive diversification of the germline repertoire and the combinatorial joining process in the evolution of humoral immunity.

  15. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    Science.gov (United States)

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  16. Genetic analysis of the PKHD1 gene with long-rang PCR sequencing.

    Science.gov (United States)

    Tong, Yong-Qing; Liu, Bei; Fu, Chao-Hong; Zheng, Hong-Yun; Gu, Jian; Liu, Hang; Luo, Hong-Bo; Li, Yan

    2016-10-01

    PKHD1 gene mutations are found responsible for autosomal recessive polycystic kidney disease (ARPKD). However, it is inconvenient to detect the mutations by common polymerase chain reaction (PCR) because the open reading frame of PKHD1 is very long. Recently, long-range (LR) PCR is demonstrated to be a more sensitive mutation screening method for PKHD1 by directly sequencing. In this study, the entire PKHD1 coding region was amplified by 29 reactions to avoid the specific PCR amplification of individual exons, which generated the size of 1 to 7 kb products by LR PCR. This method was compared to the screening method with standard direct sequencing of each individual exon of the gene by a reference laboratory in 15 patients with ARPKD. The results showed that a total of 37 genetic changes were detected with LR PCR sequencing, which included 33 variations identified by the reference laboratory with standard direct sequencing. LR PCR sequencing had 100% sensitivity, 96% specificity, and 97.0% accuracy, which were higher than those with standard direct sequencing method. In conclusion, LR PCR sequencing is a reliable method with high sensitivity, specificity and accuracy for detecting genetic variations. It also has more intronic coverage and lower cost, and is an applicable clinical method for complex genetic analyses.

  17. Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

    Directory of Open Access Journals (Sweden)

    Wu Bai-Lin

    2009-10-01

    Full Text Available Abstract Background One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive. These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels. The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. Results An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other

  18. MHC class II genes in the European badger (Meles meles) : Characterization, patterns of variation, and transcription analysis

    NARCIS (Netherlands)

    Sin, Yung Wa; Dugdale, Hannah L.; Newman, Chris; Macdonald, David W.; Burke, Terry

    The major histocompatibility complex (MHC) comprises many genes, some of which are polymorphic with numerous alleles. Sequence variation among alleles is most pronounced in exon 2 of the class II genes, which encodes the alpha 1 and beta 1 domains that form the antigen-binding site (ABS) for the

  19. MHC class II genes in the European badger (Meles meles) : Characterization, patterns of variation, and transcription analysis

    NARCIS (Netherlands)

    Sin, Yung Wa; Dugdale, Hannah L.; Newman, Chris; Macdonald, David W.; Burke, Terry

    2012-01-01

    The major histocompatibility complex (MHC) comprises many genes, some of which are polymorphic with numerous alleles. Sequence variation among alleles is most pronounced in exon 2 of the class II genes, which encodes the alpha 1 and beta 1 domains that form the antigen-binding site (ABS) for the pre

  20. Sequence variation of the 16S to 23S rRNA spacer region in Salmonella enterica.

    Science.gov (United States)

    Christensen, H; Møller, P L; Vogensen, F K; Olsen, J E

    2000-01-01

    The possibility for identification of Salmonella enterica serotypes by sequence analysis of the 16S to 23S rRNA internal transcribed spacer was investigated by direct sequencing of polymerase chain reaction-amplified DNA from all operons simultaneously in a collection of 25 strains of 18 different serotypes of S. enterica, and by sequencing individual cloned operons from a single strain. It was only possible to determine the first 117 bases upstream from the 23S rRNA gene by direct sequencing because of variation between the rrn operons. Comparison of sequences from this region allowed separation of only 15 out of the 18 serotypes investigated and was not specific even at the subspecies level of S. enterica. To determine the differences between internal transcribed spacers in more detail, the individual rrn operons of strain JEO 197, serotype IV 43:z4,z23:-, were cloned and sequenced. The strain contained four short internal transcribed spacer fragments of 382-384 bases in length, which were 98.4-99.7% similar to each other and three long fragments of 505 bases with 98.0-99.8% similarity. The study demonstrated a higher degree of interbacterial variation than intrabacterial variation between operons for serotypes of S. enterica.

  1. Epigenetic variation in the Egfr gene generates quantitative variation in a complex trait in ants.

    Science.gov (United States)

    Alvarado, Sebastian; Rajakumar, Rajendhran; Abouheif, Ehab; Szyf, Moshe

    2015-03-11

    Complex quantitative traits, like size and behaviour, are a pervasive feature of natural populations. Quantitative trait variation is the product of both genetic and environmental factors, yet little is known about the mechanisms through which their interaction generates this variation. Epigenetic processes, such as DNA methylation, can mediate gene-by-environment interactions during development to generate discrete phenotypic variation. We therefore investigated the developmental role of DNA methylation in generating continuous size variation of workers in an ant colony, a key trait associated with division of labour. Here we show that, in the carpenter ant Camponotus floridanus, global (genome-wide) DNA methylation indirectly regulates quantitative methylation of the conserved cell-signalling gene Epidermal growth factor receptor to generate continuous size variation of workers. DNA methylation can therefore generate quantitative variation in a complex trait by quantitatively regulating the transcription of a gene. This mechanism, alongside genetic variation, may determine the phenotypic possibilities of loci for generating quantitative trait variation in natural populations.

  2. Mitochondrial D-loop sequence variation among Italian horse breeds

    Directory of Open Access Journals (Sweden)

    Zanotti Marta

    2004-11-01

    Full Text Available Abstract The genetic variability of the mitochondrial D-loop DNA sequence in seven horse breeds bred in Italy (Giara, Haflinger, Italian trotter, Lipizzan, Maremmano, Thoroughbred and Sarcidano was analysed. Five unrelated horses were chosen in each breed and twenty-two haplotypes were identified. The sequences obtained were aligned and compared with a reference sequence and with 27 mtDNA D-loop sequences selected in the GenBank database, representing Spanish, Portuguese, North African, wild horses and an Equus asinus sequence as the outgroup. Kimura two-parameter distances were calculated and a cluster analysis using the Neighbour-joining method was performed to obtain phylogenetic trees among breeds bred in Italy and among Italian and foreign breeds. The cluster analysis indicates that all the breeds but Giara are divided in the two trees, and no clear relationships were revealed between Italian populations and the other breeds. These results could be interpreted as showing the mixed origin of breeds bred in Italy and probably indicate the presence of many ancient maternal lineages with high diversity in mtDNA sequences.

  3. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    Directory of Open Access Journals (Sweden)

    Huajing Teng

    2016-07-01

    Full Text Available Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  4. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    Science.gov (United States)

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  5. Genetic variation in telomere maintenance genes in relation to ovarian cancer survival.

    Science.gov (United States)

    Harris, Holly R; Vivo, Immaculata De; Titus, Linda J; Vitonis, Allison F; Wong, Jason Y Y; Cramer, Daniel W; Terry, Kathryn L

    2012-01-01

    Telomeres are repetitive non-coding DNA sequences at the ends of chromosomes that provide protection against chromosomal instability. Telomere length and stability are influenced by proteins, including telomerase which is partially encoded by the TERT gene. Genetic variation in the TERT gene is associated with ovarian cancer risk, and predicts survival in lung cancer and glioma. We investigated whether genetic variation in five telomere maintenance genes was associated with survival among 1480 cases of invasive epithelial ovarian cancer in the population-based New England Case-Control Study. Cox proportional hazard models were used to calculate hazard ratios and 95% confidence intervals. Overall we observed no significant associations between SNPs in telomere maintenance genes and mortality using a significance threshold of p=0.001. However, we observed some suggestive associations in subgroup analyses. Future studies with larger populations may further our understanding of what role telomeres play in ovarian cancer survival.

  6. Sweet taste receptor gene variation and aspartame taste in primates and other species.

    Science.gov (United States)

    Li, Xia; Bachmanov, Alexander A; Maehashi, Kenji; Li, Weihua; Lim, Raymond; Brand, Joseph G; Beauchamp, Gary K; Reed, Danielle R; Thai, Chloe; Floriano, Wely B

    2011-06-01

    Aspartame is a sweetener added to foods and beverages as a low-calorie sugar replacement. Unlike sugars, which are apparently perceived as sweet and desirable by a range of mammals, the ability to taste aspartame varies, with humans, apes, and Old World monkeys perceiving aspartame as sweet but not other primate species. To investigate whether the ability to perceive the sweetness of aspartame correlates with variations in the DNA sequence of the genes encoding sweet taste receptor proteins, T1R2 and T1R3, we sequenced these genes in 9 aspartame taster and nontaster primate species. We then compared these sequences with sequences of their orthologs in 4 other nontasters species. We identified 9 variant sites in the gene encoding T1R2 and 32 variant sites in the gene encoding T1R3 that distinguish aspartame tasters and nontasters. Molecular docking of aspartame to computer-generated models of the T1R2 + T1R3 receptor dimer suggests that species variation at a secondary, allosteric binding site in the T1R2 protein is the most likely origin of differences in perception of the sweetness of aspartame. These results identified a previously unknown site of aspartame interaction with the sweet receptor and suggest that the ability to taste aspartame might have developed during evolution to exploit a specialized food niche.

  7. Sequence Comparison of Partial Cytochrome b Genes of Two Coilia species

    Institute of Scientific and Technical Information of China (English)

    LIU Jinxian; GAO Tianxiang; WANG Yujiang; ZHANG Yaping

    2005-01-01

    Sequence variation of partial cytochrome b genes between two Coilia species, C. ectenes and C. mystus, was investigated. Of the 402 nucleotides, twenty-seven (6.72%) are polymorphic and all are synonymous substitutions. At the third positions of genetic condon of cytochrome b gene, the two species show an extreme anti-G bias (< 4 % ) and a pronounced bias towards A and C (>68%). There is no amino acid sequence divergence between the partial cytochrome b genes of the two species, indicating a close genetic relationship between them. The k-2p genetic distance of partial cytochrome b segment of the two species is 0.072, suggesting that the species were separated 3.6 Ma ago, in the middle Pliocene. Our result reveals that the cytochrome b gene is an appropriate marker for studies of population genetic structures and phylogeographic patterns of the two species.

  8. Protein 3D structure computed from evolutionary sequence variation.

    Directory of Open Access Journals (Sweden)

    Debora S Marks

    Full Text Available The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org. This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of

  9. Protein 3D structure computed from evolutionary sequence variation.

    Science.gov (United States)

    Marks, Debora S; Colwell, Lucy J; Sheridan, Robert; Hopf, Thomas A; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α)-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures

  10. High frequency of HMW-GS sequence variation through somatic hybridization between Agropyron elongatum and common wheat.

    Science.gov (United States)

    Gao, Xin; Liu, Shu Wei; Sun, Qun; Xia, Guang Min

    2010-01-01

    A symmetric somatic hybridization was performed to combine the protoplasts of tall wheatgrass (Agropyron elongatum) and bread wheat (Triticum aestivum). Fertile regenerants were obtained which were morphologically similar to tall wheatgrass, but which contained some introgression segments from wheat. An SDS-PAGE analysis showed that a number of non-parental high-molecular weight glutenin subunits (HMW-GS) were present in the symmetric somatic hybridization derivatives. These sequences were amplified, cloned and sequenced, to deliver 14 distinct HMW-GS coding sequences, eight of which were of the y-type (Hy1-Hy8) and six x-type (Hx1-Hx6). Five of the cloned HMW-GS sequences were successfully expressed in E. coli. The analysis of their deduced peptide sequences showed that they all possessed the typical HMW-GS primary structure. Sequence alignments indicated that Hx5 and Hy1 were probably derived from the tall wheatgrass genes Aex5 and Aey6, while Hy2, Hy3, Hx1 and Hy6 may have resulted from slippage in the replication of a related biparental gene. We found that both symmetric and asymmetric somatic hybridization could promote the emergence of novel alleles. We discussed the origination of allelic variation of HMW-GS genes in somatic hybridization, which might be the result from the response to genomic shock triggered by the merger and interaction of biparent genomes.

  11. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    Science.gov (United States)

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions.

  12. Combinatorial pooling enables selective sequencing of the barley gene space.

    Directory of Open Access Journals (Sweden)

    Stefano Lonardi

    2013-04-01

    Full Text Available For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  13. Citrus plastid-related gene profiling based on expressed sequence tag analyses

    Directory of Open Access Journals (Sweden)

    Tercilio Calsa Jr.

    2007-01-01

    Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.

  14. AluScan: a method for genome-wide scanning of sequence and structure variations in the human genome

    Directory of Open Access Journals (Sweden)

    Mei Lingling

    2011-11-01

    Full Text Available Abstract Background To complement next-generation sequencing technologies, there is a pressing need for efficient pre-sequencing capture methods with reduced costs and DNA requirement. The Alu family of short interspersed nucleotide elements is the most abundant type of transposable elements in the human genome and a recognized source of genome instability. With over one million Alu elements distributed throughout the genome, they are well positioned to facilitate genome-wide sequence amplification and capture of regions likely to harbor genetic variation hotspots of biological relevance. Results Here we report on the use of inter-Alu PCR with an enhanced range of amplicons in conjunction with next-generation sequencing to generate an Alu-anchored scan, or 'AluScan', of DNA sequences between Alu transposons, where Alu consensus sequence-based 'H-type' PCR primers that elongate outward from the head of an Alu element are combined with 'T-type' primers elongating from the poly-A containing tail to achieve huge amplicon range. To illustrate the method, glioma DNA was compared with white blood cell control DNA of the same patient by means of AluScan. The over 10 Mb sequences obtained, derived from more than 8,000 genes spread over all the chromosomes, revealed a highly reproducible capture of genomic sequences enriched in genic sequences and cancer candidate gene regions. Requiring only sub-micrograms of sample DNA, the power of AluScan as a discovery tool for genetic variations was demonstrated by the identification of 357 instances of loss of heterozygosity, 341 somatic indels, 274 somatic SNVs, and seven potential somatic SNV hotspots between control and glioma DNA. Conclusions AluScan, implemented with just a small number of H-type and T-type inter-Alu PCR primers, provides an effective capture of a diversity of genome-wide sequences for analysis. The method, by enabling an examination of gene-enriched regions containing exons, introns, and

  15. [Characterization of 5S rRNA gene sequence and secondary structure in gymnosperms].

    Science.gov (United States)

    Liu, Zhan-Lin; Zhang, Da-Ming; Wang, Xiao-Ru

    2003-01-01

    In higher plants the primary and the secondary structures of 5S ribosomal RNA gene are considered highly conservative. Little is known about the 5S rRNA gene structure, organization and variation in gyimnosperms. In this study we analyzed sequence and structure variation of 5S rRNA gene in Pinus through cloning and sequencing multiple copies of 5S rDNA repeats from individual trees of five pines, P. bungeana, P. tabulaeformis, P. yunnanensis, P. massoniana and P. densata. Pinus bungeana is from the subgenus Strobus while the other four are from the subgenus Pinus (diploxylon pines). Our results revealed variations in both primary and secondary structure among copies of 5S rDNA within individual genomes and between species. 5S rRNA gene in Pinus is 120 bp long in most of the 122 clones we sequenced except for one or two deletions in three clones. Among these clones 50 unique sequences were identified and they were shared by different pine species. Our sequences were compared to 13 sequences each representing a different gymnosperm species, and to six sequences representing both angiosperm monocots and dicots. Average sequence similarity was 97.1% among Pinus species and 94.3% between Pinus and other gymnosperms. Between gymnosperms and angiosperms the sequence similarity decreased to 88.1%. Similar to other molecular data, significant sequence divergence was found between the two Pinus subgenera. The 5S gene tree (neighbor-joining tree) grouped the four diploxylon pines together and separated them distinctly from P. bungeana. Comparison of sequence divergence within individuals and between species suggested that concerted evolution has been very weak especially after the divergence of the four diploxylon pines. The phylogenetic information contained in the 5S rRNA gene is limited due to its shorter length and the difficulties in identifying orthologous and paralogous copies of rDNA multigene family further complicate its phylogenetic application. Pinus densata is a

  16. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip;

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, ...

  17. Proteolipid protein 1 gene sequencing of hereditary spastic paraplegia

    Institute of Scientific and Technical Information of China (English)

    Yu Gao; Lumei Chi; Yinshi Jin; Guangxian Nan

    2012-01-01

    PCR amplification and sequencing of whole blood DNA from an individual with hereditary spastic paraplegia, as well as family members, revealed a fragment of proteolipid protein 1 (PLP1) gene exon 1, which excluded the possibility of isomer 1 expression for this family. The fragment sequence of exon 3 and exon 5 was consistent with the proteolipid protein 1 sequence at NCBI. In the proband samples, a PLP1 point mutation in exon 4 was detected at the basic group of position 844, T→C, phenylalanine→leucine. In proband samples from a male cousin, the basic group at position 844 was C, but gene sequencing signals revealed mixed signals of T and C, indicating possible mutation at this locus. Results demonstrated that changes in PLP1 exon 4 amino acids were associated with onset of hereditary spastic paraplegia.

  18. Variation in Symbiodinium ITS2 Sequence Assemblages among Coral Colonies

    Science.gov (United States)

    Stat, Michael; Bird, Christopher E.; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J.; Concepcion, Gregory T.; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J.; Gates, Ruth D.

    2011-01-01

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping. PMID:21246044

  19. Variation in Symbiodinium ITS2 sequence assemblages among coral colonies.

    Directory of Open Access Journals (Sweden)

    Michael Stat

    Full Text Available Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping.

  20. Variation in Symbiodinium ITS2 sequence assemblages among coral colonies.

    Science.gov (United States)

    Stat, Michael; Bird, Christopher E; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J; Concepcion, Gregory T; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J; Gates, Ruth D

    2011-01-05

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping.

  1. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  2. A human gut microbial gene catalogue established by metagenomic sequencing

    DEFF Research Database (Denmark)

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn;

    2010-01-01

    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  3. Propagation of genetic variation in gene regulatory networks.

    Science.gov (United States)

    Plahte, Erik; Gjuvsland, Arne B; Omholt, Stig W

    2013-08-01

    A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network's feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

  4. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  5. Streptococcus mutans Clonal Variation Revealed by Multilocus Sequence Typing▿

    OpenAIRE

    Nakano, Kazuhiko; Lapirattanakul, Jinthana; Nomura, Ryota; Nemoto, Hirotoshi; Alaluusua, Satu; Grönroos, Lisa; Vaara, Martti; Hamada, Shigeyuki; Ooshima, Takashi; Nakagawa, Ichiro

    2007-01-01

    Streptococcus mutans is the major pathogen of dental caries, a biofilm-dependent infectious disease, and occasionally causes infective endocarditis. S. mutans strains have been classified into four serotypes (c, e, f, and k). However, little is known about the S. mutans population, including the clonal relationships among strains of S. mutans, in relation to the particular clones that cause systemic diseases. To address this issue, we have developed a multilocus sequence typing (MLST) scheme ...

  6. A Poisson hierarchical modelling approach to detecting copy number variation in sequence coverage data

    KAUST Repository

    Sepúlveda, Nuno

    2013-02-26

    Background: The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model.Results: Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates.Conclusions: In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data. 2013 Seplveda et al.; licensee BioMed Central Ltd.

  7. Novel rare variations of the oxytocin receptor (OXTR) gene in autism spectrum disorder individuals.

    Science.gov (United States)

    Liu, Xiaoxi; Kawashima, Minae; Miyagawa, Taku; Otowa, Takeshi; Latt, Khun Zaw; Thiri, Myo; Nishida, Hisami; Sugiyama, Toshiro; Tsurusaki, Yoshinori; Matsumoto, Naomichi; Mabuchi, Akihiko; Tokunaga, Katsushi; Sasaki, Tsukasa

    2015-01-01

    The oxytocin receptor (OXTR) gene has been implicated as a risk gene for autism spectrum disorder (ASD)-a neurodevelopmental disorder with essential features of impairments in social communication and reciprocal interaction. The genetic associations between common variations in OXTR and ASD have been reported in multiple ethnic populations. However, little is known about the distribution of rare variations within OXTR in ASD patients. In this study, we resequenced the full length of OXTR in 105 ASD individuals using an approach that combined the power of next-generation sequencing technology, long-range PCR and DNA pooling. We demonstrated that rare variants with minor allele frequency as low as 0.05% could be reliably detected by our method. We identified 28 novel variants including potential functional variants in the intron region and one rare missense variant (R150S). We subsequently performed Sanger sequencing and validated five novel variants located in previously suggested candidate regions in ASD individuals. Further sequencing of 312 healthy subjects showed that the burden of rare variants is significantly higher in ASDs compared with healthy individuals. Our results support that the rare variation in OXTR gene might be involved in ASD.

  8. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    Science.gov (United States)

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  9. Selection of genes associated with variations in the Circle of Willis in gerbils using suppression subtractive hybridization.

    Science.gov (United States)

    Li, Zhenkun; Huo, Xueyun; Zhang, Shuangyue; Lu, Jing; Li, Changlong; Guo, Meng; Fu, Rui; He, Zhengming; Du, Xiaoyan; Chen, Zhenwen

    2015-01-01

    Deformities in the Circle of Willis (CoW) can significantly increase the risk of cerebrovascular disease in humans. However, the molecular mechanisms underlying these deformities have not been understood. Based on our previous studies, variations in the CoW of gerbils are hereditary. A normal CoW is observed in approximately 60% of gerbils, a percentage that also applies to humans. Thus, gerbil is an ideal experimental model for studying variations in the CoW. To study the mechanisms underlying these variations, we selected genes associated with different types of the CoW using suppression subtractive hybridization (SSH). After evaluating the efficiency of SSH using quantitative real-time polymerase chain reaction (qPCR) on subtracted and unsubtracted cDNA and Southern blotting on SSH PCR products, 12 SSH libraries were established. We identified 4 genes (CST3, GNAS, GPx4 and PFN2) associated with variations in the CoW. These genes were identified with qPCR and Western blotting using 70 expressed sequence tags from the SSH libraries. Cloning and sequencing allowed us to demonstrate that the 4 genes were closely related to mouse genes. We may assume that these 4 genes play an important role in the development of variations in the CoW. This study provides a foundation for further research of genes related to development of variations in the CoW and the mechanisms of dysmorphosis of cerebral vessels.

  10. Selection of genes associated with variations in the Circle of Willis in gerbils using suppression subtractive hybridization.

    Directory of Open Access Journals (Sweden)

    Zhenkun Li

    Full Text Available Deformities in the Circle of Willis (CoW can significantly increase the risk of cerebrovascular disease in humans. However, the molecular mechanisms underlying these deformities have not been understood. Based on our previous studies, variations in the CoW of gerbils are hereditary. A normal CoW is observed in approximately 60% of gerbils, a percentage that also applies to humans. Thus, gerbil is an ideal experimental model for studying variations in the CoW. To study the mechanisms underlying these variations, we selected genes associated with different types of the CoW using suppression subtractive hybridization (SSH. After evaluating the efficiency of SSH using quantitative real-time polymerase chain reaction (qPCR on subtracted and unsubtracted cDNA and Southern blotting on SSH PCR products, 12 SSH libraries were established. We identified 4 genes (CST3, GNAS, GPx4 and PFN2 associated with variations in the CoW. These genes were identified with qPCR and Western blotting using 70 expressed sequence tags from the SSH libraries. Cloning and sequencing allowed us to demonstrate that the 4 genes were closely related to mouse genes. We may assume that these 4 genes play an important role in the development of variations in the CoW. This study provides a foundation for further research of genes related to development of variations in the CoW and the mechanisms of dysmorphosis of cerebral vessels.

  11. Sequence Analysis of the ank Gene of Granulocytic Ehrlichiae

    OpenAIRE

    2000-01-01

    The ank gene of the agent of human granulocytic ehrlichiosis (HGE) codes for a protein with a predicted molecular size of 131.2 kDa that is recognized by serum from both dogs and humans infected with granulocytic ehrlichiae. As part of an effort to assess the phylogenetic relatedness of granulocytic ehrlichiae from different geographic regions and in different host species, the ank gene was PCR amplified and sequenced from a variety of sources. These included 10 blood specimens from patients ...

  12. Geographic variation in advertisement calls in a tree frog species: gene flow and selection hypotheses.

    Directory of Open Access Journals (Sweden)

    Yikweon Jang

    Full Text Available BACKGROUND: In a species with a large distribution relative to its dispersal capacity, geographic variation in traits may be explained by gene flow, selection, or the combined effects of both. Studies of genetic diversity using neutral molecular markers show that patterns of isolation by distance (IBD or barrier effect may be evident for geographic variation at the molecular level in amphibian species. However, selective factors such as habitat, predator, or interspecific interactions may be critical for geographic variation in sexual traits. We studied geographic variation in advertisement calls in the tree frog Hyla japonica to understand patterns of variation in these traits across Korea and provide clues about the underlying forces for variation. METHODOLOGY: We recorded calls of H. japonica in three breeding seasons from 17 localities including localities in remote Jeju Island. Call characters analyzed were note repetition rate (NRR, note duration (ND, and dominant frequency (DF, along with snout-to-vent length. RESULTS: The findings of a barrier effect on DF and a longitudinal variation in NRR seemed to suggest that an open sea between the mainland and Jeju Island and mountain ranges dominated by the north-south Taebaek Mountains were related to geographic variation in call characters. Furthermore, there was a pattern of IBD in mitochondrial DNA sequences. However, no comparable pattern of IBD was found between geographic distance and call characters. We also failed to detect any effects of habitat or interspecific interaction on call characters. CONCLUSIONS: Geographic variations in call characters as well as mitochondrial DNA sequences were largely stratified by geographic factors such as distance and barriers in Korean populations of H. japonica. Although we did not detect effects of habitat or interspecific interaction, some other selective factors such as sexual selection might still be operating on call characters in conjunction with

  13. Geographic Variation in Advertisement Calls in a Tree Frog Species: Gene Flow and Selection Hypotheses

    Science.gov (United States)

    Jang, Yikweon; Hahm, Eun Hye; Lee, Hyun-Jung; Park, Soyeon; Won, Yong-Jin; Choe, Jae C.

    2011-01-01

    Background In a species with a large distribution relative to its dispersal capacity, geographic variation in traits may be explained by gene flow, selection, or the combined effects of both. Studies of genetic diversity using neutral molecular markers show that patterns of isolation by distance (IBD) or barrier effect may be evident for geographic variation at the molecular level in amphibian species. However, selective factors such as habitat, predator, or interspecific interactions may be critical for geographic variation in sexual traits. We studied geographic variation in advertisement calls in the tree frog Hyla japonica to understand patterns of variation in these traits across Korea and provide clues about the underlying forces for variation. Methodology We recorded calls of H. japonica in three breeding seasons from 17 localities including localities in remote Jeju Island. Call characters analyzed were note repetition rate (NRR), note duration (ND), and dominant frequency (DF), along with snout-to-vent length. Results The findings of a barrier effect on DF and a longitudinal variation in NRR seemed to suggest that an open sea between the mainland and Jeju Island and mountain ranges dominated by the north-south Taebaek Mountains were related to geographic variation in call characters. Furthermore, there was a pattern of IBD in mitochondrial DNA sequences. However, no comparable pattern of IBD was found between geographic distance and call characters. We also failed to detect any effects of habitat or interspecific interaction on call characters. Conclusions Geographic variations in call characters as well as mitochondrial DNA sequences were largely stratified by geographic factors such as distance and barriers in Korean populations of H. japoinca. Although we did not detect effects of habitat or interspecific interaction, some other selective factors such as sexual selection might still be operating on call characters in conjunction with restricted gene

  14. The expected variation of random bounded integer sequences of finite length

    OpenAIRE

    Rudolfo Angeles; Don Rawlings; Lawrence Sze; Mark Tiefenbruck

    2005-01-01

    From the enumerative generating function of an abstract adjacency statistic, we deduce the mean and variance of the variation on random permutations, rearrangements, compositions, and bounded integer sequences of finite length.

  15. The expected variation of random bounded integer sequences of finite length

    Directory of Open Access Journals (Sweden)

    Rudolfo Angeles

    2005-09-01

    Full Text Available From the enumerative generating function of an abstract adjacency statistic, we deduce the mean and variance of the variation on random permutations, rearrangements, compositions, and bounded integer sequences of finite length.

  16. Sequence and gene expression evolution of paralogous genes in willows.

    Science.gov (United States)

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-12-22

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.

  17. Natural variation for gene expression responses to abiotic stress in maize.

    Science.gov (United States)

    Waters, Amanda J; Makarevitch, Irina; Noshay, Jaclyn; Burghardt, Liana T; Hirsch, Candice N; Hirsch, Cory D; Springer, Nathan M

    2017-02-01

    Plants respond to abiotic stress through a variety of physiological, biochemical, and transcriptional mechanisms. Many genes exhibit altered levels of expression in response to abiotic stress, which requires concerted action of both cis- and trans-regulatory features. In order to study the variability in transcriptome response to abiotic stress, RNA sequencing was performed using 14-day-old maize seedlings of inbreds B73, Mo17, Oh43, PH207 and B37 under control, cold and heat conditions. Large numbers of genes that responded differentially to stress between parental inbred lines were identified. RNA sequencing was also performed on similar tissues of the F1 hybrids produced by crossing B73 and each of the three other inbred lines. By evaluating allele-specific transcript abundance in the F1 hybrids, we were able to measure the abundance of cis- and trans-regulatory variation between genotypes for both steady-state and stress-responsive expression differences. Although examples of trans-regulatory variation were observed, cis-regulatory variation was more common for both steady-state and stress-responsive expression differences. The genes with cis-allelic variation for response to cold or heat stress provided an opportunity to study the basis for regulatory diversity.

  18. PPARG: Gene Expression Regulation and Next-Generation Sequencing for Unsolved Issues

    Directory of Open Access Journals (Sweden)

    Valerio Costa

    2010-01-01

    Full Text Available Peroxisome proliferator-activated receptor gamma (PPARγ is one of the most extensively studied ligand-inducible transcription factors (TFs, able to modulate its transcriptional activity through conformational changes. It is of particular interest because of its pleiotropic functions: it plays a crucial role in the expression of key genes involved in adipogenesis, lipid and glucid metabolism, atherosclerosis, inflammation, and cancer. Its protein isoforms, the wide number of PPARγ target genes, ligands, and coregulators contribute to determine the complexity of its function. In addition, the presence of genetic variants is likely to affect expression levels of target genes although the impact of PPARG gene variations on the expression of target genes is not fully understood. The introduction of massively parallel sequencing platforms—in the Next Generation Sequencing (NGS era—has revolutionized the way of investigating the genetic causes of inherited diseases. In this context, DNA-Seq for identifying—within both coding and regulatory regions of PPARG gene—novel nucleotide variations and haplotypes associated to human diseases, ChIP-Seq for defining a PPARγ binding map, and RNA-Seq for unraveling the wide and intricate gene pathways regulated by PPARG, represent incredible steps toward the understanding of PPARγ in health and disease.

  19. Nucleotide sequence analysis of hypervariable junctions of Haemophilus influenzae pilus gene clusters.

    Science.gov (United States)

    Read, T D; Satola, S W; Farley, M M

    2000-12-01

    Haemophilus influenzae pili are surface structures that promote attachment to human epithelial cells. The five genes that encode pili, hifABCDE, are found inserted in genomes either between pmbA and hpt (hif-1) or between purE and pepN (hif-2). We determined the sequence between the ends of the pilus clusters and bordering genes in a number of H. influenzae strains. The junctions of the hif-1 cluster (limited to biogroup aegyptius isolates) are structurally simple. In contrast, hif-2 junctions are highly diverse, complex assemblies of conserved intergenic sequences (including genes hicA and hicB) with evidence of frequent recombination. Variation at hif-2 junctions seems to be tied to multiple copies of a 23-bp Haemophilus intergenic dyad sequence. The hif-1 cluster appears to have originated in biogroup aegyptius strains from invasion of the hpt-pmbA region by a DNA template containing the hif-2 genes with termini in the hairpin loop of flanking intergenic dyad sequences. The pilus gene clusters are an interesting model of a mobile "pathogenicity island" not associated with a phage, transposon, or insertion element.

  20. Exploiting natural variation to identify insect-resistance genes.

    Science.gov (United States)

    Broekgaarden, Colette; Snoeren, Tjeerd A L; Dicke, Marcel; Vosman, Ben

    2011-10-01

    Herbivorous insects are widespread and often serious constraints to crop production. The use of insect-resistant crops is a very effective way to control insect pests in agriculture, and the development of such crops can be greatly enhanced by knowledge on plant resistance mechanisms and the genes involved. Plants have evolved diverse ways to cope with insect attack that has resulted in natural variation for resistance towards herbivorous insects. Studying the molecular genetics and transcriptional background of this variation has facilitated the identification of resistance genes and processes that lead to resistance against insects. With the development of new technologies, molecular studies are not restricted to model plants anymore. This review addresses the need to exploit natural variation in resistance towards insects to increase our knowledge on resistance mechanisms and the genes involved. We will discuss how this knowledge can be exploited in breeding programmes to provide sustainable crop protection against insect pests. Additionally, we discuss the current status of genetic research on insect-resistance genes. We conclude that insect-resistance mechanisms are still unclear at the molecular level and that exploiting natural variation with novel technologies will contribute greatly to the development of insect-resistant crop varieties.

  1. Gene Identification and Expression Analysis of 86,136 Expressed Sequence Tags (EST) from the Rice Genome

    Institute of Scientific and Technical Information of China (English)

    Yan Zhou; Lin Ye; Li Lin; Jun Li; Xuegang Wang; Hao Xu; Yibin Pan; Wei Lin; Wei Tian; Jing Liu; Liping Wei; Jiabin Tang; Siqi Liu; Huanming Yang; Jun Yu; Jian Wang; Michael G. Walker; Xiuqing Zhang; Jun Wang; Songnian Hu; Huayong Xu; Yajun Deng; Jianhai Dong

    2003-01-01

    Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Avabidopsis according to KEGG. We further profiled gene expression patterns in different tis sues, developmental stages, and in a conditional sterile mutant, after checking the libraries are comparable by means of sequence coverage. We also identified some possible library specific genes and a number of enzymes and transcription factors that contribute to rice development.

  2. Mitochondrial tRNALeu/Lys and ATPase 6/8 gene variations in spinocerebellar ataxias.

    Science.gov (United States)

    Safaei, Sepideh; Houshmand, Massoud; Banoei, Mohammad Mehdi; Panahi, Mehdi Shafa Shariat; Nafisi, Shahriar; Parivar, Kazem; Rostami, Maryam; Shariati, Parvin

    2009-01-01

    The spinocerebellar ataxias (SCA) comprise a heterogeneous group of severe late-onset neurodegenerative diseases that are promoted by the expansion of a tandem-arrayed DNA sequence that modifies the primary structure of the protein. Genomic DNA of 20 patients affected with SCAs was extracted from peripheral blood and screened for deletions in mitochondrial DNA (mtDNA). Sequencing of tRNA(Leu), tRNA(Lys), cytochrome oxidase II, ATPase 6/8 and NADH dehydrogenase I (NDI) genes belonging to mtDNA from patients with SCAs was also carried out to detect the presence of variations. We identified cytosine-adenine-guanine (CAG) trinucleotide repeat expansions in 20 patients. Seven of these patients had at least one nucleotide change in mtDNA. In such cases, 5 nucleotide variations resulted in amino acid changes with two novel variations T8256G and G9010A. SCA patients showed high levels of mtDNA variations in lymphocytes. It can be proposed that the SCA gene proteins (Ataxins) are involved in the complicated intracellular mechanisms that affect cellular organelles and their components, such as the mitochondrial genome. The instability of CAG repeats in polyglutamine diseases such as SCAs and Huntington's disease might be a causative factor in mtDNA variation or possible damage. Copyright 2008 S. Karger AG, Basel.

  3. Detection, Validation, and Downstream Analysis of Allelic Variation in Gene Expression

    Science.gov (United States)

    Ciobanu, Daniel C.; Lu, Lu; Mozhui, Khyobeni; Wang, Xusheng; Jagalur, Manjunatha; Morris, John A.; Taylor, William L.; Dietz, Klaus; Simon, Perikles; Williams, Robert W.

    2010-01-01

    Common sequence variants within a gene often generate important differences in expression of corresponding mRNAs. This high level of local (allelic) control—or cis modulation—rivals that produced by gene targeting, but expression is titrated finely over a range of levels. We are interested in exploiting this allelic variation to study gene function and downstream consequences of differences in expression dosage. We have used several bioinformatics and molecular approaches to estimate error rates in the discovery of cis modulation and to analyze some of the biological and technical confounds that contribute to the variation in gene expression profiling. Our analysis of SNPs and alternative transcripts, combined with eQTL maps and selective gene resequencing, revealed that between 17 and 25% of apparent cis modulation is caused by SNPs that overlap probes rather than by genuine quantitative differences in mRNA levels. This estimate climbs to 40–50% when qualitative differences between isoform variants are included. We have developed an analytical approach to filter differences in expression and improve the yield of genuine cis-modulated transcripts to ∼80%. This improvement is important because the resulting variation can be successfully used to study downstream consequences of altered expression on higher-order phenotypes. Using a systems genetics approach we show that two validated cis-modulated genes, Stk25 and Rasd2, are likely to control expression of downstream targets and affect disease susceptibility. PMID:19884314

  4. Nucleotide Sequence of the Protective Antigen Gene of Bacillus Anthracis

    Science.gov (United States)

    1988-02-02

    Montie, S. Kadis, and S. I. Ajl (ed.), Microbial toxins, vol. 3. Academic Press, Inc., New York. 23. Little, S. F., and G. B. Knudaon. 1986...Takkinen, and L. Kaariainen. 1981. Nucleotide sequence of the promoter and NHa-terminal signal peptide region of the a- amylase gene from Bacillus

  5. Copy number variations exploration of multiple genes in Graves' disease.

    Science.gov (United States)

    Song, Rong-Hua; Shao, Xiao-Qing; Li, Ling; Wang, Wen; Zhang, Jin-An

    2017-01-01

    Few previous published papers reported copy number variations of genes could affect the predisposition of Graves' disease (GD). Herein, the aim of this study was to explore the association between copy number variations (CNV) profile and GD. The preliminary copy number microarray used to screen copy number variant genes was performed in 6 GD patients. Five CNV candidate genes (CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17) were then validated in an independent set of samples (50 GD patients and 50 matched healthy ones) by the Accucopy assay method. The CNV of the other 2 genes TRY6 and CCL3L1 was investigated in 144 GD patients and 144 healthy volunteers by the definitive genotyping technique using the Taqman quantitative polymerase-chain-reaction (Taqman qPCR). TRY6 gene-associated single nucleotide polymorphism (SNP), rs13230029, was genotyped by the PCR-ligase detection reaction (LDR) in 675 GD patients and 898 healthy controls. There were no correlation of the gene copy number (GCN) of CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17 with GD. In comparison with that of controls, the GCN distribution of TRY6 and CCL3L1 in GD patients did not show significantly differ (P > 0.05). Furthermore, TRY6-related polymorphism (rs13230029) showed no difference between GD patients and controls. No correlation was found between CNV or SNP genotype and clinical phenotypes. Generally, there were no link of the copy numbers of several genes, including CFH, CFHR1, KIAA0125, UGT2B15, UGT2B17, TRY6, and CCL3L1 to GD. Our results clearly indicated that the copy number variations of multiple genes, namely CFH, CFHR1, KIAA0125, UGT2B15, UGT2B17, TRY6, and CCL3L1, were not associated with the development of GD.

  6. Nucleotide Base Variation of Blast Disease Resistance Gene Pi33 in Rice Selected Broad Genetic Background

    OpenAIRE

    DWINITA WIKAN UTAMI; KALIA BARNITA; SITI YURIAH; IDA HANARIDA

    2011-01-01

    Rice is one of the most important crops for human beings, thus increasing productivity are continually persecuted. Blast disease can reduce the rate of productivity of rice cultivation. Therefore, the program of blast disease-resistant varieties needs to do effectively. One of broad-spectrum blast disease-resistant gene is Pi33. This study was aimed to identify the variation in the sequence of nucleotide bases of Pi33 gene in five interspesific lines which derived from Bio46 (IR64/Oryza rufip...

  7. Variation in conserved non-coding sequences on chromosome 5q and susceptibility to asthma and atopy

    Directory of Open Access Journals (Sweden)

    Dubchak Inna

    2005-12-01

    Full Text Available Abstract Background Evolutionarily conserved sequences likely have biological function. Methods To determine whether variation in conserved sequences in non-coding DNA contributes to risk for human disease, we studied six conserved non-coding elements in the Th2 cytokine cluster on human chromosome 5q31 in a large Hutterite pedigree and in samples of outbred European American and African American asthma cases and controls. Results Among six conserved non-coding elements (>100 bp, >70% identity; human-mouse comparison, we identified one single nucleotide polymorphism (SNP in each of two conserved elements and six SNPs in the flanking regions of three conserved elements. We genotyped our samples for four of these SNPs and an additional three SNPs each in the IL13 and IL4 genes. While there was only modest evidence for association with single SNPs in the Hutterite and European American samples (P IL4 gene (P IL13 gene was strongly associated with total IgE (P = 0.00022 and allergic sensitization to mold allergens (P = 0.00076 in the Hutterites, and more modestly associated with sensitization to molds in the European Americans and African Americans (P Conclusion These results indicate that there is overall little variation in the conserved non-coding elements on 5q31, but variation in IL4 and IL13, including possibly one SNP in a conserved element, influence asthma and atopic phenotypes in diverse populations.

  8. Thermodynamics-based models of transcriptional regulation with gene sequence.

    Science.gov (United States)

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  9. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    NARCIS (Netherlands)

    Timofeeva, Maria N.; Ben Kinnersley, [Unknown; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F. A.; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Foersti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernandez-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellvi-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P. M.; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,0

  10. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    NARCIS (Netherlands)

    Timofeeva, Maria N.; Ben Kinnersley, [Unknown; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F. A.; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Foersti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernandez-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellvi-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P. M.; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,0

  11. Recurrent Coding Sequence Variation Explains only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    NARCIS (Netherlands)

    M.N. Timofeeva (Maria N.); B. Kinnersley (Ben); S.M. Farrington (Susan M.); N. Whiffin (Nicola); C. Palles (Claire); V. Svinti (Victoria); A. Lloyd (Amy); M. Gorman (Maggie); L.-Y. Ooi (Li-Yin); F. Hosking (Fay); E. Barclay (Ella); L. Zgaga (Lina); S.E. Dobbins (Sara E.); L. Martin (Lynn); E. Theodoratou (Evropi); P. Broderick (Peter); A. Tenesa (Albert); C. Smillie (Claire); G. Grimes (Graeme); C. Hayward (Caroline); A. Campbell (Archie); D. Porteous (David); I.J. Deary (Ian J.); S.E. Harris (Sarah); J.B. Northwood (John Blackman); J.H. Barrett (Jennifer H.); G. Smith (Gillian); R. Wolf (Roland); D. Forman (David); H. Morreau (Hans); D. Ruano (Dina); C. Tops (Carli); J.T. Wijnen (Juul); M. Schrumpf (Melanie); A. Boot (Arnoud); H. Vasen (Hans); F.J. Hes (Frederik); T. van Wezel (Tom); A. Franke (Andre); W. Lieb (Wolgang); C. Schafmayer (Clemens); J. Hampe (Jochen); T. Buch (Thorsten); P. Propping (Peter); K. Hemminki (Kari); A. Försti (Asta); H. Westers (Helga); R.M.W. Hofstra (Robert); M. Pinheiro (Manuela); C. Pinto (Carla); P.J. Teixeira; C. Ruiz-Ponte (Clara); C. Fernández-Rozadilla (Ceres); A. Carracedo (Angel); A. Castells; S. Castellví-Bel; H. Campbell (Harry); D.T. Bishop (David Timothy); I. Tomlinson (Ian); M.G. Dunlop (Malcolm); R. Houlston (Richard)

    2015-01-01

    textabstractWhilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs ca

  12. Recurrent Coding Sequence Variation Explains only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    NARCIS (Netherlands)

    M.N. Timofeeva (Maria N.); B. Kinnersley (Ben); S.M. Farrington (Susan M.); N. Whiffin (Nicola); C. Palles (Claire); V. Svinti (Victoria); A. Lloyd (Amy); M. Gorman (Maggie); L.-Y. Ooi (Li-Yin); F. Hosking (Fay); E. Barclay (Ella); L. Zgaga (Lina); S.E. Dobbins (Sara E.); L. Martin (Lynn); E. Theodoratou (Evropi); P. Broderick (Peter); A. Tenesa (Albert); C. Smillie (Claire); G. Grimes (Graeme); C. Hayward (Caroline); A. Campbell (Archie); D. Porteous (David); I.J. Deary (Ian J.); S.E. Harris (Sarah); J.B. Northwood (John Blackman); J.H. Barrett (Jennifer H.); G. Smith (Gillian); R. Wolf (Roland); D. Forman (David); H. Morreau (Hans); D. Ruano (Dina); C. Tops (Carli); J.T. Wijnen (Juul); M. Schrumpf (Melanie); A. Boot (Arnoud); H. Vasen (Hans); F.J. Hes (Frederik); T. van Wezel (Tom); A. Franke (Andre); W. Lieb (Wolgang); C. Schafmayer (Clemens); J. Hampe (Jochen); T. Buch (Thorsten); P. Propping (Peter); K. Hemminki (Kari); A. Försti (Asta); H. Westers (Helga); R.M.W. Hofstra (Robert); M. Pinheiro (Manuela); C. Pinto (Carla); P.J. Teixeira; C. Ruiz-Ponte (Clara); C. Fernández-Rozadilla (Ceres); A. Carracedo (Angel); A. Castells; S. Castellví-Bel; H. Campbell (Harry); D.T. Bishop (David Timothy); I. Tomlinson (Ian); M.G. Dunlop (Malcolm); R. Houlston (Richard)

    2015-01-01

    textabstractWhilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs ca

  13. Recurrent Coding Sequence Variation Explains only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    NARCIS (Netherlands)

    M.N. Timofeeva (Maria N.); B. Kinnersley (Ben); S.M. Farrington (Susan M.); N. Whiffin (Nicola); C. Palles (Claire); V. Svinti (Victoria); A. Lloyd (Amy); M. Gorman (Maggie); L.-Y. Ooi (Li-Yin); F. Hosking (Fay); E. Barclay (Ella); L. Zgaga (Lina); S.E. Dobbins (Sara E.); L. Martin (Lynn); E. Theodoratou (Evropi); P. Broderick (Peter); A. Tenesa (Albert); C. Smillie (Claire); G. Grimes (Graeme); C. Hayward (Caroline); A. Campbell (Archie); D. Porteous (David); I.J. Deary (Ian J.); S.E. Harris (Sarah); J.B. Northwood (John Blackman); J.H. Barrett (Jennifer H.); G. Smith (Gillian); R. Wolf (Roland); D. Forman (David); H. Morreau (Hans); D. Ruano (Dina); C. Tops (Carli); J.T. Wijnen (Juul); M. Schrumpf (Melanie); A. Boot (Arnoud); H. Vasen (Hans); F.J. Hes (Frederik); T. van Wezel (Tom); A. Franke (Andre); W. Lieb (Wolgang); C. Schafmayer (Clemens); J. Hampe (Jochen); T. Buch (Thorsten); P. Propping (Peter); K. Hemminki (Kari); A. Försti (Asta); H. Westers (Helga); R.M.W. Hofstra (Robert); M. Pinheiro (Manuela); C. Pinto (Carla); P.J. Teixeira; C. Ruiz-Ponte (Clara); C. Fernández-Rozadilla (Ceres); A. Carracedo (Angel); A. Castells; S. Castellví-Bel; H. Campbell (Harry); D.T. Bishop (David Timothy); I. Tomlinson (Ian); M.G. Dunlop (Malcolm); R. Houlston (Richard)

    2015-01-01

    textabstractWhilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs

  14. Quantitative modeling of a gene's expression from its intergenic sequence.

    Directory of Open Access Journals (Sweden)

    Md Abul Hassan Samee

    2014-03-01

    Full Text Available Modeling a gene's expression from its intergenic locus and trans-regulatory context is a fundamental goal in computational biology. Owing to the distributed nature of cis-regulatory information and the poorly understood mechanisms that integrate such information, gene locus modeling is a more challenging task than modeling individual enhancers. Here we report the first quantitative model of a gene's expression pattern as a function of its locus. We model the expression readout of a locus in two tiers: 1 combinatorial regulation by transcription factors bound to each enhancer is predicted by a thermodynamics-based model and 2 independent contributions from multiple enhancers are linearly combined to fit the gene expression pattern. The model does not require any prior knowledge about enhancers contributing toward a gene's expression. We demonstrate that the model captures the complex multi-domain expression patterns of anterior-posterior patterning genes in the early Drosophila embryo. Altogether, we model the expression patterns of 27 genes; these include several gap genes, pair-rule genes, and anterior, posterior, trunk, and terminal genes. We find that the model-selected enhancers for each gene overlap strongly with its experimentally characterized enhancers. Our findings also suggest the presence of sequence-segments in the locus that would contribute ectopic expression patterns and hence were "shut down" by the model. We applied our model to identify the transcription factors responsible for forming the stripe boundaries of the studied genes. The resulting network of regulatory interactions exhibits a high level of agreement with known regulatory influences on the target genes. Finally, we analyzed whether and why our assumption of enhancer independence was necessary for the genes we studied. We found a deterioration of expression when binding sites in one enhancer were allowed to influence the readout of another enhancer. Thus, interference

  15. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    Francioli, Laurent C.; Menelaou, Andronild; Pulit, Sara L.; Van Dijk, Freerk; Palamara, Pier Francesco; Elbers, Clara C.; Neerincx, Pieter B. T.; Ye, Kai; Guryev, Victor; Kloosterman, Wigard P.; Deelen, Patrick; Abdellaoui, Abdel; Van Leeuwen, Elisabeth M.; Van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F. J.; Karssen, Lennart C.; Kanterakis, Alexandros; Amin, Najaf; Hottenga, Jouke Jan; Lameijer, Eric-Wubbo; Kattenberg, Mathijs; Dijkstra, Martijn; Byelas, Heorhiy; Van Settenl, Jessica; Van Schaik, Barbera D. C.; Bot, Jan; Nijman, Isaac J.; Renkens, Ivo; Marscha, Tobias; Schonhuth, Alexander; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Polak, Paz; Sohail, Mashaal; Vuzman, Dana; Hormozdiari, Fereydoun; Van Enckevort, David; Mei, Hailiang; Koval, Vyacheslav; Moed, Ma-Tthijs H.; Van der Velde, K. Joeri; Rivadeneira, Fernando; Estrada, Karol; Medina-Gomez, Carolina; Isaacs, Aaron; McCarroll, Steven A.; Beekrnan, Marian; De Craen, Anton J. M.; Suchiman, H. Eka D.; Hofman, Albert; Oostra, Ben; Uitterlinden, Andre G.; Willemsen, Gonneke; Platteel, Mathieu; Veldink, Jan H.; Van den Berg, Leonard H.; Pitts, Steven J.; Potluri, Shobha; Sundar, Purnima; Cox, David R.; Sunyaev, Shamil R.; Den Dunnen, Johan T.; Stoneking, Mark; De Knijff, Peter; Kayser, Manfred; Li, Qibin; Li, Yingrui; Du, Yuanping; Chen, Ruoyan; Cao, Hongzhi; Li, Ning; Cao, Sujie; Wang, Jun; Bovenberg, Jasper A.; Peer, Itsik; Slagboom, P. Eline; Van Duijn, Cornelia M.; Boomsma, Dorret I.; Van Ommen, Gert-Jan B.; De Bakker, Paul I. W.; Swertz, Morris A.; Wijmenga, Cisca

    2014-01-01

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  16. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing

    NARCIS (Netherlands)

    Aflitos, S.; Schijlen, E.; de Jong, H.; de Ridder, D.; Smit, S.; Finkers, R.; Wang, J.; Zhang, G.; Li, N.; Mao, L.; Bakker, F.; Dirks, R.; Breit, T.; Gravendeel, B.; Huits, H.; Struss, D.; Swanson-Wagner, R.; van Leeuwen, H.; van Ham, R.C.H.J.; Fito, L.; Guignier, L.; Sevilla, M.; Ellul, P.; Ganko, E.; Kapur, A.; Reclus, E.; de Geus, B.; van de Geest, H.; te Lintel Hekkert, B.; van Haarst, J.; Smits, L.; Koops, A.; Sanchez-Perez, G.; van Heusden, A.W.; Visser, R.; Quan, Z.; Min, J.; Liao, L.; Wang, X.; Wang, G.; Yue, Z.; Yang, X.; Xu, N.; Schranz, E.; Smets, E.; Vos, R.; Rauwerda, J.; Ursem, R.; Schuit, C.; Kerns, M.; van den Berg, J.; Vriezen, W.; Janssen, A.; Datema, E.; Jahrman, T.; Moquet, F.; Bonnet, J.; Peters, S.

    2014-01-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new referen

  17. Unique Features of Germline Variation in Five Egyptian Familial Breast Cancer Families Revealed by Exome Sequencing

    Science.gov (United States)

    Kim, Yeong C.; Soliman, Amr S.; Cui, Jian; Ramadan, Mohamed; Hablas, Ahmed; Abouelhoda, Mohamed; Hussien, Nehal; Ahmed, Ola; Zekri, Abdel-Rahman Nabawy; Seifeldin, Ibrahim A.

    2017-01-01

    Genetic predisposition increases the risk of familial breast cancer. Recent studies indicate that genetic predisposition for familial breast cancer can be ethnic-specific. However, current knowledge of genetic predisposition for the disease is predominantly derived from Western populations. Using this existing information as the sole reference to judge the predisposition in non-Western populations is not adequate and can potentially lead to misdiagnosis. Efforts are required to collect genetic predisposition from non-Western populations. The Egyptian population has high genetic variations in reflecting its divergent ethnic origins, and incident rate of familial breast cancer in Egypt is also higher than the rate in many other populations. Using whole exome sequencing, we investigated genetic predisposition in five Egyptian familial breast cancer families. No pathogenic variants in BRCA1, BRCA2 and other classical breast cancer-predisposition genes were present in these five families. Comparison of the genetic variants with those in Caucasian familial breast cancer showed that variants in the Egyptian families were more variable and heterogeneous than the variants in Caucasian families. Multiple damaging variants in genes of different functional categories were identified either in a single family or shared between families. Our study demonstrates that genetic predisposition in Egyptian breast cancer families may differ from those in other disease populations, and supports a comprehensive screening of local disease families to determine the genetic predisposition in Egyptian familial breast cancer. PMID:28076423

  18. Cloning and sequencing of a Moraxella bovis pilin gene.

    Science.gov (United States)

    Marrs, C F; Schoolnik, G; Koomey, J M; Hardy, J; Rothbard, J; Falkow, S

    1985-07-01

    Moraxella bovis pili have been shown to play a major role in both infectivity and protective immunity of bovine infectious keratoconjunctivitis. Sonicated M. bovis DNA from the piliated strain EPP63 was inserted into the vector lambda gt11 with EcoRI linkers. Recombinant phage were screened with an oligonucleotide probe based on the amino-terminal portion of the DNA sequence of a Neisseria gonorrhoeae pilin gene. Two candidate phages produced a protein that comigrated with EPP63 beta pilin in sodium dodecyl sulfate-polyacrylamide gels and bound anti-pilus antisera. The 1.9-kilobase insert from one of these, lambda gt11M182, was subcloned in both orientations into pBR322, forming the plasmids pMxB7 and pMxB9, both of which produced beta pilin, as did pMxB12, a HindIII deletion derivative of pMxB7. In HB101(pMxB12), the M. bovis pilin protein was shown to be primarily localized in the inner membrane. The entire 939-base-pair insert of pMxB12 was sequenced, revealing a ribosome binding site just upstream of the coding region and an AT-rich region further upstream containing some potential RNA polymerase recognition sites. The translation of the sequence predicts a six-amino-acid leader sequence preceding the phenylalanine that begins the mature protein. Codon usage analysis of the M. bovis beta pilin gene revealed greater use of the CUA codon for leucine than usual for a well-expressed Escherichia coli gene. Comparisons of the M. bovis EPP63 beta pilin protein sequence with other pilin gene sequences are presented.

  19. Phylogenetic Relationships and Genetic Variation in Longidorus and Xiphinema Species (Nematoda: Longidoridae) Using ITS1 Sequences of Nuclear Ribosomal DNA.

    Science.gov (United States)

    Ye, Weimin; Szalanski, Allen L; Robbins, R T

    2004-03-01

    Genetic analyses using DNA sequences of nuclear ribosomal DNA ITS1 were conducted to determine the extent of genetic variation within and among Longidorus and Xiphinema species. DNA sequences were obtained from samples collected from Arkansas, California and Australia as well as 4 Xiphinema DNA sequences from GenBank. The sequences of the ITS1 region including the 3' end of the 18S rDNA gene and the 5' end of the 5.8S rDNA gene ranged from 1020 bp to 1244 bp for the 9 Longidorus species, and from 870 bp to 1354 bp for the 7 Xiphinema species. Nucleotide frequencies were: A = 25.5%, C = 21.0%, G = 26.4%, and T = 27.1%. Genetic variation between the two genera had a maximum divergence of 38.6% between X. chambersi and L. crassus. Genetic variation among Xiphinema species ranged from 3.8% between X. diversicaudatum and X. bakeri to 29.9% between X. chambersi and X. italiae. Within Longidorus, genetic variation ranged from 8.9% between L. crassus and L. grandis to 32.4% between L. fragilis and L. diadecturus. Intraspecific genetic variation in X. americanum sensu lato ranged from 0.3% to 1.9%, while genetic variation in L. diadecturus had 0.8% and L. biformis ranged from 0.6% to 10.9%. Identical sequences were obtained between the two populations of L. grandis, and between the two populations of X. bakeri. Phylogenetic analyses based on the ITS1 DNA sequence data were conducted on each genus separately using both maximum parsimony and maximum likelihood analysis. Among the Longidorus taxa, 4 subgroups are supported: L. grandis, L. crassus, and L. elongatus are in one cluster; L. biformis and L. paralongicaudatus are in a second cluster; L. fragilis and L. breviannulatus are in a third cluster; and L. diadecturus is in a fourth cluster. Among the Xiphinema taxa, 3 subgroups are supported: X. americanum with X. chambersi, X. bakeri with X. diversicaudatum, and X. italiae and X. vuittenezi forming a sister group with X. index. The relationships observed in this study

  20. Mitochondrial DNA sequence variation is associated with free-living activity energy expenditure in the elderly.

    Science.gov (United States)

    Tranah, Gregory J; Lam, Ernest T; Katzman, Shana M; Nalls, Michael A; Zhao, Yiqiang; Evans, Daniel S; Yokoyama, Jennifer S; Pawlikowska, Ludmila; Kwok, Pui-Yan; Mooney, Sean; Kritchevsky, Stephen; Goodpaster, Bret H; Newman, Anne B; Harris, Tamara B; Manini, Todd M; Cummings, Steven R

    2012-09-01

    The decline in activity energy expenditure underlies a range of age-associated pathological conditions, neuromuscular and neurological impairments, disability, and mortality. The majority (90%) of the energy needs of the human body are met by mitochondrial oxidative phosphorylation (OXPHOS). OXPHOS is dependent on the coordinated expression and interaction of genes encoded in the nuclear and mitochondrial genomes. We examined the role of mitochondrial genomic variation in free-living activity energy expenditure (AEE) and physical activity levels (PAL) by sequencing the entire (~16.5 kilobases) mtDNA from 138 Health, Aging, and Body Composition Study participants. Among the common mtDNA variants, the hypervariable region 2 m.185G>A variant was significantly associated with AEE (p=0.001) and PAL (p=0.0005) after adjustment for multiple comparisons. Several unique nonsynonymous variants were identified in the extremes of AEE with some occurring at highly conserved sites predicted to affect protein structure and function. Of interest is the p.T194M, CytB substitution in the lower extreme of AEE occurring at a residue in the Qi site of complex III. Among participants with low activity levels, the burden of singleton variants was 30% higher across the entire mtDNA and OXPHOS complex I when compared to those having moderate to high activity levels. A significant pooled variant association across the hypervariable 2 region was observed for AEE and PAL. These results suggest that mtDNA variation is associated with free-living AEE in older persons and may generate new hypotheses by which specific mtDNA complexes, genes, and variants may contribute to the maintenance of activity levels in late life.

  1. Heritable genome-wide variation of gene expression and promoter methylation between wild and domesticated chickens

    Directory of Open Access Journals (Sweden)

    Nätt Daniel

    2012-02-01

    Full Text Available Abstract Background Variations in gene expression, mediated by epigenetic mechanisms, may cause broad phenotypic effects in animals. However, it has been debated to what extent expression variation and epigenetic modifications, such as patterns of DNA methylation, are transferred across generations, and therefore it is uncertain what role epigenetic variation may play in adaptation. Results In Red Junglefowl, ancestor of domestic chickens, gene expression and methylation profiles in thalamus/hypothalamus differed substantially from that of a domesticated egg laying breed. Expression as well as methylation differences were largely maintained in the offspring, demonstrating reliable inheritance of epigenetic variation. Some of the inherited methylation differences were tissue-specific, and the differential methylation at specific loci were little changed after eight generations of intercrossing between Red Junglefowl and domesticated laying hens. There was an over-representation of differentially expressed and methylated genes in selective sweep regions associated with chicken domestication. Conclusions Our results show that epigenetic variation is inherited in chickens, and we suggest that selection of favourable epigenomes, either by selection of genotypes affecting epigenetic states, or by selection of methylation states which are inherited independently of sequence differences, may have been an important aspect of chicken domestication.

  2. Temporal stability of epigenetic markers: sequence characteristics and predictors of short-term DNA methylation variations.

    Directory of Open Access Journals (Sweden)

    Hyang-Min Byun

    Full Text Available BACKGROUND: DNA methylation is an epigenetic mechanism that has been increasingly investigated in observational human studies, particularly on blood leukocyte DNA. Characterizing the degree and determinants of DNA methylation stability can provide critical information for the design and conduction of human epigenetic studies. METHODS: We measured DNA methylation in 12 gene-promoter regions (APC, p16, p53, RASSF1A, CDH13, eNOS, ET-1, IFNγ, IL-6, TNFα, iNOS, and hTERT and 2 of non-long terminal repeat elements, i.e., L1 and Alu in blood samples obtained from 63 healthy individuals at baseline (Day 1 and after three days (Day 4. DNA methylation was measured by bisulfite-PCR-Pyrosequencing. We calculated intraclass correlation coefficients (ICCs to measure the within-individual stability of DNA methylation between Day 1 and 4, subtracted of pyrosequencing error and adjusted for multiple covariates. RESULTS: Methylation markers showed different temporal behaviors ranging from high (IL-6, ICC = 0.89 to low stability (APC, ICC = 0.08 between Day 1 and 4. Multiple sequence and marker characteristics were associated with the degree of variation. Density of CpG dinucleotides nearby the sequence analyzed (measured as CpG(o/e or G+C content within ±200 bp was positively associated with DNA methylation stability. The 3' proximity to repeat elements and range of DNA methylation on Day 1 were also positively associated with methylation stability. An inverted U-shaped correlation was observed between mean DNA methylation on Day 1 and stability. CONCLUSIONS: The degree of short-term DNA methylation stability is marker-dependent and associated with sequence characteristics and methylation levels.

  3. ITS2-rDNA Sequence Variation of Phlebotomus sergenti s.l. (Dip: Psychodidae Populations in Iran

    Directory of Open Access Journals (Sweden)

    Vahideh Moin-Vaziri

    2016-10-01

    Full Text Available Background: Phlebotomus sergenti s.l. is considered the most likely vector of Leishmania tropica in Iran. Although two morphotypes- P. sergenti sergenti (A and P. sergenti similis (B-have been formally described, further morphologi­cal and a molecular analysis of mitochondrial cytochrome oxidase I (mtDNA-COI gene revealed inconsistencies and suggests that the variation between the morphotypes is intra-specific and the morphotypes might be identical species.Methods: We examined the sequence of the ITS2-rDNA of Iranian specimens of P. sergenti s.l., comprising P. cf ser­genti, P. cf similis, and intermediate morphotypes, together with available data in Genbank.Results: Sequence analysis showed 5.2% variation among P. sergenti s.l. morphotypes. Almost half of the variation was due to the number of an AT microsatellite repeats in the center of the spacer. Nine haplotypes were found in the spe­cies constructing three main lineages corresponding to the origin of the colonies located in southwest (SW, northeast (NE, and northwest-center-southeast (NCS. Lineages NCS and NE included both typical P. cf sergenti and P. cf similis and intermediate morphotypes.Conclusion: Phylogenetic sequence analysis revealed that, except for one Iranian sample, which was close to the European samples, other Iranian haplotypes were associated with the northeastern Mediterranean populations in­cluding Turkey, Cyprus, Syria, and Pakistan. Similar to the sequences of mtDNA COI gene, ITS2 sequences could not resolve P. sergenti from P. similis and did not support the possible existence of sibling species or subspecies within P. sergenti s.l..

  4. ITS2-rDNA Sequence Variation of Phlebotomus sergenti s.l. (Dip: Psychodidae) Populations in Iran

    Science.gov (United States)

    Moin-Vaziri, Vahideh; Oshaghi, Mohammad Ali; Yaghoobi-Ershadi, Mohammad Reza; Derakhshandeh-Peykar, Pupak; Abaei, Mohammad Reza; Mohtarami, Fatemeh; Zahraei-Ramezani, Ali Reza; Nadim, Aboulhassan

    2016-01-01

    Background: Phlebotomus sergenti s.l. is considered the most likely vector of Leishmania tropica in Iran. Although two morphotypes- P. sergenti sergenti (A) and P. sergenti similis (B)-have been formally described, further morphological and a molecular analysis of mitochondrial cytochrome oxidase I (mtDNA-COI) gene revealed inconsistencies and suggests that the variation between the morphotypes is intraspecific and the morphotypes might be identical species. Methods: We examined the sequence of the ITS2-rDNA of Iranian specimens of P. sergenti s.l., comprising P. cf sergenti, P. cf similis, and intermediate morphotypes, together with available data in Genbank. Results: Sequence analysis showed 5.2% variation among P. sergenti s.l. morphotypes. Almost half of the variation was due to the number of an AT microsatellite repeats in the center of the spacer. Nine haplotypes were found in the species constructing three main lineages corresponding to the origin of the colonies located in southwest (SW), northeast (NE), and northwest-center-southeast (NCS). Lineages NCS and NE included both typical P. cf sergenti and P. cf similis and intermediate morphotypes. Conclusion: Phylogenetic sequence analysis revealed that, except for one Iranian sample, which was close to the European samples, other Iranian haplotypes were associated with the northeastern Mediterranean populations including Turkey, Cyprus, Syria, and Pakistan. Similar to the sequences of mtDNA COI gene, ITS2 sequences could not resolve P. sergenti from P. similis and did not support the possible existence of sibling species or subspecies within P. sergenti s.l.. PMID:28032098

  5. Imprinted genes show unique patterns of sequence conservation

    Directory of Open Access Journals (Sweden)

    Helms Volkhard

    2010-11-01

    Full Text Available Abstract Background Genomic imprinting is an evolutionary conserved mechanism of epigenetic gene regulation in placental mammals that results in silencing of one of the parental alleles. In order to decipher interactions between allele-specific DNA methylation of imprinted genes and evolutionary conservation, we performed a genome-wide comparative investigation of genomic sequences and highly conserved elements of imprinted genes in human and mouse. Results Evolutionarily conserved elements in imprinted regions differ from those associated with autosomal genes in various ways. Whereas for maternally expressed genes strong divergence of protein-encoding sequences is most prominent, paternally expressed genes exhibit substantial conservation of coding and noncoding sequences. Conserved elements in imprinted regions are marked by enrichment of CpG dinucleotides and low (TpG+CpA/(2·CpG ratios indicate reduced CpG deamination. Interestingly, paternally and maternally expressed genes can be distinguished by differences in G+C and CpG contents that might be associated with unusual epigenetic features. Especially noncoding conserved elements of paternally expressed genes are exceptionally G+C and CpG rich. In addition, we confirmed a frequent occurrence of intronic CpG islands and observed a decelerated degeneration of ancient LINE-1 repeats. We also found a moderate enrichment of YY1 and CTCF binding sites in imprinted regions and identified several short sequence motifs in highly conserved elements that might act as additional regulatory elements. Conclusions We discovered several novel conserved DNA features that might be related to allele-specific DNA methylation. Our results hint at reduced CpG deamination rates in imprinted regions, which affects mostly noncoding conserved elements of paternally expressed genes. Pronounced differences between maternally and paternally expressed genes imply specific modes of evolution as a result of differences in

  6. The Structure of a Bernoulli Process Variation of the Fibonacci Sequence

    CERN Document Server

    Benson, Brian A

    2007-01-01

    We consider the structure of a variation of the Fibonacci sequence which is determined by a Bernoulli process. The associated structure of all Bernoulli variations of the Fibonacci sequence can be represented by a directed binary tree, which we denote X, with vertex labels representing the specific state of the recurrence variation. Since X is a binary tree, we can consider the term of a sequence variation given by a finite traversal of X represented by a binary code t. We then prove that the traversal of X that is the reflection of the digits of t gives exactly the integer term corresponding to t. We consider how to further this result with the statement of an additional conjecture. Finally, we give connections to Fibonacci expansions, the Stern-Brocot tree, and we apply our methods to the Three Hat Problem as seen in ``Puzzle Corner'' of the ``Technology Review'' magazine.

  7. Distribution of Genes and Repetitive Elements in the Diabrotica virgifera virgifera Genome Estimated Using BAC Sequencing

    Directory of Open Access Journals (Sweden)

    Brad S. Coates

    2012-01-01

    Full Text Available Feeding damage caused by the western corn rootworm, Diabrotica virgifera virgifera, is destructive to corn plants in North America and Europe where control remains challenging due to evolution of resistance to chemical and transgenic toxins. A BAC library, DvvBAC1, containing 109,486 clones with 104±34.5 kb inserts was created, which has an ~4.56X genome coverage based upon a 2.58 Gb (2.80 pg flow cytometry-estimated haploid genome size. Paired end sequencing of 1037 BAC inserts produced 1.17 Mb of data (~0.05% genome coverage and indicated ~9.4 and 16.0% of reads encode, respectively, endogenous genes and transposable elements (TEs. Sequencing genes within BAC full inserts demonstrated that TE densities are high within intergenic and intron regions and contribute to the increased gene size. Comparison of homologous genome regions cloned within different BAC clones indicated that TE movement may cause haplotype variation within the inbred strain. The data presented here indicate that the D. virgifera virgifera genome is large in size and contains a high proportion of repetitive sequence. These BAC sequencing methods that are applicable for characterization of genomes prior to sequencing may likely be valuable resources for genome annotation as well as scaffolding.

  8. [Genetic variation analysis of canine parvovirus VP2 gene in China].

    Science.gov (United States)

    Yi, Li; Cheng, Shi-Peng; Yan, Xi-Jun; Wang, Jian-Ke; Luo, Bin

    2009-11-01

    To recognize the molecular biology character, phylogenetic relationship and the state quo prevalent of Canine parvovirus (CPV), Faecal samnples from pet dogs with acute enteritis in the cities of Beijing, Wuhan, and Nanjing were collected and tested for CPV by PCR and other assay between 2006 and 2008. There was no CPV to FPV (MEV) variation by PCR-RFLP analysis in all samples. The complete ORFs of VP2 genes were obtained by PCR from 15 clinical CPVs and 2 CPV vaccine strains. All amplicons were cloned and sequenced. Analysis of the VP2 sequences showed that clinical CPVs both belong to CPV-2a subtype, and could be classified into a new cluster by amino acids contrasting which contains Tyr-->Ile (324) mutation. Besides the 2 CPV vaccine strains belong to CPV-2 subtype, and both of them have scattered variation in amino acids residues of VP2 protein. Construction of the phylogenetic tree based on CPV VP2 sequence showed these 15 CPV clinical strains were in close relationship with Korea strain K001 than CPV-2a isolates in other countries at early time, It is indicated that the canine parvovirus genetic variation was associated with location and time in some degree. The survey of CPV capsid protein VP2 gene provided the useful information for the identification of CPV types and understanding of their genetic relationship.

  9. Cloning and sequence analysis of 5' fragment of Hoxa-11 gene in Latimeria chalumnae.

    Science.gov (United States)

    Xue, L Y; Qian, K X

    2001-01-01

    Hoxa-11 gene is essential for the development of fish fins and tetrapod limbs. Based on the published nucleotide sequences of human and mouse Hoxa-11 genes, two degenerate primers were designed. Latimeria Hoxa-11 gene fragment was amplified by PCR, cloned and sequenced. The acquired Hox gene fragment, which encodes 204 amino acids, is comprised of 2,065 bp, including most exon 1, intron and partial exon 2. The homology of latimeria Hoxa-11 protein is 66.0% to human, 67.6% to mouse, 74.4% to chick, 72.8% to frog, and 59.7% to zebrafish, respectively. The exon 2 region including the homeobox and the splice site are highly conserved. However, the exon 1 region has increased in size by 16% from latimeria to human. Sequence analysis further revealed that exon 1 of latimeria Hoxa-11 could be divided into four regions: two highly conserved regions, a moderately conserved region, and a variable region adjacent to the intron. The size variation is primarily caused by the accumulation of alanine repeats and of flanking segments rich in glycine and serine in the variable region. It implies that the variable region might be related to acquisition of new functions in the fin-limb transition and vertebrate evolution. Besides the homeobox, two highly conserved regions in exon 1 and two phylogenetic footprints in the intro were found. The strong sequence conservation suggests an important functional role of these regions.

  10. Identification of Legionella pneumophila serogroups and other Legionella species by mip gene sequencing.

    Science.gov (United States)

    Haroon, Attiya; Koide, Michio; Higa, Futoshi; Tateyama, Masao; Fujita, Jiro

    2012-04-01

    The virulence factor known as the macrophage infectivity potentiator (mip) is responsible for the intracellular survival of Legionella species. In this study, we investigated the potential of the mip gene sequence to differentiate isolates of different species of Legionella and different serogroups of Legionella pneumophila. We used 35 clinical L. pneumophila isolates and one clinical isolate each of Legionella micdadei, Legionella longbeachae, and Legionella dumoffii (collected from hospitals all over Japan between 1980 and 2007). We used 19 environmental Legionella anisa isolates (collected in the Okinawa, Nara, Osaka, and Hyogo prefectures between 1987 and 2007) and two Legionella type strains. We extracted bacterial genomic DNA and amplified out the mip gene by PCR. PCR products were purified by agarose gel electrophoresis and the mip gene was then sequenced. The L. pneumophila isolates could be divided into two groups: one group was very similar to the type strain and was composed of serogroup (SG) 1 isolates only; the second group had more sequence variations and was composed of SG1 isolates as well as SG2, SG3, SG5, and SG10 isolates. Phylogenetic analysis displayed one cluster for L. anisa isolates, while other Legionella species were present at discrete levels. Our findings show that mip gene sequencing is an effective technique for differentiating L. pneumophila strains from other Legionella species.

  11. Leishmania-specific surface antigens show sub-genus sequence variation and immune recognition.

    Directory of Open Access Journals (Sweden)

    Daniel P Depledge

    Full Text Available BACKGROUND: A family of hydrophilic acylated surface (HASP proteins, containing extensive and variant amino acid repeats, is expressed at the plasma membrane in infective extracellular (metacyclic and intracellular (amastigote stages of Old World Leishmania species. While HASPs are antigenic in the host and can induce protective immune responses, the biological functions of these Leishmania-specific proteins remain unresolved. Previous genome analysis has suggested that parasites of the sub-genus Leishmania (Viannia have lost HASP genes from their genomes. METHODS/PRINCIPAL FINDINGS: We have used molecular and cellular methods to analyse HASP expression in New World Leishmania mexicana complex species and show that, unlike in L. major, these proteins are expressed predominantly following differentiation into amastigotes within macrophages. Further genome analysis has revealed that the L. (Viannia species, L. (V. braziliensis, does express HASP-like proteins of low amino acid similarity but with similar biochemical characteristics, from genes present on a region of chromosome 23 that is syntenic with the HASP/SHERP locus in Old World Leishmania species and the L. (L. mexicana complex. A related gene is also present in Leptomonas seymouri and this may represent the ancestral copy of these Leishmania-genus specific sequences. The L. braziliensis HASP-like proteins (named the orthologous (o HASPs are predominantly expressed on the plasma membrane in amastigotes and are recognised by immune sera taken from 4 out of 6 leishmaniasis patients tested in an endemic region of Brazil. Analysis of the repetitive domains of the oHASPs has shown considerable genetic variation in parasite isolates taken from the same patients, suggesting that antigenic change may play a role in immune recognition of this protein family. CONCLUSIONS/SIGNIFICANCE: These findings confirm that antigenic hydrophilic acylated proteins are expressed from genes in the same chromosomal

  12. Cloning and sequence analysis of Sox genes in a tetraploid cyprinid fish, Tor douronensis

    Institute of Scientific and Technical Information of China (English)

    GUO BaoCheng; LI JunBing; TONG ChaoBo; HE ShunPing

    2008-01-01

    A PCR survey for Sox genes in a young tetraploid fish Tor douronensis (Teleostei: Cyprinidae) was per-formed to access the evolutionary fates of important functional genes after genome duplication caused by polyploidization event. Totally 13 Sox genes were obtained in Tor douronensis, which represent SoxB, SoxC and SoxE groups. Phylogenetic analysis of Sox genes in Tor douronensis provided evidence for fish-specific genome duplication, and suggested that Sox19 might be a teleost specific Sox gene member. Sequence analysis revealed most of the nucleotide substitutions between duplicated copies of Sox genes caused by tetraploidization event or their orthologues in other species are silent substitutions. It would appear that the sequences are under purifying selective pressure, strongly suggesting that they repre- sent functional genes and supporting selection against all null allele at either of two duplicated loci of Sox4a, Sox9a and Sox9b. Surprising variations of the intron length and similarities of two duplicated copies of Sox9a and Sox9b, suggest that Tor douronensis might be an allotetraploidy.

  13. Genetic variations of glycinin subunit genes among cultivated and wild type soybean species

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Glycinin is a predominant storage protein in most soybean accessions. It is a hexamer constituted by five major subunits, which can be classified into two groups. Group Ⅰ contains Gl, G2 and G3, and Group Ⅱ contains G4 and G5. The genes encoding these subunits have been designated from Gyl to Gy5, respectively. In the present study, Gyl genomic fragments were cloned from wild accessions of subgenera Glycine glycine, Glycine soja and a cultivar of Glycine max. Their sequences and the deduced amino acid sequences were compared. The residues critical for assembling of G1 subunits from the wild perennial accession were conservative. The Gy4 fragments were cloned from two wild perennial accessions and compared with that from subgenus Soja. The intron 3 of Gy4 had abundant variations between the subgenera G. Soja and G. Glycine as well as within the subgenus G. Glycine. Abundant variations existed in the disordered regions 3 and 4 of G4 subunits from two wild perennial accessions. The genomic organization of glycinin genes was analyzed in 19 accessions from subgenera Soja and Glycine. The hybridization patterns were identical among the accessions of subgenus Soja. On the contrary, abundant polymorphisms existed between the accessions from subgenus Glycine. These results indicated that glycinin genes have high degree of conservation within subgenus Soja but more variations within subgenus Glycine.

  14. IDENTIFICATION OF UTERIN MILK PROTEIN (UTMT GENE IN BALI CATTLE USING DIRECT SEQUENCING

    Directory of Open Access Journals (Sweden)

    Jakaria

    2016-03-01

    Full Text Available The objective of this research was to identify diversity of exon 5 UTMP gene fragment in Bali cattle using direct sequencing. The total 60 blood samples of Bali Cattle derived from BPTU Bali in Bali siland (20 heads, BPTU Serading in Sumbawa island (20 heads and Village Breeding Center in Barru District South Sulawesi (20 heads were used to evaluate their genetic diversity at exon 5 UTMP gene. The forward and reverse data sequences were analyzed using Bioedit program and alignment analysis was carried out using MEGA5 program. Meanwhile haplotype analysis was performed by DnaSPv5 program. The result showed that partial sequences in exon 5 UTMP gene had 16 haplotypes with the highest number of haplotypes ware found in VBC Barru district South Sulawesi (8 haplotypes. Moreover, the highest average of haplotype (h and nucleotide (p diversity were found in VBC Barru district South Sulawesi were 0.7949 and 0.0016, respectively. In addition, minisatellite insersion was found in exon 5 UTMP gene fragment on Bali cattle which are consist of 5'-CCA GTC ATG AAG AAG GCA GAG GTC GTC GTG CCG GCG AAA-3'. According to our results, haplotype and minisatellite variation in exon 5 UTMP gene fragment can be used as a candidate genetic marker specific for reproductive trait in the Bali cattle and for its strategy breeding program in the future.

  15. Sequence analysis of 21 genes located in the Kartagener syndrome linkage region on chromosome 15q.

    Science.gov (United States)

    Geremek, Maciej; Schoenmaker, Frederieke; Zietkiewicz, Ewa; Pogorzelski, Andrzej; Diehl, Scott; Wijmenga, Cisca; Witt, Michal

    2008-06-01

    Primary ciliary dyskinesia (PCD) is a rare genetic disorder, which shows extensive genetic heterogeneity and is mostly inherited in an autosomal recessive fashion. There are four genes with a proven pathogenetic role in PCD. DNAH5 and DNAI1 are involved in 28 and 10% of PCD cases, respectively, while two other genes, DNAH11 and TXNDC3, have been identified as causal in one PCD family each. We have previously identified a 3.5 cM (2.82 Mb) region on chromosome 15q linked to Kartagener syndrome (KS), a subtype of PCD characterized by the randomization of body organ positioning. We have now refined the KS candidate region to a 1.8 Mb segment containing 18 known genes. The coding regions of these genes and three neighboring genes were subjected to sequence analysis in seven KS probands, and we were able to identify 60 single nucleotide sequence variants, 35 of which resided in mRNA coding sequences. However, none of the variations alone could explain the occurrence of the disease in these patients.

  16. Informational structure of genetic sequences and nature of gene splicing

    Science.gov (United States)

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  17. Variation in the RAD51 gene and familial breast cancer

    Science.gov (United States)

    Lose, Felicity; Lovelock, Paul; Chenevix-Trench, Georgia; Mann, Graham J; Pupo, Gulietta M; Spurdle, Amanda B

    2006-01-01

    Introduction Human RAD51 is a homologue of the Escherichia coli RecA protein and is known to function in recombinational repair of double-stranded DNA breaks. Mutations in the lower eukaryotic homologues of RAD51 result in a deficiency in the repair of double-stranded DNA breaks. Loss of RAD51 function would therefore be expected to result in an elevated mutation rate, leading to accumulation of DNA damage and, hence, to increased cancer risk. RAD51 interacts directly or indirectly with a number of proteins implicated in breast cancer, such as BRCA1 and BRCA2. Similar to BRCA1 mice, RAD51-/- mice are embryonic lethal. The RAD51 gene region has been shown to exhibit loss of heterozygosity in breast tumours, and deregulated RAD51 expression in breast cancer patients has also been reported. Few studies have investigated the role of coding region variation in the RAD51 gene in familial breast cancer, with only one coding region variant – exon 6 c.449G>A (p.R150Q) – reported to date. Methods All nine coding exons of the RAD51 gene were analysed for variation in 46 well-characterised, BRCA1/2-negative breast cancer families using denaturing high-performance liquid chromatography. Genotyping of the exon 6 p.R150Q variant was performed in an additional 66 families. Additionally, lymphoblastoid cell lines from breast cancer patients were subjected to single nucleotide primer extension analysis to assess RAD51 expression. Results No coding region variation was found, and all intronic variation detected was either found in unaffected controls or was unlikely to have functional consequences. Single nucleotide primer extension analysis did not reveal any allele-specific changes in RAD51 expression in all lymphoblastoid cell lines tested. Conclusion Our study indicates that RAD51 is not a major familial breast cancer predisposition gene. PMID:16762046

  18. Evolutionary evidence for alternative structure in RNA sequence co-variation.

    Directory of Open Access Journals (Sweden)

    Justin Ritz

    Full Text Available Sequence conservation and co-variation of base pairs are hallmarks of structured RNAs. For certain RNAs (e.g. riboswitches, a single sequence must adopt at least two alternative secondary structures to effectively regulate the message. If alternative secondary structures are important to the function of an RNA, we expect to observe evolutionary co-variation supporting multiple conformations. We set out to characterize the evolutionary co-variation supporting alternative conformations in riboswitches to determine the extent to which alternative secondary structures are conserved. We found strong co-variation support for the terminator, P1, and anti-terminator stems in the purine riboswitch by extending alignments to include terminator sequences. When we performed Boltzmann suboptimal sampling on purine riboswitch sequences with terminators we found that these sequences appear to have evolved to favor specific alternative conformations. We extended our analysis of co-variation to classic alignments of group I/II introns, tRNA, and other classes of riboswitches. In a majority of these RNAs, we found evolutionary evidence for alternative conformations that are compatible with the Boltzmann suboptimal ensemble. Our analyses suggest that alternative conformations are selected for and thus likely play functional roles in even the most structured of RNAs.

  19. Cloning,sequencing and phylogenic analysis of duck prion gene

    Institute of Scientific and Technical Information of China (English)

    WANG Qigui; ZHANG Lei; HU Xiaoxiang; FAN Baoliang; LI Ning; LI Hui; WU Changxin

    2004-01-01

    Duck prion gene was cloned and sequenced. Similar to mammalian prion protein (PrP), duck prion is encoded by a single exon of a single copy in genome, which was confirmed by Southern blot analysis. All of the structural features of mammalian PrP were also identified in the duck PrP. Compared with mammalian PrP, it exhibited a 30 % of general similarity. When compared with chicken PrP, it showed a higher homology of 97%. A phylogenetic tree was constructed to trace evolution of prion gene in animals.

  20. Identification of rat genes by TWINSCAN gene prediction, RT-PCR, and direct sequencing

    DEFF Research Database (Denmark)

    Wu, Jia Qian; Shteynberg, David; Arumugam, Manimozhiyan

    2004-01-01

    The publication of a draft sequence of a third mammalian genome--that of the rat--suggests a need to rethink genome annotation. New mammalian sequences will not receive the kind of labor-intensive annotation efforts that are currently being devoted to human. In this paper, we demonstrate...... an alternative approach: reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on dual-genome de novo predictions from TWINSCAN. We tested 444 TWINSCAN-predicted rat genes that showed significant homology to known human genes implicated in disease but that were partially...

  1. SARS transmission pattern in Singapore reassessed by viral sequence variation analysis.

    Directory of Open Access Journals (Sweden)

    Jianjun Liu

    2005-02-01

    Full Text Available BACKGROUND: Epidemiological investigations of infectious disease are mainly dependent on indirect contact information and only occasionally assisted by characterization of pathogen sequence variation from clinical isolates. Direct sequence analysis of the pathogen, particularly at a population level, is generally thought to be too cumbersome, technically difficult, and expensive. We present here a novel application of mass spectrometry (MS-based technology in characterizing viral sequence variations that overcomes these problems, and we apply it retrospectively to the severe acute respiratory syndrome (SARS outbreak in Singapore. METHODS AND FINDINGS: The success rate of the MS-based analysis for detecting SARS coronavirus (SARS-CoV sequence variations was determined to be 95% with 75 copies of viral RNA per reaction, which is sufficient to directly analyze both clinical and cultured samples. Analysis of 13 SARS-CoV isolates from the different stages of the Singapore outbreak identified nine sequence variations that could define the molecular relationship between them and pointed to a new, previously unidentified, primary route of introduction of SARS-CoV into the Singapore population. Our direct determination of viral sequence variation from a clinical sample also clarified an unresolved epidemiological link regarding the acquisition of SARS in a German patient. We were also able to detect heterogeneous viral sequences in primary lung tissues, suggesting a possible coevolution of quasispecies of virus within a single host. CONCLUSION: This study has further demonstrated the importance of improving clinical and epidemiological studies of pathogen transmission through the use of genetic analysis and has revealed the MS-based analysis to be a sensitive and accurate method for characterizing SARS-CoV genetic variations in clinical samples. We suggest that this approach should be used routinely during outbreaks of a wide variety of agents, in order

  2. Gene Tree Discordance Causes Apparent Substitution Rate Variation.

    Science.gov (United States)

    Mendes, Fábio K; Hahn, Matthew W

    2016-07-01

    Substitution rates are known to be variable among genes, chromosomes, species, and lineages due to multifarious biological processes. Here, we consider another source of substitution rate variation due to a technical bias associated with gene tree discordance. Discordance has been found to be rampant in genome-wide data sets, often due to incomplete lineage sorting (ILS). This apparent substitution rate variation is caused when substitutions that occur on discordant gene trees are analyzed in the context of a single, fixed species tree. Such substitutions have to be resolved by proposing multiple substitutions on the species tree, and we therefore refer to this phenomenon as Substitutions Produced by ILS (SPILS). We use simulations to demonstrate that SPILS has a larger effect with increasing levels of ILS, and on trees with larger numbers of taxa. Specific branches of the species trees are consistently, but erroneously, inferred to be longer or shorter, and we show that these branches can be predicted based on discordant tree topologies. Moreover, we observe that fixing a species tree topology when performing tests of positive selection increases the false positive rate, particularly for genes whose discordant topologies are most affected by SPILS. Finally, we use data from multiple Drosophila species to show that SPILS can be detected in nature. Although the effects of SPILS are modest per gene, it has the potential to affect substitution rate variation whenever high levels of ILS are present, particularly in rapid radiations. The problems outlined here have implications for character mapping of any type of trait, and for any biological process that causes discordance. We discuss possible solutions to these problems, and areas in which they are likely to have caused faulty inferences of convergence and accelerated evolution.

  3. Cloning and sequencing of a Moraxella bovis pilin gene.

    OpenAIRE

    1985-01-01

    Moraxella bovis pili have been shown to play a major role in both infectivity and protective immunity of bovine infectious keratoconjunctivitis. Sonicated M. bovis DNA from the piliated strain EPP63 was inserted into the vector lambda gt11 with EcoRI linkers. Recombinant phage were screened with an oligonucleotide probe based on the amino-terminal portion of the DNA sequence of a Neisseria gonorrhoeae pilin gene. Two candidate phages produced a protein that comigrated with EPP63 beta pilin in...

  4. Gene expression and variation in social aggression by queens of the harvester ant Pogonomyrmex californicus.

    Science.gov (United States)

    Helmkampf, Martin; Mikheyev, Alexander S; Kang, Yun; Fewell, Jennifer; Gadau, Jürgen

    2016-08-01

    A key requirement for social cooperation is the mitigation and/or social regulation of aggression towards other group members. Populations of the harvester ant Pogonomyrmex californicus show the alternate social phenotypes of queens founding nests alone (haplometrosis) or in groups of unrelated yet cooperative individuals (pleometrosis). Pleometrotic queens display an associated reduction in aggression. To understand the proximate drivers behind this variation, we placed foundresses of the two populations into social environments with queens from the same or the alternate population, and measured their behaviour and head gene expression profiles. A proportion of queens from both populations behaved aggressively, but haplometrotic queens were significantly more likely to perform aggressive acts, and conflict escalated more frequently in pairs of haplometrotic queens. Whole-head RNA sequencing revealed variation in gene expression patterns, with the two populations showing moderate differentiation in overall transcriptional profile, suggesting that genetic differences underlie the two founding strategies. The largest detected difference, however, was associated with aggression, regardless of queen founding type. Several modules of coregulated genes, involved in metabolism, immune system and neuronal function, were found to be upregulated in highly aggressive queens. Conversely, nonaggressive queens exhibited a striking pattern of upregulation in chemosensory genes. Our results highlight that the social phenotypes of cooperative vs. solitary nest founding tap into a set of gene regulatory networks that seem to govern aggression level. We also present a number of highly connected hub genes associated with aggression, providing opportunity to further study the genetic underpinnings of social conflict and tolerance.

  5. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  6. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  7. Sequencing by ligation variation with endonuclease V digestion and deoxyinosine-containing query oligonucleotides

    Directory of Open Access Journals (Sweden)

    Ho Antoine

    2011-12-01

    Full Text Available Abstract Background Sequencing-by-ligation (SBL is one of several next-generation sequencing methods that has been developed for massive sequencing of DNA immobilized on arrayed beads (or other clonal amplicons. SBL has the advantage of being easy to implement and accessible to all because it can be performed with off-the-shelf reagents. However, SBL has the limitation of very short read lengths. Results To overcome the read length limitation, research groups have developed complex library preparation processes, which can be time-consuming, difficult, and result in low complexity libraries. Herein we describe a variation on traditional SBL protocols that extends the number of sequential bases that can be sequenced by using Endonuclease V to nick a query primer, thus leaving a ligatable end extended into the unknown sequence for further SBL cycles. To demonstrate the protocol, we constructed a known DNA sequence and utilized our SBL variation, cyclic SBL (cSBL, to resequence this region. Using our method, we were able to read thirteen contiguous bases in the 3' - 5' direction. Conclusions Combining this read length with sequencing in the 5' - 3' direction would allow a read length of over twenty bases on a single tage. Implementing mate-paired tags and this SBL variation could enable > 95% coverage of the genome.

  8. Full-length minor ampullate spidroin gene sequence.

    Directory of Open Access Journals (Sweden)

    Gefei Chen

    Full Text Available Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps. Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level.

  9. Sequence analysis of the Epstein-Barr virus (EBV) latent membrane protein-1 gene and promoter region

    DEFF Research Database (Denmark)

    Sandvej, K; Gratama, J W; Munch, M

    1997-01-01

    Sequence variations in the Epstein-Barr virus (EBV) encoded latent membrane protein-1 (LMP-1) gene have been described in a Chinese nasopharyngeal carcinoma-derived isolate (CAO), and in viral isolates from various EBV-associated tumors. It has been suggested that these genetic changes, which inc...

  10. An abundance of rare functional variants in 202 drug target genes sequenced in 14.002 people

    DEFF Research Database (Denmark)

    Nelson, Matthew R.; Wegmann, Daniel; Ehm, Margaret G.

    2012-01-01

    Rare genetic variants contribute to complex disease risk; however, the abundance of rare variants in human populations remains unknown. We explored this spectrum of variation by sequencing 202 genes encoding drug targets in 14,002 individuals. We find rare variants are abundant (1 every 17 bases)...

  11. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    Science.gov (United States)

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  12. Genomic and gene variation in Mycoplasma hominis strains

    DEFF Research Database (Denmark)

    Christiansen, Gunna; Andersen, H; Birkelund, Svend

    1987-01-01

    DNAs from 14 strains of Mycoplasma hominis isolated from various habitats, including strain PG21, were analyzed for genomic heterogeneity. DNA-DNA filter hybridization values were from 51 to 91%. Restriction endonuclease digestion patterns, analyzed by agarose gel electrophoresis, revealed...... no identity or cluster formation between strains. Variation within M. hominis rRNA genes was analyzed by Southern hybridization of EcoRI-cleaved DNA hybridized with a cloned fragment of the rRNA gene from the mycoplasma strain PG50. Five of the M. hominis strains showed identical hybridization patterns....... These hybridization patterns were compared with those of 12 other mycoplasma species, which showed a much more complex band pattern. Cloned nonribosomal RNA gene fragments of M. hominis PG21 DNA were analyzed, and the fragments were used to demonstrate heterogeneity among the strains. A monoclonal antibody against...

  13. Molecular variation and evolution of the tyrosine kinase domains of insulin receptor IRa and IRb genes in Cyprinidae.

    Science.gov (United States)

    Kong, XiangHui; Wang, XuZhen; He, ShunPing

    2011-07-01

    The insulin receptor (IR) gene plays an important role in regulating cell growth, differentiation and development. In the present study, DNA sequences of insulin receptor genes, IRa and IRb, were amplified and sequenced from 37 representative species of the Cyprinidae and from five outgroup species from non-cyprinid Cypriniformes. Based on coding sequences (CDS) of tyrosine kinase regions of IRa and IRb, molecular evolution and phylogenetic relationships were analyzed to better understand the characteristics of IR gene divergence in the family Cyprinidae. IRa and IRb were clustered into one lineage in the gene tree of the IR gene family, reconstructed using the unweighted pair group method with arithmetic mean (UPGMA). IRa and IRb have evolved into distinct genes after IR gene duplication in Cyprinidae. For each gene, molecular evolution analyses showed that there was no significant difference among different groups in the reconstructed maximum parsimony (MP) tree of Cyprinidae; IRa and IRb have been subjected to similar evolutionary pressure among different lineages. Although the amino acid sequences of IRa and IRb tyrosine kinase regions were highly conserved, our analyses showed that there were clear sequence variations between the tyrosine kinase regions of IRa and IRb proteins. This indicates that IRa and IRb proteins might play different roles in the insulin signaling pathway.

  14. AFLP and DNA sequence variation in an Andean domesticate, pepino (Solanum muricatum, Solanaceae): implications for evolution and domestication.

    Science.gov (United States)

    Blanca, José M; Prohens, Jaime; Anderson, Gregory J; Zuriaga, Elena; Cañizares, Joaquín; Nuez, Fernando

    2007-07-01

    The pepino (Solanum muricatum) is a vegetatively propagated, domesticated native of the Andes, where it grows with wild relatives. We used AFLPs and a 1-kb sequence of the 3-methylcrotonyl-CoA carboxylase gene to study variation of 27 accessions of S. muricatum and 35 collections of 10 species of wild relatives (Solanum section Basarthrum). A total of 298 AFLP fragments and 29 DNA sequence haplotypes were detected. Cluster and principal coordinate analyses and other genetic parameters estimated from both types of markers, show that S. muricatum is closely related to the species from one of the series (Caripensia) of section Basarthrum and that >90% of the variation of the cultigen is also represented in that series. Pepino is highly diverse, either because it is not monophyletic or it has been subjected to regular introgression with wild species, or both. Although a continuous distribution of the genetic variation occurred within the cultivated species, three genetic clusters were recognized. Cluster 1 is mostly centered in Ecuador, cluster 2 in Ecuador and Peru, and cluster 3 in Colombia and Ecuador. Cluster 3 also includes all modern cultivars studied. These results and other evidence suggest that northern Ecuador/southern Colombia is the main center of pepino diversity and the center of origin. The high genetic variation of this cultigen indicates that domestication does not always produce a genetic bottleneck.

  15. Massive parallel IGHV gene sequencing reveals a germinal center pathway in origins of human multiple myeloma.

    Science.gov (United States)

    Cowan, Graeme; Weston-Bell, Nicola J; Bryant, Dean; Seckinger, Anja; Hose, Dirk; Zojer, Niklas; Sahota, Surinder S

    2015-05-30

    Human multiple myeloma (MM) is characterized by accumulation of malignant terminally differentiated plasma cells (PCs) in the bone marrow (BM), raising the question when during maturation neoplastic transformation begins. Immunoglobulin IGHV genes carry imprints of clonal tumor history, delineating somatic hypermutation (SHM) events that generally occur in the germinal center (GC). Here, we examine MM-derived IGHV genes using massive parallel deep sequencing, comparing them with profiles in normal BM PCs. In 4/4 presentation IgG MM, monoclonal tumor-derived IGHV sequences revealed significant evidence for intraclonal variation (ICV) in mutation patterns. IGHV sequences of 2/2 normal PC IgG populations revealed dominant oligoclonal expansions, each expansion also displaying mutational ICV. Clonal expansions in MM and in normal BM PCs reveal common IGHV features. In such MM, the data fit a model of tumor origins in which neoplastic transformation is initiated in a GC B-cell committed to terminal differentiation but still targeted by on-going SHM. Strikingly, the data parallel IGHV clonal sequences in some monoclonal gammopathy of undetermined significance (MGUS) known to display on-going SHM imprints. Since MGUS generally precedes MM, these data suggest origins of MGUS and MM with IGHV gene mutational ICV from the same GC B-cell, arising via a distinctive pathway.

  16. Hypertension and genetic variation in endothelial-specific genes.

    Directory of Open Access Journals (Sweden)

    Erik Larsson

    Full Text Available Genome-wide association (GWA studies usually detect common genetic variants with low-to-medium effect sizes. Many contributing variants are not revealed, since they fail to reach significance after strong correction for multiple comparisons. The WTCCC study for hypertension, for example, failed to identify genome-wide significant associations. We hypothesized that genetic variation in genes expressed specifically in the endothelium may be important for hypertension development. Results from the WTCCC study were combined with previously published gene expression data from mice to specifically investigate SNPs located within endothelial-specific genes, bypassing the requirement for genome-wide significance. Six SNPs from the WTCCC study were selected for independent replication in 5205 hypertensive patients and 5320 population-based controls, and successively in a cohort of 16,537 individuals. A common variant (rs10860812 in the DRAM (damage-regulated autophagy modulator locus showed association with hypertension (P = 0.008 in the replication study. The minor allele (A had a protective effect (OR = 0.93; 95% CI 0.88-0.98 per A-allele, which replicates the association in the WTCCC GWA study. However, a second follow-up, in the larger cohort, failed to reveal an association with blood pressure. We further tested the endothelial-specific genes for co-localization with a panel of newly discovered SNPs from large meta-GWAS on hypertension or blood pressure. There was no significant overlap between those genes and hypertension or blood pressure loci. The result does not support the hypothesis that genetic variation in genes expressed in endothelium plays an important role for hypertension development. Moreover, the discordant association of rs10860812 with blood pressure in the case control study versus the larger Malmö Preventive Project-study highlights the importance of rigorous replication in multiple large independent studies.

  17. HIV-1 Tat and Viral Latency: What We Can Learn from Naturally Occurring Sequence Variations

    Science.gov (United States)

    Kamori, Doreen; Ueno, Takamasa

    2017-01-01

    Despite the effective use of antiretroviral therapy, the remainder of a latently HIV-1-infected reservoir mainly in the resting memory CD4+ T lymphocyte subset has provided a great setback toward viral eradication. While host transcriptional silencing machinery is thought to play a dominant role in HIV-1 latency, HIV-1 protein such as Tat, may affect both the establishment and the reversal of latency. Indeed, mutational studies have demonstrated that insufficient Tat transactivation activity can result in impaired transcription of viral genes and the establishment of latency in cell culture experiments. Because Tat protein is one of highly variable proteins within HIV-1 proteome, it is conceivable that naturally occurring Tat mutations may differentially modulate Tat functions, thereby influencing the establishment and/or the reversal of viral latency in vivo. In this mini review, we summarize the recent findings of Tat naturally occurring polymorphisms associating with host immune responses and we highlight the implication of Tat sequence variations in relation to HIV latency.

  18. Sequence diversity and differential expression of major phenylpropanoid-flavonoid biosynthetic genes among three mango varieties.

    Science.gov (United States)

    Hoang, Van L T; Innes, David J; Shaw, P Nicholas; Monteith, Gregory R; Gidley, Michael J; Dietzgen, Ralf G

    2015-07-30

    Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles. A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316 bp. Variety IW had the highest SNP frequency (one SNP every 258 bp) while KP and NDM had similar frequencies (one SNP every 369 bp and 360 bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3'-hydroxylase (F3'H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties. The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and

  19. Copy Number Variation of UGT 2B Genes in Indian Families Using Whole Genome Scans

    Directory of Open Access Journals (Sweden)

    Avinash M. Veerappa

    2016-01-01

    Full Text Available Background and Objectives. Uridine diphospho-glucuronosyltransferase 2B (UGT2B is a family of genes involved in metabolizing steroid hormones and several other xenobiotics. These UGT2B genes are highly polymorphic in nature and have distinct polymorphisms associated with specific regions around the globe. Copy number variations (CNVs status of UGT2B17 in Indian population is not known and their disease associations have been inconclusive. It was therefore of interest to investigate the CNV profile of UGT2B genes. Methods. We investigated the presence of CNVs in UGT2B genes in 31 members from eight Indian families using Affymetrix Genome-Wide Human SNP Array 6.0 chip. Results. Our data revealed >50% of the study members carried CNVs in UGT2B genes, of which 76% showed deletion polymorphism. CNVs were observed more in UGT2B17 (76.4% than in UGT2B15 (17.6%. Molecular network and pathway analysis found enrichment related to steroid metabolic process, carboxylesterase activity, and sequence specific DNA binding. Interpretation and Conclusion. We report the presence of UGT2B gene deletion and duplication polymorphisms in Indian families. Network analysis indicates the substitutive role of other possible genes in the UGT activity. The CNVs of UGT2B genes are very common in individuals indicating that the effect is neutral in causing any suspected diseases.

  20. Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

    Directory of Open Access Journals (Sweden)

    B. Bennetts

    2014-09-01

    Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.

  1. Amplification of complete gag gene sequences from geographically distinct equine infectious anemia virus isolates.

    Science.gov (United States)

    Boldbaatar, Bazartseren; Bazartseren, Tsevel; Koba, Ryota; Murakami, Hironobu; Oguma, Keisuke; Murakami, Kenji; Sentsui, Hiroshi

    2013-04-01

    In the current study, primers described previously and modified versions of these primers were evaluated for amplification of full-length gag genes from different equine infectious anemia virus (EIAV) strains from several countries, including the USA, Germany and Japan. Each strain was inoculated into a primary horse leukocyte culture, and the full-length gag gene was amplified by reverse transcription polymerase chain reaction. Each amplified gag gene was cloned into a plasmid vector for sequencing, and the detectable copy numbers of target DNA were determined. Use of a mixture of two forward primers and one reverse primer in the polymerase chain reaction enabled the amplification of all EIAV strains used in this study. However, further study is required to confirm these primers as universal for all EIAV strains. The nucleotide sequence of gag is considered highly conserved, as evidenced by the use of gag-encoded capsid proteins as a common antigen for the detection of EIAV in serological tests. However, significant sequence variation in the gag genes of different EIAV strains was found in the current study.

  2. Genome-Wide Associations of Gene Expression Variation in Humans.

    Directory of Open Access Journals (Sweden)

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  3. Genome-wide associations of gene expression variation in humans.

    Directory of Open Access Journals (Sweden)

    Barbara E Stranger

    2005-12-01

    Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

  4. Characterization of genetic sequence variation of 58 STR loci in four major population groups.

    Science.gov (United States)

    Novroski, Nicole M M; King, Jonathan L; Churchill, Jennifer D; Seah, Lay Hong; Budowle, Bruce

    2016-11-01

    Massively parallel sequencing (MPS) can identify sequence variation within short tandem repeat (STR) alleles as well as their nominal allele lengths that traditionally have been obtained by capillary electrophoresis. Using the MiSeq FGx Forensic Genomics System (Illumina), STRait Razor, and in-house excel workbooks, genetic variation was characterized within STR repeat and flanking regions of 27 autosomal, 7 X-chromosome and 24 Y-chromosome STR markers in 777 unrelated individuals from four population groups. Seven hundred and forty six autosomal, 227 X-chromosome, and 324 Y-chromosome STR alleles were identified by sequence compared with 357 autosomal, 107 X-chromosome, and 189 Y-chromosome STR alleles that were identified by length. Within the observed sequence variation, 227 autosomal, 156 X-chromosome, and 112 Y-chromosome novel alleles were identified and described. One hundred and seventy six autosomal, 123 X-chromosome, and 93 Y-chromosome sequence variants resided within STR repeat regions, and 86 autosomal, 39 X-chromosome, and 20 Y-chromosome variants were located in STR flanking regions. Three markers, D18S51, DXS10135, and DYS385a-b had 1, 4, and 1 alleles, respectively, which contained both a novel repeat region variant and a flanking sequence variant in the same nucleotide sequence. There were 50 markers that demonstrated a relative increase in diversity with the variant sequence alleles compared with those of traditional nominal length alleles. These population data illustrate the genetic variation that exists in the commonly used STR markers in the selected population samples and provide allele frequencies for statistical calculations related to STR profiling with MPS data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  6. Targeted enrichment of the black cottonwood (Populus trichocarpa gene space using sequence capture

    Directory of Open Access Journals (Sweden)

    Zhou Lecong

    2012-12-01

    Full Text Available Abstract Background High-throughput re-sequencing is rapidly becoming the method of choice for studies of neutral and adaptive processes in natural populations across taxa. As re-sequencing the genome of large numbers of samples is still cost-prohibitive in many cases, methods for genome complexity reduction have been developed in attempts to capture most ecologically-relevant genetic variation. One of these approaches is sequence capture, in which oligonucleotide baits specific to genomic regions of interest are synthesized and used to retrieve and sequence those regions. Results We used sequence capture to re-sequence most predicted exons, their upstream regulatory regions, as well as numerous random genomic intervals in a panel of 48 genotypes of the angiosperm tree Populus trichocarpa (black cottonwood, or ‘poplar’. A total of 20.76Mb (5% of the poplar genome was targeted, corresponding to 173,040 baits. With 12 indexed samples run in each of four lanes on an Illumina HiSeq instrument (2x100 paired-end, 86.8% of the bait regions were on average sequenced at a depth ≥10X. Few off-target regions (>250bp away from any bait were present in the data, but on average ~80bp on either side of the baits were captured and sequenced to an acceptable depth (≥10X to call heterozygous SNPs. Nucleotide diversity estimates within and adjacent to protein-coding genes were similar to those previously reported in Populus spp., while intergenic regions had higher values consistent with a relaxation of selection. Conclusions Our results illustrate the efficiency and utility of sequence capture for re-sequencing highly heterozygous tree genomes, and suggest design considerations to optimize the use of baits in future studies.

  7. Sequence diversity within the argF, fbp and recA genes of natural isolates of Neisseria meningitidis: interspecies recombination within the argF gene.

    Science.gov (United States)

    Zhou, J; Spratt, B G

    1992-08-01

    Studies of natural populations of Neisseria meningitidis using multilocus enzyme electrophoresis have shown extensive genetic variation within this species, which, it has been proposed, implies a level of sequence diversity within meningococci that is greater than that normally considered as the criterion for species limits in bacteria. To obtain a direct measure of the sequence diversity among meningococci, we obtained the nucleotide sequences of most of the argF, recA and fbp genes of eight meningococci of widely differing electrophoretic type (from the reference collection of Caugant). Sequence variation between the meningococcal strains ranged from 0-0.6% for fbp, 0-1.3% for argF, and 0-3.3% for recA. These levels of diversity are no greater than those found within Escherichia coli 'housekeeping' genes and suggest that multilocus enzyme electrophoresis may overestimate the extent of nucleotide sequence diversity within meningococci. The average sequence divergence between the Neisseria meningitidis strains and N. gonorrhoeae strain FA19 was 1.0% for fbp and 1.6% for recA. The argF gene, although very uniform among the eight meningococcal isolates, had a striking mosaic structure when compared with the gonococcal argF gene: two regions of the gene differed by greater than 13% in nucleotide sequence between meningococci and gonococci, whereas the rest of the gene differed by less than 1.7%. One of the diverged regions was shown to have been introduced from the argF gene of a commensal Neisseria species that is closely related to Neisseria cinerea. The source of the other region was unclear.

  8. Angiosperm phylogeny inferred from sequences of four mitochondrial genes

    Institute of Scientific and Technical Information of China (English)

    Yin-Long QIU; Zhi-Duan CHEN; Libo LI; Bin WANG; Jia-Yu XUE; Tory A. HENDRY; Rui-Qi LI; Joseph W. BROWN; Yang LIU; Geordan T. HUDSON

    2010-01-01

    An angiosperm phylogeny was reconstructed in a maximum likelihood analysis of sequences of four mitochondrial genes, atpl, matR, had5, and rps3, from 380 species that represent 376 genera and 296 families of seed plants. It is largely congruent with the phylogeny of angiosperms reconstructed from chloroplast genes atpB, matK, and rbcL, and nuclear 18S rDNA. The basalmost lineage consists of Amborella and Nymphaeales (including Hydatellaceae). Austrobaileyales follow this clade and are sister to the mesangiosperms, which include Chloranthaceae, Ceratophyllum, magnoliids, monocots, and eudicots. With the exception of Chloranthaceae being sister to Ceratophyllum, relationships among these five lineages are not well supported. In eudicots, Ranunculales, Sabiales, Proteales, Trochodendrales, Buxales, Gunnerales, Saxifragales, Vitales, Berberidopsidales, and Dilleniales form a basal grade of lines that diverged before the diversification of rosids and asterids. Within rosids, the COM (Celastrales-Oxalidales-Malpighiales) clade is sister to malvids (or rosid Ⅱ), instead of to the nitrogen-fixing clade as found in all previous large-scale molecular analyses of angiosperms. Santalales and Caryophyllales are members of an expanded asterid clade. This study shows that the mitochondrial genes are informative markers for resolving relationships among genera, families, or higher rank taxa across angiosperms. The low substitution rates and low homoplasy levels of the mitochondrial genes relative to the chloroplast genes, as found in this study, make them particularly useful for reconstructing ancient phylogenetic relationships. A mitochondrial gene-based angiosperm phylogeny provides an independent and essential reference for comparison with hypotheses of angiosperm phylogeny based on chloroplast genes, nuclear genes, and non-molecular data to reconstruct the underlying organismal phylogeny.

  9. Eugenol synthase genes in floral scent variation in Gymnadenia species.

    Science.gov (United States)

    Gupta, Alok K; Schauvinhold, Ines; Pichersky, Eran; Schiestl, Florian P

    2014-12-01

    Floral signaling, especially through floral scent, is often highly complex, and little is known about the molecular mechanisms and evolutionary causes of this complexity. In this study, we focused on the evolution of "floral scent genes" and the associated changes in their functions in three closely related orchid species of the genus Gymnadenia. We developed a benchmark repertoire of 2,571 expressed sequence tags (ESTs) in Gymnadenia odoratissima. For the functional characterization and evolutionary analysis, we focused on eugenol synthase, as eugenol is a widespread and important scent compound. We obtained complete coding complementary DNAs (cDNAs) of two copies of putative eugenol synthase genes in each of the three species. The proteins encoded by these cDNAs were characterized by expression and testing for activity in Escherichia coli. While G. odoratissima and Gymnadenia conopsea enzymes were found to catalyze the formation of eugenol only, the Gymnadenia densiflora proteins synthesize eugenol, as well as a smaller amount of isoeugenol. Finally, we showed that the eugenol and isoeugenol producing gene copies of G. densiflora are evolutionarily derived from the ancestral genes of the other species producing only eugenol. The evolutionary switch from production of one to two compounds evolved under relaxed purifying selection. In conclusion, our study shows the molecular bases of eugenol and isoeugenol production and suggests that an evolutionary transition in a single gene can lead to an increased complexity in floral scent emitted by plants.

  10. Impacts of Neanderthal-Introgressed Sequences on the Landscape of Human Gene Expression.

    Science.gov (United States)

    McCoy, Rajiv C; Wakefield, Jon; Akey, Joshua M

    2017-02-23

    Regulatory variation influencing gene expression is a key contributor to phenotypic diversity, both within and between species. Unfortunately, RNA degrades too rapidly to be recovered from fossil remains, limiting functional genomic insights about our extinct hominin relatives. Many Neanderthal sequences survive in modern humans due to ancient hybridization, providing an opportunity to assess their contributions to transcriptional variation and to test hypotheses about regulatory evolution. We developed a flexible Bayesian statistical approach to quantify allele-specific expression (ASE) in complex RNA-seq datasets. We identified widespread expression differences between Neanderthal and modern human alleles, indicating pervasive cis-regulatory impacts of introgression. Brain regions and testes exhibited significant downregulation of Neanderthal alleles relative to other tissues, consistent with natural selection influencing the tissue-specific regulatory landscape. Our study demonstrates that Neanderthal-inherited sequences are not silent remnants of ancient interbreeding but have measurable impacts on gene expression that contribute to variation in modern human phenotypes. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    Directory of Open Access Journals (Sweden)

    M. Ananda Chitra

    2015-07-01

    Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain

  12. Sequence variations of the hypervariable region of hepatitis C virus and their clinical significance

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Objective To understand the clinical significance of sequence variations in the hypervariable region (HVR) of hepatitis C virus during infection. Methods Eight patients with acute hepatitis C and 20 patients with chronic hepatitis C were followed up for two years. Blood samples were taken at intervals of six months for analysis of HCV-HVR sequences by reverse transcription-pelymerase chain reaction (RT-PCR) and direct sequencing methods. Results HCV-HVR sequences of the 28 patients changed in various degrees. 92% of these nuclentide substitutions led to changes of corresponding amino acid sequence. Only 8% of changed nucleotide were synonymous substitutions. Of 27 amino acids variation of amino acid ranged from 1 to 20 (mean 8, 30%). The most common nucleotide substitution (62%) occurred in the first position of codon, 31% in the second and the rest in the third. HVR variation rate wes 0.89×10-1 per genome site per year in acute hepatitis C, compared with 2.31×10-1 per genome site per year in chronic hepatitis C (P<0.05), but had no relafian to HCV subtype. Variation of HVR in the flare up type (ALT>150 μ/L) was much more than that in the quiescent type (ALT<100 μ/L). Conclusion Our results suggested that sequence variation of HVR during HCV chronic infection seems to be an adaptive response to HCV to evade the host immune pressure and might play a major role in the establishment of persistent infection as well as in the flare-up of hepatitis.

  13. Nuclear gene sequences from a late pleistocene sloth coprolite.

    Science.gov (United States)

    Poinar, Hendrik; Kuch, Melanie; McDonald, Gregory; Martin, Paul; Pääbo, Svante

    2003-07-01

    The determination of nuclear DNA sequences from ancient remains would open many novel opportunities such as the resolution of phylogenies, the sexing of hominid and animal remains, and the characterization of genes involved in phenotypic traits. However, to date, single-copy nuclear DNA sequences from fossils have been determined only from bones and teeth of woolly mammoths preserved in the permafrost. Since the best preserved ancient nucleic acids tend to stem from cold environments, this has led to the assumption that nuclear DNA would be retrievable only from frozen remains. We have previously shown that Pleistocene coprolites stemming from the extinct Shasta sloth (Nothrotheriops shastensis, Megatheriidae) contain mitochondrial (mt) DNA from the animal that produced them as well as chloroplast (cp) DNA from the ingested plants. Recent attempts to resolve the phylogeny of two families of extinct sloths by using strictly mitochondrial DNA has been inconclusive. We have prepared DNA extracts from a ground sloth coprolite from Gypsum Cave, Nevada, and quantitated the number of mtDNA copies for three different fragment lengths by using real-time PCR. We amplified one multicopy and three single-copy nuclear gene fragments and used the concatenated sequence to resolve the phylogeny. These results show that ancient single-copy nuclear DNA can be recovered from warm, arid climates. Thus, nuclear DNA preservation is not restricted to cold climates.

  14. Chromosomal Organization and Sequence Diversity of Genes Encoding Lachrymatory Factor Synthase in Allium cepa L.

    Science.gov (United States)

    Masamura, Noriya; McCallum, John; Khrustaleva, Ludmila; Kenel, Fernand; Pither-Joyce, Meegham; Shono, Jinji; Suzuki, Go; Mukai, Yasuhiko; Yamauchi, Naoki; Shigyo, Masayoshi

    2012-06-01

    Lachrymatory factor synthase (LFS) catalyzes the formation of lachrymatory factor, one of the most distinctive traits of bulb onion (Allium cepa L.). Therefore, we used LFS as a model for a functional gene in a huge genome, and we examined the chromosomal organization of LFS in A. cepa by multiple approaches. The first-level analysis completed the chromosomal assignment of LFS gene to chromosome 5 of A. cepa via the use of a complete set of A. fistulosum-shallot (A. cepa L. Aggregatum group) monosomic addition lines. Subsequent use of an F(2) mapping population from the interspecific cross A. cepa × A. roylei confirmed the assignment of an LFS locus to this chromosome. Sequence comparison of two BAC clones bearing LFS genes, LFS amplicons from diverse germplasm, and expressed sequences from a doubled haploid line revealed variation consistent with duplicated LFS genes. Furthermore, the BAC-FISH study using the two BAC clones as a probe showed that LFS genes are localized in the proximal region of the long arm of the chromosome. These results suggested that LFS in A. cepa is transcribed from at least two loci and that they are localized on chromosome 5.

  15. Cloning and sequence analysis of US1 gene in duck enteritis virus%Cloning and sequence analysis of US1gene in duck enteritis virus

    Institute of Scientific and Technical Information of China (English)

    ZHAO Yan; WANG Jun-wei; MA Bo; ZHAO Xiao-yan

    2011-01-01

    In this paper, a 1,860 bp sequence in IRs region of duck enteritis virus(DEV)was amplified by single oligonucleotide nested PCR with a single primer designed according to partial sequence of USI and then a pair of primers designed according to the 3' UTR of US8 gene and 5'end of the new getting sequence were used to amplify a 2,426 bp sequence toward the TRs region.Sequence analysis revealed that the both sequences contained an identical 990 bp open reading frame of DEV US1 gene.The two ORFs were in opposite transcription orientation.Sequence comparison of the nucleotide sequence and the deduced amino acid sequence of US1 gene showed relatively high identity to Mardivirus.Phylogenetic tree analysis showed that the eleven herpesviruses viruses were classified into three groups, and the duck enteritis virus was most closely related to Mardivirus.

  16. Genetic variation of Taenia pisiformis collected from Sichuan, China, based on the mitochondrial cytochrome B gene.

    Science.gov (United States)

    Yang, Deying; Ren, Yongjun; Fu, Yan; Xie, Yue; Nie, Huaming; Nong, Xiang; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou

    2013-08-01

    Taenia pisiformis is one of the most important parasites of canines and rabbits. T. pisiformis cysticercus (the larval stage) causes severe damage to rabbit breeding, which results in huge economic losses. In this study, the genetic variation of T. pisiformis was determined in Sichuan Province, China. Fragments of the mitochondrial cytochrome b (cytb) (922 bp) gene were amplified in 53 isolates from 8 regions of T. pisiformis. Overall, 12 haplotypes were found in these 53 cytb sequences. Molecular genetic variations showed 98.4% genetic variation derived from intra-region. FST and Nm values suggested that 53 isolates were not genetically differentiated and had low levels of genetic diversity. Neutrality indices of the cytb sequences showed the evolution of T. pisiformis followed a neutral mode. Phylogenetic analysis revealed no correlation between phylogeny and geographic distribution. These findings indicate that 53 isolates of T. pisiformis keep a low genetic variation, which provide useful knowledge for monitoring changes in parasite populations for future control strategies.

  17. Diurnal variation of hepatic antioxidant gene expression in mice.

    Directory of Open Access Journals (Sweden)

    Yi-Qiao Xu

    Full Text Available BACKGROUND: This study was aimed to examine circadian variations of hepatic antioxidant components, including the Nrf2- pathway, the glutathione (GSH system, antioxidant enzymes and metallothionein in mouse liver. METHODS AND RESULTS: Adult mice were housed in light- and temperature-controlled facilities for 2 weeks, and livers were collected every 4 h during the 24 h period. Total RNA was isolated, purified, and subjected to real-time RT-PCR analysis. Hepatic mRNA levels of Nrf2, Keap1, Nqo1 and Gclc were higher in the light-phase than the dark-phase, and were female-predominant. Hepatic GSH presented marked circadian fluctuations, along with glutathione S-transferases (GST-α1, GST-µ, GST-π and glutathione peroxidase (GPx1. The expressions of GPx1, GST-µ and GST-π mRNA were also higher in females. Antioxidant enzymes Cu/Zn superoxide dismutase (Sod1, catalase (CAT, cyclooxygenase-2 (Cox-2 and heme oxygenase-1 (Ho-1 showed circadian rhythms, with higher expressions of Cox-2 and CAT in females. Metallothionein, a small non-enzymatic antioxidant protein, showed dramatic circadian variation in males, but higher expression in females. The circadian variations of the clock gene Brain and Muscle Arnt-like Protein-1(Bmal1, albumin site D-binding protein (Dbp, nuclear receptor Rev-Erbα (Nr1d1, period protein (Per1 and Per2 and cryptochrome 1(Cry1 were in agreement with the literature. Furthermore, acetaminophen hepatotoxicity is more severe when administered in the afternoon when hepatic GSH was lowest. CONCLUSIONS: Circadian variations and gender differences in transcript levels of antioxidant genes exist in mouse liver, which could affect body responses to oxidative stress at different times of the day.

  18. Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

    Directory of Open Access Journals (Sweden)

    Rachel Caldwell

    2015-01-01

    Full Text Available There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length.

  19. Molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer myostatin gene

    Directory of Open Access Journals (Sweden)

    Smith-Keune Carolyn

    2008-02-01

    Full Text Available Abstract Background Myostatin (MSTN is a member of the transforming growth factor-β superfamily that negatively regulates growth of skeletal muscle tissue. The gene encoding for the MSTN peptide is a consolidate candidate for the enhancement of productivity in terrestrial livestock. This gene potentially represents an important target for growth improvement of cultured finfish. Results Here we report molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer MSTN-1 gene. The barramundi MSTN-1 was encoded by three exons 379, 371 and 381 bp in length and translated into a 376-amino acid peptide. Intron 1 and 2 were 412 and 819 bp in length and presented typical GT...AG splicing sites. The upstream region contained cis-regulatory elements such as TATA-box and E-boxes. A first assessment of sequence variability suggested that higher mutation rates are found in the 5' flanking region with several SNP's present in this species. A putative micro RNA target site has also been observed in the 3'UTR (untranslated region and is highly conserved across teleost fish. The deduced amino acid sequence was conserved across vertebrates and exhibited characteristic conserved putative functional residues including a cleavage motif of proteolysis (RXXR, nine cysteines and two glycosilation sites. A qualitative analysis of the barramundi MSTN-1 expression pattern revealed that, in adult fish, transcripts are differentially expressed in various tissues other than skeletal muscles including gill, heart, kidney, intestine, liver, spleen, eye, gonad and brain. Conclusion Our findings provide valuable insights such as sequence variation and genomic information which will aid the further investigation of the barramundi MSTN-1 gene in association with growth. The finding for the first time in finfish MSTN of a miRNA target site in the 3'UTR provides an opportunity for the identification of regulatory mutations on the

  20. Antigen-presenting genes and genomic copy number variations in the Tasmanian devil MHC

    Directory of Open Access Journals (Sweden)

    Cheng Yuanyuan

    2012-03-01

    Full Text Available Abstract Background The Tasmanian devil (Sarcophilus harrisii is currently under threat of extinction due to an unusual fatal contagious cancer called Devil Facial Tumour Disease (DFTD. DFTD is caused by a clonal tumour cell line that is transmitted between unrelated individuals as an allograft without triggering immune rejection due to low levels of Major Histocompatibility Complex (MHC diversity in Tasmanian devils. Results Here we report the characterization of the genomic regions encompassing MHC Class I and Class II genes in the Tasmanian devil. Four genomic regions approximately 960 kb in length were assembled and annotated using BAC contigs and physically mapped to devil Chromosome 4q. 34 genes and pseudogenes were identified, including five Class I and four Class II loci. Interestingly, when two haplotypes from two individuals were compared, three genomic copy number variants with sizes ranging from 1.6 to 17 kb were observed within the classical Class I gene region. One deletion is particularly important as it turns a Class Ia gene into a pseudogene in one of the haplotypes. This deletion explains the previously observed variation in the Class I allelic number between individuals. The frequency of this deletion is highest in the northwestern devil population and lowest in southeastern areas. Conclusions The third sequenced marsupial MHC provides insights into the evolution of this dynamic genomic region among the diverse marsupial species. The two sequenced devil MHC haplotypes revealed three copy number variations that are likely to significantly affect immune response and suggest that future work should focus on the role of copy number variations in disease susceptibility in this species.

  1. Effective Normalization for Copy Number Variation Detection from Whole Genome Sequencing

    NARCIS (Netherlands)

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D.

    2012-01-01

    Background Whole genome sequencing enables a high resolution view ofthe human genome and provides unique insights into genome structureat an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools while validatedalso include a number of parame

  2. Identification of Nucleotide Variation in Genomes Using Next-Generation Sequencing

    NARCIS (Netherlands)

    Megens, H.J.W.C.; Groenen, M.A.M.

    2012-01-01

    Discovery of genome-wide variation has taken a huge leap forward with the introduction of next-generation sequencing (NGS) technology. Variant discovery requires sampling of a number of haplotypes. This can be either the two haplotypes of a diploid organism or multiple haplotypes in a population. Va

  3. Detailed analysis of sequence changes occurring during vlsE antigenic variation in the mouse model of Borrelia burgdorferi infection.

    Directory of Open Access Journals (Sweden)

    Loïc Coutte

    2009-02-01

    Full Text Available Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.

  4. Mining Association Rules in Dengue Gene Sequence with Latent Periodicity

    Directory of Open Access Journals (Sweden)

    Marimuthu Thangam

    2015-01-01

    Full Text Available The mining of periodic patterns in dengue database is an interesting research problem that can be used for predicting the future evolution of dengue viruses. In this paper, we propose an algorithm called Recurrence Finder (RECFIN that uses the suffix tree for detecting the periodic patterns of dengue gene sequence. Also, the RECFIN finds the presence of palindrome which indicates the possibilities of formation of proteins. Further, this paper computes the periodicity of nucleic acid and amino acid sequences of any length. The periodicity based association rules are used to diagnose the type of dengue. The time complexity of the proposed algorithm is O(n2. We demonstrate the effectiveness of the proposed approach by comparing the experimental results performed on dengue virus serotypes dataset with NCBI-BLAST algorithm.

  5. Variation within the Huntington's disease gene influences normal brain structure.

    Directory of Open Access Journals (Sweden)

    Mark Mühlau

    Full Text Available Genetics of the variability of normal and diseased brain structure largely remains to be elucidated. Expansions of certain trinucleotide repeats cause neurodegenerative disorders of which Huntington's disease constitutes the most common example. Here, we test the hypothesis that variation within the IT15 gene on chromosome 4, whose expansion causes Huntington's disease, influences normal human brain structure. In 278 normal subjects, we determined CAG repeat length within the IT15 gene on chromosome 4 and analyzed high-resolution T1-weighted magnetic resonance images by the use of voxel-based morphometry. We found an increase of GM with increasing long CAG repeat and its interaction with age within the pallidum, which is involved in Huntington's disease. Our study demonstrates that a certain trinucleotide repeat influences normal brain structure in humans. This result may have important implications for the understanding of both the healthy and diseased brain.

  6. Sequence analysis of the msp4 gene of Anaplasma ovis strains

    Science.gov (United States)

    de la Fuente, J.; Atkinson, M.W.; Naranjo, V.; Fernandez de Mera, I. G.; Mangold, A.J.; Keating, K.A.; Kocan, K.M.

    2007-01-01

    Anaplasma ovis (Rickettsiales: Anaplasmataceae) is a tick-borne pathogen of sheep, goats and wild ruminants. The genetic diversity of A. ovis strains has not been well characterized due to the lack of sequence information. In this study, we evaluated bighorn sheep (Ovis canadensis) and mule deer (Odocoileus hemionus) from Montana for infection with A. ovis by serology and sequence analysis of the msp4 gene. Antibodies to Anaplasma spp. were detected in 37% and 39% of bighorn sheep and mule deer analyzed, respectively. Four new msp4 genotypes were identified. The A. ovis msp4 sequences identified herein were analyzed together with sequences reported previously for the characterization of the genetic diversity of A. ovis strains in comparison with other Anaplasma spp. The results of these studies demonstrated that although A. ovis msp4 genotypes may vary among geographic regions and between sheep and deer hosts, the variation observed was less than the variation observed between A. marginale and A. phagocytophilum strains. The results reported herein further confirm that A. ovis infection occurs in natural wild ruminant populations in Western United States and that bighorn sheep and mule deer may serve as wildlife reservoirs of A. ovis. ?? 2006.

  7. Evidence that Natural Selection is the Primary Cause of the Guanine-cytosine Content Variation in Rice Genes

    Institute of Scientific and Technical Information of China (English)

    Xiaoli Shi; Xiyin Wang; Zhe Li; Qihui Zhu; Ji Yang; Song Ge; Jingchu Luo

    2007-01-01

    Cereal genes are classified into two distinct classes according to the guanine-cytosine (GC) content at the third codon sites (GC3). Natural selection and mutation bias have been proposed to affect the GC content. However, there has been controversy about the cause of GC variation. Here, we characterized the GC content of 1 092 paralogs and other single-copy genes in the duplicated chromosomal regions of the rice genome (ssp. indica) and classified the paralogs into GC3-rich and GC3-poor groups. By referring to out-group sequences from Arabidopsis and maize, we confirmed that the average synonymous substitution rate of the GC3-rich genes is significantly lower than that of the GC3-poor genes. Furthermore,we explored the other possible factors corresponding to the GC variation including the length of coding sequences, the number of exons in each gene, the number of genes in each family, the location of genes on chromosomes and the protein functions. Consequently, we propose that natural selection rather than mutation bias was the primary cause of the GC variation.

  8. Nucleotide Base Variation of Blast Disease Resistance Gene Pi33 in Rice Selected Broad Genetic Background

    Directory of Open Access Journals (Sweden)

    DWINITA WIKAN UTAMI

    2011-09-01

    Full Text Available Rice is one of the most important crops for human beings, thus increasing productivity are continually persecuted. Blast disease can reduce the rate of productivity of rice cultivation. Therefore, the program of blast disease-resistant varieties needs to do effectively. One of broad-spectrum blast disease-resistant gene is Pi33. This study was aimed to identify the variation in the sequence of nucleotide bases of Pi33 gene in five interspesific lines which derived from Bio46 (IR64/Oryza rufipogon and CT13432 crossing. DNA of five rice lines were amplified using the spesific primer for Pi33, G1010. Amplification results purified through Exonuclease 1 and Shrimp Alkaline Phosphatase protocols. Labelling using fluorescent dyes done before sequencing nucleotide base using CEQ8000 instrument. The results showed that lines number 28 showed introgesion of the three control parent genome (subspecies of Indica, subspecies of Japonica, and O. rufipogon while the Lines number 79, 136, and 143 were identical to Indica genome. Strain number 195 was identical to Japonica genome. These broad genetic background lines promise as durable performance to attack the expansion of the dynamic nature of the pathogen to blast. The result of ortholog sequence analysis found conserved nucleotide base sequence (CAGCAGCC which involved in heterotrimeric G-protein group. This protein has role as plant receptor for recognizing pathogen elicitor in interaction of rice and blast pathogen.

  9. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    Science.gov (United States)

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  10. High throughput 16S rRNA gene amplicon sequencing

    DEFF Research Database (Denmark)

    Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup

    S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r...... to the presence of filamentous microorganisms was monitored weekly over 4 months. Microthrix was identified as a causative filament and suitable control measures were introduced. The level of Microthrix was reduced after 1-2 months but a number of other filamentous species were still present, with most of them...

  11. Reprint of "Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray)".

    Science.gov (United States)

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-01-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  12. A sequence-based approach to identify reference genes for gene expression analysis

    Directory of Open Access Journals (Sweden)

    Chari Raj

    2010-08-01

    Full Text Available Abstract Background An important consideration when analyzing both microarray and quantitative PCR expression data is the selection of appropriate genes as endogenous controls or reference genes. This step is especially critical when identifying genes differentially expressed between datasets. Moreover, reference genes suitable in one context (e.g. lung cancer may not be suitable in another (e.g. breast cancer. Currently, the main approach to identify reference genes involves the mining of expression microarray data for highly expressed and relatively constant transcripts across a sample set. A caveat here is the requirement for transcript normalization prior to analysis, and measurements obtained are relative, not absolute. Alternatively, as sequencing-based technologies provide digital quantitative output, absolute quantification ensues, and reference gene identification becomes more accurate. Methods Serial analysis of gene expression (SAGE profiles of non-malignant and malignant lung samples were compared using a permutation test to identify the most stably expressed genes across all samples. Subsequently, the specificity of the reference genes was evaluated across multiple tissue types, their constancy of expression was assessed using quantitative RT-PCR (qPCR, and their impact on differential expression analysis of microarray data was evaluated. Results We show that (i conventional references genes such as ACTB and GAPDH are highly variable between cancerous and non-cancerous samples, (ii reference genes identified for lung cancer do not perform well for other cancer types (breast and brain, (iii reference genes identified through SAGE show low variability using qPCR in a different cohort of samples, and (iv normalization of a lung cancer gene expression microarray dataset with or without our reference genes, yields different results for differential gene expression and subsequent analyses. Specifically, key established pathways in lung

  13. Mitochondrial sequence variation in African-American primary open-angle glaucoma patients.

    Directory of Open Access Journals (Sweden)

    David W Collins

    Full Text Available Primary open-angle glaucoma (POAG is a major cause of blindness and results from irreversible retinal ganglion cell damage and optic nerve degeneration. In the United States, POAG is most prevalent in African-Americans. Mitochondrial genetics and dysfunction have been implicated in POAG, and potentially pathogenic sequence variations, in particular novel transversional base substitutions, are reportedly common in mitochondrial genomes (mtDNA from POAG patient blood. The purpose of this study was to ascertain the spectrum of sequence variation in mtDNA from African-American POAG patients and determine whether novel nonsynonymous, transversional or other potentially pathogenic sequence variations are observed more commonly in POAG cases than controls. mtDNA from African-American POAG cases (n = 22 and age-matched controls (n = 22 was analyzed by deep sequencing of a single 16,487 base pair PCR amplicon by Ion Torrent, and candidate novel variants were validated by Sanger sequencing. Sequence variants were classified and interpreted using the MITOMAP compendium of polymorphisms. 99.8% of the observed variations had been previously reported. The ratio of novel variants to POAG cases was 7-fold lower than a prior estimate. Novel mtDNA variants were present in 3 of 22 cases, novel nonsynonymous changes in 1 of 22 cases and novel transversions in 0 of 22 cases; these proportions are significantly lower (p<.0005, p<.0004, p<.0001 than estimated previously for POAG, and did not differ significantly from controls. Although it is possible that mitochondrial genetics play a role in African-Americans' high susceptibility to POAG, it is unlikely that any mitochondrial respiratory dysfunction is due to an abnormally high incidence of novel mutations that can be detected in mtDNA from peripheral blood.

  14. Natural variation in the Pto pathogen resistance gene within species of wild tomato (Lycopersicon). I. Functional analysis of Pto alleles.

    Science.gov (United States)

    Rose, Laura E; Langley, Charles H; Bernal, Adriana J; Michelmore, Richard W

    2005-09-01

    Disease resistance to the bacterial pathogen Pseudomonas syringae pv. tomato (Pst) in the cultivated tomato, Lycopersicon esculentum, and the closely related L. pimpinellifolium is triggered by the physical interaction between plant disease resistance protein, Pto, and the pathogen avirulence protein, AvrPto. To investigate the extent to which variation in the Pto gene is responsible for naturally occurring variation in resistance to Pst, we determined the resistance phenotype of 51 accessions from seven species of Lycopersicon to isogenic strains of Pst differing in the presence of avrPto. One-third of the plants displayed resistance specifically when the pathogen expressed AvrPto, consistent with a gene-for-gene interaction. To test whether this resistance in these species was conferred specifically by the Pto gene, alleles of Pto were amplified and sequenced from 49 individuals and a subset (16) of these alleles was tested in planta using Agrobacterium-mediated transient assays. Eleven alleles conferred a hypersensitive resistance response (HR) in the presence of AvrPto, while 5 did not. Ten amino acid substitutions associated with the absence of AvrPto recognition and HR were identified, none of which had been identified in previous structure-function studies. Additionally, 3 alleles encoding putative pseudogenes of Pto were isolated from two species of Lycopersicon. Therefore, a large proportion, but not all, of the natural variation in the reaction to strains of Pst expressing AvrPto can be attributed to sequence variation in the Pto gene.

  15. AGTR1 gene variation: association with depression and frontotemporal morphology.

    Science.gov (United States)

    Taylor, Warren D; Benjamin, Sophiya; McQuoid, Douglas R; Payne, Martha E; Krishnan, Ranga R; MacFall, James R; Ashley-Koch, Allison

    2012-05-31

    The renin-angiotensin system (RAS) is implicated in the response to physiological and psychosocial stressors, but its role in stress-related psychiatric disorders is poorly understood. We examined if variation in AGTR1, the gene coding for the type 1 angiotensin II receptor (AT(1)R), is associated with a diagnosis of depression and differences in white matter hyperintensities and frontotemporal brain volumes. Participants comprised 257 depressed and 116 nondepressed elderly Caucasian subjects who completed clinical assessments and provided blood samples for genotyping. We used a haplotype-tagging single nucleotide polymorphism (htSNP) analysis to test for variation in AGTR1. For measurement of hyperintense lesions, 1.5 Tesla magnetic resonance imaging (MRI) data were available on 33 subjects. For measurements of the hippocampus and dorsolateral prefrontal cortex (dlPFC), 3 Tesla MRI data were available on 70 subjects. Two htSNPs exhibited statistically significant frequency differences between diagnostic cohorts: rs10935724 and rs12721331. Although hyperintense lesion volume did not significantly differ by any htSNP, dlPFC and hippocampus volume differed significantly for several htSNPs. Intriguingly, for those htSNPs differing significantly for both dlPFC and hippocampus volume, the variant associated with smaller dlPFC volume was associated with larger hippocampal volume. This supports the idea that genetic variation in AGTR1 is associated with depression and differences in frontotemporal morphology. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  16. Genetic variation at hair length candidate genes in elephants and the extinct woolly mammoth

    Directory of Open Access Journals (Sweden)

    Tisdale Michele

    2009-09-01

    Full Text Available Abstract Background Like humans, the living elephants are unusual among mammals in being sparsely covered with hair. Relative to extant elephants, the extinct woolly mammoth, Mammuthus primigenius, had a dense hair cover and extremely long hair, which likely were adaptations to its subarctic habitat. The fibroblast growth factor 5 (FGF5 gene affects hair length in a diverse set of mammalian species. Mutations in FGF5 lead to recessive long hair phenotypes in mice, dogs, and cats; and the gene has been implicated in hair length variation in rabbits. Thus, FGF5 represents a leading candidate gene for the phenotypic differences in hair length notable between extant elephants and the woolly mammoth. We therefore sequenced the three exons (except for the 3' UTR and a portion of the promoter of FGF5 from the living elephantid species (Asian, African savanna and African forest elephants and, using protocols for ancient DNA, from a woolly mammoth. Results Between the extant elephants and the mammoth, two single base substitutions were observed in FGF5, neither of which alters the amino acid sequence. Modeling of the protein structure suggests that the elephantid proteins fold similarly to the human FGF5 protein. Bioinformatics analyses and DNA sequencing of another locus that has been implicated in hair cover in humans, type I hair keratin pseudogene (KRTHAP1, also yielded negative results. Interestingly, KRTHAP1 is a pseudogene in elephantids as in humans (although fully functional in non-human primates. Conclusion The data suggest that the coding sequence of the FGF5 gene is not the critical determinant of hair length differences among elephantids. The results are discussed in the context of hairlessness among mammals and in terms of the potential impact of large body size, subarctic conditions, and an aquatic ancestor on hair cover in the Proboscidea.

  17. A comparison of intraspecific patterns of DNA sequence variation in mitochondrial DNA, alpha-enolase, and MHC class II B loci in auklets (Charadriiformes: Alcidae).

    Science.gov (United States)

    Walsh, Hollie E; Friesen, Vicki L

    2003-12-01

    Patterns of DNA sequence variation can be used to learn about mechanisms of organismal evolution, but only if mechanisms of sequence evolution are well understood. Although theories of molecular evolution are well developed, few empirical studies have addressed patterns and mechanisms of sequence evolution in nuclear genes within species. In the present study, we compared DNA sequences among three loci with different evolutionary constraints to determine the influences of effective population size, balancing selection, and linkage on intraspecific patterns of sequence variation. Specifically, we assessed the degree and nature of polymorphism in a 307-base pair (bp) fragment of the mitochondrial cytochrome b gene, intron VIII of the gene for alpha-enolase (a presumably neutral nuclear gene), and an approximately 600-bp fragment of an MHC class II B gene, including 155 bp of the hypervariable peptide binding region (a nuclear locus thought to be under balancing selection) for least and crested auklets (Aethia pusilla and A. cristatella; Charadriiformes: Alcidae). Transspecies polymorphism was found in both alpha-enolase and the MHC but not cytochrome b and, given estimates of effective population size, probably represents retained ancestral variation. Biases in nucleotide composition suggested that mutational bias, tRNA availability, and the secondary structure of mRNA and/or DNA may influence base usage. Several lines of evidence indicated that balancing selection may be acting on the MHC II B exon 2. However, no evidence of balancing selection was observed in the intron and exon sequences immediately downstream of MHC II B exon 2.

  18. Genes contributing to pain sensitivity in the normal population: an exome sequencing study.

    Science.gov (United States)

    Williams, Frances M K; Scollen, Serena; Cao, Dandan; Memari, Yasin; Hyde, Craig L; Zhang, Baohong; Sidders, Benjamin; Ziemek, Daniel; Shi, Yujian; Harris, Juliette; Harrow, Ian; Dougherty, Brian; Malarstig, Anders; McEwen, Robert; Stephens, Joel C; Patel, Ketan; Menni, Cristina; Shin, So-Youn; Hodgkiss, Dylan; Surdulescu, Gabriela; He, Wen; Jin, Xin; McMahon, Stephen B; Soranzo, Nicole; John, Sally; Wang, Jun; Spector, Tim D

    2012-01-01

    Sensitivity to pain varies considerably between individuals and is known to be heritable. Increased sensitivity to experimental pain is a risk factor for developing chronic pain, a common and debilitating but poorly understood symptom. To understand mechanisms underlying pain sensitivity and to search for rare gene variants (MAF<5%) influencing pain sensitivity, we explored the genetic variation in individuals' responses to experimental pain. Quantitative sensory testing to heat pain was performed in 2,500 volunteers from TwinsUK (TUK): exome sequencing to a depth of 70× was carried out on DNA from singletons at the high and low ends of the heat pain sensitivity distribution in two separate subsamples. Thus in TUK1, 101 pain-sensitive and 102 pain-insensitive were examined, while in TUK2 there were 114 and 96 individuals respectively. A combination of methods was used to test the association between rare variants and pain sensitivity, and the function of the genes identified was explored using network analysis. Using causal reasoning analysis on the genes with different patterns of SNVs by pain sensitivity status, we observed a significant enrichment of variants in genes of the angiotensin pathway (Bonferroni corrected p = 3.8×10(-4)). This pathway is already implicated in animal models and human studies of pain, supporting the notion that it may provide fruitful new targets in pain management. The approach of sequencing extreme exome variation in normal individuals has provided important insights into gene networks mediating pain sensitivity in humans and will be applicable to other common complex traits.

  19. Phylogenetic relationships among Linguatula serrata isolates from Iran based on 18S rRNA and mitochondrial cox1 gene sequences.

    Science.gov (United States)

    Ghorashi, Seyed Ali; Tavassoli, Mousa; Peters, Andrew; Shamsi, Shokoofeh; Hajipour, Naser

    2016-01-01

    The phylogenetic relationships among seven Linguatula serrata (L. serrata) isolates collected from cattle, goats, sheep, dogs and camels in different geographical locations of Iran were investigated using partial 18S ribosomal RNA (rRNA) and partial mitochondrial cytochrome c oxidase subunit 1 (cox1) gene sequences. The nucleotide sequences were analysed in order to determine the phylogenetic relationships between the isolates. Higher sequence diversity and intraspecies variation was observed in the cox1 gene compared to 18S rRNA sequences. Phylogenetic analysis of the cox1 gene placed all L. serrata isolates in a sister clade to L. arctica. The Mantel regression analysis revealed no association between genetic variations and host species or geographical location, perhaps due to the small sample size. However, genetic variations between L. serrata isolates in Iran and those isolated in other parts of the world may exist and could reveal possible evolutionary relationships.

  20. Targeted capture and resequencing of 1040 genes reveal environmentally driven functional variation in grey wolves.

    Science.gov (United States)

    Schweizer, Rena M; Robinson, Jacqueline; Harrigan, Ryan; Silva, Pedro; Galverni, Marco; Musiani, Marco; Green, Richard E; Novembre, John; Wayne, Robert K

    2016-01-01

    In an era of ever-increasing amounts of whole-genome sequence data for individuals and populations, the utility of traditional single nucleotide polymorphisms (SNPs) array-based genome scans is uncertain. We previously performed a SNP array-based genome scan to identify candidate genes under selection in six distinct grey wolf (Canis lupus) ecotypes. Using this information, we designed a targeted capture array for 1040 genes, including all exons and flanking regions, as well as 5000 1-kb nongenic neutral regions, and resequenced these regions in 107 wolves. Selection tests revealed striking patterns of variation within candidate genes relative to noncandidate regions and identified potentially functional variants related to local adaptation. We found 27% and 47% of candidate genes from the previous SNP array study had functional changes that were outliers in sweed and bayenv analyses, respectively. This result verifies the use of genomewide SNP surveys to tag genes that contain functional variants between populations. We highlight nonsynonymous variants in APOB, LIPG and USH2A that occur in functional domains of these proteins, and that demonstrate high correlation with precipitation seasonality and vegetation. We find Arctic and High Arctic wolf ecotypes have higher numbers of genes under selection, which highlight their conservation value and heightened threat due to climate change. This study demonstrates that combining genomewide genotyping arrays with large-scale resequencing and environmental data provides a powerful approach to discern candidate functional variants in natural populations. © 2015 John Wiley & Sons Ltd.

  1. Clinal variation at phenology-related genes in spruce: parallel evolution in FTL2 and Gigantea?

    Science.gov (United States)

    Chen, Jun; Tsuda, Yoshiaki; Stocks, Michael; Källman, Thomas; Xu, Nannan; Kärkkäinen, Katri; Huotari, Tea; Semerikov, Vladimir L; Vendramin, Giovanni G; Lascoux, Martin

    2014-07-01

    Parallel clines in different species, or in different geographical regions of the same species, are an important source of information on the genetic basis of local adaptation. We recently detected latitudinal clines in SNPs frequencies and gene expression of candidate genes for growth cessation in Scandinavian populations of Norway spruce (Picea abies). Here we test whether the same clines are also present in Siberian spruce (P. obovata), a close relative of Norway spruce with a different Quaternary history. We sequenced nine candidate genes and 27 control loci and genotyped 14 SSR loci in six populations of P. obovata located along the Yenisei river from latitude 56°N to latitude 67°N. In contrast to Scandinavian Norway spruce that both departs from the standard neutral model (SNM) and shows a clear population structure, Siberian spruce populations along the Yenisei do not depart from the SNM and are genetically unstructured. Nonetheless, as in Norway spruce, growth cessation is significantly clinal. Polymorphisms in photoperiodic (FTL2) and circadian clock (Gigantea, GI, PRR3) genes also show significant clinal variation and/or evidence of local selection. In GI, one of the variants is the same as in Norway spruce. Finally, a strong cline in gene expression is observed for FTL2, but not for GI. These results, together with recent physiological studies, confirm the key role played by FTL2 and circadian clock genes in the control of growth cessation in spruce species and suggest the presence of parallel adaptation in these two species.

  2. Molecular chaperone genes in the sugarcane expressed sequence database (SUCEST

    Directory of Open Access Journals (Sweden)

    Júlio C. Borges

    2001-12-01

    Full Text Available Some newly synthesized proteins require the assistance of molecular chaperones for their correct folding. Chaperones are also involved in the dissolution of protein aggregates making their study significant for both biotechnology and medicine and the identification of chaperones and stress-related protein sequences in different organisms is an important task. We used bioinformatic tools to investigate the information generated by the Sugarcane Expressed Sequence Tag (SUCEST genome project in order to identify and annotate molecular chaperones. We considered that the SUCEST sequences belonged to this category of proteins when their E-values were lower than 1.0e-05. Our annotation shows that 4,164 of the 5’ expressed sequence tag (EST sequences were homologous to molecular chaperones, nearly 1.8% of all the 5’ ESTs sequenced during the SUCEST project. About 43% of the chaperones which we found were Hsp70 chaperones and its co-chaperones, 10% were Hsp90 chaperones and 13% were peptidyl-prolyl cis, trans isomerase. Based on the annotation results we predicted 156 different chaperone gene subclasses in the sugarcane genome. Taken together, our results indicate that genes which encode chaperones were diverse and abundantly expressed in sugarcane cells, which emphasizes their biological importance.Algumas proteínas ao serem sintetizadas necessitam do auxílio de chaperones moleculares para seu correto enovelamento. Chaperones também estão envolvidas na dissolução de agregados protéicos, fazendo com que seu estudo seja de relevância biotecnológica e médica. Portanto, a identificação de seqüências de chaperones moleculares é uma tarefa importante. Nós usamos ferramentas de bioinformática para procurar informações geradas pelo sugarcane EST Genome Project (SUCEST a fim de identificar e anotar chaperones e proteínas relacionas ao estresse. As seqüências do SUCEST eram anotadas como pertencentes a uma categoria de proteínas se o E

  3. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  4. AIB1 gene amplification and the instability of polyQ encoding sequence in breast cancer cell lines

    Directory of Open Access Journals (Sweden)

    Clarke Robert

    2006-05-01

    Full Text Available Abstract Background The poly Q polymorphism in AIB1 (amplified in breast cancer gene is usually assessed by fragment length analysis which does not reveal the actual sequence variation. The purpose of this study is to investigate the sequence variation of poly Q encoding region in breast cancer cell lines at single molecule level, and to determine if the sequence variation is related to AIB1 gene amplification. Methods The polymorphic poly Q encoding region of AIB1 gene was investigated at the single molecule level by PCR cloning/sequencing. The amplification of AIB1 gene in various breast cancer cell lines were studied by real-time quantitative PCR. Results Significant amplifications (5–23 folds of AIB1 gene were found in 2 out of 9 (22% ER positive cell lines (in BT-474 and MCF-7 but not in BT-20, ZR-75-1, T47D, BT483, MDA-MB-361, MDA-MB-468 and MDA-MB-330. The AIB1 gene was not amplified in any of the ER negative cell lines. Different passages of MCF-7 cell lines and their derivatives maintained the feature of AIB1 amplification. When the cells were selected for hormone independence (LCC1 and resistance to 4-hydroxy tamoxifen (4-OH TAM (LCC2 and R27, ICI 182,780 (LCC9 or 4-OH TAM, KEO and LY 117018 (LY-2, AIB1 copy number decreased but still remained highly amplified. Sequencing analysis of poly Q encoding region of AIB1 gene did not reveal specific patterns that could be correlated with AIB1 gene amplification. However, about 72% of the breast cancer cell lines had at least one under represented (3CAA(CAG9(CAACAG3(CAACAGCAG2CAA of the original cell line, a number of altered poly Q encoding sequences were found in the derivatives of MCF-7 cell lines. Conclusion These data suggest that poly Q encoding region of AIB1 gene is somatic unstable in breast cancer cell lines. The instability and the sequence characteristics, however, do not appear to be associated with the level of the gene amplification.

  5. High intraindividual variation in internal transcibed spacer sequences in Aeschynanthus (Gesneriaceae): implications for phylogenetics.

    Science.gov (United States)

    Denduangboripant, J; Cronk, Q C

    2000-07-22

    Aeschynanthus (Gesneriaceae) is a large genus of tropical epiphytes that is widely distributed from the Himalayas and China throughout South-East Asia to New Guinea and the Solomon Islands. Polymerase chain reaction (PCR) consensus sequences of the internal transcribed spacers (ITS) of Aeschynanthus nuclear ribosomal DNA showed sequence polymorphism that was difficult to interpret. Cloning individual sequences from the PCR product generated a phylogenetic tree of 23 Aeschynanthus species (two clones per species). The intraindividual clone pairs varied from 0 to 5.01%. We suggest that the high intraindividual sequence variation results from low molecular drive in the ITS of Aeschynanthus. However, this study shows that, despite the variation found within some individuals, it is still possible to use these data to reconstruct phylogenetic relationships of the species, suggesting that clone variation, although persistent, does not pre-date the divergence of Aeschynanthus species. The Aeschynanthus analysis revealed two major clades with different but overlapping geographic distributions and reflected classification based on morphology (particularly seed hair type).

  6. Whole-exome sequencing for the identification of susceptibility genes of Kashin-Beck disease.

    Directory of Open Access Journals (Sweden)

    Zhenxing Yang

    Full Text Available OBJECTIVE: To identify and investigate the susceptibility genes of Kashin-Beck disease (KBD in Chinese population. METHODS: Whole-exome capturing and sequencing technology was used for the detection of genetic variations in 19 individuals from six families with high incidence of KBD. A total of 44 polymorphisms from 41 genes were genotyped from a total of 144 cases and 144 controls by using MassARRAY under the standard protocol from Sequenom. Association was applied on the data by using PLINK1.07. RESULTS: In the sequencing stage, each sample showed approximately 70-fold coverage, thus covering more than 99% of the target regions. Among the single nucleotide polymorphisms (SNPs used in the transmission disequilibrium test, 108 had a p-value of <0.01, whereas 1056 had a p-value of <0.05. Kyoto Encyclopedia of Genes and Genomes(KEGG pathway analysis indicates that these SNPs focus on three major pathways: regulation of actin cytoskeleton, focal adhesion, and metabolic pathways. In the validation stage, single locus effects revealed that two of these polymorphisms (rs7745040 and rs9275295 in the human leukocyte antigen (HLA-DRB1 gene and one polymorphism (rs9473132 in CD2-associated protein (CD2AP gene have a significant statistical association with KBD. CONCLUSIONS: HLA-DRB1 and CD2AP gene were identified to be among the susceptibility genes of KBD, thus supporting the role of the autoimmune response in KBD and the possibility of shared etiology between osteoarthritis, rheumatoid arthritis, and KBD.

  7. Variations in ORAI1 Gene Associated with Kawasaki Disease.

    Science.gov (United States)

    Onouchi, Yoshihiro; Fukazawa, Ryuji; Yamamura, Kenichiro; Suzuki, Hiroyuki; Kakimoto, Nobuyuki; Suenaga, Tomohiro; Takeuchi, Takashi; Hamada, Hiromichi; Honda, Takafumi; Yasukawa, Kumi; Terai, Masaru; Ebata, Ryota; Higashi, Kouji; Saji, Tsutomu; Kemmotsu, Yasushi; Takatsuki, Shinichi; Ouchi, Kazunobu; Kishi, Fumio; Yoshikawa, Tetsushi; Nagai, Toshiro; Hamamoto, Kunihiro; Sato, Yoshitake; Honda, Akihito; Kobayashi, Hironobu; Sato, Junichi; Shibuta, Shoichi; Miyawaki, Masakazu; Oishi, Ko; Yamaga, Hironobu; Aoyagi, Noriyuki; Yoshiyama, Megumi; Miyashita, Ritsuko; Murata, Yuji; Fujino, Akihiro; Ozaki, Kouichi; Kawasaki, Tomisaku; Abe, Jun; Seki, Mitsuru; Kobayashi, Tohru; Arakawa, Hirokazu; Ogawa, Shunichi; Hara, Toshiro; Hata, Akira; Tanaka, Toshihiro

    2016-01-01

    Kawasaki disease (KD; MIM#61175) is a systemic vasculitis syndrome with unknown etiology which predominantly affects infants and children. Recent findings of susceptibility genes for KD suggest possible involvement of the Ca(2+)/NFAT pathway in the pathogenesis of KD. ORAI1 is a Ca(2+) release activated Ca(2+) (CRAC) channel mediating store-operated Ca(2+) entry (SOCE) on the plasma membrane. The gene for ORAI1 is located in chromosome 12q24 where a positive linkage signal was observed in our previous affected sib-pair study of KD. A common non-synonymous single nucleotide polymorphism located within exon 2 of ORAI1 (rs3741596) was significantly associated with KD (P = 0.028 in the discovery sample set (729 KD cases and 1,315 controls), P = 0.0056 in the replication sample set (1,813 KD cases vs. 1,097 controls) and P = 0.00041 in a meta-analysis by the Mantel-Haenszel method). Interestingly, frequency of the risk allele of rs3741596 is more than 20 times higher in Japanese compared to Europeans. We also found a rare 6 base-pair in-frame insertion variant associated with KD (rs141919534; 2,544 KD cases vs. 2,414 controls, P = 0.012). These data indicate that ORAI1 gene variations are associated with KD and may suggest the potential importance of the Ca(2+)/NFAT pathway in the pathogenesis of this disorder.

  8. Variations in ORAI1 Gene Associated with Kawasaki Disease.

    Directory of Open Access Journals (Sweden)

    Yoshihiro Onouchi

    Full Text Available Kawasaki disease (KD; MIM#61175 is a systemic vasculitis syndrome with unknown etiology which predominantly affects infants and children. Recent findings of susceptibility genes for KD suggest possible involvement of the Ca(2+/NFAT pathway in the pathogenesis of KD. ORAI1 is a Ca(2+ release activated Ca(2+ (CRAC channel mediating store-operated Ca(2+ entry (SOCE on the plasma membrane. The gene for ORAI1 is located in chromosome 12q24 where a positive linkage signal was observed in our previous affected sib-pair study of KD. A common non-synonymous single nucleotide polymorphism located within exon 2 of ORAI1 (rs3741596 was significantly associated with KD (P = 0.028 in the discovery sample set (729 KD cases and 1,315 controls, P = 0.0056 in the replication sample set (1,813 KD cases vs. 1,097 controls and P = 0.00041 in a meta-analysis by the Mantel-Haenszel method. Interestingly, frequency of the risk allele of rs3741596 is more than 20 times higher in Japanese compared to Europeans. We also found a rare 6 base-pair in-frame insertion variant associated with KD (rs141919534; 2,544 KD cases vs. 2,414 controls, P = 0.012. These data indicate that ORAI1 gene variations are associated with KD and may suggest the potential importance of the Ca(2+/NFAT pathway in the pathogenesis of this disorder.

  9. Variations in ORAI1 Gene Associated with Kawasaki Disease

    Science.gov (United States)

    Suzuki, Hiroyuki; Kakimoto, Nobuyuki; Suenaga, Tomohiro; Takeuchi, Takashi; Hamada, Hiromichi; Honda, Takafumi; Yasukawa, Kumi; Terai, Masaru; Ebata, Ryota; Higashi, Kouji; Saji, Tsutomu; Kemmotsu, Yasushi; Takatsuki, Shinichi; Ouchi, Kazunobu; Kishi, Fumio; Yoshikawa, Tetsushi; Nagai, Toshiro; Hamamoto, Kunihiro; Sato, Yoshitake; Honda, Akihito; Kobayashi, Hironobu; Sato, Junichi; Shibuta, Shoichi; Miyawaki, Masakazu; Oishi, Ko; Yamaga, Hironobu; Aoyagi, Noriyuki; Yoshiyama, Megumi; Miyashita, Ritsuko; Murata, Yuji; Fujino, Akihiro; Ozaki, Kouichi; Kawasaki, Tomisaku; Abe, Jun; Seki, Mitsuru; Kobayashi, Tohru; Arakawa, Hirokazu; Ogawa, Shunichi; Hara, Toshiro; Hata, Akira; Tanaka, Toshihiro

    2016-01-01

    Kawasaki disease (KD; MIM#61175) is a systemic vasculitis syndrome with unknown etiology which predominantly affects infants and children. Recent findings of susceptibility genes for KD suggest possible involvement of the Ca2+/NFAT pathway in the pathogenesis of KD. ORAI1 is a Ca2+ release activated Ca2+ (CRAC) channel mediating store-operated Ca2+ entry (SOCE) on the plasma membrane. The gene for ORAI1 is located in chromosome 12q24 where a positive linkage signal was observed in our previous affected sib-pair study of KD. A common non-synonymous single nucleotide polymorphism located within exon 2 of ORAI1 (rs3741596) was significantly associated with KD (P = 0.028 in the discovery sample set (729 KD cases and 1,315 controls), P = 0.0056 in the replication sample set (1,813 KD cases vs. 1,097 controls) and P = 0.00041 in a meta-analysis by the Mantel-Haenszel method). Interestingly, frequency of the risk allele of rs3741596 is more than 20 times higher in Japanese compared to Europeans. We also found a rare 6 base-pair in-frame insertion variant associated with KD (rs141919534; 2,544 KD cases vs. 2,414 controls, P = 0.012). These data indicate that ORAI1 gene variations are associated with KD and may suggest the potential importance of the Ca2+/NFAT pathway in the pathogenesis of this disorder. PMID:26789410

  10. Solving the molecular diagnostic testing conundrum for Mendelian disorders in the era of next-generation sequencing: single-gene, gene panel, or exome/genome sequencing.

    Science.gov (United States)

    Xue, Yuan; Ankala, Arunkanth; Wilcox, William R; Hegde, Madhuri R

    2015-06-01

    Next-generation sequencing is changing the paradigm of clinical genetic testing. Today there are numerous molecular tests available, including single-gene tests, gene panels, and exome sequencing or genome sequencing. As a result, ordering physicians face the conundrum of selecting the best diagnostic tool for their patients with genetic conditions. Single-gene testing is often most appropriate for conditions with distinctive clinical features and minimal locus heterogeneity. Next-generation sequencing-based gene panel testing, which can be complemented with array comparative genomic hybridization and other ancillary methods, provides a comprehensive and feasible approach for heterogeneous disorders. Exome sequencing and genome sequencing have the advantage of being unbiased regarding what set of genes is analyzed, enabling parallel interrogation of most of the genes in the human genome. However, current limitations of next-generation sequencing technology and our variant interpretation capabilities caution us against offering exome sequencing or genome sequencing as either stand-alone or first-choice diagnostic approaches. A growing interest in personalized medicine calls for the application of genome sequencing in clinical diagnostics, but major challenges must be addressed before its full potential can be realized. Here, we propose a testing algorithm to help clinicians opt for the most appropriate molecular diagnostic tool for each scenario.

  11. Phylogenetic analysis of dermatophyte species using DNA sequence polymorphism in calmodulin gene.

    Science.gov (United States)

    Ahmadi, Bahram; Mirhendi, Hossein; Makimura, Koichi; de Hoog, G Sybren; Shidfar, Mohammad Reza; Nouripour-Sisakht, Sadegh; Jalalizand, Niloofar

    2016-07-01

    Use of phylogenetic species concepts based on rDNA internal transcribe spacer (ITS) regions have improved the taxonomy of dermatophyte species; however, confirmation and refinement using other genes are needed. Since the calmodulin gene has not been systematically used in dermatophyte taxonomy, we evaluated its intra- and interspecies sequence variation as well as its application in identification, phylogenetic analysis, and taxonomy of 202 strains of 29 dermatophyte species. A set of primers was designed and optimized to amplify the target followed by bilateral sequencing. Using pairwise nucleotide comparisons, a mean similarity of 81% was observed among 29 dermatophyte species, with inter-species diversity ranging from 0 to 200 nucleotides (nt). Intraspecies nt differences were found within strains of Trichophyton interdigitale, Arthroderma simii, T. rubrum and A. vanbreuseghemii, while T. tonsurans, T. violaceum, Epidermophyton floccosum, Microsporum canis, M. audouinii, M. cookei, M. racemosum, M. gypseum, T. mentagrophytes, T schoenleinii, and A. benhamiae were conserved. Strains of E. floccosum/M. racemosum/M. cookei, A. obtosum/A. gertleri, T. tonsurans/T. equinum and a genotype of T. interdigitale had identical calmodulin sequences. For the majority of the species, tree topology obtained for calmodulin gene showed a congruence with coding and non-coding regions including ITS, BT2, and Tef-1α. Compared with the phylogenetic tree derived from ITS, BT2, and Tef-1α genes, some species such as E. floccosum and A. gertleri took relatively remote positions. Here, characterization and obtained dendrogram of calmodulin gene on a broad range of dermatophyte species provide a basis for further discovery of relationships between species. Studies of other loci are necessary to confirm the results.

  12. Membrane gene ontology bias in sequencing and microarray obtained by housekeeping-gene analysis.

    Science.gov (United States)

    Zhang, Yijuan; Akintola, Oluwafemi S; Liu, Ken J A; Sun, Bingyun

    2016-01-10

    Microarray (MA) and high-throughput sequencing are two commonly used detection systems for global gene expression profiling. Although these two systems are frequently used in parallel, the differences in their final results have not been examined thoroughly. Transcriptomic analysis of housekeeping (HK) genes provides a unique opportunity to reliably examine the technical difference between these two systems. We investigated here the structure, genome location, expression quantity, microarray probe coverage, as well as biological functions of differentially identified human HK genes by 9 MA and 6 sequencing studies. These in-depth analyses allowed us to discover, for the first time, a subset of transcripts encoding membrane, cell surface and nuclear proteins that were prone to differential identification by the two platforms. We hope that the discovery can aid the future development of these technologies for comprehensive transcriptomic studies. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene.

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-12-17

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations.

  14. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-01-01

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations. PMID:27999334

  15. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Directory of Open Access Journals (Sweden)

    Karin Soares Cunha

    2016-12-01

    Full Text Available Neurofibromatosis 1 (NF1 is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11. We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G. Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns for different types of pathogenic variations, including the deep intronic splicing mutations.

  16. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach

    Science.gov (United States)

    Hofmann, Hansjörg; Sakti, Sakriani; Hori, Chiori; Kashioka, Hideki; Nakamura, Satoshi; Minker, Wolfgang

    The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a “noisy” string and the goal is to recover the “clean” string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.

  17. Estimating the extent of horizontal gene transfer in metagenomic sequences

    Directory of Open Access Journals (Sweden)

    Moya Andrés

    2008-03-01

    Full Text Available Abstract Background Although the extent of horizontal gene transfer (HGT in complete genomes has been widely studied, its influence in the evolution of natural communities of prokaryotes remains unknown. The availability of metagenomic sequences allows us to address the study of global patterns of prokaryotic evolution in samples from natural communities. However, the methods that have been commonly used for the study of HGT are not suitable for metagenomic samples. Therefore it is important to develop new methods or to adapt existing ones to be used with metagenomic sequences. Results We have created two different methods that are suitable for the study of HGT in metagenomic samples. The methods are based on phylogenetic and DNA compositional approaches, and have allowed us to assess the extent of possible HGT events in metagenomes for the first time. The methods are shown to be compatible and quite precise, although they probably underestimate the number of possible events. Our results show that the phylogenetic method detects HGT in between 0.8% and 1.5% of the sequences, while DNA compositional methods identify putative HGT in between 2% and 8% of the sequences. These ranges are very similar to these found in complete genomes by related approaches. Both methods act with a different sensitivity since they probably target HGT events of different ages: the compositional method mostly identifies recent transfers, while the phylogenetic is more suitable for the detections of older events. Nevertheless, the study of the number of HGT events in metagenomic sequences from different communities shows a consistent trend for both methods: the lower amount is found for the sequences of the Sargasso Sea metagenome, while the higher quantity is found in the whale fall metagenome from the bottom of the ocean. The significance of these observations is discussed. Conclusion The computational approaches that are used to find possible HGT events in complete

  18. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    DEFF Research Database (Denmark)

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn

    2011-01-01

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental...... present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere....

  19. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.

    Science.gov (United States)

    Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N

    2013-06-03

    Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.

  20. Variation in the Ace Gene in Elite Polish Football Players

    Directory of Open Access Journals (Sweden)

    Cięszczyk Paweł

    2016-12-01

    Full Text Available Purpose. A common polymorphism in the angiotensin converting enzyme I gene (the ACE I/D variant represents one of the first characterized and the most widely studied genetic variants in the context of elite athletes status and performance related traits. The aim of the study was to determine the genotype and allele distribution of the allele and genotype of the ACE gene in Polish male football players. Methods. The total of 106 Polish male professional football players were recruited. They were divided into groups according to the position in the field: forwards, defenders, midfielders, and goalkeepers. For controls, samples were prepared with 115 unrelated volunteers. DNA was extracted from the buccal cells donated by the subjects, and the PCR amplification of the polymorphic region of the ACE gene containing either the insertion (I or deletion (D fragment was performed. Results. The genotype distribution and allele frequencies among all football players did not differ significantly when compared with sedentary control individuals (p = 0.887, p = 0.999, respectively. Likewise, the analysis of forwards, defenders, midfielders, and goalkeepers revealed no significant differences in either ACE genotype or allele frequencies. Conclusions. We did not provide evidence for difference of variation of the ACE I/D polymorphism between Polish football players and controls, as we did not obtain any statistically significantly higher frequency of either of the analysed alleles (I and D or genotypes (DD, ID, and II in the studied subgroups. It may be suspected that harbouring of I/D allelic variants of the ACE gene neither decreases nor increases the probability of being a professional football player in Poland.

  1. Caenorhabditis elegans Genes Affecting Interindividual Variation in Life-span Biomarker Gene Expression.

    Science.gov (United States)

    Mendenhall, Alexander; Crane, Matthew M; Tedesco, Patricia M; Johnson, Thomas E; Brent, Roger

    2017-10-01

    Genetically identical organisms grown in homogenous environments differ in quantitative phenotypes. Differences in one such trait, expression of a single biomarker gene, can identify isogenic cells or organisms that later manifest different fates. For example, in isogenic populations of young adult Caenorhabditis elegans, differences in Green Fluorescent Protein (GFP) expressed from the hsp-16.2 promoter predict differences in life span. Thus, it is of interest to determine how interindividual differences in biomarker gene expression arise. Prior reports showed that the thermosensory neurons and insulin signaling systems controlled the magnitude of the heat shock response, including absolute expression of hsp-16.2. Here, we tested whether these regulatory signals might also influence variation in hsp-16.2 reporter expression. Genetic experiments showed that the action of AFD thermosensory neurons increases interindividual variation in biomarker expression. Further genetic experimentation showed the insulin signaling system acts to decrease interindividual variation in life-span biomarker expression; in other words, insulin signaling canalizes expression of the hsp-16.2-driven life-span biomarker. Our results show that specific signaling systems regulate not only expression level, but also the amount of interindividual expression variation for a life-span biomarker gene. They raise the possibility that manipulation of these systems might offer means to reduce heterogeneity in the aging process. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Diversity and population-genetic properties of copy number variations and multicopy genes in cattle.

    Science.gov (United States)

    Bickhart, Derek M; Xu, Lingyang; Hutchison, Jana L; Cole, John B; Null, Daniel J; Schroeder, Steven G; Song, Jiuzhou; Garcia, Jose Fernando; Sonstegard, Tad S; Van Tassell, Curtis P; Schnabel, Robert D; Taylor, Jeremy F; Lewin, Harris A; Liu, George E

    2016-06-01

    The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1 Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future. Published by Oxford University Press on behalf of Kazusa DNA Research Institute 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  3. Modelling human regulatory variation in mouse: finding the function in genome-wide association studies and whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Jean-François Schmouth

    Full Text Available An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs, in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX. This method can be applied to most human genes for which a bacterial artificial chromosome (BAC construct can be derived and a mouse-null allele exists. This strategy comprises (1 the use of recombineering technology to create a human variant-harbouring BAC, (2 knock-in of this BAC into the mouse genome using Hprt docking technology, and (3 allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation.

  4. Sequencing of a patient with balanced chromosome abnormalities and neurodevelopmental disease identifies disruption of multiple high risk loci by structural variation.

    Directory of Open Access Journals (Sweden)

    Jonathon Blake

    Full Text Available Balanced chromosome abnormalities (BCAs occur at a high frequency in healthy and diseased individuals, but cost-efficient strategies to identify BCAs and evaluate whether they contribute to a phenotype have not yet become widespread. Here we apply genome-wide mate-pair library sequencing to characterize structural variation in a patient with unclear neurodevelopmental disease (NDD and complex de novo BCAs at the karyotype level. Nucleotide-level characterization of the clinically described BCA breakpoints revealed disruption of at least three NDD candidate genes (LINC00299, NUP205, PSMD14 that gave rise to abnormal mRNAs and could be assumed as disease-causing. However, unbiased genome-wide analysis of the sequencing data for cryptic structural variation was key to reveal an additional submicroscopic inversion that truncates the schizophrenia- and bipolar disorder-associated brain transcription factor ZNF804A as an equally likely NDD-driving gene. Deep sequencing of fluorescent-sorted wild-type and derivative chromosomes confirmed the clinically undetected BCA. Moreover, deep sequencing further validated a high accuracy of mate-pair library sequencing to detect structural variants larger than 10 kB, proposing that this approach is powerful for clinical-grade genome-wide structural variant detection. Our study supports previous evidence for a role of ZNF804A in NDD and highlights the need for a more comprehensive assessment of structural variation in karyotypically abnormal individuals and patients with neurocognitive disease to avoid diagnostic deception.

  5. High-throughput sequencing and copy number variation detection using formalin fixed embedded tissue in metastatic gastric cancer.

    Directory of Open Access Journals (Sweden)

    Seokhwi Kim

    Full Text Available In the era of targeted therapy, mutation profiling of cancer is a crucial aspect of making therapeutic decisions. To characterize cancer at a molecular level, the use of formalin-fixed paraffin-embedded tissue is important. We tested the Ion AmpliSeq Cancer Hotspot Panel v2 and nCounter Copy Number Variation Assay in 89 formalin-fixed paraffin-embedded gastric cancer samples to determine whether they are applicable in archival clinical samples for personalized targeted therapies. We validated the results with Sanger sequencing, real-time quantitative PCR, fluorescence in situ hybridization and immunohistochemistry. Frequently detected somatic mutations included TP53 (28.17%, APC (10.1%, PIK3CA (5.6%, KRAS (4.5%, SMO (3.4%, STK11 (3.4%, CDKN2A (3.4% and SMAD4 (3.4%. Amplifications of HER2, CCNE1, MYC, KRAS and EGFR genes were observed in 8 (8.9%, 4 (4.5%, 2 (2.2%, 1 (1.1% and 1 (1.1% cases, respectively. In the cases with amplification, fluorescence in situ hybridization for HER2 verified gene amplification and immunohistochemistry for HER2, EGFR and CCNE1 verified the overexpression of proteins in tumor cells. In conclusion, we successfully performed semiconductor-based sequencing and nCounter copy number variation analyses in formalin-fixed paraffin-embedded gastric cancer samples. High-throughput screening in archival clinical samples enables faster, more accurate and cost-effective detection of hotspot mutations or amplification in genes.

  6. Natural variation in DNA methylation in ribosomal RNA genes of Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Richards Eric J

    2008-09-01

    Full Text Available Abstract Background DNA methylation is an important biochemical mark that silences repetitive sequences, such as transposons, and reinforces epigenetic gene expression states. An important class of repetitive genes under epigenetic control in eukaryotic genomes encodes ribosomal RNA (rRNA transcripts. The ribosomal genes coding for the 45S rRNA precursor of the three largest eukaryotic ribosomal RNAs (18S, 5.8S, and 25–28S are found in nucleolus organizer regions (NORs, comprised of hundreds to thousands of repeats, only some of which are expressed in any given cell. An epigenetic switch, mediated by DNA methylation and histone modification, turns rRNA genes on and off. However, little is known about the mechanisms that specify and maintain the patterns of NOR DNA methylation. Results Here, we explored the extent of naturally-occurring variation in NOR DNA methylation among accessions of the flowering plant Arabidopsis thaliana. DNA methylation in coding regions of rRNA genes was positively correlated with copy number of 45S rRNA gene and DNA methylation in the intergenic spacer regions. We investigated the inheritance of NOR DNA methylation patterns in natural accessions with hypomethylated NORs in inter-strain crosses and defined three different categories of inheritance in F1 hybrids. In addition, subsequent analysis of F2 segregation for NOR DNA methylation patterns uncovered different patterns of inheritance. We also revealed that NOR DNA methylation in the Arabidopsis accession Bor-4 is influenced by the vim1-1 (variant in methylation 1-1 mutation, but the primary effect is specified by the NORs themselves. Conclusion Our results indicate that the NORs themselves are the most significant determinants of natural variation in NOR DNA methylation. However, the inheritance of NOR DNA methylation suggests the operation of a diverse set of mechanisms, including inheritance of parental methylation patterns, reconfiguration of parental NOR DNA

  7. Genes controlling mimetic colour pattern variation in butterflies.

    Science.gov (United States)

    Nadeau, Nicola J

    2016-10-01

    Butterfly wing patterns are made up of arrays of coloured scales. There are two genera in which within-species variation in wing patterning is common and has been investigated at the molecular level, Heliconius and Papilio. Both of these species have mimetic relationships with other butterfly species that increase their protection from predators. Heliconius have a 'tool-kit' of five genetic loci that control colour pattern, three of which have been identified at the gene level, and which have been repeatedly used to modify colour pattern by different species in the genus. By contrast, the three Papilio species that have been investigated each have different genetic mechanisms controlling their polymorphic wing patterns.

  8. Diversity of the luciferin binding protein gene in bioluminescent dinoflagellates--insights from a new gene in Noctiluca scintillans and sequences from gonyaulacoid genera.

    Science.gov (United States)

    Valiadi, Martha; Iglesias-Rodriguez, Maria Debora

    2014-01-01

    Dinoflagellate bioluminescence systems operate with or without a luciferin binding protein, representing two distinct modes of light production. However, the distribution, diversity, and evolution of the luciferin binding protein gene within bioluminescent dinoflagellates are not well known. We used PCR to detect and partially sequence this gene from the heterotrophic dinoflagellate Noctiluca scintillans and a group of ecologically important gonyaulacoid species. We report an additional luciferin binding protein gene in N. scintillans which is not attached to luciferase, further to its typical combined bioluminescence gene. This supports the hypothesis that a profound re-organization of the bioluminescence system has taken place in this organism. We also show that the luciferin binding protein gene is present in the genera Ceratocorys, Gonyaulax, and Protoceratium, and is prevalent in bioluminescent species of Alexandrium. Therefore, this gene is an integral component of the standard molecular bioluminescence machinery in dinoflagellates. Nucleotide sequences showed high within-strain variation among gene copies, revealing a highly diverse gene family comprising multiple gene types in some organisms. Phylogenetic analyses showed that, in some species, the evolution of the luciferin binding protein gene was different from the organism's general phylogenies, highlighting the complex evolutionary history of dinoflagellate bioluminescence systems.

  9. Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K

    2016-01-27

    Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for

  10. Genetic variation in the oxytocin receptor (OXTR) gene is associated with Asperger Syndrome.

    Science.gov (United States)

    Di Napoli, Agnese; Warrier, Varun; Baron-Cohen, Simon; Chakrabarti, Bhismadev

    2014-01-01

    Autism Spectrum Conditions (ASC) are a group of neurodevelopmental conditions characterized by impairments in communication and social interaction, alongside unusually repetitive behaviors and narrow interests. ASC are highly heritable and have complex patterns of inheritance where multiple genes are involved, alongside environmental and epigenetic factors. Asperger Syndrome (AS) is a subgroup of these conditions, where there is no history of language or cognitive delay. Animal models suggest a role for oxytocin (OXT) and oxytocin receptor (OXTR) genes in social-emotional behaviors, and several studies indicate that the oxytocin/oxytocin receptor system is altered in individuals with ASC. Previous studies have reported associations between genetic variations in the OXTR gene and ASC. The present study tested for an association between nine single nucleotide polymorphisms (SNPs) in the OXTR gene and AS in 530 individuals of Caucasian origin, using SNP association test and haplotype analysis. There was a significant association between rs2268493 in OXTR and AS. Multiple haplotypes that include this SNP (rs2268493-rs2254298, rs2268490-rs2268493-rs2254298, rs2268493-rs2254298-rs53576, rs237885-rs2268490-rs2268493-rs2254298, rs2268490-rs2268493-rs2254298-rs53576) were also associated with AS. rs2268493 has been previously associated with ASC and putatively alters several transcription factor-binding sites and regulates chromatin states, either directly or through other variants in linkage disequilibrium (LD). This study reports a significant association of the sequence variant rs2268493 in the OXTR gene and associated haplotypes with AS.

  11. Recombination in pe/ppe genes contributes to genetic variation in Mycobacterium tuberculosis lineages

    KAUST Repository

    Phelan, Jody E.

    2016-02-29

    Background Approximately 10 % of the Mycobacterium tuberculosis genome is made up of two families of genes that are poorly characterized due to their high GC content and highly repetitive nature. The PE and PPE families are typified by their highly conserved N-terminal domains that incorporate proline-glutamate (PE) and proline-proline-glutamate (PPE) signature motifs. They are hypothesised to be important virulence factors involved with host-pathogen interactions, but their high genetic variability and complexity of analysis means they are typically disregarded in genome studies. Results To elucidate the structure of these genes, 518 genomes from a diverse international collection of clinical isolates were de novo assembled. A further 21 reference M. tuberculosis complex genomes and long read sequence data were used to validate the approach. SNP analysis revealed that variation in the majority of the 168 pe/ppe genes studied was consistent with lineage. Several recombination hotspots were identified, notably pe_pgrs3 and pe_pgrs17. Evidence of positive selection was revealed in 65 pe/ppe genes, including epitopes potentially binding to major histocompatibility complex molecules. Conclusions This, the first comprehensive study of the pe and ppe genes, provides important insight into M. tuberculosis diversity and has significant implications for vaccine development.

  12. Rarity of DNA sequence alterations in the promoter region of the human androgen receptor gene

    Directory of Open Access Journals (Sweden)

    D.F. Cabral

    2004-12-01

    Full Text Available The human androgen receptor (AR gene promoter lies in a GC-rich region containing two principal sites of transcription initiation and a putative Sp1 protein-binding site, without typical "TATA" and "CAAT" boxes. It has been suggested that mutations within the 5'untranslated region (5'UTR may contribute to the development of prostate cancer by changing the rates of gene transcription and/or translation. In order to investigate this question, the aim of the present study was to search for the presence of mutations or polymorphisms at the AR-5'UTR in 92 prostate cancer patients, where histological diagnosis of adenocarcinoma was established in specimens obtained from transurethral resection or after prostatectomy. The AR-5'UTR was amplified by PCR from genomic DNA samples of the patients and of 100 healthy male blood donors, included as controls. Conformation-sensitive gel electrophoresis was used for DNA sequence alteration screening. Only one band shift was detected in one individual from the blood donor group. Sequencing revealed a new single nucleotide deletion (T in the most conserved portion of the promoter region at position +36 downstream from the transcription initiation site I. Although the effect of this specific mutation remains unknown, its rarity reveals the high degree of sequence conservation of the human androgen promoter region. Moreover, the absence of detectable variation within the critical 5'UTR in prostate cancer patients indicates a low probability of its involvement in prostate cancer etiology.

  13. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates

    Directory of Open Access Journals (Sweden)

    Cepko Connie L

    2007-06-01

    Full Text Available Abstract Background High-throughput systems for gene expression profiling have been developed and have matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised about the level of agreement across technologies. As part of an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing. Results The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Conclusion Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive alternatives for measuring gene expression, and currently, both are important tools for transcriptome profiling.

  14. Melanopsin Gene Variations Interact With Season to Predict Sleep Onset and Chronotype

    Science.gov (United States)

    Roecklein, Kathryn A.; Wong, Patricia M.; Franzen, Peter L.; Hasler, Brant P.; Wood-Vasey, W. Michael; Nimgaonkar, Vishwajit L.; Miller, Megan A.; Kepreos, Kyle M.; Ferrell, Robert E.; Manuck, Stephen B.

    2013-01-01

    The human melanopsin gene has been reported to mediate risk for seasonal affective disorder (SAD), which is hypothesized to be caused by decreased photic input during winter when light levels fall below threshold, resulting in differences in circadian phase and/or sleep. However, it is unclear if melanopsin increases risk of SAD by causing differences in sleep or circadian phase, or if those differences are symptoms of the mood disorder. To determine if melanopsin sequence variations are associated with differences in sleep-wake behavior among those not suffering from a mood disorder, the authors tested associations between melanopsin gene polymorphisms and self-reported sleep timing (sleep onset and wake time) in a community sample (N = 234) of non-Hispanic Caucasian participants (age 30–54 yrs) with no history of psychological, neurological, or sleep disorders. The authors also tested the effect of melanopsin variations on differences in preferred sleep and activity timing (i.e., chronotype), which may reflect differences in circadian phase, sleep homeostasis, or both. Daylength on the day of assessment was measured and included in analyses. DNA samples were genotyped for melanopsin gene polymorphisms using fluorescence polarization. P10L genotype interacted with daylength to predict self-reported sleep onset (interaction p seasonal patterns of recurrence or exacerbation. PMID:22881342

  15. Genetic variations of the dihydrofolate reductase gene of Plasmodium vivax in Mandalay Division, Myanmar.

    Science.gov (United States)

    Na, Byoung-Kuk; Lee, Hyeong-Woo; Moon, Sung-Ung; In, Tae-Suk; Lin, Khin; Maung, Maung; Chung, Gyung-Tae; Lee, Jong-Koo; Kim, Tong-Soo; Kong, Yoon

    2005-07-01

    Dihydrofolate reductase (DHFR; EC1.5.1.3) is a known target enzyme for antifolate agents, which are used as alternative chemotherapeutics for chloroquine-resistant malaria. Mutations in the dhfr gene of Plasmodium vivax are thought to be associated with resistance to the antifolate drugs. In this study, we have analyzed genetic variations in the dhfr genes of clinical isolates of P. vivax (n=21) in Myanmar, to monitor antifolate resistance in this country. Sequence variations within the entire dhfr gene were highly restricted to codons from 57 to 117, and the GGDN tandem repeat region. Double (S58R and S117N/T) or quadruple mutations (F57L/I, S58R, T61M, and S117N/T), which may be closely related to the drug resistance, were recognized in most of the isolates (20/21 cases). Our results suggest that antifolate-resistant P. vivax is becoming widespread in Myanmar, as it also is in the neighboring countries in Southeast Asia. It appears that the drug resistance situation may be worsening in the country.

  16. Mitochondrial cytochrome b sequence variations and phylogeny of the East Asian bagrid catfishes

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The mitochondrial DNA cytochrome b gene was sequenced from 8 bagrid catfishes in China. Aligned with cytochrome b sequences from 9 bagrid catfishes in Japan, Korea and Russia retrieved from GenBank, and selected Silurus meridionalis, Liobagrus anguillicauda, Liobagrus reini and Phenacogrammus interruptus as outgroups, we constructed a matrix of 21 DNA sequences. The Kimura's two-parameter distances were calculated and molecular phylogenetic trees were constructed by using the maximum parsimony (MP) and neighbor-joining (NJ) methods. The results show that (i) there exist 3-bp deletions of mitochondrial cytochrome b gene compared with cypriniforms and characiforms; (ii) the molecular phylogenetic tree suggests that bagrid catfishes form a monophyletic group, and the genus Mystus is the earliest divergent in the East Asian bagrid catfishes, as well as the genus Pseudobagrus is a monophyletic group but the genus Pelteobagrus and Leiocassis are complicated; and (iii) the evolution rate of the East Asian bagrids mitochondrial cytochrome b gene is about 0.18%~0.30% sequence divergence per million years.

  17. Sequencing, characterization, and gene expression analysis of the histidine decarboxylase gene cluster of Morganella morganii.

    Science.gov (United States)

    Ferrario, Chiara; Borgo, Francesca; de Las Rivas, Blanca; Muñoz, Rosario; Ricci, Giovanni; Fortina, Maria Grazia

    2014-03-01

    The histidine decarboxylase gene cluster of Morganella morganii DSM30146(T) was sequenced, and four open reading frames, named hdcT1, hdc, hdcT2, and hisRS were identified. Two putative histidine/histamine antiporters (hdcT1 and hdcT2) were located upstream and downstream the hdc gene, codifying a pyridoxal-P dependent histidine decarboxylase, and followed by hisRS gene encoding a histidyl-tRNA synthetase. This organization was comparable with the gene cluster of other known Gram negative bacteria, particularly with that of Klebsiella oxytoca. Recombinant Escherichia coli strains harboring plasmids carrying the M. morganii hdc gene were shown to overproduce histidine decarboxylase, after IPTG induction at 37 °C for 4 h. Quantitative RT-PCR experiments revealed the hdc and hisRS genes were highly induced under acidic and histidine-rich conditions. This work represents the first description and identification of the hdc-related genes in M. morganii. Results support the hypothesis that the histidine decarboxylation reaction in this prolific histamine producing species may play a role in acid survival. The knowledge of the role and the regulation of genes involved in histidine decarboxylation should improve the design of rational strategies to avoid toxic histamine production in foods.

  18. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    Science.gov (United States)

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  19. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    Directory of Open Access Journals (Sweden)

    Kei-ichi Morita

    Full Text Available Gorlin syndrome (GS is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs. In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals, whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  20. Sequence-length variation of mtDNA HVS-IC-stretch in Chinese ethnic groups

    Institute of Scientific and Technical Information of China (English)

    Feng CHEN; Yong-hui DANG; Chun-xia YAN; Yan-ling LIU; Ya-jun DENG; David J. R. FULTON; Teng CHEN

    2009-01-01

    The purpose of this study was to investigate mitochondrial DNA (mtDNA) hypervariable segment-I (HVS-I) C-stretch variations and explore the significance of these variations in forensic and population genetics studies. The C-stretch sequence variation was studied in 919 unrelated individuals from 8 Chinese ethnic groups using both direct and clone sequencing ap-proaches. Thirty eight C-stretch haplotypes were identified, and some novel and population specific haplotypes were also detected. The C-stretch genetic diversity (GD) values were relatively high, and probability (P) values were low. Additionally, C-stretch length heteroplasmy was observed in approximately 9% of individuals studied. There was a significant correlation (r=-0.961, P<0.01) between the expansion of the cytosine sequence length in the C-stretch of HVS-I and a reduction in the number of up-stream adenines. These results indicate that the C-stretch could be a useful genetic maker in forensic identification of Chinese populations. The results from the Fst and dA genetic distance matrix, neighbor-joining tree, and principal component map also suggest that C-stretch could be used as a reliable genetic marker in population genetics.

  1. Allelic Diversity and Population Structure in Oenococcus oeni as Determined from Sequence Analysis of Housekeeping Genes

    Science.gov (United States)

    de las Rivas, Blanca; Marcobal, Ángela; Muñoz, Rosario

    2004-01-01

    Oenococcus oeni is the organism of choice for promoting malolactic fermentation in wine. The population biology of O. oeni is poorly understood and remains unclear. For a better understanding of the mode of genetic variation within this species, we investigated by using multilocus sequence typing (MLST) with the gyrB, pgm, ddl, recP, and mleA genes the genetic diversity and genetic relationships among 18 O. oeni strains isolated in various years from wines of the United States, France, Germany, Spain, and Italy. These strains have also been characterized by ribotyping and restriction fragment length polymorphism (RFLP) analysis of the PCR-amplified 16S-23S rRNA gene intergenic spacer region (ISR). Ribotyping grouped the strains into two groups; however, the RFLP analysis of the ISRs showed no differences in the strains analyzed. In contrast, MLST in oenococci had a good discriminatory ability, and we have found a higher genetic diversity than indicated by ribotyping analysis. All sequence types were represented by a single strain, and all the strains could be distinguished from each other because they had unique combinations of alleles. Strains assumed to be identical showed the same sequence type. Phylogenetic analyses indicated a panmictic population structure in O. oeni. Sequences were analyzed for evidence of recombination by split decomposition analysis and analysis of clustered polymorphisms. All results indicated that recombination plays a major role in creating the genetic heterogeneity of O. oeni. A low standardized index of association value indicated that the O. oeni genes analyzed are close to linkage equilibrium. This study constitutes the first step in the development of an MLST method for O. oeni and the first example of the application of MLST to a nonpathogenic food production bacteria. PMID:15574919

  2. The impact of gene expression variation on the robustness and evolvability of a developmental gene regulatory network.

    Directory of Open Access Journals (Sweden)

    David A Garfield

    2013-10-01

    Full Text Available Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear, allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role.

  3. Molecular Cloning and Sequencing of Hemoglobin-Beta Gene of Channel Catfish, Ictalurus Punctatus Rafinesque

    Science.gov (United States)

    : Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...

  4. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions.

    Science.gov (United States)

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M; Greenwood, Alex D; Roca, Alfred L

    2015-01-15

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. Next-gen sequencing identifies non-coding variation disrupting miRNA-binding sites in neurological disorders.

    Science.gov (United States)

    Devanna, P; Chen, X S; Ho, J; Gajewski, D; Smith, S D; Gialluisi, A; Francks, C; Fisher, S E; Newbury, D F; Vernes, S C

    2017-03-14

    Understanding the genetic factors underlying neurodevelopmental and neuropsychiatric disorders is a major challenge given their prevalence and potential severity for quality of life. While large-scale genomic screens have made major advances in this area, for many disorders the genetic underpinnings are complex and poorly understood. To date the field has focused predominantly on protein coding variation, but given the importance of tightly controlled gene expression for normal brain development and disorder, variation that affects non-coding regulatory regions of the genome is likely to play an important role in these phenotypes. Herein we show the importance of 3 prime untranslated region (3'UTR) non-coding regulatory variants across neurodevelopmental and neuropsychiatric disorders. We devised a pipeline for identifying and functionally validating putatively pathogenic variants from next generation sequencing (NGS) data. We applied this pipeline to a cohort of children with severe specific language impairment (SLI) and identified a functional, SLI-associated variant affecting gene regulation in cells and post-mortem human brain. This variant and the affected gene (ARHGEF39) represent new putative risk factors for SLI. Furthermore, we identified 3'UTR regulatory variants across autism, schizophrenia and bipolar disorder NGS cohorts demonstrating their impact on neurodevelopmental and neuropsychiatric disorders. Our findings show the importance of investigating non-coding regulatory variants when determining risk factors contributing to neurodevelopmental and neuropsychiatric disorders. In the future, integration of such regulatory variation with protein coding changes will be essential for uncovering the genetic causes of complex neurological disorders and the fundamental mechanisms underlying health and disease.Molecular Psychiatry advance online publication, 14 March 2017; doi:10.1038/mp.2017.30.

  6. Mutational analysis of DBD*--a unique antileukemic gene sequence.

    Science.gov (United States)

    Ji, Yan-shan; Johnson, Betty H; Webb, M Scott; Thompson, E Brad

    2002-01-01

    DBD* is a novel gene encoding an 89 amino acid peptide that is constitutively lethal to leukemic cells. DBD* was derived from the DNA binding domain of the human glucocorticoid receptor by a frameshift that replaces the final 21 C-terminal amino acids of the domain. Previous studies suggested that DBD* no longer acted as the natural DNA binding domain. To confirm and extend these results, we mutated DBD* in 29 single amino acid positions, critical for the function in the native domain or of possible functional significance in the novel 21 amino acid C-terminal sequence. Steroid-resistant leukemic ICR-27-4 cells were transiently transfected by electroporation with each of the 29 mutants. Cell kill was evaluated by trypan blue dye exclusion, a WST-1 tetrazolium-based assay for cell respiration, propidium iodide exclusion, and Hoechst 33258 staining of chromatin. Eleven of the 29 point mutants increased, whereas four decreased antileukemic activity. The remainder had no effect on activity. The nonconcordances between these effects and native DNA binding domain function strongly suggest that the lethality of DBD* is distinct from that of the glucocorticoid receptor. Transfections of fragments of DBD* showed that optimal activity localized to the sequence for its C-terminal 32 amino acids.

  7. Sequence analysis and prokaryotic expression of Giardia lamblia α-18 giardin gene.

    Science.gov (United States)

    Wu, Sheng; Yu, Xingang; Abdullahi, Auwalu Yusuf; Hu, Wei; Pan, Weida; Shi, Xianli; Tan, Liping; Song, Meiran; Li, Guoqing

    2016-03-01

    To study the genetic variation and prokaryotic expression of α18 giardin gene of Giardia lamblia zoonotic assemblage A and host-specific assemblage F, the α18 genes were amplified from G. lamblia assemblages A and F by PCR and sequenced. The PCR product was cloned into the prokaryotic expression vector pET-28a(+) and the positive recombinant plasmid was transformed into Escherichia coli Rosetta (DE3) strain for the expression. The expressed α18 giardin fusion protein was validated by SDS-PAGE and Western blot analysis, and purified by Ni-Agarose resin. The putative sequence of α18 giardin amino acid was analyzed by bioinformatics software. Results showed that the α18 giardin gene was 861 bp in length, encoding 286 amino acids; it was 100% homologous between human-derived and dog-derived G. lamblia assemblage A, but it was 86.8% homologous with G. lamblia assemblage F (cat-derived). Giardin α18 was about 36 kDa in molecular weight, with good reactivity. Prediction based on in silico analyses: it had hydrophobicity, without signal peptide and transmembrane domain, and contained 11 alpha regions, 13 beta sheets, 1 beta turn and 7 random coils in secondary structure. The above information would lay the foundation for research about the subcellular localization and biological function of α18 giardin in G. lamblia.

  8. EcoGene: a genome sequence database for Escherichia coli K-12.

    Science.gov (United States)

    Rudd, K E

    2000-01-01

    The EcoGene database provides a set of gene and protein sequences derived from the genome sequence of Escherichia coli K-12. EcoGene is a source of re-annotated sequences for the SWISS-PROT and Colibri databases. EcoGene is used for genetic and physical map compilations in collaboration with the Coli Genetic Stock Center. The EcoGene12 release includes 4293 genes. EcoGene12 differs from the GenBank annotation of the complete genome sequence in several ways, including (i) the revision of 706 predicted or confirmed gene start sites, (ii) the correction or hypothetical reconstruction of 61 frame-shifts caused by either sequence error or mutation, (iii) the reconstruction of 14 protein sequences interrupted by the insertion of IS elements, and (iv) pre-dictions that 92 genes are partially deleted gene fragments. A literature survey identified 717 proteins whose N-terminal amino acids have been verified by sequencing. 12 446 cross-references to 6835 literature citations and s are provided. EcoGene is accessible at a new website: http://bmb.med.miami.edu/EcoGene/EcoWeb. Users can search and retrieve individual EcoGene GenePages or they can download large datasets for incorporation into database management systems, facilitating various genome-scale computational and functional analyses.

  9. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  10. High Sequence Variations in Mitochondrial DNA Control Region among Worldwide Populations of Flathead Mullet Mugil cephalus

    Directory of Open Access Journals (Sweden)

    Brian Wade Jamandre

    2014-01-01

    Full Text Available The sequence and structure of the complete mtDNA control region (CR of M. cephalus from African, Pacific, and Atlantic populations are presented in this study to assess its usefulness in phylogeographic studies of this species. The mtDNA CR sequence variations among M. cephalus populations largely exceeded intraspecific polymorphisms that are generally observed in other vertebrates. The length of CR sequence varied among M. cephalus populations due to the presence of indels and variable number of tandem repeats at the 3′ hypervariable domain. The high evolutionary rate of the CR in this species probably originated from these mutations. However, no excessive homoplasic mutations were noticed. Finally, the star shaped tree inferred from the CR polymorphism stresses a rapid radiation worldwide, in this species. The CR still appears as a good marker for phylogeographic investigations and additional worldwide samples are warranted to further investigate the genetic structure and evolution in M. cephalus.

  11. Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

    Science.gov (United States)

    Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

    2015-12-01

    Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.

  12. Bm86 midgut protein sequence variation in South Texas cattle fever ticks

    Directory of Open Access Journals (Sweden)

    Kammlah Diane M

    2010-11-01

    Full Text Available Abstract Background Cattle fever ticks, Rhipicephalus (Boophilus microplus and R. (B. annulatus, vector bovine and equine babesiosis, and have significantly expanded beyond the permanent quarantine zone established in South Texas. Currently, there are no vaccines approved for use within the United States for controlling these vectors. Vaccines developed in Australia and Cuba based on the midgut antigen Bm86 have variable efficacy against cattle fever ticks. A possible explanation for this variation in vaccine efficacy is amino acid sequence divergence between the recombinant Bm86 vaccine component and native Bm86 expressed in ticks from different geographical regions of the world. Results There was 91.8% amino acid sequence identity in Bm86 among R. microplus and R. annulatus sequenced from South Texas infestations. When South Texas isolates were compared to the Australian Yeerongpilly and Cuban Camcord vaccine strains, there was 89.8% and 90.0% identity, respectively. Most of the sequence divergence was focused in one region of the protein, amino acids 206-298. Hydrophilicity profiles revealed that two short regions of Bm86 (amino acids 206-210 and 560-570 appear to be more hydrophilic in South Texas isolates compared to vaccine strains. Only one amino acid difference was found between South Texas and vaccine strains within two previously described B-cell epitopes. A total of 4 amino acid differences were observed within three peptides previously shown to induce protective immune responses in cattle. Conclusions Sequence differences between South Texas isolates and Yeerongpilly and Camcord strains are spread throughout the entire Bm86 sequence, suggesting that geographic variation does exist. Differences within previously described B-cell epitopes between South Texas isolates and vaccine strains are minimal; however, short regions of hydrophilic amino acids found unique to South Texas isolates suggest that additional unique surface exposed

  13. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Science.gov (United States)

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  14. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    Directory of Open Access Journals (Sweden)

    Patrick D. Schloss

    2016-03-01

    Full Text Available Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  15. Quantitative gene-gene and gene-environment mapping for leaf shape variation using tree-based models

    Science.gov (United States)

    Leaf shape traits have long been a focus of many disciplines, but searching for complex genetic and environmental interactive mechanisms regulating leaf shape variation has not yet been well developed. The question of the respective roles of gene and environment and how they interplay to modulate l...

  16. The Clinical Significance of Unknown Sequence Variants in BRCA Genes

    Energy Technology Data Exchange (ETDEWEB)

    Calò, Valentina; Bruno, Loredana; Paglia, Laura La; Perez, Marco; Margarese, Naomi [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy); Gaudio, Francesca Di [Department of Medical Biotechnologies and Legal Medicine, University of Palermo, Palermo (Italy); Russo, Antonio, E-mail: lab-oncobiologia@usa.net [Department of Surgery and Oncology, Regional Reference Center for the Biomolecular Characterization and Genetic Screening of Hereditary Tumors, University of Palermo, Via del Vespro 127, 90127 Palermo (Italy)

    2010-09-10

    Germline mutations in BRCA1/2 genes are responsible for a large proportion of hereditary breast and/or ovarian cancers. Many highly penetrant predisposition alleles have been identified and include frameshift or nonsense mutations that lead to the translation of a truncated protein. Other alleles contain missense mutations, which result in amino acid substitution and intronic variants with splicing effect. The discovery of variants of uncertain/unclassified significance (VUS) is a result that can complicate rather than improve the risk assessment process. VUSs are mainly missense mutations, but also include a number of intronic variants and in-frame deletions and insertions. Over 2,000 unique BRCA1 and BRCA2 missense variants have been identified, located throughout the whole gene (Breast Cancer Information Core Database (BIC database)). Up to 10–20% of the BRCA tests report the identification of a variant of uncertain significance. There are many methods to discriminate deleterious/high-risk from neutral/low-risk unclassified variants (i.e., analysis of the cosegregation in families of the VUS, measure of the influence of the VUSs on the wild-type protein activity, comparison of sequence conservation across multiple species), but only an integrated analysis of these methods can contribute to a real interpretation of the functional and clinical role of the discussed variants. The aim of our manuscript is to review the studies on BRCA VUS in order to clarify their clinical relevance.

  17. Cloning, sequencing and expression of a xylanase gene from the maize pathogen Helminthosporium turcicum

    DEFF Research Database (Denmark)

    Degefu, Y.; Paulin, L.; Lübeck, Peter Stephensen

    2001-01-01

    A gene encoding an endoxylanase from the phytopathogenic fungus Helminthosporium turcicum Pass. was cloned and sequenced. The entire nucleotide sequence of a 1991 bp genomic fragment containing an endoxylanase gene was determined. The xylanase gene of 795 bp, interrupted by two introns of 52 and ...

  18. Poly purine.pyrimidine sequences upstream of the beta-galactosidase gene affect gene expression in Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Brahmachari Samir K

    2001-10-01

    Full Text Available Abstract Background Poly purine.pyrimidine sequences have the potential to adopt intramolecular triplex structures and are overrepresented upstream of genes in eukaryotes. These sequences may regulate gene expression by modulating the interaction of transcription factors with DNA sequences upstream of genes. Results A poly purine.pyrimidine sequence with the potential to adopt an intramolecular triplex DNA structure was designed. The sequence was inserted within a nucleosome positioned upstream of the β-galactosidase gene in yeast, Saccharomyces cerevisiae, between the cycl promoter and gal 10Upstream Activating Sequences (UASg. Upon derepression with galactose, β-galactosidase gene expression is reduced 12-fold in cells carrying single copy poly purine.pyrimidine sequences. This reduction in expression is correlated with reduced transcription. Furthermore, we show that plasmids carrying a poly purine.pyrimidine sequence are not specifically lost from yeast cells. Conclusion We propose that a poly purine.pyrimidine sequence upstream of a gene affects transcription. Plasmids carrying this sequence are not specifically lost from cells and thus no additional effort is needed for the replication of these sequences in eukaryotic cells.

  19. Stability of glycoprotein gene sequences of herpes simplex virus type 2 from primary to recurrent human infection, and diversity of the sequences among patients attending an STD clinic

    Science.gov (United States)

    2014-01-01

    Background Herpes simplex virus type 2 (HSV-2) is sexually transmitted, leading to blisters and ulcers in the genito-anal region. After primary infection the virus is present in a latent state in neurons in sensory ganglia. Reactivation and production of new viral particles can cause asymptomatic viral shedding or new lesions. Establishment of latency, maintenance and reactivation involve silencing of genes, continuous suppression of gene activities and finally gene activation and synthesis of viral DNA. The purpose of the present work was to study the genetic stability of the virus during these events. Methods HSV-2 was collected from 5 patients with true primary and recurrent infections, and the genes encoding glycoproteins B,G,E and I were sequenced. Results No nucleotide substitution was observed in any patient, indicating genetic stability. However, since the total number of nucleotides in these genes is only a small part of the total genome, we cannot rule out variation in other regions. Conclusions Although infections of cell cultures and animal models are useful for studies of herpes simplex virus, it is important to know how the virus behaves in the natural host. We observed that several glycoprotein gene sequences are stable from primary to recurrent infection. However, the virus isolates from the different patients were genetically different. PMID:24502528

  20. Candidate gene analysis and exome sequencing confirm LBX1 as a susceptibility gene for idiopathic scoliosis.

    Science.gov (United States)

    Grauers, Anna; Wang, Jingwen; Einarsdottir, Elisabet; Simony, Ane; Danielsson, Aina; Åkesson, Kristina; Ohlin, Acke; Halldin, Klas; Grabowski, Pawel; Tenne, Max; Laivuori, Hannele; Dahlman, Ingrid; Andersen, Mikkel; Christensen, Steen Bach; Karlsson, Magnus K; Jiao, Hong; Kere, Juha; Gerdhem, Paul

    2015-10-01

    Idiopathic scoliosis is a spinal deformity affecting approximately 3% of otherwise healthy children or adolescents. The etiology is still largely unknown but has an important genetic component. Genome-wide association studies have identified a number of common genetic variants that are significantly associated with idiopathic scoliosis in Asian and Caucasian populations, rs11190870 close to the LBX1 gene being the most replicated finding. The aim of the present study was to investigate the genetics of idiopathic scoliosis in a Scandinavian cohort by performing a candidate gene study of four variants previously shown to be associated with idiopathic scoliosis and exome sequencing of idiopathic scoliosis patients with a severe phenotype to identify possible novel scoliosis risk variants. This was a case control study. A total of 1,739 patients with idiopathic scoliosis and 1,812 controls were included. The outcome measure was idiopathic scoliosis. The variants rs10510181, rs11190870, rs12946942, and rs6570507 were genotyped in 1,739 patients with idiopathic scoliosis and 1,812 controls. Exome sequencing was performed on pooled samples from 100 surgically treated idiopathic scoliosis patients. Novel or rare missense, nonsense, or splice site variants were selected for individual genotyping in the 1,739 cases and 1,812 controls. In addition, the 5'UTR, noncoding exon and promoter regions of LBX1, not covered by exome sequencing, were Sanger sequenced in the 100 pooled samples. Of the four candidate genes, an intergenic variant, rs11190870, downstream of the LBX1 gene, showed a highly significant association to idiopathic scoliosis in 1,739 cases and 1,812 controls (p=7.0×10(-18)). We identified 20 novel variants by exome sequencing after filtration and an initial genotyping validation. However, we could not verify any association to idiopathic scoliosis in the large cohort of 1,739 cases and 1,812 controls. We did not find any variants in the 5'UTR, noncoding exon and

  1. Genetic variation in exon 5 of troponin - I gene in hypertrophic cardiomyopathy cases

    Directory of Open Access Journals (Sweden)

    Annapurna S

    2007-01-01

    Full Text Available Background: Cardiomyopathies are a heterogeneous group of heart muscle disorders and are classified as 1 Hypertrophic Cardiomyopathy (HCM 2 Dilated cardiomyopathy (DCM 3 Restrictive cardiomyopathy (RCM and 4 Arrhythmogenic right ventricular dysplasia (ARVD as per WHO classification, of which HCM and DCM are common. HCM is a complex but relatively common form of inherited heart muscle disease with prevalence of 1 in 500 individuals and is commonly associated with sarcomeric gene mutations. Cardiac muscle troponin I (TNNI-3 is one such sarcomeric protein and is a subunit of the thin filament-associated troponin-tropomyosin complex involved in calcium regulation of skeletal and cardiac muscle contraction. Mutations in this gene were found to be associated with a history of sudden cardiac death in HCM patients. Aim: Therefore the present study aims to identify for mutations associated with troponin I gene in a set of HCM patients from Indian population. Materials and Methods: Mutational analyses of 92 HCM cases were carried out following PCR based SSCP analysis. Results: The study revealed band pattern variation in 3 cases from a group of 92 HCM patients. This band pattern variation, on sequencing revealed base changes, one at nt 2560 with G>T transversion in exon-5 region with a wobble and others at nt 2479 and nt 2478 with G>C and C>G transversions in the intronic region upstream of the exon 5 on sequencing. Further analysis showed that one of the probands showed apical form of hypertrophy, two others showing asymmetric septal hypertrophy. Two of these probands showed family history of the condition. Conclusions: Hence, the study supports earlier reports of involvement of TNNI-3 in the causation of apical and asymmetrical forms of hypertrophy.

  2. Facilitating genome navigation : survey sequencing and dense radiation-hybrid gene mapping

    NARCIS (Netherlands)

    Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F

    2005-01-01

    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences la

  3. Genome-wide gene-gene interaction analysis for next-generation sequencing.

    Science.gov (United States)

    Zhao, Jinying; Zhu, Yun; Xiong, Momiao

    2016-03-01

    The critical barrier in interaction analysis for next-generation sequencing (NGS) data is that the traditional pairwise interaction analysis that is suitable for common variants is difficult to apply to rare variants because of their prohibitive computational time, large number of tests and low power. The great challenges for successful detection of interactions with NGS data are (1) the demands in the paradigm of changes in interaction analysis; (2) severe multiple testing; and (3) heavy computations. To meet these challenges, we shift the paradigm of interaction analysis between two SNPs to interaction analysis between two genomic regions. In other words, we take a gene as a unit of analysis and use functional data analysis techniques as dimensional reduction tools to develop a novel statistic to collectively test interaction between all possible pairs of SNPs within two genome regions. By intensive simulations, we demonstrate that the functional logistic regression for interaction analysis has the correct type 1 error rates and higher power to detect interaction than the currently used methods. The proposed method was applied to a coronary artery disease dataset from the Wellcome Trust Case Control Consortium (WTCCC) study and the Framingham Heart Study (FHS) dataset, and the early-onset myocardial infarction (EOMI) exome sequence datasets with European origin from the NHLBI's Exome Sequencing Project. We discovered that 6 of 27 pairs of significantly interacted genes in the FHS were replicated in the independent WTCCC study and 24 pairs of significantly interacted genes after applying Bonferroni correction in the EOMI study.

  4. Targeted exon sequencing successfully discovers rare causative genes and clarifies the molecular epidemiology of Japanese deafness patients.

    Science.gov (United States)

    Miyagawa, Maiko; Naito, Takehiko; Nishio, Shin-ya; Kamatani, Naoyuki; Usami, Shin-ichi

    2013-01-01

    Target exon resequencing using Massively Parallel DNA Sequencing (MPS) is a new powerful strategy to discover causative genes in rare Mendelian disorders such as deafness. We attempted to identify genomic variations responsible for deafness by massive sequencing of the exons of 112 target candidate genes. By the analysis of 216randomly selected Japanese deafness patients (120 early-onset and 96 late-detected), who had already been evaluated for common genes/mutations by Invader assay and of which 48 had already been diagnosed, we efficiently identified causative mutations and/or mutation candidates in 57 genes. Approximately 86.6% (187/216) of the patients had at least one mutation. Of the 187 patients, in 69 the etiology of the hearing loss was completely explained. To determine which genes have the greatest impact on deafness etiology, the number of mutations was counted, showing that those in GJB2 were exceptionally higher, followed by mutations in SLC26A4, USH2A, GPR98, MYO15A, COL4A5 and CDH23. The present data suggested that targeted exon sequencing of selected genes using the MPS technology followed by the appropriate filtering algorithm will be able to identify rare responsible genes including new candidate genes for individual patients with deafness, and improve molecular diagnosis. In addition, using a large number of patients, the present study clarified the molecular epidemiology of deafness in Japanese. GJB2 is the most prevalent causative gene, and the major (commonly found) gene mutations cause 30-40% of deafness while the remainder of hearing loss is the result of various rare genes/mutations that have been difficult to diagnose by the conventional one-by-one approach. In conclusion, target exon resequencing using MPS technology is a suitable method to discover common and rare causative genes for a highly heterogeneous monogenic disease like hearing loss.

  5. 猪链球菌2型分离株精氨酸脱亚氨酸酶遗传变异分析%Analysis in Heredity and Variation of the Arginine-deiminase Gene Sequence of Streptococcus suis Serotyp 2 Isolation Strains

    Institute of Scientific and Technical Information of China (English)

    尹国友; 孙婕; 陈兰英; 谢朝晖; 王福梅

    2011-01-01

    The aim was to analyze the reason and epidemic trend of Streptococcus suis serotype 2, and provide theoretical basis for preventing and controlling it. According to the sequence of arginine deiminase gene published by the Genbank, The primers was designed for RT-PCR amplification of ADS isolates from Henan obtaining its DNA fragments, which were cloned into pMD18-T vector respectively and sequenced. The sequences of ADS genes were analysed by DNAStar,analysed nucleotide sequence homology and drawed phylogenetic tree. The nuclear acid analysis indicated that the homology was 94.9% to 100% between the isolation strains,the homology was 100% between HN5 and HN6 from the same region,and the homology of nuclear acid sequence was 95.9% to 99.2% between Henan isolation strains and SX332 issued by the GenBank. The homology of ADS gene sequence was higher and genetic relationship was near during prevalence strains in the same region,and it was far in different regions.%为了分析猪链球菌2型(SS2)发病原因、流行趋势,为预防和控制猪链球菌2型提供理论依据,根据GenBank发表的猪链球菌精氨酸脱亚氨酸酶(arginine deiminase,ADS)基因序列,设计并合成引物,采用RT-PCR技术,对阳性病料进行扩增,回收扩增产物并将其克隆人pMD18-T载体中,提取质粒测序.应用DNAStar软件分析核苷酸序列的同源性,并绘制系统进化树.结果显示,分离株HN1-HN8核苷酸的同源性为94.9%~100%,来自同一地区的分离株HN5和HN6同源性为100%,而河南分离株与GenBank发表的SX332同源性为95.9%~99.2%.因此,同一地域分离株间ADS基因序列同源性较高且亲缘关系较近,不同地域则较远.

  6. Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion.

    Science.gov (United States)

    Xi, Ruibin; Hadjipanayis, Angela G; Luquette, Lovelace J; Kim, Tae-Min; Lee, Eunjung; Zhang, Jianhua; Johnson, Mark D; Muzny, Donna M; Wheeler, David A; Gibbs, Richard A; Kucherlapati, Raju; Park, Peter J

    2011-11-15

    DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer and confer susceptibility to a variety of human disorders. Array comparative genomic hybridization has been used widely to identify CNVs genome wide, but the next-generation sequencing technology provides an opportunity to characterize CNVs genome wide with unprecedented resolution. In this study, we developed an algorithm to detect CNVs from whole-genome sequencing data and applied it to a newly sequenced glioblastoma genome with a matched control. This read-depth algorithm, called BIC-seq, can accurately and efficiently identify CNVs via minimizing the Bayesian information criterion. Using BIC-seq, we identified hundreds of CNVs as small as 40 bp in the cancer genome sequenced at 10× coverage, whereas we could only detect large CNVs (> 15 kb) in the array comparative genomic hybridization profiles for the same genome. Eighty percent (14/16) of the small variants tested (110 bp to 14 kb) were experimentally validated by quantitative PCR, demonstrating high sensitivity and true positive rate of the algorithm. We also extended the algorithm to detect recurrent CNVs in multiple samples as well as deriving error bars for breakpoints using a Gibbs sampling approach. We propose this statistical approach as a principled yet practical and efficient method to estimate CNVs in whole-genome sequencing data.

  7. Copy number variations in alternative splicing gene networks impact lifespan.

    Directory of Open Access Journals (Sweden)

    Joseph T Glessner

    Full Text Available Longevity has a strong genetic component evidenced by family-based studies. Lipoprotein metabolism, FOXO proteins, and insulin/IGF-1 signaling pathways in model systems have shown polygenic variations predisposing to shorter lifespan. To test the hypothesis that rare variants could influence lifespan, we compared the rates of CNVs in healthy children (0-18 years of age with individuals 67 years or older. CNVs at a significantly higher frequency in the pediatric cohort were considered risk variants impacting lifespan, while those enriched in the geriatric cohort were considered longevity protective variants. We performed a whole-genome CNV analysis on 7,313 children and 2,701 adults of European ancestry genotyped with 302,108 SNP probes. Positive findings were evaluated in an independent cohort of 2,079 pediatric and 4,692 geriatric subjects. We detected 8 deletions and 10 duplications that were enriched in the pediatric group (P=3.33×10(-8-1.6×10(-2 unadjusted, while only one duplication was enriched in the geriatric cohort (P=6.3×10(-4. Population stratification correction resulted in 5 deletions and 3 duplications remaining significant (P=5.16×10(-5-4.26×10(-2 in the replication cohort. Three deletions and four duplications were significant combined (combined P=3.7×10(-4-3.9×10(-2. All associated loci were experimentally validated using qPCR. Evaluation of these genes for pathway enrichment demonstrated ~50% are involved in alternative splicing (P=0.0077 Benjamini and Hochberg corrected. We conclude that genetic variations disrupting RNA splicing could have long-term biological effects impacting lifespan.

  8. A general scenario of Hox gene inventory variation among major sarcopterygian lineages

    Directory of Open Access Journals (Sweden)

    Wang Chaolin

    2011-01-01

    Full Text Available Abstract Background Hox genes are known to play a key role in shaping the body plan of metazoans. Evolutionary dynamics of these genes is therefore essential in explaining patterns of evolutionary diversity. Among extant sarcopterygians comprising both lobe-finned fishes and tetrapods, our knowledge of the Hox genes and clusters has largely been restricted in several model organisms such as frogs, birds and mammals. Some evolutionary gaps still exist, especially for those groups with derived body morphology or occupying key positions on the tree of life, hindering our understanding of how Hox gene inventory varied along the sarcopterygian lineage. Results We determined the Hox gene inventory for six sarcopterygian groups: lungfishes, caecilians, salamanders, snakes, turtles and crocodiles by comprehensive PCR survey and genome walking. Variable Hox genes in each of the six sarcopterygian group representatives, compared to the human Hox gene inventory, were further validated for their presence/absence by PCR survey in a number of related species representing a broad evolutionary coverage of the group. Turtles, crocodiles, birds and placental mammals possess the same 39 Hox genes. HoxD12 is absent in snakes, amphibians and probably lungfishes. HoxB13 is lost in frogs and caecilians. Lobe-finned fishes, amphibians and squamate reptiles possess HoxC3. HoxC1 is only present in caecilians and lobe-finned fishes. Similar to coelacanths, lungfishes also possess HoxA14, which is only found in lobe-finned fishes to date. Our Hox gene variation data favor the lungfish-tetrapod, turtle-archosaur and frog-salamander relationships and imply that the loss of HoxD12 is not directly related to digit reduction. Conclusions Our newly determined Hox inventory data provide a more complete scenario for evolutionary dynamics of Hox genes along the sarcopterygian lineage. Limbless, worm-like caecilians and snakes possess similar Hox gene inventories to animals with

  9. Evolution of variation in presence and absence of genes in bacterial pathways

    Directory of Open Access Journals (Sweden)

    Francis Andrew R

    2012-04-01

    Full Text Available Abstract Background Bacterial genomes exhibit a remarkable degree of variation in the presence and absence of genes, which probably extends to the level of individual pathways. This variation may be a consequence of the significant evolutionary role played by horizontal gene transfer, but might also be explained by the loss of genes through mutation. A challenge is to understand why there would be variation in gene presence within pathways if they confer a benefit only when complete. Results Here, we develop a mathematical model to study how variation in pathway content is produced by horizontal transfer, gene loss and partial exposure of a population to a novel environment. Conclusions We discuss the possibility that variation in gene presence acts as cryptic genetic variation on which selection acts when the appropriate environment occurs. We find that a high level of variation in gene presence can be readily explained by decay of the pathway through mutation when there is no longer exposure to the selective environment, or when selection becomes too weak to maintain the genes. In the context of pathway variation the role of horizontal gene transfer is probably the initial introduction of a complete novel pathway rather than in building up the variation in a genome without the pathway.

  10. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  11. Molecular cloning and daily variations of the Period gene in a reef fish Siganus guttatus.

    Science.gov (United States)

    Park, Ji-Gweon; Park, Yong-Ju; Sugama, Nozomi; Kim, Se-Jae; Takemura, Akihiro

    2007-04-01

    As the first step in understanding the molecular oscillation of the circa rhythms in the golden rabbitfish Siganus guttatus--a reef fish with a definite lunar-related rhythmicity--we cloned and sequenced a Period gene (rfPer). The rfPer gene contained an open reading frame that encodes a protein consisting of 1,452 amino acids; this protein is highly homologous to PER proteins of vertebrates including zebrafish. Phylogenetic analyses indicated that the rfPER protein is related to the zebrafish PER1 and PER4. The expression of rfPer mRNA in the whole brain, retina, and liver under light/dark (LD) conditions increased at 06:00 h and decreased at 18:00 h, suggesting that its robust circadian rhythm occurs in neural and peripheral tissues. When daily variation in the expression in rfPer mRNA in the whole brain and cultured pineal gland were examined under LD conditions, similar expression patterns of the gene were observed with an increase around dawn. Under constant light condition, the increased expression of rfPer mRNA in the whole brain disappeared around dawn. The present results demonstrate that rfPer is related to zPer4 and possibly zPer1. The present study is the first report on the Period gene from a marine fish.

  12. Molecular variation at a candidate gene implicated in the regulation of fire ant social behavior.

    Directory of Open Access Journals (Sweden)

    Dietrich Gotzek

    Full Text Available The fire ant Solenopsis invicta and its close relatives display an important social polymorphism involving differences in colony queen number. Colonies are headed by either a single reproductive queen (monogyne form or multiple queens (polygyne form. This variation in social organization is associated with variation at the gene Gp-9, with monogyne colonies harboring only B-like allelic variants and polygyne colonies always containing b-like variants as well. We describe naturally occurring variation at Gp-9 in fire ants based on 185 full-length sequences, 136 of which were obtained from S. invicta collected over much of its native range. While there is little overall differentiation between most of the numerous alleles observed, a surprising amount is found in the coding regions of the gene, with such substitutions usually causing amino acid replacements. This elevated coding-region variation may result from a lack of negative selection acting to constrain amino acid replacements over much of the protein, different mutation rates or biases in coding and non-coding sequences, negative selection acting with greater strength on non-coding than coding regions, and/or positive selection acting on the protein. Formal selection analyses provide evidence that the latter force played an important role in the basal b-like lineages coincident with the emergence of polygyny. While our data set reveals considerable paraphyly and polyphyly of S. invicta sequences with respect to those of other fire ant species, the b-like alleles of the socially polymorphic species are monophyletic. An expanded analysis of colonies containing alleles of this clade confirmed the invariant link between their presence and expression of polygyny. Finally, our discovery of several unique alleles bearing various combinations of b-like and B-like codons allows us to conclude that no single b-like residue is completely predictive of polygyne behavior and, thus, potentially causally

  13. Nucleotide sequence of the structural gene for tryptophanase of Escherichia coli K-12.

    OpenAIRE

    Deeley, M C; Yanofsky, C

    1981-01-01

    The tryptophanase structural gene, tnaA, of Escherichia coli K-12 was cloned and sequenced. The size, amino acid composition, and sequence of the protein predicted from the nucleotide sequence agree with protein structure data previously acquired by others for the tryptophanase of E. coli B. Physiological data indicated that the region controlling expression of tnaA was present in the cloned segment. Sequence data suggested that a second structural gene of unknown function was located distal ...

  14. Variations and classification of toxic epitopes related to celiac disease among α-gliadin genes from four Aegilops genomes.

    Science.gov (United States)

    Li, Jie; Wang, Shunli; Li, Shanshan; Ge, Pei; Li, Xiaohui; Ma, Wujun; Zeller, F J; Hsam, Sai L K; Yan, Yueming

    2012-07-01

    The α-gliadins are associated with human celiac disease. A total of 23 noninterrupted full open reading frame α-gliadin genes and 19 pseudogenes were cloned and sequenced from C, M, N, and U genomes of four diploid Aegilops species. Sequence comparison of α-gliadin genes from Aegilops and Triticum species demonstrated an existence of extensive allelic variations in Gli-2 loci of the four Aegilops genomes. Specific structural features were found including the compositions and variations of two polyglutamine domains (QI and QII) and four T cell stimulatory toxic epitopes. The mean numbers of glutamine residues in the QI domain in C and N genomes and the QII domain in C, N, and U genomes were much higher than those in Triticum genomes, and the QI domain in C and N genomes and the QII domain in C, M, N, and U genomes displayed greater length variations. Interestingly, the types and numbers of four T cell stimulatory toxic epitopes in α-gliadins from the four Aegilops genomes were significantly less than those from Triticum A, B, D, and their progenitor genomes. Relationships between the structural variations of the two polyglutamine domains and the distributions of four T cell stimulatory toxic epitopes were found, resulting in the α-gliadin genes from the Aegilops and Triticum genomes to be classified into three groups.

  15. Post-polyploidisation morphotype diversification associates with gene copy number variation

    Science.gov (United St