Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P
Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2.
Fishman, G A; Stone, E M; Grover, S; Derlacki, D J; Haines, H L; Hockey, R R
To report the spectrum of ophthalmic findings in patients with Stargardt dystrophy or fundus flavimaculatus who have a specific sequence variation in the ABCR gene. Twenty-nine patients with Stargardt dystrophy or fundus flavimaculatus from different pedigrees were identified with possible disease-causing sequence variations in the ABCR gene from a group of 66 patients who were screened for sequence variations in this gene. Patients underwent a routine ocular examination, including slitlamp biomicroscopy and a dilated fundus examination. Fluorescein angiography was performed on 22 patients, and electroretinographic measurements were obtained on 24 of 29 patients. Kinetic visual fields were measured with a Goldmann perimeter in 26 patients. Single-strand conformation polymorphism analysis and DNA sequencing were used to identify variations in coding sequences of the ABCR gene. Three clinical phenotypes were observed among these 29 patients. In phenotype I, 9 of 12 patients had a sequence change in exon 42 of the ABCR gene in which the amino acid glutamic acid was substituted for glycine (Gly1961Glu). In only 4 of these 9 patients was a second possible disease-causing mutation found on the other ABCR allele. In addition to an atrophic-appearing macular lesion, phenotype I was characterized by localized perifoveal yellowish white flecks, the absence of a dark choroid, and normal electroretinographic amplitudes. Phenotype II consisted of 10 patients who showed a dark choroid and more diffuse yellowish white flecks in the fundus. None exhibited the Gly1961Glu change. Phenotype III consisted of 7 patients who showed extensive atrophic-appearing changes of the retinal pigment epithelium. Electroretinographic cone and rod amplitudes were reduced. One patient showed the Gly1961Glu change. A wide variation in clinical phenotype can occur in patients with sequence changes in the ABCR gene. In individual patients, a certain phenotype seems to be associated with the presence of
Edberg Jeffrey C
Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.
Full Text Available The adiponectin gene (ADIPOQ plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5 of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2 were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3 and three SNPs were observed. Two patterns (A4-B4, A5-B5 and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg. In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits.
Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David
Background Escherichia coli exists in commensal and pathogenic forms. By measuring the variation of individual genes across more than a hundred sequenced genomes, gene variation can be studied in detail, including the number of mutations found for any given gene. This knowledge will be useful...... for creating better phylogenies, for determination of molecular clocks and for improved typing techniques. Results We find 3,051 gene clusters/families present in at least 95% of the genomes and 1,702 gene clusters present in 100% of the genomes. The former 'soft core' of about 3,000 gene families is perhaps...... more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...
Full Text Available Genetic diversity of T. gondii is a concern of many studies, due to the biological and epidemiological diversity of this parasite. The present study examined sequence variation in rhoptry protein 17 (ROP17 gene among T. gondii isolates from different hosts and geographical regions. The rop17 gene was amplified and sequenced from 10 T. gondii strains, and phylogenetic relationship among these T. gondii strains was reconstructed using maximum parsimony (MP, neighbor-joining (NJ, and maximum likelihood (ML analyses. The partial rop17 gene sequences were 1375 bp in length and A+T contents varied from 49.45% to 50.11% among all examined T. gondii strains. Sequence analysis identified 33 variable nucleotide positions (2.1%, 16 of which were identified as transitions. Phylogeny reconstruction based on rop17 gene data revealed two major clusters which could readily distinguish Type I and Type II strains. Analyses of sequence variations in nucleotides and amino acids among these strains revealed high ratio of nonsynonymous to synonymous polymorphisms (>1, indicating that rop17 shows signs of positive selection. This study demonstrated the existence of slightly high sequence variability in the rop17 gene sequences among T. gondii strains from different hosts and geographical regions, suggesting that rop17 gene may represent a new genetic marker for population genetic studies of T. gondii isolates.
Chakraborty, R.; Reiniš, Milan; Rostron, T.; Philpott, S.; Dong, T.; D'Agostino, A.; Musoke, R.; de Silva, E.; Stumpf, M.; Weiser, B.; Burger, H.; Rowland-Jones, S.L.
Roč. 7, č. 2 (2006), s. 75-84 ISSN 1464-2662 Grant - others:Fogarty International Center, NIH(US) 3D43TW00915; NIH(US) RO1 AI 42555 Institutional research plan: CEZ:AV0Z50520514 Keywords : HIV-1 nef gene * non-clade B * Kenya Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.674, year: 2006
Mar 27, 2012 ... Toxoplasma gondii can infect a wide range of hosts including mammals and birds, causing toxoplasmosis which is one of the most common parasitic zoonoses worldwide. The present study examined sequence variation in rhoptry 7 (ROP7) gene among different T. gondii isolates from different hosts and ...
Full Text Available DNA sequence polymorphism in a regulatory protein can have a widespread transcriptional effect. Here we present a computational approach for analyzing modules of genes with a common regulation that are affected by specific DNA polymorphisms. We identify such regulatory-linkage modules by integrating genotypic and expression data for individuals in a segregating population with complementary expression data of strains mutated in a variety of regulatory proteins. Our procedure searches simultaneously for groups of co-expressed genes, for their common underlying linkage interval, and for their shared regulatory proteins. We applied the method to a cross between laboratory and wild strains of S. cerevisiae, demonstrating its ability to correctly suggest modules and to outperform extant approaches. Our results suggest that middle sporulation genes are under the control of polymorphism in the sporulation-specific tertiary complex Sum1p/Rfm1p/Hst1p. In another example, our analysis reveals novel inter-relations between Swi3 and two mitochondrial inner membrane proteins underlying variation in a module of aerobic cellular respiration genes. Overall, our findings demonstrate that this approach provides a useful framework for the systematic mapping of quantitative trait loci and their role in gene expression variation.
Barber, Lisa M; McGrath, Helen E N; Meyer, Stefan; Will, Andrew M; Birch, Jillian M; Eden, Osborn B; Taylor, G Malcolm
The extent to which genetic susceptibility contributes to the causation of childhood acute myeloid leukaemia (AML) is not known. The inherited bone marrow failure disorder Fanconi anaemia (FA) carries a substantially increased risk of AML, raising the possibility that constitutional variation in the FA (FANC) genes is involved in the aetiology of childhood AML. We have screened genomic DNA extracted from remission blood samples of 97 children with sporadic AML and 91 children with sporadic acute lymphoblastic leukaemia (ALL), together with 104 cord blood DNA samples from newborn children, for variations in the Fanconi anaemia group C (FANCC) gene. We found no evidence of known FANCC pathogenic mutations in children with AML, ALL or in the cord blood samples. However, we detected 12 different FANCC sequence variants, of which five were novel to this study. Among six FANCC variants leading to amino-acid substitutions, one (S26F) was present at a fourfold greater frequency in children with AML than in the cord blood samples (odds ratio: 4.09, P = 0.047; 95% confidence interval 1.08-15.54). Our results thus do not exclude the possibility that this polymorphic variant contributes to the risk of a small proportion of childhood AML.
Full Text Available In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD software. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA and SGCG (c.*102A/C genes. These variations were present in 210 tested healthy controls from Tunisian, Moroccan, Algerian, Lebanese and French populations suggesting that they represent novel polymorphisms within LAMA2 and SGCG genes sequences. ESEfinder showed that the c.*102A/C substitution created a new exon splicing enhancer in the 3'UTR of SGCG genes, whereas the c.6085 +12delA deletion was situated in the base pairing region between LAMA2 mRNA and the U1snRNA spliceosomal components. The RNA structure analyses showed that both variations modulated RNA secondary structure. Our results are suggestive of correlations between mRNA folding and the recruitment of spliceosomal components mediating splicing, including SR proteins. The contribution of common sequence variations to mRNA structural and functional diversity will contribute to a better study of gene expression.
Zhao, Yu; Zhou, Donghui; Chen, Jia; Sun, Xiaolin
Toxoplasma gondii, as a eukaryotic parasite of the phylum Apicomplexa, can infect almost all the warm-blooded animals and humans, causing toxoplasmosis. Rhoptry neck proteins (RONs) play a key role in the invasion process of T. gondii and are potential vaccine candidate molecules against toxoplasmosis. The present study examined sequence variation in the rhoptry neck protein 10 (TgRON10) gene among 10 T. gondii isolates from different hosts and geographical locations from Lanzhou province during 2014, and compared with the corresponding sequences of strains ME49 and VEG obtained from the ToxoDB database, using polymerase chain reaction (PCR) amplification, sequence analysis, and phylogenetic reconstruction by Bayesian inference (BI) and maximum parsimony (MP). Analysis of all the 12 TgRON10 genomic and cDNA sequences revealed 7 exons and 6 introns in the TgRON10 gDNA. The complete genomic sequence of the TgRON10 gene ranged from 4759 bp to 4763 bp, and sequence variation was 0-0.6% among the 12 T. gondii isolates, indicating a low sequence variation in TgRON10 gene. Phylogenetic analysis of TgRON10 sequences showed that the cluster of the 12 T. gondii isolates was not completely consistent with their respective genotypes. TgRON10 gene is not a suitable genetic marker for the differentiation of T. gondii isolates from different hosts and geographical locations, but may represent a potential vaccine candidate against toxoplasmosis, worth further studies.
Levran, Orna; Diotti, Raffaella; Pujara, Kanan; Batish, Sat D; Hanenberg, Helmut; Auerbach, Arleen D
Fanconi anemia (FA) is an autosomal recessive disorder that is defined by cellular hypersensitivity to DNA cross-linking agents, and is characterized clinically by developmental abnormalities, progressive bone-marrow failure, and predisposition to leukemia and solid tumors. There is extensive genetic heterogeneity, with at least 11 different FA complementation groups. FA-A is the most common group, accounting for approximately 65% of all affected individuals. The mutation spectrum of the FANCA gene, located on chromosome 16q24.3, is highly heterogeneous. Here we summarize all sequence variations (mutations and polymorphisms) in FANCA described in the literature and listed in the Fanconi Anemia Mutation Database as of March 2004, and report 61 novel FANCA mutations identified in FA patients registered in the International Fanconi Anemia Registry (IFAR). Thirty-eight novel SNPs, previously unreported in the literature or in dbSNP, were also identified. We studied the segregation of common FANCA SNPs in FA families to generate haplotypes. We found that FANCA SNP data are highly useful for carrier testing, prenatal diagnosis, and preimplantation genetic diagnosis, particularly when the disease-causing mutations are unknown. Twenty-two large genomic deletions were identified by detection of apparent homozygosity for rare SNPs. In addition, a conserved SNP haplotype block spanning at least 60 kb of the FANCA gene was identified in individuals from various ethnic groups. (c) 2005 Wiley-Liss, Inc.
Mikaeili, F; Mirhendi, H; Mohebali, M; Hosseini, M; Sharbatkhori, M; Zarei, Z; Kia, E B
The study was conducted to determine the sequence variation in two mitochondrial genes, namely cytochrome c oxidase 1 (pcox1) and NADH dehydrogenase 1 (pnad1) within and among isolates of Toxocara cati, Toxocara canis and Toxascaris leonina. Genomic DNA was extracted from 32 isolates of T. cati, 9 isolates of T. canis and 19 isolates of T. leonina collected from cats and dogs in different geographical areas of Iran. Mitochondrial genes were amplified by polymerase chain reaction (PCR) and sequenced. Sequence data were aligned using the BioEdit software and compared with published sequences in GenBank. Phylogenetic analysis was performed using Bayesian inference and maximum likelihood methods. Based on pairwise comparison, intra-species genetic diversity within Iranian isolates of T. cati, T. canis and T. leonina amounted to 0-2.3%, 0-1.3% and 0-1.0% for pcox1 and 0-2.0%, 0-1.7% and 0-2.6% for pnad1, respectively. Inter-species sequence variation among the three ascaridoid nematodes was significantly higher, being 9.5-16.6% for pcox1 and 11.9-26.7% for pnad1. Sequence and phylogenetic analysis of the pcox1 and pnad1 genes indicated that there is significant genetic diversity within and among isolates of T. cati, T. canis and T. leonina from different areas of Iran, and these genes can be used for studying genetic variation of ascaridoid nematodes.
Wyszyńska-Koko, J; Kurył, J
MYF6 gene codes for the bHLH transcription factor belonging to MyoD family. Its expression accompanies the processes of differentiation and maturation of myotubes during embriogenesis and continues on a relatively high level after birth, affecting the muscle phenotype. The porcine MYF6 gene was amplified and sequenced and compared with MYF6 gene sequences of other species. The amino acid sequence was deduced and an interspecies homology analysis was performed. Myf-6 protein shows a high conservation among species of 99 and 97% identity when comparing pig with cow and human, respectively, and of 93% when comparing pig with mouse and rat. The single nucleotide polymorphism (SNP) was revealed within the promoter region, which appeared to be T --> C transition recognized by a MspI restriction enzyme.
Full Text Available Background: Toxoplasma gondii, as a eukaryotic parasite of the phylum Apicomplexa, can infect almost all the warm-blooded animals and humans, causing toxoplasmosis. Rhoptry neck proteins (RONs play a key role in the invasion process of T. gondii and are potential vaccine candidate molecules against toxoplasmosis.Methods: The present study examined sequence variation in the rhoptry neck protein 10 (TgRON10 gene among 10 T. gondii isolates from different hosts and geographical locations from Lanzhou province during 2014, and compared with the corresponding sequences of strains ME49 and VEG obtained from the ToxoDB database, using polymerase chain reaction (PCR amplification, sequence analysis, and phylogenetic reconstruction by Bayesian inference (BI and maximum parsimony (MP. Results: Analysis of all the 12 TgRON10 genomic and cDNA sequences revealed 7 exons and 6 introns in the TgRON10 gDNA. The complete genomic sequence of the TgRON10 gene ranged from 4759 bp to 4763 bp, and sequence variation was 0-0.6% among the 12 T. gondii isolates, indicating a low sequence variation in TgRON10 gene. Phylogenetic analysis of TgRON10 sequences showed that the cluster of the 12 T. gondii isolates was not completely consistent with their respective genotypes.Conclusion: TgRON10 gene is not a suitable genetic marker for the differentiation of T. gondii isolates from different hosts and geographical locations, but may represent a potential vaccine candidate against toxoplasmosis, worth further studies.
Abildgaard, L; Engberg, RM; Pedersen, Karl
The aim of the present study was to analyse the genetic diversity of the alpha-toxin encoding plc gene and the variation in a-toxin production of Clostridium perfringens type A strains isolated from presumably healthy chickens and chickens suffering from either necrotic enteritis (NE) or cholangio......-hepatitis. The a-toxin encoding plc genes from 60 different pulsed-field gel electrophoresis (PFGE) types (strains) of C perfringens were sequenced and translated in silico to amino acid sequences and the a-toxin production was investigated in batch cultures of 45 of the strains using an enzyme...
Full Text Available The major histocompatibility complex (MHC is a large multigene coding for glycoproteins that play a key role in the initiation of immune responses in vertebrates. For a better understanding of the immunologic diversity in thriving marine mammal species, the sequence variation of the exon 2 region of MHC DQB locus was analyzed in 42 bottlenose dolphins (Tursiops truncatus collected from strandings and fishery bycatch in Taiwanese waters. The 172 bp sequences amplified showed no more than two alleles in each individual. The high proportion of non-synonymous nucleotide substitutions and the moderate amount of variation suggest positive selection pressure on this locus, arguing against a reduction in the marine environment selection pressure. The phylogenetic relationship among DQB exon 2 sequences of T. truncatus and other cetaceans did not coincide with taxonomic relationship, indicating a trans-species evolutionary pattern.
Shen, P; Wang, F; Underhill, P A; Franco, C; Yang, W H; Roxas, A; Sung, R; Lin, A A; Hyman, R W; Vollrath, D; Davis, R W; Cavalli-Sforza, L L; Oefner, P J
Some insight into human evolution has been gained from the sequencing of four Y chromosome genes. Primary genomic sequencing determined gene SMCY to be composed of 27 exons that comprise 4,620 bp of coding sequence. The unfinished sequencing of the 5' portion of gene UTY1 was completed by primer walking, and a total of 20 exons were found. By using denaturing HPLC, these two genes, as well as DBY and DFFRY, were screened for polymorphic sites in 53-72 representatives of the five continents. A total of 98 variants were found, yielding nucleotide diversity estimates of 2.45 x 10(-5), 5. 07 x 10(-5), and 8.54 x 10(-5) for the coding regions of SMCY, DFFRY, and UTY1, respectively, with no variant having been observed in DBY. In agreement with most autosomal genes, diversity estimates for the noncoding regions were about 2- to 3-fold higher and ranged from 9. 16 x 10(-5) to 14.2 x 10(-5) for the four genes. Analysis of the frequencies of derived alleles for all four genes showed that they more closely fit the expectation of a Luria-Delbrück distribution than a distribution expected under a constant population size model, providing evidence for exponential population growth. Pairwise nucleotide mismatch distributions date the occurrence of population expansion to approximately 28,000 years ago. This estimate is in accord with the spread of Aurignacian technology and the disappearance of the Neanderthals.
Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full
Erena A. Edae
Full Text Available Functional markers are needed for key genes involved in drought tolerance to improve selection for crop yield under moisture stress conditions. The objectives of this study were to (i characterize five drought tolerance candidate genes, namely dehydration responsive element binding 1A (, enhanced response to abscisic acid ( and , and fructan 1-exohydrolase ( and , in wheat ( L. for nucleotide and haplotype diversity, Tajima’s D value, and linkage disequilibrium (LD and (ii associate within-gene single nucleotide polymorphisms (SNPs with phenotypic traits in a spring wheat association mapping panel ( = 126. Field trials were grown under contrasting moisture regimes in Greeley, CO, and Melkassa, Ethiopia, in 2010 and 2011. Genome-specific amplification and DNA sequence analysis of the genes identified SNPs and revealed differences in nucleotide and haplotype diversity, Tajima’s D, and patterns of LD. showed associations (false discovery rate adjusted probability value = 0.1 with normalized difference vegetation index, heading date, biomass, and spikelet number. Both and were associated with harvest index, flag leaf width, and leaf senescence. was associated with grain yield, and was associated with thousand kernel weight and test weight. If validated in relevant genetic backgrounds, the identified marker–trait associations may be applied to functional marker-assisted selection.
Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Kumar, Gaurav; Aberg, Karolina A; Nerella, Srilaxmi; Xie, Linying; Collins, Ann L; Crowley, James J; Quackenbush, Corey R; Hilliard, Christopher E; Shabalin, Andrey A; Vrieze, Scott I; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; McGue, Matt; Maes, Hermine; Iacono, William G; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J
Previous genomewide association studies (GWASs) have identified a number of putative risk loci for alcohol dependence (AD). However, only a few loci have replicated and these replicated variants only explain a small proportion of AD risk. Using an innovative approach, the goal of this study was to generate hypotheses about potentially causal variants for AD that can be explored further through functional studies. We employed targeted capture of 71 candidate loci and flanking regions followed by next-generation deep sequencing (mean coverage 78X) in 806 European Americans. Regions included in our targeted capture library were genes identified through published GWAS of alcohol, all human alcohol and aldehyde dehydrogenases, reward system genes including dopaminergic and opioid receptors, prioritized candidate genes based on previous associations, and genes involved in the absorption, distribution, metabolism, and excretion of drugs. We performed single-locus tests to determine if any single variant was associated with AD symptom count. Sets of variants that overlapped with biologically meaningful annotations were tested for association in aggregate. No single, common variant was significantly associated with AD in our study. We did, however, find evidence for association with several variant sets. Two variant sets were significant at the q-value <0.10 level: a genic enhancer for ADHFE1 (p = 1.47 × 10 -5 ; q = 0.019), an alcohol dehydrogenase, and ADORA1 (p = 5.29 × 10 -5 ; q = 0.035), an adenosine receptor that belongs to a G-protein-coupled receptor gene family. To our knowledge, this is the first sequencing study of AD to examine variants in entire genes, including flanking and regulatory regions. We found that in addition to protein coding variant sets, regulatory variant sets may play a role in AD. From these findings, we have generated initial functional hypotheses about how these sets may influence AD. Copyright © 2017 by the Research Society on
Iacocca, Michael A; Wang, Jian; Dron, Jacqueline S; Robinson, John F; McIntyre, Adam D; Cao, Henian; Hegele, Robert A
Familial hypercholesterolemia (FH) is a heritable condition of severely elevated LDL cholesterol, caused predominantly by autosomal codominant mutations in the LDL receptor gene ( LDLR ). In providing a molecular diagnosis for FH, the current procedure often includes targeted next-generation sequencing (NGS) panels for the detection of small-scale DNA variants, followed by multiplex ligation-dependent probe amplification (MLPA) in LDLR for the detection of whole-exon copy number variants (CNVs). The latter is essential because ∼10% of FH cases are attributed to CNVs in LDLR ; accounting for them decreases false negative findings. Here, we determined the potential of replacing MLPA with bioinformatic analysis applied to NGS data, which uses depth-of-coverage analysis as its principal method to identify whole-exon CNV events. In analysis of 388 FH patient samples, there was 100% concordance in LDLR CNV detection between these two methods: 38 reported CNVs identified by MLPA were also successfully detected by our NGS method, while 350 samples negative for CNVs by MLPA were also negative by NGS. This result suggests that MLPA can be removed from the routine diagnostic screening for FH, significantly reducing associated costs, resources, and analysis time, while promoting more widespread assessment of this important class of mutations across diagnostic laboratories. Copyright © 2017 by the American Society for Biochemistry and Molecular Biology, Inc.
Host-pathogen interactions are of prime importance to modern agriculture. Plants utilize various types of resistance genes to mitigate pathogen damage. Identification of the specific gene responsible for a specific resistance can be difficult due to duplication and clustering within R-gene families....
Proctor, R.H.; Hove, van F.; Susca, A.; Stea, A.; Busman, M.; Lee, van der T.A.J.; Waalwijk, C.; Moretti, A.
In Fusarium, the ability to produce fumonisins is governed by a 17-gene fumonisin biosynthetic gene (FUM) cluster. Here, we examined the cluster in F. oxysporum strain O-1890 and nine other species selected to represent a wide range of the genetic diversity within the GFSC.
Ginther, C; Corach, D; Penacino, G A; Rey, J A; Carnese, F R; Hutz, M H; Anderson, A; Just, J; Salzano, F M; King, M C
DNA samples from 60 Mapuche Indians, representing 39 maternal lineages, were genetically characterized for (1) nucleotide sequences of the mtDNA control region; (2) presence or absence of a nine base duplication in mtDNA region V; (3) HLA loci DRB1 and DQA1; (4) variation at three nuclear genes with short tandem repeats; and (5) variation at the polymorphic marker D2S44. The genetic profile of the Mapuche population was compared to other Amerinds and to worldwide populations. Two highly polymorphic portions of the mtDNA control region, comprising 650 nucleotides, were amplified by the polymerase chain reaction (PCR) and directly sequenced. The 39 maternal lineages were defined by two or three generation families identified by the Mapuches. These 39 lineages included 19 different mtDNA sequences that could be grouped into four classes. The same classes of sequences appear in other Amerinds from North, Central, and South American populations separated by thousands of miles, suggesting that the origin of the mtDNA patterns predates the migration to the Americas. The mtDNA sequence similarity between Amerind populations suggests that the migration throughout the Americas occurred rapidly relative to the mtDNA mutation rate. HLA DRB1 alleles 1602 and 1402 were frequent among the Mapuches. These alleles also occur at high frequency among other Amerinds in North and South America, but not among Spanish, Chinese or African-American populations. The high frequency of these alleles throughout the Americas, and their specificity to the Americas, supports the hypothesis that Mapuches and other Amerind groups are closely related.(ABSTRACT TRUNCATED AT 250 WORDS)
Sakai, Takahiro; Kikkawa, Yoshiaki; Tsuchiya, Kimiyuki; Harada, Masashi; Kanoe, Masamitsu; Yoshiyuki, Mizuko; Yonekawa, Hiromichi
Microchiroptera have diversified into many species whose size and the shapes of the complicated ear and nose have been adapted to their echolocation abilities. Their speciation processes, and intra- and interspecies relationships are still under discussion. Here we report on the geographical variation of Japanese Rhinolophus ferrumequinum and R. cornutus using the complete sequence of the mitochondrial cytochrome b gene to clarify the phylogenetic positions of the 2 species as well as that of Rhinolophidae within the Microchiroptera. We have found that sequence divergence values within each of the 2 species are unexpectedly low (0.07%-0.94%). We have also found that there is no local specificity of their mtCytb alleles. On the other hand, the divergence values for Japanese Microchiroptera (12.7%-16.6%) are much higher than those for other mammalian genera. Similarly, the values among five genera of Vespertilionidae were 20.5%-27.3%. Phylogenetic analysis shows that the 2 species of family Rhinolophidae in the suborder Microchiroptera belong to the Megachiroptera cluster in the constructed maximum parsimony tree. These results suggest that the speciation of Rhinolophidae involved its divergence as an independent lineage from other Microchiroptera, and other microbats might be paraphyletic. In addition, the tree also shows that the order Chiroptera is monophylitic, and the closest group to Chiroptera is the ungulates.
Tyfield, L.A. [Southmead Hospital, Bristol (United Kingdom)]|[Univ. of Bristol (United Kingdom); Stephenson, A. [Southmead Hospital, Bristol (United Kingdom); Cockburn, F. [Royal Hospital for Sick Children, Glasgow (United Kingdom)] [and others
Using mutation and haplotype analysis, we have examined the phenylalanine hydroxylase gene in the phenylketonuria populations of four geographical areas of the British Isles: the west of Scotland, southern Wales, and southwestern and southeastern England. The enormous genetic diversity of this locus within the British Isles is demonstrated in the large number of different mutations characterized and in the variety of genetic backgrounds on which individual mutations are found. Allele frequencies of the more common mutations exhibited significant nonrandom distribution in a north/south differentiation. Differences between the west of Scotland and southwestern England may be related to different events in the recent and past histories of their respective populations. Similarities between southern Wales and southeastern England are likely to reflect the heterogeneity that is seen in and around two large capital cities. Finally, comparison with more recently colonized areas of the world corroborates the genealogical origin by range expansion of several mutations. 38 refs., 2 tabs.
Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.
Sato, Mitsuhiko P; Makino, Takashi; Kawata, Masakado
Understanding the evolutionary forces that influence variation in gene regulatory regions in natural populations is an important challenge for evolutionary biology because natural selection for such variations could promote adaptive phenotypic evolution. Recently, whole-genome sequence analyses have identified regulatory regions subject to natural selection. However, these studies could not identify the relationship between sequence variation in the detected regions and change in gene expression levels. We analyzed sequence variations in core promoter regions, which are critical regions for gene regulation in higher eukaryotes, in a natural population of Drosophila melanogaster, and identified core promoter sequence variations associated with differences in gene expression levels subjected to natural selection. Among the core promoter regions whose sequence variation could change transcription factor binding sites and explain differences in expression levels, three core promoter regions were detected as candidates associated with purifying selection or selective sweep and seven as candidates associated with balancing selection, excluding the possibility of linkage between these regions and core promoter regions. CHKov1, which confers resistance to the sigma virus and related insecticides, was identified as core promoter regions that has been subject to selective sweep, although it could not be denied that selection for variation in core promoter regions was due to linked single nucleotide polymorphisms in the regulatory region outside core promoter regions. Nucleotide changes in core promoter regions of CHKov1 caused the loss of two basal transcription factor binding sites and acquisition of one transcription factor binding site, resulting in decreased gene expression levels. Of nine core promoter regions regions associated with balancing selection, brat, and CG9044 are associated with neuromuscular junction development, and Nmda1 are associated with learning
Rivera, Andrea; White, Karen; Stöhr, Heidi; Steiner, Klaus; Hemmrich, Nadine; Grimm, Timo; Jurklies, Bernhard; Lorenz, Birgit; Scholl, Hendrik P. N.; Apfelstedt-Sylla, Eckhart; Weber, Bernhard H. F.
Stargardt disease (STGD) is a common autosomal recessive maculopathy of early and young-adult onset and is caused by alterations in the gene encoding the photoreceptor-specific ATP-binding cassette (ABC) transporter (ABCA4). We have studied 144 patients with STGD and 220 unaffected individuals ascertained from the German population, to complete a comprehensive, population-specific survey of the sequence variation in the ABCA4 gene. In addition, we have assessed the proposed role for ABCA4 in ...
Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680
Masters, N; Christie, M; Katouli, M; Stratton, H
We investigated the usefulness of the β-d-glucuronidase gene variance in Escherichia coli as a microbial source tracking tool using a novel algorithm for comparison of sequences from a prescreened set of host-specific isolates using a high-resolution PhP typing method. A total of 65 common biochemical phenotypes belonging to 318 E. coli strains isolated from humans and domestic and wild animals were analysed for nucleotide variations at 10 loci along a 518 bp fragment of the 1812 bp β-d-glucuronidase gene. Neighbour-joining analysis of loci variations revealed 86 (76.8%) human isolates and 91.2% of animal isolates were correctly identified. Pairwise hierarchical clustering improved assignment; where 92 (82.1%) human and 204 (99%) animal strains were assigned to their respective cluster. Our data show that initial typing of isolates and selection of common types from different hosts prior to analysis of the β-d-glucuronidase gene sequence improves source identification. We also concluded that numerical profiling of the nucleotide variations can be used as a valuable approach to differentiate human from animal E. coli. This study signifies the usefulness of the β-d-glucuronidase gene as a marker for differentiating human faecal pollution from animal sources.
McRobie, Helen R; King, Linda M; Fanutti, Cristina; Coussons, Peter J; Moncrief, Nancy D; Thomas, Alison P M
Sequence variations in the melanocortin 1 receptor (MC1R) gene are associated with melanism in many different species of mammals, birds, and reptiles. The gray squirrel (Sciurus carolinensis), found in the British Isles, was introduced from North America in the late 19th century. Melanism in the British gray squirrel is associated with a 24-bp deletion in the MC1R. To investigate the origin of this mutation, we sequenced the MC1R of 95 individuals including 44 melanic gray squirrels from both the British Isles and North America. Melanic gray squirrels of both populations had the same 24-bp deletion associated with melanism. Given the significant deletion associated with melanism in the gray squirrel, we sequenced the MC1R of both wild-type and melanic fox squirrels (Sciurus niger) (9 individuals) and red squirrels (Sciurus vulgaris) (39 individuals). Unlike the gray squirrel, no association between sequence variation in the MC1R and melanism was found in these 2 species. We conclude that the melanic gray squirrel found in the British Isles originated from one or more introductions of melanic gray squirrels from North America. We also conclude that variations in the MC1R are not associated with melanism in the fox and red squirrels.
Lee, Chao-Hung; Helweg-Larsen, Jannik; Tang, Xing; Jin, Shaoling; Li, Baozheng; Bartlett, Marilyn S.; Lu, Jang-Jih; Lundgren, Bettina; Lundgren, Jens D.; Olsson, Mats; Lucas, Sebastian B.; Roux, Patricia; Cargnel, Antonietta; Atzori, Chiara; Matos, Olga; Smith, James W.
Pneumocystis carinii f. sp. hominis isolates from 207 clinical specimens from nine countries were typed based on nucleotide sequence variations in the internal transcribed spacer regions I and II (ITS1 and ITS2, respectively) of rRNA genes. The number of ITS1 nucleotides has been revised from the previously reported 157 bp to 161 bp. Likewise, the number of ITS2 nucleotides has been changed from 177 to 192 bp. The number of ITS1 sequence types has increased from 2 to 15, and that of ITS2 has increased from 3 to 14. The 15 ITS1 sequence types are designated types A through O, and the 14 ITS2 types are named types a through n. A total of 59 types of P. carinii f. sp. hominis were found in this study. PMID:9508304
Full Text Available Salih Coşkun,1 Sefer Varol,2 Hasan H Özdemir,2 Sercan Bulut Çelik,3 Metin Balduz,4 Mehmet Akif Camkurt,5 Abdullah Çim,1 Demet Arslan,2 Mehmet Uğur Çevik2 1Department of Medical Genetics, 2Department of Neurology, Faculty of Medicine, Dicle University, Diyarbakir, 3Family Health Center, Batman, 4Department of Neurology, Şanlıurfa Education and Research Hospital, Şanlıurfa, 5Department of Psychiatry, Afsin State Hospital, Kahramanmaraş, Turkey Abstract: Migraine pathogenesis involves a complex interaction between hormones, neurotransmitters, and inflammatory pathways, which also influence the migraine phenotype. The Mediterranean fever gene (MEFV encodes the pyrin protein. The major role of pyrin appears to be in the regulation of inflammation activity and the processing of the cytokine pro-interleukin-1β, and this cytokine plays a part in migraine pathogenesis. This study included 220 migraine patients and 228 healthy controls. Eight common missense mutations of the MEFV gene, known as M694V, M694I, M680I, V726A, R761H, K695R, P369S, and E148Q, were genotyped using real-time polymerase chain reaction with 5' nuclease assays, which include sequence specific primers, and probes with a reporter dye. When mutations were evaluated separately among the patient and control groups, only the heterozygote E148Q carrier was found to be significantly higher in the control group than in the patient group (P=0.029, odds ratio [95% confidence interval] =0.45 [0.21–0.94]. In addition, the frequency of the homozygote and the compound heterozygote genotype carrier was found to be significantly higher in patients (n=8, 3.6% than in the control group (n=1, 0.4% (P=0.016, odds ratio [95% confidence interval] =8.57 [1.06–69.07]. However, there was no statistically significant difference in the allele frequencies of MEFV mutations between the patients and the healthy control group (P=0.964. In conclusion, the results of the present study suggest that
Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi
With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as
Ito, Katsura; Nishikawa, Hiroshi; Shimada, Takuji; Ogawa, Kohei; Minamiya, Yukio; Tomoda, Masafumi; Nakahira, Kengo; Kodama, Rika; Fukuda, Tatsuya; Arakawa, Ryo
Pilophorus typicus (Distant) (Heteroptera: Miridae) is a predatory bug occurring in East, Southeast, and South Asia. Because the active stages of P. typicus prey on various agricultural pest insects and mites, this species is a candidate insect as an indigenous natural enemy for use in biological control programs. However, the mass releasing of introduced natural enemies into agricultural fields may incur the risk of affecting the genetic integrity of species through hybridization with a local population. To clarify the genetic characteristics of the Japanese populations of P. typicus two portions of the mitochondrial DNA, the cytochrome oxidase subunit I (COI) (534 bp) and the cytochrome B (cytB) (217 bp) genes, were sequenced for 64 individuals collected from 55 localities in a wide range of Japan. Totals of 18 and 10 haplotypes were identified for the COI and cytB sequences, respectively (25 haplotypes over regions). Phylogenetic analysis using the maximum likelihood method revealed the existence of two genetically distinct groups in P. typicus in Japan. These groups were distributed in different geographic ranges: one occurred mainly from the Pacific coastal areas of the Kii Peninsula, the Shikoku Island, and the Ryukyu Islands; whereas the other occurred from the northern Kyushu district to the Kanto and Hokuriku districts of mainland Japan. However, both haplotypes were found in a single locality of the southern coast of the Shikoku Island. COI phylogeny incorporating other Pilophorus species revealed that these groups were only recently differentiated. Therefore, use of a certain population of P. typicus across its distribution range should be done with caution because genetic hybridization may occur.
Reamon-Buettner, Stella Marie; Borlak, Jürgen
'Epigenetics' is a heritable phenomenon without change in primary DNA sequence. In recent years, this field has attracted much attention as more epigenetic controls of gene activities are being discovered. Such epigenetic controls ensue from an interplay of DNA methylation, histone modifications, and RNA-mediated pathways from non-coding RNAs, notably silencing RNA (siRNA) and microRNA (miRNA). Although epigenetic regulation is inherent to normal development and differentiation, this can be misdirected leading to a number of diseases including cancer. All the same, many of the processes can be reversed offering a hope for epigenetic therapies such as inhibitors of enzymes controlling epigenetic modifications, specifically DNA methyltransferases, histone deacetylases, and RNAi therapeutics. 'In utero' or early life exposures to dietary and environmental exposures can have a profound effect on our epigenetic code, the so-called 'epigenome', resulting in birth defects and diseases developed later in life. Indeed, examples are accumulating in which environmental exposures can be attributed to epigenetic causes, an encouraging edge towards greater understanding of the contribution of epigenetic influences of environmental exposures. Routine analysis of epigenetic modifications as part of the mechanisms of action of environmental contaminants is in order. There is, however, an explosion of research in the field of epigenetics and to keep abreast of these developments could be a challenge. In this paper, we provide an overview of epigenetic mechanisms focusing on recent reviews and studies to serve as an entry point into the realm of 'environmental epigenetics'.
Iacocca, Michael A.; Wang, Jian; Dron, Jacqueline S.; Robinson, John F.; McIntyre, Adam D.; Cao, Henian
Familial hypercholesterolemia (FH) is a heritable condition of severely elevated LDL cholesterol, caused predominantly by autosomal codominant mutations in the LDL receptor gene (LDLR). In providing a molecular diagnosis for FH, the current procedure often includes targeted next-generation sequencing (NGS) panels for the detection of small-scale DNA variants, followed by multiplex ligation-dependent probe amplification (MLPA) in LDLR for the detection of whole-exon copy number variants (CNVs). The latter is essential because ∼10% of FH cases are attributed to CNVs in LDLR; accounting for them decreases false negative findings. Here, we determined the potential of replacing MLPA with bioinformatic analysis applied to NGS data, which uses depth-of-coverage analysis as its principal method to identify whole-exon CNV events. In analysis of 388 FH patient samples, there was 100% concordance in LDLR CNV detection between these two methods: 38 reported CNVs identified by MLPA were also successfully detected by our NGS method, while 350 samples negative for CNVs by MLPA were also negative by NGS. This result suggests that MLPA can be removed from the routine diagnostic screening for FH, significantly reducing associated costs, resources, and analysis time, while promoting more widespread assessment of this important class of mutations across diagnostic laboratories. PMID:28874442
Dawn B. Goldsmith
Full Text Available Deep sequencing of the viral phoH gene, a host-derived auxiliary metabolic gene, was used to track viral diversity throughout the water column at the Bermuda Atlantic Time-series Study (BATS site in the summer (September and winter (March of three years. Viral phoH sequences reveal differences in the viral communities throughout a depth profile and between seasons in the same year. Variation was also detected between the same seasons in subsequent years, though these differences were not as great as the summer/winter distinctions. Over 3,600 phoH operational taxonomic units (OTUs; 97% sequence identity were identified. Despite high richness, most phoH sequences belong to a few large, common OTUs whereas the majority of the OTUs are small and rare. While many OTUs make sporadic appearances at just a few times or depths, a small number of OTUs dominate the community throughout the seasons, depths, and years.
Full Text Available Exertional rhabdomyolysis (ER and stress-induced malignant hyperthermia (MH events are syndromes that primarily afflict military recruits in basic training and athletes. Events similar to those occurring in ER and in stress-induced MH events are triggered after exposure to anesthetic agents in MH-susceptible (MHS patients. MH is an autosomal dominant hypermetabolic condition that occurs in genetically predisposed subjects during general anesthesia, induced by commonly used volatile anesthetics and/or the neuromuscular blocking agent succinylcholine. Triggering agents cause an altered intracellular calcium regulation. Mutations in RYR1 gene have been found in about 70% of MH families. The RYR1 gene encodes the skeletal muscle calcium release channel of the sarcoplasmic reticulum, commonly known as ryanodine receptor type 1 (RYR1. The present work reviews the documented cases of ER or of stress-induced MH events in which RYR1 sequence variations, associated or possibly associated to MHS status, have been identified.
Full Text Available Panton-Valentine leucocidin (PVL, encoded by lukSF-PV genes, a bi-component and pore-forming toxin, is carried by different staphylococcal bacteriophages. The prevalence of PVL in Staphylococcus aureus (S. aureus have been reported around the globe. However, the data on PVL-encoding phage types, lukSF-PV gene variation and chromosomal phage insertion sites for PVL-positive S. aureus are limited, especially in China. In order to obtain a more complete understanding of the molecular epidemiology of PVL-positive S. aureus, an integrated and modified PCR-based scheme was applied to detect the PVL-encoding phage types. Phage insertion locus and the lukSF-PV variant were determined by PCR and sequencing. Meanwhile, the genetic background was characterized by staphylococcal cassette chromosome mec (SCCmec typing, staphylococcal protein A (spa gene polymorphisms typing, pulsed-field gel electrophoresis (PFGE typing, accessory gene regulator (agr locus typing and multilocus sequence typing (MLST. Seventy eight (78/1175, 6.6% isolates possessed the lukSF-PV genes and 59.0% (46/78 of PVL-positive strains belonged to CC59 lineage. Eight known different PVL-encoding phage types were detected, and Φ7247PVL/ΦST5967PVL (n=13 and ΦPVL (n=12 were the most prevalent among them. While 25 (25/78, 32.1% isolates, belonging to ST30 and ST59 clones, were unable to be typed by the modified PCR-based scheme. Single nucleotide polymorphisms (SNPs were identified at five locations in the lukSF-PV genes, two of which were non-synonymous. Maximum-likelihood tree analysis of attachment sites sequences detected six SNP profiles for attR and eight for attL, respectively. In conclusion, the PVL-positive S. aureus mainly harbored Φ7247PVL/ΦST5967PVL and ΦPVL in the regions studied. lukSF-PV gene sequences, PVL-encoding phages and phage insertion locus generally varied with lineages. Moreover, PVL-positive clones that have emerged worldwide likely carry distinct phages.
Zhao, Huanqiang; Hu, Fupin; Jin, Shu; Xu, Xiaogang; Zou, Yuhan; Ding, Baixing; He, Chunyan; Gong, Fang; Liu, Qingzhong
Panton-Valentine leukocidin (PVL, encoded by lukSF-PV genes), a bi-component and pore-forming toxin, is carried by different staphylococcal bacteriophages. The prevalence of PVL in Staphylococcus aureus has been reported around the globe. However, the data on PVL-encoding phage types, lukSF-PV gene variation and chromosomal phage insertion sites for PVL-positive S. aureus are limited, especially in China. In order to obtain a more complete understanding of the molecular epidemiology of PVL-positive S. aureus, an integrated and modified PCR-based scheme was applied to detect the PVL-encoding phage types. Phage insertion locus and the lukSF-PV variant were determined by PCR and sequencing. Meanwhile, the genetic background was characterized by staphylococcal cassette chromosome mec (SCCmec) typing, staphylococcal protein A (spa) gene polymorphisms typing, pulsed-field gel electrophoresis (PFGE) typing, accessory gene regulator (agr) locus typing and multilocus sequence typing (MLST). Seventy eight (78/1175, 6.6%) isolates possessed the lukSF-PV genes and 59.0% (46/78) of PVL-positive strains belonged to CC59 lineage. Eight known different PVL-encoding phage types were detected, and Φ7247PVL/ΦST5967PVL (n = 13) and ΦPVL (n = 12) were the most prevalent among them. While 25 (25/78, 32.1%) isolates, belonging to ST30, and ST59 clones, were unable to be typed by the modified PCR-based scheme. Single nucleotide polymorphisms (SNPs) were identified at five locations in the lukSF-PV genes, two of which were non-synonymous. Maximum-likelihood tree analysis of attachment sites sequences detected six SNP profiles for attR and eight for attL, respectively. In conclusion, the PVL-positive S. aureus mainly harbored Φ7247PVL/ΦST5967PVL and ΦPVL in the regions studied. lukSF-PV gene sequences, PVL-encoding phages, and phage insertion locus generally varied with lineages. Moreover, PVL-positive clones that have emerged worldwide likely carry distinct phages.
Qin, Xin-Min; Li, Hui-Min; Zeng, Zhen-Hua; Zeng, De-Long; Guan, Qing-Xin
Black-spotted and red-spotted tokay geckos are distributed in different regions and have significant differences in morphological appearance, but have been regarded as the same species, Gekko gecko, in taxonomy. To determine whether black-spotted and red-spotted tokay geckos are genetically differentiated, we sequenced the entire mitochondrial cytochrome b gene (1147 bp) from 110 individuals of Gekko gecko collected in 11 areas including Guangxi China, Yunnan China, Vietnam, and Laos. In addition, we performed karyotypic analyses of black-spotted tokay geckos from Guangxi China and red-spotted tokay geckos from Laos. These phylogenetic analyses showed that black-spotted and red-spotted tokay geckos are divided into two branches in molecular phylogenetic trees. The average genetic distances are as follows: 0.12-0.47% among six haplotypes in the black-spotted tokay gecko group, 0.12-1.66% among five haplotypes in the red-spotted tokay gecko group, and 8.76-9.18% between the black-spotted and red-spotted tokay geckos, respectively. The karyotypic analyses showed that the karyotype formula is 2n = 38 = 8m + 2sm + 2st + 26t in red-spotted tokay geckos from Laos compared with 2n = 38 = 8m + 2sm + 28t in black-spotted tokay geckos from Guangxi China. The differences in these two kinds of karyotypes were detected on the 15th chromosome. The clear differences in genetic levels between black-spotted and red-spotted tokay geckos suggest a significant level of genetic differentiation between the two.
Full Text Available In reptiles, dorsal body darkness often varies with substrate color or temperature environment, and is generally presumed to be an adaptation for crypsis or thermoregulation. However, the genetic basis of pigmentation is poorly known in this group. In this study we analyzed the coding region of the melanocortin-1-receptor (MC1R gene, and therefore its role underlying the dorsal color variation in two sympatric species of sand lizards (Liolaemus that inhabit the southeastern coast of South America: L. occipitalis and L. arambarensis. The first is light-colored and occupies aeolic pale sand dunes, while the second is brownish and lives in a darker sandy habitat. We sequenced 630 base pairs of MC1R in both species. In total, 12 nucleotide polymorphisms were observed, and four amino acid replacement sites, but none of them could be associated with a color pattern. Comparative analysis indicated that these taxa are monomorphic for amino acid sites that were previously identified as functionally important in other reptiles. Thus, our results indicate that MC1R is not involved in the pigmentation pattern observed in Liolaemus lizards. Therefore, structural differences in other genes, such as ASIP, or variation in regulatory regions of MC1R may be responsible for this variation. Alternatively, the phenotypic differences observed might be a consequence of non-genetic factors, such as thermoregulatory mechanisms.
Mardulyn, Patrick; Milinkovitch, Michel C
We have studied mitochondrial DNA variation in a local population of the leaf beetle species Gonioctena olivacea, to check whether its apparent low dispersal behaviour affects its pattern of genetic variation at a small geographical scale. We have sampled 10 populations of G. olivacea within a rectangle of 5 x 2 km in the Belgian Ardennes, as well as five populations located approximately along a straight line of 30 km and separated by distances of 3-12 km. For each sampled individual (8-19 per population), a fragment of the mtDNA control region was polymerase chain reaction-amplified and sequenced. Sequence data were analysed to test whether significant genetic differentiation could be detected among populations separated by such relatively short distances. The reconstructed genealogy of the mitochondrial haplotypes was also used to investigate the demographic history of these populations. Computer simulations of the evolution of populations were conducted to assess the minimum amount of gene flow that is necessary to explain the observed pattern of variation in the samples. Results show that migration among populations included in the rectangle of 5 x 2 km is substantial, and probably involves the occurrence of dispersal flights. This appears difficult to reconcile with the results of a previous ecological field study that concluded that most of this species dispersal occurs by walking. While sufficient migration to homogenize genetic diversity occurs among populations separated by distances of a few hundred metres to a few kilometres, distances greater than 5 km results in contrast in strong differentiation among populations, suggesting that migration is drastically reduced on such distances. Finally, the results of coalescent simulations suggest that the star-like genealogy inferred from the mtDNA sequence data is fully compatible with a past demographic expansion. However, a metapopulation structure alone (without the need to invoke a population expansion
Full Text Available This study was to investigate the single nucleotide polymorphism (SNP in the interferon regulatory factor 6 (IRF6 gene in healthy residents of Guangdong Province, China, for further analysis of their associations with the development of cleft lip with or without palate (CL/P.
Sloan, D.B.; Müller, Karel; McCauley, D.; Taylor, D.R.; Štorchová, Helena
Roč. 196, č. 4 (2012), s. 1228-1239 ISSN 0028-646X R&D Projects: GA ČR GA521/09/0261; GA MŠk(CZ) LC06004; GA MŠk ME09035 Institutional research plan: CEZ:AV0Z50380511 Keywords : cytoplasmic male sterility (CMS) * gynodioecy * intracellular gene transfer Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 6.736, year: 2012
Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their
Jacobsson, Susanne; Hedberg, Sara Thulin; Mölling, Paula; Unemo, Magnus; Comanducci, Maurizio; Rappuoli, Rino; Olcén, Per
During the recent years, projects are in progress for designing broad-range non-capsular-based meningococcal vaccines, covering also serogroup B isolates. We have examined three genes encoding antigens (NadA, GNA1030 and GNA2091) included in a novel vaccine, i.e. the 5 Component Vaccine against Meningococcus B (5CVMB), in terms of gene prevalence and sequence variations. These data were combined with the results from a similar study, examining the two additional antigens included in the 5CVMB (fHbp and GNA2132). nadA and fHbp v. 1 were present in 38% (n=36), respectively 71% (n=67) of the isolates, whereas gna2132, gna1030 and gna2091 were present in all the Neisseria meningitidis isolates tested (n=95). The level of amino acid conservation was relatively high in GNA1030 (93%), GNA2091 (92%), and within the main variants of NadA and fHbp. GNA2132 (54% of the amino acids conserved) appeared to be the most diversified antigen. Consequently, the theoretical coverage of the 5CVMB antigens and the feasibility to use these in a broad-range meningococcal vaccine is appealing.
Malin C Andersen
Full Text Available Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers. The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation.
Andersen, Malin C; Engström, Pär G; Lithwick, Stuart; Arenillas, David; Eriksson, Per; Lenhard, Boris; Wasserman, Wyeth W; Odeberg, Jacob
Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation. PMID:18208319
Kidd, K K; Pakstis, A J; Speed, W C; Kidd, J R
Over the past century researchers have identified normal genetic variation and studied that variation in diverse human populations to determine the amounts and distributions of that variation. That information is being used to develop an understanding of the demographic histories of the different populations and the species as a whole, among other studies. With the advent of DNA-based markers in the last quarter century, these studies have accelerated. One of the challenges for the next century is to understand that variation. One component of that understanding will be population genetics. We present here examples of many of the ways these new data can be analyzed from a population perspective using results from our laboratory on multiple individual DNA-based polymorphisms, many clustered in haplotypes, studied in multiple populations representing all major geographic regions of the world. These data support an "out of Africa" hypothesis for human dispersal around the world and begin to refine the understanding of population structures and genetic relationships. We are also developing baseline information against which we can compare findings at different loci to aid in the identification of loci subject, now and in the past, to selection (directional or balancing). We do not yet have a comprehensive understanding of the extensive variation in the human genome, but some of that understanding is coming from population genetics.
Rivera, A; White, K; Stöhr, H; Steiner, K; Hemmrich, N; Grimm, T; Jurklies, B; Lorenz, B; Scholl, H P; Apfelstedt-Sylla, E; Weber, B H
Stargardt disease (STGD) is a common autosomal recessive maculopathy of early and young-adult onset and is caused by alterations in the gene encoding the photoreceptor-specific ATP-binding cassette (ABC) transporter (ABCA4). We have studied 144 patients with STGD and 220 unaffected individuals ascertained from the German population, to complete a comprehensive, population-specific survey of the sequence variation in the ABCA4 gene. In addition, we have assessed the proposed role for ABCA4 in age-related macular degeneration (AMD), a common cause of late-onset blindness, by studying 200 affected individuals with late-stage disease. Using a screening strategy based primarily on denaturing gradient gel electrophoresis, we have identified in the three study groups a total of 127 unique alterations, of which 90 have not been previously reported, and have classified 72 as probable pathogenic mutations. Of the 288 STGD chromosomes studied, mutations were identified in 166, resulting in a detection rate of approximately 58%. Eight different alleles account for 61% of the identified disease alleles, and at least one of these, the L541P-A1038V complex allele, appears to be a founder mutation in the German population. When the group with AMD and the control group were analyzed with the same methodology, 18 patients with AMD and 12 controls were found to harbor possible disease-associated alterations. This represents no significant difference between the two groups; however, for detection of modest effects of rare alleles in complex diseases, the analysis of larger cohorts of patients may be required.
Li, Jun; Guo, Xian-Guang; Wang, Yue-Zhao
A 838 bp fragment of mtDNA ND4-tRNALeu gene was sequenced for 66 individuals from five populations (DB: Dabancheng, TU: Turpan, SS: Shanshan, HL: Liushuquan, HD: East district of Hami) of Phrynocephalus axillaris distributed in east of Xinjiang Uygur Autonomous Region. Seventeen haplotypes were identified from 29 nucleotide polymorphic sites in the aligned 838 bp sequence. Excluding DB, there were relatively high haplotype diversity [(0.600+/-0.113)oscillation since Pleistocene and genetic drift.
Lindahl, Susanne; Söderlund, Robert; Frosth, Sara; Pringle, John; Båverud, Viveca; Aspán, Anna
Strangles is a serious respiratory disease in horses caused by Streptococcus equi subspecies equi (S. equi). Transmission of the disease occurs by direct contact with an infected horse or contaminated equipment. Genetically, S. equi strains are highly homogenous and differentiation of strains has proven difficult. However, the S. equi M-protein SeM contains a variable N-terminal region and has been proposed as a target gene to distinguish between different strains of S. equi and determine the source of an outbreak. In this study, strains of S. equi (n=60) from 32 strangles outbreaks in Sweden during 1998-2003 and 2008-2009 were genetically characterized by sequencing the SeM protein gene (seM), and by pulsed-field gel electrophoresis (PFGE). Swedish strains belonged to 10 different seM types, of which five have not previously been described. Most were identical or highly similar to allele types from strangles outbreaks in the UK. Outbreaks in 2008/2009 sharing the same seM type were associated by geographic location and/or type of usage of the horses (racing stables). Sequencing of the seM gene generally agreed with pulsed-field gel electrophoresis profiles. Our data suggest that seM sequencing as a epidemiological tool is supported by the agreement between seM and PFGE and that sequencing of the SeM protein gene is more sensitive than PFGE in discriminating strains of S. equi. Copyright © 2011 Elsevier B.V. All rights reserved.
Iverson-Cabral, Stefanie L; Astete, Sabina G; Cohen, Craig R; Rocha, Eduardo P C; Totten, Patricia A
Mycoplasma genitalium is associated with reproductive tract disease in women and may persist in the lower genital tract for months, potentially increasing the risk of upper tract infection and transmission to uninfected partners. Despite its exceptionally small genome (580 kb), approximately 4% is composed of repeated elements known as MgPar sequences (MgPa repeats) based on their homology to the mgpB gene that encodes the immunodominant MgPa adhesin protein. The presence of these MgPar sequences, as well as mgpB variability between M. genitalium strains, suggests that mgpB and MgPar sequences recombine to produce variant MgPa proteins. To examine the extent and generation of diversity within single strains of the organism, we examined mgpB variation within M. genitalium strain G-37 and observed sequence heterogeneity that could be explained by recombination between the mgpB expression site and putative donor MgPar sequences. Similarly, we analyzed mgpB sequences from cervical specimens from a persistently infected woman (21 months) and identified 17 different mgpB variants within a single infecting M. genitalium strain, confirming that mgpB heterogeneity occurs over the course of a natural infection. These observations support the hypothesis that recombination occurs between the mgpB gene and MgPar sequences and that the resulting antigenically distinct MgPa variants may contribute to immune evasion and persistence of infection.
Hoy, Marshal S.; Rodriguez, Rusty J.
Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.
Benmansour, A.; Bascuro, B.; Monnier, A.F.; Vende, P.; Winton, J.R.; de Kinkelin, P.
To evaluate the genetic diversity of viral haemorrhagic septicaemia virus (VHSV), the sequence of the glycoprotein genes (G) of 11 North American and European isolates were determined. Comparison with the G protein of representative members of the family Rhabdoviridae suggested that VHSV was a different virus species from infectious haemorrhagic necrosis virus (IHNV) and Hirame rhabdovirus (HIRRV). At a higher taxonomic level, VHSV, IHNV and HIRRV formed a group which was genetically closest to the genus Lyssavirus. Compared with each other, the G genes of VHSV displayed a dissimilar overall genetic diversity which correlated with differences in geographical origin. The multiple sequence alignment of the complete G protein, showed that the divergent positions were not uniformly distributed along the sequence. A central region (amino acid position 245-300) accumulated substitutions and appeared to be highly variable. The genetic heterogeneity within a single isolate was high, with an apparent internal mutation frequency of 1.2 x 10(-3) per nucleotide site, attesting the quasispecies nature of the viral population. The phylogeny separated VHSV strains according to the major geographical area of isolation: genotype I for continental Europe, genotype II for the British Isles, and genotype III for North America. Isolates from continental Europe exhibited the highest genetic variability, with sub-groups correlated partially with the serological classification. Neither neutralizing polyclonal sera, nor monoclonal antibodies, were able to discriminate between the genotypes. The overall structure of the phylogenetic tree suggests that VHSV genetic diversity and evolution fit within the model of random change and positive selection operating on quasispecies.
Sequence variation in the cytochrome oxidase subunit I and II genes of two commonly found blow fly species, Chrysomya megacephala (Fabricius) and Chrysomya rufifacies (Macquart) (Diptera: Calliphoridae) in Malaysia.
Tan, Siew Hwa; Aris, Edah Mohd; Surin, Johari; Omar, Baharudin; Kurahashi, Hiromu; Mohamed, Zulqarnain
The mitochondiral DNA region encompassing the cytochrome oxidase subunit I (COI) and cytochrome oxidase subunit II (COII) genes of two Malaysian blow fly species, Chrysomya megacephala (Fabricius) and Chrysomya rufifacies (Macquart) were studied. This region, which spans 2303bp and includes the COI, tRNA leucine and partial COII was sequenced from adult fly and larval specimens, and compared. Intraspecific variations were observed at 0.26% for Ch. megacephala and 0.17% for Ch. rufifacies, while sequence divergence between the two species was recorded at a minimum of 141 out of 2303 sites (6.12%). Results obtained in this study are comparable to published data, and thus support the use of DNA sequence to facilitate and complement morphology-based species identification.
Winchester, Catherine L; Ohzeki, Hiromitsu; Vouyiouklis, Demetrius A; Thompson, Rhiannon; Penninger, Josef M; Yamagami, Keiji; Norrie, John D; Hunter, Robert; Pratt, Judith A; Morris, Brian J
Schizophrenia is a debilitating psychiatric disease with a strong genetic contribution, potentially linked to altered glutamatergic function in brain regions such as the prefrontal cortex (PFC). Here, we report converging evidence to support a functional candidate gene for schizophrenia. In post-mortem PFC from patients with schizophrenia, we detected decreased expression of MKK7/MAP2K7-a kinase activated by glutamatergic activity. While mice lacking one copy of the Map2k7 gene were overtly normal in a variety of behavioural tests, these mice showed a schizophrenia-like cognitive phenotype of impaired working memory. Additional support for MAP2K7 as a candidate gene came from a genetic association study. A substantial effect size (odds ratios: ~1.9) was observed for a common variant in a cohort of case and control samples collected in the Glasgow area and also in a replication cohort of samples of Northern European descent (most significant P-value: 3 × 10(-4)). While some caution is warranted until these association data are further replicated, these results are the first to implicate the candidate gene MAP2K7 in genetic risk for schizophrenia. Complete sequencing of all MAP2K7 exons did not reveal any non-synonymous mutations. However, the MAP2K7 haplotype appeared to have functional effects, in that it influenced the level of expression of MAP2K7 mRNA in human PFC. Taken together, the results imply that reduced function of the MAP2K7-c-Jun N-terminal kinase (JNK) signalling cascade may underlie some of the neurochemical changes and core symptoms in schizophrenia.
Bai, Yun; Ma, Hongxia; Wang, Ying; Liu, Hongliang; Chen, Weihong; Yang, Lin; Jing, Guangfu; Huo, Xiang; Chen, Feng; Liu, Yanhong; Jin, Li; Xu, Liang; Wei, Qingyi; Huang, Wei; Shen, Hongbing; Lu, Daru; Wu, Tangchun; Yang, Xiaobo; Hu, Zhibin; Yuan, Jing; Wang, Feng; Shao, Minhua; Yuan, Wentao; Qian, Ji
The nucleotide excision repair (NER) protein, xeroderma pigmentosum C (XPC), participates in recognizing DNA lesions and initiating DNA repair in response to DNA damage. Because mutations in XPC cause a high risk of cancer in XP patients, we hypothesized that inherited sequence variations in XPC may alter DNA repair and thus susceptibility to cancer. In this hospital-based case-control study, we investigated five XPC tagging, common single nucleotide polymorphisms (tagging SNPs) in 1,010 patients with newly diagnosed lung cancer and 1,011 matched cancer free controls in a Chinese population. In individual tagging SNP analysis, we found that rs3731055AG+AA variant genotypes were associated with a significantly decreased risk of lung adenocarcinoma [adjusted odds ratio (OR), 0.71; 95% confidence interval (CI), 0.56–0.90] but an increased risk of small cell carcinomas [adjusted OR, 1.79; 95% CI, 1.05–3.07]. Furthermore, we found that haplotype ACCCA was associated with a decreased risk of lung adenocarcinoma [OR, 0.78; 95% CI, 0.62–0.97] but an increased risk of small cell carcinomas [OR, 1.68; 95% CI, 1.04–2.71], which reflected the presence of rs3731055A allele in this haplotype. Further stratified analysis revealed that the protective effect of rs3731055AG+AA on risk of lung adenocarcinoma was more evident among young subjects (age ≤ 60) and never smokers. These results suggest that inherited sequence variations in XPC may modulate risk of lung cancer, especially lung adenocarcinoma, in Chinese populations. However, these findings need to be verified in larger confirmatory studies with more comprehensively selected tagging SNPs
Phylogenetic analysis, based on EPIYA repeats in the cagA gene of Indian Helicobacter pylori, and the implications of sequence variation in tyrosine phosphorylation motifs on determining the clinical outcome
Santosh K. Tiwari
Full Text Available The population of India harbors one of the world's most highly diverse gene pools, owing to the influx of successive waves of immigrants over regular periods in time. Several phylogenetic studies involving mitochondrial DNA and Y chromosomal variation have demonstrated Europeans to have been the first settlers in India. Nevertheless, certain controversy exists, due to the support given to the thesis that colonization was by the Austro-Asiatic group, prior to the Europeans. Thus, the aim was to investigate pre-historic colonization of India by anatomically modern humans, using conserved stretches of five amino acid (EPIYA sequences in the cagA gene of Helicobacter pylori. Simultaneously, the existence of a pathogenic relationship of tyrosine phosphorylation motifs (TPMs, in 32 H. pylori strains isolated from subjects with several forms of gastric diseases, was also explored. High resolution sequence analysis of the above described genes was performed. The nucleotide sequences obtained were translated into amino acids using MEGA (version 4.0 software for EPIYA. An MJ-Network was constructed for obtaining TPM haplotypes by using NETWORK (version 4.5 software. The findings of the study suggest that Indian H. pylori strains share a common ancestry with Europeans. No specific association of haplotypes with the outcome of disease was revealed through additional network analysis of TPMs.
Full Text Available Abstract Background GABA transporter-1 (GAT-1; genetic locus SLC6A1 is emerging as a novel target for treatment of neuropsychiatric disorders. To understand how population differences might influence strategies for pharmacogenetic studies, we identified patterns of genetic variation and linkage disequilibrium (LD in SLC6A1 in five populations representing three continental groups. Results We resequenced 12.4 kb of SLC6A1, including the promoters, exons and flanking intronic regions in African-American, Thai, Hmong, Finnish, and European-American subjects (total n = 40. LD in SLC6A1 was examined by genotyping 16 SNPs in larger samples. Sixty-three variants were identified through resequencing. Common population-specific variants were found in African-Americans, including a novel 21-bp promoter region variable number tandem repeat (VNTR, but no such variants were found in any of the other populations studied. Low levels of LD and the absence of major LD blocks were characteristic of all five populations. African-Americans had the highest genetic diversity. European-Americans and Finns did not differ in genetic diversity or LD patterns. Although the Hmong had the highest level of LD, our results suggest that a strategy based on the use of tag SNPs would not translate to a major improvement in genotyping efficiency. Conclusion Owing to the low level of LD and presence of recombination hotspots, SLC6A1 may be an example of a problematic gene for association and haplotype tagging-based genetic studies. The 21-bp promoter region VNTR polymorphism is a putatively functional candidate allele for studies focusing on variation in GAT-1 function in the African-American population.
Leekitcharoenphon, Pimlapas; Lukjancenko, Oksana; Rundsten, Carsten Friis
Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over...... genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher...... that there is a positive selection towards mutations leading to amino acid changes. Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important...
Chang, Perng-Kuang; Scharfenstein, Leslie L; Solorzano, Cesar D; Abbas, Hamed K; Hua, Sui-Sheng T; Jones, Walker A; Zablotowicz, Robert M
Aspergillus oryzae and Aspergillus flavus are closely related fungal species. The A. flavus morphotype that produces numerous small sclerotia (S strain) and aflatoxin has a unique 1.5 kb deletion in the norB-cypA region of the aflatoxin gene cluster (i.e. the S genotype). Phylogenetic studies have indicated that an isolate of the nonaflatoxigenic A. flavus with the S genotype is the ancestor of A. oryzae. Genome sequence comparison between A. flavus NRRL3357, which produces large sclerotia (L strain), and S-strain A. flavus 70S identified a region (samA-rosA) that was highly variable in the two morphotypes. A third type of samA-rosA region was found in A. oryzae RIB40. The three samA-rosA types were later revealed to be commonly present in A. flavus L-strain populations. Of the 182 L-strain A. flavus field isolates examined, 46%, 15% and 39% had the samA-rosA type of NRRL3357, 70S and RIB40, respectively. The three types also were found in 18 S-strain A. flavus isolates with different proportions. For A. oryzae, however, the majority (80%) of the 16 strains examined had the RIB40 type and none had the NRRL3357 type. The results suggested that A. oryzae strains in the current culture collections were mostly derived from the samA-rosA/RIB40 lineage of the nonaflatoxigenic A. flavus with the S genotype. Published by Elsevier B.V.
Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima\\'s D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. Availability and implementation: The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp. © The Author 2014.
Ferrin Thomas E
Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.
Andalib, Sasan; Vafaee, Manouchehr Seyedi; Gjedde, Albert
Parkinson's disease (PD) is a common disorder of the central nervous system in the elderly. The pathogenesis of PD is a complex process, with genetics as an important contributing factor. This factor may stem from mitochondrial gene variations and mutations as well as from nuclear gene variations...
Kráľová-Hromadová, I.; Špakulová, M.; Horáčková, Eva; Turčeková, Ĺ.; Novobilský, A.; Beck, R.; Koudela, Břetislav; Marinculić, A.; Rajský, D.; Pybus, M.
Roč. 94, č. 1 (2008), s. 58-67 ISSN 0022-3395 R&D Projects: GA ČR GD524/03/H133; GA AV ČR IAA6022404 Grant - others:Slovak Research and Development Agency(SK) APVV-51-062205 Institutional research plan: CEZ:AV0Z60220518 Keywords : Fascioloides magna * Fasciola hepatica * ribosomal genes * mitochondrial genes Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.165, year: 2008
Full Text Available To measure the strength of natural selection that acts upon single nucleotide variants (SNVs in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs per nonsynonymous site and synonymous SNVs (sSNVs per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs, nervous system genes (NSGs, randomly sampled genes (RSGs, and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.
Aspergillus oryzae and Aspergillus flavus are closely related fungal species. The A. flavus population that produces numerous small sclerotia (S strain) and aflatoxin has a unique 1.5 kb deletion in the norB-cypA region of the aflatoxin gene cluster (the S genotype). Phylogenetic studies have indica...
Phylogenetic relationships of Acheilognathidae (Cypriniformes: Cyprinoidea) as revealed from evidence of both nuclear and mitochondrial gene sequence variation: evidence for necessary taxonomic revision in the family and the identification of cryptic species.
Chang, Chia-Hao; Li, Fan; Shao, Kwang-Tsao; Lin, Yeong-Shin; Morosawa, Takahiro; Kim, Sungmin; Koo, Hyeyoung; Kim, Won; Lee, Jae-Seong; He, Shunping; Smith, Carl; Reichard, Martin; Miya, Masaki; Sado, Tetsuya; Uehara, Kazuhiko; Lavoué, Sébastien; Chen, Wei-Jen; Mayden, Richard L
Bitterlings are relatively small cypriniform species and extremely interesting evolutionarily due to their unusual reproductive behaviors and their coevolutionary relationships with freshwater mussels. As a group, they have attracted a great deal of attention in biological studies. Understanding the origin and evolution of their mating system demands a well-corroborated hypothesis of their evolutionary relationships. In this study, we provide the most comprehensive phylogenetic reconstruction of species relationships of the group based on partitioned maximum likelihood and Bayesian methods using DNA sequence variation of nuclear and mitochondrial genes on 41 species, several subspecies and three undescribed species. Our findings support the monophyly of the Acheilognathidae. Two of the three currently recognized genera are not monophyletic and the family can be subdivided into six clades. These clades are further regarded as genera based on both their phylogenetic relationships and a reappraisal of morphological characters. We present a revised classification for the Acheilognathidae with five genera/lineages: Rhodeus, Acheilognathus (new constitution), Tanakia (new constitution), Paratanakia gen. nov., and Pseudorhodeus gen. nov. and an unnamed clade containing five species currently referred to as "Acheilognathus". Gene trees of several bitterling species indicate that the taxa are not monophyletic. This result highlights a potentially dramatic underestimation of species diversity in this family. Using our new phylogenetic framework, we discuss the evolution of the Acheilognathidae relative to classification, taxonomy and biogeography. Copyright © 2014 Elsevier Inc. All rights reserved.
Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A
Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.
Molecular phylogenetics of the family Cyprinidae (Actinopterygii: Cypriniformes) as evidenced by sequence variation in the first intron of S7 ribosomal protein-coding gene: further evidence from a nuclear gene of the systematic chaos in the family.
He, Shunping; Mayden, Richard L; Wang, Xuzheng; Wang, Wei; Tang, Kevin L; Chen, Wei-Jen; Chen, Yiyu
The family Cyprinidae is the largest freshwater fish group in the world, including over 200 genera and 2100 species. The phylogenetic relationships of major clades within this family are simply poorly understood, largely because of the overwhelming diversity of the group; however, several investigators have advanced different hypotheses of relationships that pre- and post-date the use of shared-derived characters as advocated through phylogenetic systematics. As expected, most previous investigations used morphological characters. Recently, mitochondrial DNA (mtDNA) sequences and combined morphological and mtDNA investigations have been used to explore and advance our understanding of species relationships and test monophyletic groupings. Limitations of these studies include limited taxon sampling and a strict reliance upon maternally inherited mtDNA variation. The present study is the first endeavor to recover the phylogenetic relationships of the 12 previously recognized monophyletic subfamilies within the Cyprinidae using newly sequenced nuclear DNA (nDNA) for over 50 species representing members of the different previously hypothesized subfamily and family groupings within the Cyprinidae and from other cypriniform families as outgroup taxa. Hypothesized phylogenetic relationships are constructed using maximum parsimony and Basyesian analyses of 1042 sites, of which 971 sites were variable and 790 were phylogenetically informative. Using other appropriate cypriniform taxa of the families Catostomidae (Myxocyprinus asiaticus), Gyrinocheilidae (Gyrinocheilus aymonieri), and Balitoridae (Nemacheilus sp. and Beaufortia kweichowensis) as outgroups, the Cyprinidae is resolved as a monophyletic group. Within the family the genera Raiamas, Barilius, Danio, and Rasbora, representing many of the tropical cyprinids, represent basal members of the family. All other species can be classified into variably supported and resolved monophyletic lineages, depending upon analysis
Wiborg, O; Hyldig-Nielsen, J J; Jensen, E O
We present the complete nucleotide sequences of two leghemoglobin genes isolated from soybean DNA. Both genes contain three intervening sequences in identical positions. Comparison of the coding sequences with known amino-acid sequences of soybean leghemoglobins suggest that the two genes...
Vlad, Daniela; Rappaport, Fabrice; Simon, Matthieu; Loudet, Olivier
A major challenge in biology is to identify molecular polymorphisms responsible for variation in complex traits of evolutionary and agricultural interest. Using the advantages of Arabidopsis thaliana as a model species, we sought to identify new genes and genetic mechanisms underlying natural variation for shoot growth using quantitative genetic strategies. More quantitative trait loci (QTL) still need be resolved to draw a general picture as to how and where in the pathways adaptation is shaping natural variation and the type of molecular variation involved. Phenotypic variation for shoot growth in the Bur-0 x Col-0 recombinant inbred line set was decomposed into several QTLs. Nearly-isogenic lines generated from the residual heterozygosity segregating among lines revealed an even more complex picture, with major variation controlled by opposite linked loci and masked by the segregation bias due to the defective phenotype of SG3 (Shoot Growth-3), as well as epistasis with SG3i (SG3-interactor). Using principally a fine-mapping strategy, we have identified the underlying gene causing phenotypic variation at SG3: At4g30720 codes for a new chloroplast-located protein essential to ensure a correct electron flow through the photosynthetic chain and, hence, photosynthesis efficiency and normal growth. The SG3/SG3i interaction is the result of a structural polymorphism originating from the duplication of the gene followed by divergent paralogue's loss between parental accessions. Species-wide, our results illustrate the very dynamic rate of duplication/transposition, even over short periods of time, resulting in several divergent--but still functional-combinations of alleles fixed in different backgrounds. In predominantly selfing species like Arabidopsis, this variation remains hidden in wild populations but is potentially revealed when divergent individuals outcross. This work highlights the need for improved tools and algorithms to resolve structural variation
Deep Sequence Analysis of Non-Small Cell Lung Cancer: Integrated Analysis of Gene Expression, Alternative Splicing, and Single Nucleotide Variations in Lung Adenocarcinomas with and without Oncogenic KRAS Mutations
Kalari, Krishna R.; Rossell, David; Necela, Brian M.; Asmann, Yan W.; Nair, Asha
KRAS mutations are highly prevalent in non-small cell lung cancer (NSCLC), and tumors harboring these mutations tend to be aggressive and resistant to chemotherapy. We used next-generation sequencing technology to identify pathways that are specifically altered in lung tumors harboring a KRAS mutation. Paired-end RNA-sequencing of 15 primary lung adenocarcinoma tumors (8 harboring mutant KRAS and 7 with wild-type KRAS) were performed. Sequences were mapped to the human genome, and genomic features, including differentially expressed genes, alternate splicing isoforms and single nucleotide variants, were determined for tumors with and without KRAS mutation using a variety of computational methods. Network analysis was carried out on genes showing differential expression (374 genes), alternate splicing (259 genes), and SNV-related changes (65 genes) in NSCLC tumors harboring a KRAS mutation. Genes exhibiting two or more connections from the lung adenocarcinoma network were used to carry out integrated pathway analysis. The most significant signaling pathways identified through this analysis were the NFκB, ERK1/2, and AKT pathways. A 27 gene mutant KRAS-specific sub network was extracted based on gene–gene connections from the integrated network, and interrogated for druggable targets. Our results confirm previous evidence that mutant KRAS tumors exhibit activated NFκB, ERK1/2, and AKT pathways and may be preferentially sensitive to target therapeutics toward these pathways. In addition, our analysis indicates novel, previously unappreciated links between mutant KRAS and the TNFR and PPARγ signaling pathways, suggesting that targeted PPARγ antagonists and TNFR inhibitors may be useful therapeutic strategies for treatment of mutant KRAS lung tumors. Our study is the first to integrate genomic features from RNA-Seq data from NSCLC and to define a first draft genomic landscape model that is unique to tumors with oncogenic KRAS mutations.
Phelan, Jody E.; Coll, Francesc; Bergval, Indra; Anthony, Richard M.; Warren, Rob; Sampson, Samantha L.; Gey van Pittius, Nicolaas C.; Glynn, Judith R.; Crampin, Amelia C.; Alves, Adriana; Bessa, Theolis Barbosa; Campino, Susana; Dheda, Keertan; Grandjean, Louis; Hasan, Rumina; Hasan, Zahra; Miranda, Anabela; Moore, David; Panaiotov, Stefan; Perdigao, Joao; Portugal, Isabel; Sheen, Patricia; de Oliveira Sousa, Erivelton; Streicher, Elizabeth M.; van Helden, Paul D.; Viveiros, Miguel; Hibberd, Martin L.; Pain, Arnab; McNerney, Ruth; Clark, Taane G.
. tuberculosis complex genomes and long read sequence data were used to validate the approach. SNP analysis revealed that variation in the majority of the 168 pe/ppe genes studied was consistent with lineage. Several recombination hotspots were identified
Full Text Available In order to leverage novel sequencing techniques for cloning genes in eukaryotic organisms with complex genomes, the false positive rate of variant discovery must be controlled for by experimental design and informatics. We sequenced five lines from three pedigrees of ethyl methanesulfonate (EMS-mutagenized Sorghum bicolor, including a pedigree segregating a recessive dwarf mutant. Comparing the sequences of the lines, we were able to identify and eliminate error-prone positions. One genomic region contained EMS mutant alleles in dwarfs that were homozygous reference sequences in wild-type siblings and heterozygous in segregating families. This region contained a single nonsynonymous change that cosegregated with dwarfism in a validation population and caused a premature stop codon in the Sorghum ortholog encoding the gibberellic acid (GA biosynthetic enzyme ent-kaurene oxidase. Application of exogenous GA rescued the mutant phenotype. Our method for mapping did not require outcrossing and introduced no segregation variance. This enables work when line crossing is complicated by life history, permitting gene discovery outside of genetic models. This inverts the historical approach of first using recombination to define a locus and then sequencing genes. Our formally identical approach first sequences all the genes and then seeks cosegregation with the trait. Mutagenized lines lacking obvious phenotypic alterations are available for an extension of this approach: mapping with a known marker set in a line that is phenotypically identical to starting material for EMS mutant generation.
Ramírez-Manzanares, Alonso; Rivera, Mariano; Kornprobst, Pierre
Motion estimation in sequences with transparencies is an important problem in robotics and medical imaging applications. In this work we propose a variational approach for estimating multi-valued velocity fields in transparent sequences. Starting from existing local motion estimators, we derive...... a variational model for integrating in space and time such a local information in order to obtain a robust estimation of the multi-valued velocity field. With this approach, we can indeed estimate multi-valued velocity fields which are not necessarily piecewise constant on a layer –each layer can evolve...
Full Text Available Solid pseudopapillary tumor of the pancreas (SPT is a rare pancreatic disease with a unique clinical manifestation. Although CTNNB1 gene mutations had been universally reported, genetic variation profiles of SPT are largely unidentified. We conducted whole exome sequencing in nine SPT patients to probe the SPT-specific insertions and deletions (indels and single nucleotide polymorphisms (SNPs. In total, 54 SNPs and 41 indels of prominent variations were demonstrated through parallel exome sequencing. We detected that CTNNB1 mutations presented throughout all patients studied (100%, and a higher count of SNPs was particularly detected in patients with older age, larger tumor, and metastatic disease. By aggregating 95 detected variation events and viewing the interconnections among each of the genes with variations, CTNNB1 was identified as the core portion in the network, which might collaborate with other events such as variations of USP9X, EP400, HTT, MED12, and PKD1 to regulate tumorigenesis. Pathway analysis showed that the events involved in other cancers had the potential to influence the progression of the SNPs count. Our study revealed an insight into the variation of the gene encoding region underlying solid-pseudopapillary neoplasm tumorigenesis. The detection of these variations might partly reflect the potential molecular mechanism.
Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...
Mygind, Tina; Jacobsen, Iben Søgaard; Melkova, Renata
The gap gene encodes the glycolytic enzyme glyceraldehyde 3-phosphate dehydrogenase (GAPDH). The gene was cloned and sequenced from the Mycoplasma hominis type strain PG21(T). The intraspecies variability was investigated by inspection of restriction fragment length polymorphism (RFLP) patterns...... after polymerase chain reaction (PCR) amplification of the gap gene from 15 strains and furthermore by sequencing of part of the gene in eight strains. The M. hominis gap gene was found to vary more than the Escherichia coli counterpart, but the variation at nucleotide level gave rise to only a few...
In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics.
Oct 18, 2007 ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their conserved exons. MATERIALS AND METHODS. Multiple sequence alignment. Sucrose synthase gene sequences of various cereals like rice, maize, and barley were accessed from NCBI Genbank database.
The authors reported the identification of genes associated with embryonic development and microsatellite sequences. The future direction will entail characterization of these genes using gene over-expression and mutant assays. Key words: Namibia, simple sequence repeats (SSR), data mining, homology searches, ...
Full Text Available Abstract Background Mutations in the transcription factor gene PAX6 have been shown to be the cause of the aniridia phenotype. The purpose of this study was to analyze patients with aniridia to uncover PAX6 gene mutations in south Indian population. Methods Total genomic DNA was isolated from peripheral blood of twenty-eight members of six clinically diagnosed aniridia families and 60 normal healthy controls. The coding exons of the human PAX6 gene were amplified by PCR and allele specific variations were detected by single strand conformation polymorphism (SSCP followed by automated sequencing. Results The sequencing results revealed novel PAX6 mutations in three patients with sporadic aniridia: c.715ins5, [c.1201delA; c.1239A>G] and c.901delA. Two previously reported nonsense mutations were also found: c.482C>A, c.830G>A. A neutral polymorphism was detected (IVS9-12C>T at the boundary of intron 9 and exon 10. The two nonsense mutations found in the coding region of human PAX6 gene are reported for the first time in the south Indian population. Conclusion The genetic analysis confirms that haploinsuffiency of the PAX6 gene causes the classic aniridia phenotype. Most of the point mutations detected in our study results in stop codons. Here we add three novel PAX6 gene mutations in south Indian population to the existing spectrum of mutations, which is not a well-studied ethnic group. Our study supports the hypothesis that a mutation in the PAX6 gene correlates with expression of aniridia.
Sequence Variations in the Flagellar Antigen Genes fliCH25 and fliCH28 of Escherichia coli and Their Use in Identification and Characterization of Enterohemorrhagic E. coli (EHEC O145:H25 and O145:H28.
Full Text Available Enterohemorrhagic E. coli (EHEC serogroup O145 is regarded as one of the major EHEC serogroups involved in severe infections in humans. EHEC O145 encompasses motile and non-motile strains of serotypes O145:H25 and O145:H28. Sequencing the fliC-genes associated with the flagellar antigens H25 and H28 revealed the genetic diversity of the fliCH25 and fliCH28 gene sequences in E. coli. Based on allele discrimination of these fliC-genes real-time PCR tests were designed for identification of EHEC O145:H25 and O145:H28. The fliCH25 genes present in O145:H25 were found to be very similar to those present in E. coli serogroups O2, O100, O165, O172 and O177 pointing to their common evolution but were different from fliCH25 genes of a multiple number of other E. coli serotypes. In a similar way, EHEC O145:H28 harbor a characteristic fliCH28 allele which, apart from EHEC O145:H28, was only found in enteropathogenic (EPEC O28:H28 strains that shared some common traits with EHEC O145:H28. The real time PCR-assays targeting these fliCH25[O145] and fliCH28[O145] alleles allow better characterization of EHEC O145:H25 and EHEC O145:H28. Evaluation of these PCR assays in spiked ready-to eat salad samples resulted in specific detection of both types of EHEC O145 strains even when low spiking levels of 1-10 cfu/g were used. Furthermore these PCR assays allowed identification of non-motile E. coli strains which are serologically not typable for their H-antigens. The combined use of O-antigen genotyping (O145wzy and detection of the respective fliCH25[O145] and fliCH28[O145] allele types contributes to improve identification and molecular serotyping of E. coli O145 isolates.
Have, Christian Theil; Mørk, Søren
We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested...... and are evaluated by the effect on prediction performance. Since bacterial gene finding to a large extent is a solved problem it forms an ideal proving ground for evaluating the explicit modeling of larger scale gene sequence composition of genomes. We conclude that the sequential composition of gene reading frames...
Bellissimo, Daniel B; Christopherson, Pamela A; Flood, Veronica H; Gill, Joan Cox; Friedman, Kenneth D; Haberichter, Sandra L; Shapiro, Amy D; Abshire, Thomas C; Leissinger, Cindy; Hoots, W Keith; Lusher, Jeanne M; Ragni, Margaret V; Montgomery, Robert R
Diagnosis and classification of VWD is aided by molecular analysis of the VWF gene. Because VWF polymorphisms have not been fully characterized, we performed VWF laboratory testing and gene sequencing of 184 healthy controls with a negative bleeding history. The controls included 66 (35.9%) African Americans (AAs). We identified 21 new sequence variations, 13 (62%) of which occurred exclusively in AAs and 2 (G967D, T2666M) that were found in 10%-15% of the AA samples, suggesting they are polymorphisms. We identified 14 sequence variations reported previously as VWF mutations, the majority of which were type 1 mutations. These controls had VWF Ag levels within the normal range, suggesting that these sequence variations might not always reduce plasma VWF levels. Eleven mutations were found in AAs, and the frequency of M740I, H817Q, and R2185Q was 15%-18%. Ten AA controls had the 2N mutation H817Q; 1 was homozygous. The average factor VIII level in this group was 99 IU/dL, suggesting that this variation may confer little or no clinical symptoms. This study emphasizes the importance of sequencing healthy controls to understand ethnic-specific sequence variations so that asymptomatic sequence variations are not misidentified as mutations in other ethnic or racial groups.
Full Text Available Abstract Background Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants. Results In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications. Conclusions Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.
Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate
John K Davies
Full Text Available Antigenic variation occurs in a broad range of species. This process resembles gene conversion in that variant DNA is unidirectionally transferred from partial gene copies (or silent loci into an expression locus. Previous studies of antigenic variation have involved the amplification and sequencing of individual genes from hundreds of colonies. Using the pilE gene from Neisseria gonorrhoeae we have demonstrated that it is possible to use PCR amplification, followed by high-throughput DNA sequencing and a novel assembly process, to detect individual antigenic variation events. The ability to detect these events was much greater than has previously been possible. In N. gonorrhoeae most silent loci contain multiple partial gene copies. Here we show that there is a bias towards using the copy at the 3' end of the silent loci (copy 1 as the donor sequence. The pilE gene of N. gonorrhoeae and some strains of Neisseria meningitidis encode class I pilin, but strains of N. meningitidis from clonal complexes 8 and 11 encode a class II pilin. We have confirmed that the class II pili of meningococcal strain FAM18 (clonal complex 11 are non-variable, and this is also true for the class II pili of strain NMB from clonal complex 8. In addition when a gene encoding class I pilin was moved into the meningococcal strain NMB background there was no evidence of antigenic variation. Finally we investigated several members of the opa gene family of N. gonorrhoeae, where it has been suggested that limited variation occurs. Variation was detected in the opaK gene that is located close to pilE, but not at the opaJ gene located elsewhere on the genome. The approach described here promises to dramatically improve studies of the extent and nature of antigenic variation systems in a variety of species.
Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.
Full Text Available Mycobacterium avium subspecies paratuberculosis (M. ap, the causative agent of Johne’s disease (JD, infects many farmed ruminants, wildlife animals and humans. To better understand the molecular pathogenesis of these infections, we analyzed the whole genome sequences of several M. ap and M. avium subspecies avium (M. avium strains isolated from various hosts and environments. Using Next-generation sequencing technology, all 6 M. ap isolates showed a high percentage of homology (98% to the reference genome sequence of M. ap K-10 isolated from cattle. However, 2 M. avium isolates (DT 78 and Env 77 showed significant sequence diversity from the reference strain M. avium 104. The genomes of M. avium isolates DT 78 and Env 77 exhibited only 87% and 40% homology, respectively, to the M. avium 104 reference genome. Within the M. ap isolates, genomic rearrangements (insertions/deletions, Indels were not detected, and only unique single nucleotide polymorphisms (SNPs were observed among the 6 M. ap strains. While most of the SNPs (~100 in M. ap genomes were non-synonymous, a total of ~ 6000 SNPs were detected among M. avium genomes, most of them were synonymous suggesting a differential selective pressure between M. ap and M. avium isolates. In addition, SNPs-based phylo-genomic analysis showed that isolates from goat and Oryx are closely related to the cattle (K-10 strain while the human isolate (M. ap 4B is closely related to the environmental strains, indicating environmental source to human infections. Overall, SNPs were the most common variations among M. ap isolates while SNPs in addition to Indels were prevalent among M. avium isolates. Genomic variations will be useful in designing host-specific markers for the analysis of mycobacterial evolution and for developing novel diagnostics directed against Johne’s disease in animals.
Mills, Ryan E.; Walter, Klaudia; Stewart, Chip
Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is......, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications...
A recent large-scale phylogenomic study has shown the great degree of topological variation that can be found among eukaryotic phylogenetic trees constructed from single genes, highlighting the problems that can be associated with gene sampling in phylogenetic studies.
Old, S.E.; Mohrenweiser, H.W. (Univ. of Michigan, Ann Arbor (USA))
The triosephosphate isomerase gene from a rhesus monkey, Macaca mulatta, charon 34 library was sequenced. The human and chimpanzee enzymes differ from the rhesus enzyme at ASN 20 and GLU 198. The nucleotide sequence identity between rhesus and human is 97% in the coding region and >94% in the flanking regions. Comparison of the rhesus and chimp genes, including the intron and flanking sequences, does not suggest a mechanism for generating the two TPI peptides of proliferating cells from hominoids and a single peptide from the rhesus gene.
Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
Walker, Joseph F; Zanis, Michael J; Emery, Nancy C
Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.
Full Text Available Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies.Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model. However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used
Steven N Hart
Full Text Available BACKGROUND: Structural variation (SV represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. RESULTS: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1 not requiring secondary (or exhaustive primary alignment, 2 portability into established sequencing workflows, and 3 is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.. SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. CONCLUSIONS: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.
Mygind, Tina; Jacobsen, Iben Søgaard; Melkova, Renata
The gap gene encodes the glycolytic enzyme glyceraldehyde 3-phosphate dehydrogenase (GAPDH). The gene was cloned and sequenced from the Mycoplasma hominis type strain PG21(T). The intraspecies variability was investigated by inspection of restriction fragment length polymorphism (RFLP) patterns...... after polymerase chain reaction (PCR) amplification of the gap gene from 15 strains and furthermore by sequencing of part of the gene in eight strains. The M. hominis gap gene was found to vary more than the Escherichia coli counterpart, but the variation at nucleotide level gave rise to only a few...... amino acid substitutions. To verify that the gene was expressed in M. hominis, a polyclonal antibody was produced and tested against whole cell protein from 15 strains. The enzyme was expressed in all strains investigated as a 36-kDa protein. All strains except type strain PG21(T) showed reaction...
U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...
Stephen B Montgomery
Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.
Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción
In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a
Full Text Available Abstract Background In highly copy number variable (CNV regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach. Results As a proof of principle, we enriched a ~850 kb section comprising the CNV defensin gene cluster DEFB, the invariable DEFA part and 11 control regions from two genomes by sequence capture and sequenced it by 454 technology. 6,651 differences to the human reference genome were found. Comparison to HapMap genotypes revealed sensitivities and specificities in the range of 94% to 99% for the identification of variations. Using error probabilities for rigorous filtering revealed 2,886 unique single nucleotide variations (SNVs including 358 putative novel ones. DEFB CN determinations by haplotype ratios were in agreement with alternative methods. Conclusion Although currently labor extensive and having high costs, target enriched NGS provides a powerful tool for the comprehensive assessment of SNVs in highly polymorphic CNV regions of individual genomes. Furthermore, it reveals considerable amounts of putative novel variations and simultaneously allows CN estimation.
First page Back Continue Last page Overview Graphics. Sequencing results of pncA gene at JALMA. Red colour indicates novel mutations, Blue colour indicates the novel mutations reported at the same codon earlier also.
Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K
A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.
Hood, Elizabeth; Teoh, Thomas
This invention is in the field of plant biology and agriculture and relates to novel seed specific promoter regions. The present invention further provide methods of producing proteins and other products of interest and methods of controlling expression of nucleic acid sequences of interest using the seed specific promoter regions.
dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn
To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...
He, Fengping; Zhai, Jiancheng; Zhang, Le; Liu, Dan; Ma, Yue; Rong, Ke; Xu, Yanchun; Ma, Jianzhang
The Amur tiger is one of the most endangered species in the world, and the healthy population of captive Amur tigers assists the recovery of the wild population. Gut microbes have been shown to be important for human disease and health, but little research exists regarding the microbiome of Amur tigers in captivity. In this study, we used an integrated approach of 16S rRNA gene sequencing combined with ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS)-based metabolomics to analyze the effects of Fenbendazole and Ivermectin Tablets on the gut microbiota and fecal metabolic phenotype of the Amur tiger. The relative abundances of the bacterial genera Collinsella, Clostridium XI and Megamonas were decreased, whereas those of Escherichia and Clostridium sensu stricto were increased in experimental Amur tigers compared with those in normal controls. Meanwhile, distinct changes in the fecal metabolic phenotype of the experimental Amur tigers were also found, including lower levels of acrylic acid, acetoacetate and catechol and higher amounts of 5,6-dihydrouracil, adenine hydrochloride hydrate and galactitol. Moreover, the differentially abundant gut microbes were substantially associated with the altered fecal metabolites, especially the bacteria in the Firmicutes and Actinomycetes, which were involved in the metabolism of 5,6-dihydrouracil, 6-phospho-d-gluconate and 1-methylnicotinamide. Our results indicate for the first time that Fenbendazole and Ivermectin Tablets not only disturb the gut microbiota at the abundance level but also alter the metabolic homeostasis of the Amur tiger. Copyright © 2018 Elsevier Inc. All rights reserved.
Rasmussen, M; Sunde, L; Nielsen, M L
Identification of fetal kidney anomalies invites questions about underlying causes and recurrence risk in future pregnancies. We therefore investigated the diagnostic yield of next-generation sequencing in fetuses with bilateral kidney anomalies and the correlation between disrupted genes and fetal...... phenotypes. Fetuses with bilateral kidney anomalies were screened using an in-house-designed kidney-gene panel. In families where candidate variants were not identified, whole-exome sequencing was performed. Genes uncovered by this analysis were added to our kidney-panel. We identified likely deleterious...... of nephronophthisis. Exome sequencing identified ROBO1 variants in one family and a GREB1L variant in another family. GREB1L and ROBO1 were added to our kidney-gene panel and additional variants were identified. Next-generation sequencing substantially contributes to identifying causes of fetal kidney anomalies...
Soini Heidi K
Full Text Available Abstract Background The genetic background of type 2 diabetes is complex involving contribution by both nuclear and mitochondrial genes. There is an excess of maternal inheritance in patients with type 2 diabetes and, furthermore, diabetes is a common symptom in patients with mutations in mitochondrial DNA (mtDNA. Polymorphisms in mtDNA have been reported to act as risk factors in several complex diseases. Findings We examined the nucleotide variation in complete mtDNA sequences of 64 Finnish patients with matrilineal diabetes. We used conformation sensitive gel electrophoresis and sequencing to detect sequence variation. We analysed the pathogenic potential of nonsynonymous variants detected in the sequences and examined the role of the m.16189 T>C variant. Controls consisted of non-diabetic subjects ascertained in the same population. The frequency of mtDNA haplogroup V was 3-fold higher in patients with diabetes. Patients harboured many nonsynonymous mtDNA substitutions that were predicted to be possibly or probably damaging. Furthermore, a novel m.13762 T>G in MTND5 leading to p.Ser476Ala and several rare mtDNA variants were found. Haplogroup H1b harbouring m.16189 T > C and m.3010 G > A was found to be more frequent in patients with diabetes than in controls. Conclusions Mildly deleterious nonsynonymous mtDNA variants and rare population-specific haplotypes constitute genetic risk factors for maternally inherited diabetes.
Methods: DNA extraction, purification, amplification and sequencing of Internal Transcribed Spacer (ITS) genes were per- formed using ... Keywords: Internal transcribed spacer genes, phylogenetic, genetic relationship, clinical and environmental fungi, HIV-TB. ... Nigeria. An Ethical clearance was obtained from the Eth-.
Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.
Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both [3'- 32 P]-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these γ-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues
Zhang, J R; Norris, S J
The Lyme disease spirochete Borrelia burgdorferi possesses 15 silent vls cassettes and a vls expression site (vlsE) encoding a surface-exposed lipoprotein. Segments of the silent vls cassettes have been shown to recombine with the vlsE cassette region in the mammalian host, resulting in combinatorial antigenic variation. Despite promiscuous recombination within the vlsE cassette region, the 5' and 3' coding sequences of vlsE that flank the cassette region are not subject to sequence variation during these recombination events. The segments of the silent vls cassettes recombine in the vlsE cassette region through a unidirectional process such that the sequence and organization of the silent vls loci are not affected. As a result of recombination, the previously expressed segments are replaced by incoming segments and apparently degraded. These results provide evidence for a gene conversion mechanism in VlsE antigenic variation.
The study of gene mutation is one of the hot topics in the field of life science nowadays, and the related detection methods and diagnostic technology have been developed rapidly. Sequencing technology plays an indispensable role in the definite diagnosis and classification of genetic diseases. In this review, we summarize the research progress in sequencing technology, evaluate the advantages and disadvantages of 1(st) ~3(rd) generation of sequencing technology, and describe its application in gene diagnosis. Also we made forecasts and prospects on its development trend.
Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.
Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary
Bryony A. Thompson
Full Text Available Inherited mutations in the DNA mismatch repair genes (MMR can cause MMR deficiency and increased susceptibility to colorectal and endometrial cancer. Microsatellite instability (MSI is the defining molecular signature of MMR deficiency. The clinical classification of identified MMR gene sequence variants has a direct impact on the management of patients and their families. For a significant proportion of cases sequence variants of uncertain clinical significance (also known as unclassified variants are identified, constituting a challenge for genetic counselling and clinical management of families. The effect on protein function of these variants is difficult to interpret. The presence or absence of MSI in tumours can aid in determining the pathogenicity of associated unclassified MMR gene variants. However, there are some considerations that need to be taken into account when using MSI for variant interpretation. The use of MSI and other tumour characteristics in MMR gene sequence variant classification will be explored in this review.
Full Text Available Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp were separated by a large single-copy region (82,064 bp and a small single-copy region (17,725 bp. The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis.
showed Gst value as 0.013 while gene flow estimates using sequence data information revealed, Delta St ... information that will help in better understanding the relatedness in the functions of the .... differentiated enough across these species.
Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
Ağladıoğlu, Sebahat Yılmaz; Aycan, Zehra; Çetinkaya, Semra; Baş, Veysel Nijat; Önder, Aşan; Peltek Kendirci, Havva Nur; Doğan, Haldun; Ceylaner, Serdar
Maturity-onset diabetes of the youth (MODY), is a genetically and clinically heterogeneous group of diseasesand is often misdiagnosed as type 1 or type 2 diabetes. The aim of this study is to investigate both novel and proven mutations of 11 MODY genes in Turkish children by using targeted next generation sequencing. A panel of 11 MODY genes were screened in 43 children with MODY diagnosed by clinical criterias. Studies of index cases was done with MISEQ-ILLUMINA, and family screenings and confirmation studies of mutations was done by Sanger sequencing. We identified 28 (65%) point mutations among 43 patients. Eighteen patients have GCK mutations, four have HNF1A, one has HNF4A, one has HNF1B, two have NEUROD1, one has PDX1 gene variations and one patient has both HNF1A and HNF4A heterozygote mutations. This is the first study including molecular studies of 11 MODY genes in Turkish children. GCK is the most frequent type of MODY in our study population. Very high frequency of novel mutations (42%) in our study population, supports that in heterogenous disorders like MODY sequence analysis provides rapid, cost effective and accurate genetic diagnosis.
Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.
Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979
Solus, Joseph F; Arietta, Brenda J; Harris, James R; Sexton, David P; Steward, John Q; McMunn, Chara; Ihrie, Patrick; Mehall, Janelle M; Edwards, Todd L; Dawson, Elliott P
The extent of genetic variation found in drug metabolism genes and its contribution to interindividual variation in response to medication remains incompletely understood. To better determine the identity and frequency of variation in 11 phase I drug metabolism genes, the exons and flanking intronic regions of the cytochrome P450 (CYP) isoenzyme genes CYP1A1, CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4 and CYP3A5 were amplified from genomic DNA and sequenced. A total of 60 kb of bi-directional sequence was generated from each of 93 human DNAs, which included Caucasian, African-American and Asian samples. There were 388 different polymorphisms identified. These included 269 non-coding, 45 synonymous and 74 non-synonymous polymorphisms. Of these, 54% were novel and included 176 non-coding, 14 synonymous and 21 non-synonymous polymorphisms. Of the novel variants observed, 85 were represented by single occurrences of the minor allele in the sample set. Much of the variation observed was from low-frequency alleles. Comparatively, these genes are variation-rich. Calculations measuring genetic diversity revealed that while the values for the individual genes are widely variable, the overall nucleotide diversity of 7.7 x 10(-4) and polymorphism parameter of 11.5 x 10(-4) are higher than those previously reported for other gene sets. Several independent measurements indicate that these genes are under selective pressure, particularly for polymorphisms corresponding to non-synonymous amino acid changes. There is relatively little difference in measurements of diversity among the ethnic groups, but there are large differences among the genes and gene subfamilies themselves. Of the three CYP subfamilies involved in phase I drug metabolism (1, 2, and 3), subfamily 2 displays the highest levels of genetic diversity.
Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk
Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes . Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.
Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.
For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960
Full Text Available For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
Kusunoki, Kazutaka; Nakano, Yuki; Tanaka, Keisuke; Sakata, Yoichi; Koyama, Hiroyuki; Kobayashi, Yuriko
Differences in the expression levels of aluminium (Al) tolerance genes are a known determinant of Al tolerance among plant varieties. We combined transcriptomic analysis of six Arabidopsis thaliana accessions with contrasting Al tolerance and a reverse genetic approach to identify Al-tolerance genes responsible for differences in Al tolerance between accession groups. Gene expression variation increased in the signal transduction process under Al stress and in growth-related processes in the absence of stress. Co-expression analysis and promoter single nucleotide polymorphism searching suggested that both trans-acting polymorphisms of Al signal transduction pathway and cis-acting polymorphisms in the promoter sequences caused the variations in gene expression associated with Al tolerance. Compared with the wild type, Al sensitivity increased in T-DNA knockout (KO) lines for five genes, including TARGET OF AVRB OPERATION1 (TAO1) and an unannotated gene (At5g22530). These were identified from 53 Al-inducible genes showing significantly higher expression in tolerant accessions than in sensitive accessions. These results indicate that the difference in transcriptional signalling is partly associated with the natural variation in Al tolerance in Arabidopsis. Our study also demonstrates the feasibility of comparative transcriptome analysis by using natural genetic variation for the identification of genes responsible for Al stress tolerance. © 2016 John Wiley & Sons Ltd.
Petrovski, Slavé; Todd, Jamie L; Durheim, Michael T; Wang, Quanli; Chien, Jason W; Kelly, Fran L; Frankel, Courtney; Mebane, Caroline M; Ren, Zhong; Bridgers, Joshua; Urban, Thomas J; Malone, Colin D; Finlen Copeland, Ashley; Brinkley, Christie; Allen, Andrew S; O'Riordan, Thomas; McHutchison, John G; Palmer, Scott M; Goldstein, David B
Idiopathic pulmonary fibrosis (IPF) is an increasingly recognized, often fatal lung disease of unknown etiology. The aim of this study was to use whole-exome sequencing to improve understanding of the genetic architecture of pulmonary fibrosis. We performed a case-control exome-wide collapsing analysis including 262 unrelated individuals with pulmonary fibrosis clinically classified as IPF according to American Thoracic Society/European Respiratory Society/Japanese Respiratory Society/Latin American Thoracic Association guidelines (81.3%), usual interstitial pneumonia secondary to autoimmune conditions (11.5%), or fibrosing nonspecific interstitial pneumonia (7.2%). The majority (87%) of case subjects reported no family history of pulmonary fibrosis. We searched 18,668 protein-coding genes for an excess of rare deleterious genetic variation using whole-exome sequence data from 262 case subjects with pulmonary fibrosis and 4,141 control subjects drawn from among a set of individuals of European ancestry. Comparing genetic variation across 18,668 protein-coding genes, we found a study-wide significant (P RTEL1, and PARN. A model qualifying ultrarare, deleterious, nonsynonymous variants implicated TERT and RTEL1, and a model specifically qualifying loss-of-function variants implicated RTEL1 and PARN. A subanalysis of 186 case subjects with sporadic IPF confirmed TERT, RTEL1, and PARN as study-wide significant contributors to sporadic IPF. Collectively, 11.3% of case subjects with sporadic IPF carried a qualifying variant in one of these three genes compared with the 0.3% carrier rate observed among control subjects (odds ratio, 47.7; 95% confidence interval, 21.5-111.6; P = 5.5 × 10 -22 ). We identified TERT, RTEL1, and PARN-three telomere-related genes previously implicated in familial pulmonary fibrosis-as significant contributors to sporadic IPF. These results support the idea that telomere dysfunction is involved in IPF pathogenesis.
Nierychlo, Marta; Larsen, Poul; Jørgensen, Mads Koustrup
S rRNA gene amplicon sequencing has been developed over the past few years and is now ready to use for more comprehensive studies related to plant operation and optimization thanks to short analysis time, low cost, high throughput, and high taxonomic resolution. In this study we show how 16S r......RNA gene amplicon sequencing can be used to reveal factors of importance for the operation of full-scale nutrient removal plants related to settling problems and floc properties. Using optimized DNA extraction protocols, indexed primers and our in-house Illumina platform, we prepared multiple samples...... be correlated to the presence of the species that are regarded as “strong” and “weak” floc formers. In conclusion, 16S rRNA gene amplicon sequencing provides a high throughput approach for a rapid and cheap community profiling of activated sludge that in combination with multivariate statistics can be used...
Porteous David J
Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.
Tercilio Calsa Jr.
Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.
Richards, Emilie J; Brown, Jeremy M; Barley, Anthony J; Chong, Rebecca A; Thomson, Robert C
The use of large genomic datasets in phylogenetics has highlighted extensive topological variation across genes. Much of this discordance is assumed to result from biological processes. However, variation among gene trees can also be a consequence of systematic error driven by poor model fit, and the relative importance of biological versus methodological factors in explaining gene tree variation is a major unresolved question. Using mitochondrial genomes to control for biological causes of gene tree variation, we estimate the extent of gene tree discordance driven by systematic error and employ posterior prediction to highlight the role of model fit in producing this discordance. We find that the amount of discordance among mitochondrial gene trees is similar to the amount of discordance found in other studies that assume only biological causes of variation. This similarity suggests that the role of systematic error in generating gene tree variation is underappreciated and critical evaluation of fit between assumed models and the data used for inference is important for the resolution of unresolved phylogenetic questions.
Daniel J Kliebenstein
Full Text Available BACKGROUND: Most eukaryotic genomes have undergone whole genome duplications during their evolutionary history. Recent studies have shown that the function of these duplicated genes can diverge from the ancestral gene via neo- or sub-functionalization within single genotypes. An additional possibility is that gene duplicates may also undergo partitioning of function among different genotypes of a species leading to genetic differentiation. Finally, the ability of gene duplicates to diverge may be limited by their biological function. METHODOLOGY/PRINCIPAL FINDINGS: To test these hypotheses, I estimated the impact of gene duplication and metabolic function upon intraspecific gene expression variation of segmental and tandem duplicated genes within Arabidopsis thaliana. In all instances, the younger tandem duplicated genes showed higher intraspecific gene expression variation than the average Arabidopsis gene. Surprisingly, the older segmental duplicates also showed evidence of elevated intraspecific gene expression variation albeit typically lower than for the tandem duplicates. The specific biological function of the gene as defined by metabolic pathway also modulated the level of intraspecific gene expression variation. The major energy metabolism and biosynthetic pathways showed decreased variation, suggesting that they are constrained in their ability to accumulate gene expression variation. In contrast, a major herbivory defense pathway showed significantly elevated intraspecific variation suggesting that it may be under pressure to maintain and/or generate diversity in response to fluctuating insect herbivory pressures. CONCLUSION: These data show that intraspecific variation in gene expression is facilitated by an interaction of gene duplication and biological activity. Further, this plays a role in controlling diversity of plant metabolism.
Fay, Justin C.; McCullough, Heather L.; Sniegowski, Paul D.; Eisen, Michael B.
The relationship between genetic variation in gene expression and phenotypic variation observable in nature is not well understood. Identifying how many phenotypes are associated with differences in gene expression and how many gene-expression differences are associated with a phenotype is important to understanding the molecular basis and evolution of complex traits. Results: We compared levels of gene expression among nine natural isolates of Saccharomyces cerevisiae grown either in the presence or absence of copper sulfate. Of the nine strains, two show a reduced growth rate and two others are rust colored in the presence of copper sulfate. We identified 633 genes that show significant differences in expression among strains. Of these genes,20 were correlated with resistance to copper sulfate and 24 were correlated with rust coloration. The function of these genes in combination with their expression pattern suggests the presence of both correlative and causative expression differences. But the majority of differentially expressed genes were not correlated with either phenotype and showed the same expression pattern both in the presence and absence of copper sulfate. To determine whether these expression differences may contribute to phenotypic variation under other environmental conditions, we examined one phenotype, freeze tolerance, predicted by the differential expression of the aquaporin gene AQY2. We found freeze tolerance is associated with the expression of AQY2. Conclusions: Gene expression differences provide substantial insight into the molecular basis of naturally occurring traits and can be used to predict environment dependent phenotypic variation.
Aflitos, Saulo; Schijlen, Elio; de Jong, Hans; de Ridder, Dick; Smit, Sandra; Finkers, Richard; Wang, Jun; Zhang, Gengyun; Li, Ning; Mao, Likai; Bakker, Freek; Dirks, Rob; Breit, Timo; Gravendeel, Barbara; Huits, Henk; Struss, Darush; Swanson-Wagner, Ruth; van Leeuwen, Hans; van Ham, Roeland C H J; Fito, Laia; Guignier, Laëtitia; Sevilla, Myrna; Ellul, Philippe; Ganko, Eric; Kapur, Arvind; Reclus, Emannuel; de Geus, Bernard; van de Geest, Henri; Te Lintel Hekkert, Bas; van Haarst, Jan; Smits, Lars; Koops, Andries; Sanchez-Perez, Gabino; van Heusden, Adriaan W; Visser, Richard; Quan, Zhiwu; Min, Jiumeng; Liao, Li; Wang, Xiaoli; Wang, Guangbiao; Yue, Zhen; Yang, Xinhua; Xu, Na; Schranz, Eric; Smets, Erik; Vos, Rutger; Rauwerda, Johan; Ursem, Remco; Schuit, Cees; Kerns, Mike; van den Berg, Jan; Vriezen, Wim; Janssen, Antoine; Datema, Erwin; Jahrman, Torben; Moquet, Frederic; Bonnet, Julien; Peters, Sander
We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Scheuermann, Markus O.; Tajbakhsh, Jian; Kurz, Anette; Saracoglu, Kaan; Eils, Roland; Lichter, Peter
Knowledge about the functional impact of the topological organization of DNA sequences within interphase chromosome territories is still sparse. Of the few analyzed single copy genomic DNA sequences, the majority had been found to localize preferentially at the chromosome periphery or to loop out from chromosome territories. By means of dual-color fluorescence in situ hybridization (FISH), immunolabeling, confocal microscopy, and three-dimensional (3D) image analysis, we analyzed the intraterritorial and nuclear localization of 10 genomic fragments of different sequence classes in four different human cell types. The localization of three muscle-specific genes FLNA, NEB, and TTN, the oncogene BCL2, the tumor suppressor gene MADH4, and five putatively nontranscribed genomic sequences was predominantly in the periphery of the respective chromosome territories, independent from transcriptional status and from GC content. In interphase nuclei, the noncoding sequences were only rarely found associated with heterochromatic sites marked by the satellite III DNA D1Z1 or clusters of mammalian heterochromatin proteins (HP1α, HP1β, HP1γ). However, the nontranscribed sequences were found predominantly at the nuclear periphery or at the nucleoli, whereas genes tended to localize on chromosome surfaces exposed to the nuclear interior
Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Harnyuttanakorn, Pongchai
Antigen-detecting rapid diagnostic tests (RDTs) have been recommended by the World Health Organization for use in remote areas to improve malaria case management. Lactate dehydrogenase (LDH) of Plasmodium falciparum is one of the main parasite antigens employed by various commercial RDTs. It has been hypothesized that the poor detection of LDH-based RDTs is attributed in part to the sequence diversity of the gene. To test this, the present study aimed to investigate the genetic diversity of the P. falciparum ldh gene in Thailand and to construct the map of LDH sequence diversity in P. falciparum populations worldwide. The ldh gene was sequenced for 50 P. falciparum isolates in Thailand and compared with hundreds of sequences from P. falciparum populations worldwide. Several indices of molecular variation were calculated, including the proportion of polymorphic sites, the average nucleotide diversity index (π), and the haplotype diversity index (H). Tests of positive selection and neutrality tests were performed to determine signatures of natural selection on the gene. Mean genetic distance within and between species of Plasmodium ldh was analysed to infer evolutionary relationships. Nucleotide sequences of P. falciparum ldh could be classified into 9 alleles, encoding 5 isoforms of LDH. L1a was the most common allelic type and was distributed in P. falciparum populations worldwide. Plasmodium falciparum ldh sequences were highly conserved, with haplotype and nucleotide diversity values of 0.203 and 0.0004, respectively. The extremely low genetic diversity was maintained by purifying selection, likely due to functional constraints. Phylogenetic analysis inferred the close genetic relationship of P. falciparum to malaria parasites of great apes, rather than to other human malaria parasites. This study revealed the global genetic variation of the ldh gene in P. falciparum, providing knowledge for improving detection of LDH-based RDTs and supporting the candidacy of
Sciarra, Alessandra; Cantucci, Barbara; Galli, Gianfranco; Cinti, Daniele; Pizzino, Luca
Several geochemical surveys (soil gas and shallow water) were performed in the Modena province (Massa Finalese, Finale Emilia, Medolla and S. Felice sul Panaro), during 2006-2014 period. In May-June 2012, a seismic sequence (main shocks of ML 5.9 and 5.8) was occurred closely to the investigated area. In this area 300 CO2 and CH4 fluxes measurements, 150 soil gas concentrations (He, H2, CO2, CH4 and C2H6), 30 shallow waters and their isotopic analyses (δ13C- CH4, δD- CH4 and δ13C- CO2) were performed in April-May 2006, October and December 2008, repeated in May and September 2012, June 2013 and July 2014 afterwards the 2012 Emilia seismic sequences. Chemical composition of soil gas are dominated by CH4 in the southern part by CO2 in the northern part. Very anomalous fluxes and concentrations are recorded in spot areas; elsewhere CO2 and CH4 values are very low, within the typical range of vegetative and of organic exhalation of the cultivated soil. After the seismic sequence the CH4 and CO2 fluxes are increased of one order of magnitude in the spotty areas, whereas in the surrounding area the values are within the background. On the contrary, CH4 concentration decrease (40%v/v in the 2012 surveys) and CO2 concentration increase until to 12.7%v/v (2013 survey). Isotopic gas analysis were carried out only on samples with anomalous values. Pre-seismic data hint a thermogenic origin of CH4 probably linked to leakage from a deep source in the Medolla area. Conversely, 2012/2013 isotopic data indicate a typical biogenic origin (i.e. microbial hydrocarbon production) of the CH4, as recognized elsewhere in the Po Plain and surroundings. The δ13C-CO2 value suggests a prevalent shallow origin of CO2 (i.e. organic and/or soil-derived) probably related to anaerobic oxidation of heavy hydrocarbons. Water samples, collected from domestic, industrial and hydrocarbons exploration wells, allowed us to recognize different families of waters. Waters are meteoric in origin and
Full Text Available Rhinovirus (RV is the most prevalent human respiratory virus and is responsible for at least half of all common colds. RV infections may result in a broad spectrum of effects that range from asymptomatic infections to severe lower respiratory illnesses. The basis for inter-individual variation in the response to RV infection is not well understood. In this study, we explored whether host genetic variation is associated with variation in gene expression response to RV infections between individuals. To do so, we obtained genome-wide genotype and gene expression data in uninfected and RV-infected peripheral blood mononuclear cells (PBMCs from 98 individuals. We mapped local and distant genetic variation that is associated with inter-individual differences in gene expression levels (eQTLs in both uninfected and RV-infected cells. We focused specifically on response eQTLs (reQTLs, namely, genetic associations with inter-individual variation in gene expression response to RV infection. We identified local reQTLs for 38 genes, including genes with known functions in viral response (UBA7, OAS1, IRF5 and genes that have been associated with immune and RV-related diseases (e.g., ITGA2, MSR1, GSTM3. The putative regulatory regions of genes with reQTLs were enriched for binding sites of virus-activated STAT2, highlighting the role of condition-specific transcription factors in genotype-by-environment interactions. Overall, we suggest that the 38 loci associated with inter-individual variation in gene expression response to RV-infection represent promising candidates for affecting immune and RV-related respiratory diseases.
Çalışkan, Minal; Baker, Samuel W; Gilad, Yoav; Ober, Carole
Rhinovirus (RV) is the most prevalent human respiratory virus and is responsible for at least half of all common colds. RV infections may result in a broad spectrum of effects that range from asymptomatic infections to severe lower respiratory illnesses. The basis for inter-individual variation in the response to RV infection is not well understood. In this study, we explored whether host genetic variation is associated with variation in gene expression response to RV infections between individuals. To do so, we obtained genome-wide genotype and gene expression data in uninfected and RV-infected peripheral blood mononuclear cells (PBMCs) from 98 individuals. We mapped local and distant genetic variation that is associated with inter-individual differences in gene expression levels (eQTLs) in both uninfected and RV-infected cells. We focused specifically on response eQTLs (reQTLs), namely, genetic associations with inter-individual variation in gene expression response to RV infection. We identified local reQTLs for 38 genes, including genes with known functions in viral response (UBA7, OAS1, IRF5) and genes that have been associated with immune and RV-related diseases (e.g., ITGA2, MSR1, GSTM3). The putative regulatory regions of genes with reQTLs were enriched for binding sites of virus-activated STAT2, highlighting the role of condition-specific transcription factors in genotype-by-environment interactions. Overall, we suggest that the 38 loci associated with inter-individual variation in gene expression response to RV-infection represent promising candidates for affecting immune and RV-related respiratory diseases.
Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.
Duffy, Michael F; Tang, Jingyi; Sumardy, Fransisca; Nguyen, Hanh H T; Selvarajah, Shamista A; Josling, Gabrielle A; Day, Karen P; Petter, Michaela; Brown, Graham V
The Plasmodium falciparum var multigene family encodes the cytoadhesive, variant antigen PfEMP1. P. falciparum antigenic variation and cytoadhesion specificity are controlled by epigenetic switching between the single, or few, simultaneously expressed var genes. Most var genes are maintained in perinuclear clusters of heterochromatic telomeres. The active var gene(s) occupy a single, perinuclear var expression site. It is unresolved whether the var expression site forms in situ at a telomeric cluster or whether it is an extant compartment to which single chromosomes travel, thus controlling var switching. Here we show that transcription of a var gene did not require decreased colocalisation with clusters of telomeres, supporting var expression site formation in situ. However following recombination within adjacent subtelomeric sequences, the same var gene was persistently activated and did colocalise less with telomeric clusters. Thus, participation in stable, heterochromatic, telomere clusters and var switching are independent but are both affected by subtelomeric sequences. The var expression site colocalised with the euchromatic mark H3K27ac to a greater extent than it did with heterochromatic H3K9me3. H3K27ac was enriched within the active var gene promoter even when the var gene was transiently repressed in mature parasites and thus H3K27ac may contribute to var gene epigenetic memory. © 2016 Federation of European Biochemical Societies.
ArulJothi, K N; Suruthi Abirami, B; Devi, Arikketh
Low density lipoprotein receptor (LDLR) is a membrane bound receptor maintaining cholesterol homeostasis along with Apolipoprotein B (APOB), Proprotein Convertase Subtilisin/Kexin type 9 (PCSK9) and other genes of lipid metabolism. Any pathogenic variation in these genes alters the function of the receptor and leads to Familial Hypercholesterolemia (FH) and other cardiovascular diseases. This study was aimed at screening the LDLR, APOB and PCSK9 genes in Hypercholesterolemic patients to define the genetic spectrum of FH in Indian population. Familial Hypercholesterolemia patients (n=78) of South Indian Tamil population with LDL cholesterol and Total cholesterol levels above 4.9mmol/l and 7.5mmol/l with family history of Myocardial infarction were involved. DNA was isolated by organic extraction method from blood samples and LDLR, APOB and PCSK9 gene exons were amplified using primers that cover exon-intron boundaries. The amplicons were screened using High Resolution Melt (HRM) Analysis and the screened samples were sequenced after purification. This study reports 20 variations in South Indian population for the first time. In this set of variations 9 are novel variations which are reported for the first time, 11 were reported in other studies also. The in silico analysis for all the variations detected in this study were done to predict the probabilistic effect in pathogenicity of FH. This study adds 9 novel variations and 11 recurrent variations to the spectrum of LDLR gene mutations in Indian population. All these variations are reported for the first time in Indian population. This spectrum of variations was different from the variations of previous Indian reports. Copyright © 2017 Elsevier B.V. All rights reserved.
Sin, Yung Wa; Dugdale, Hannah L.; Newman, Chris; Macdonald, David W.; Burke, Terry
The major histocompatibility complex (MHC) comprises many genes, some of which are polymorphic with numerous alleles. Sequence variation among alleles is most pronounced in exon 2 of the class II genes, which encodes the alpha 1 and beta 1 domains that form the antigen-binding site (ABS) for the
Ashley David M
Full Text Available Abstract Background Rhabdoid tumors are rare cancers of early childhood arising in the kidney, central nervous system and other organs. The majority are caused by somatic inactivating mutations or deletions affecting the tumor suppressor locus SMARCB1 [OMIM 601607]. Germ-line SMARCB1 inactivation has been reported in association with rhabdoid tumor, epitheloid sarcoma and familial schwannomatosis, underscoring the importance of accurate mutation screening to ascertain recurrence and transmission risks. We describe a rapid and sensitive diagnostic screening method, using high resolution melting (HRM, for detecting sequence variations in SMARCB1. Methods Amplicons, encompassing the nine coding exons of SMARCB1, flanking splice site sequences and the 5' and 3' UTR, were screened by both HRM and direct DNA sequencing to establish the reliability of HRM as a primary mutation screening tool. Reaction conditions were optimized with commercially available HRM mixes. Results The false negative rate for detecting sequence variants by HRM in our sample series was zero. Nine amplicons out of a total of 140 (6.4% showed variant melt profiles that were subsequently shown to be false positive. Overall nine distinct pathogenic SMARCB1 mutations were identified in a total of 19 possible rhabdoid tumors. Two tumors had two distinct mutations and two harbored SMARCB1 deletion. Other mutations were nonsense or frame-shifts. The detection sensitivity of the HRM screening method was influenced by both sequence context and specific nucleotide change and varied from 1: 4 to 1:1000 (variant to wild-type DNA. A novel method involving digital HRM, followed by re-sequencing, was used to confirm mutations in tumor specimens containing associated normal tissue. Conclusions This is the first report describing SMARCB1 mutation screening using HRM. HRM is a rapid, sensitive and inexpensive screening technology that is likely to be widely adopted in diagnostic laboratories to
Dagar, Vinod; Chow, Chung-Wo; Ashley, David M; Algar, Elizabeth M
Rhabdoid tumors are rare cancers of early childhood arising in the kidney, central nervous system and other organs. The majority are caused by somatic inactivating mutations or deletions affecting the tumor suppressor locus SMARCB1 [OMIM 601607]. Germ-line SMARCB1 inactivation has been reported in association with rhabdoid tumor, epitheloid sarcoma and familial schwannomatosis, underscoring the importance of accurate mutation screening to ascertain recurrence and transmission risks. We describe a rapid and sensitive diagnostic screening method, using high resolution melting (HRM), for detecting sequence variations in SMARCB1. Amplicons, encompassing the nine coding exons of SMARCB1, flanking splice site sequences and the 5' and 3' UTR, were screened by both HRM and direct DNA sequencing to establish the reliability of HRM as a primary mutation screening tool. Reaction conditions were optimized with commercially available HRM mixes. The false negative rate for detecting sequence variants by HRM in our sample series was zero. Nine amplicons out of a total of 140 (6.4%) showed variant melt profiles that were subsequently shown to be false positive. Overall nine distinct pathogenic SMARCB1 mutations were identified in a total of 19 possible rhabdoid tumors. Two tumors had two distinct mutations and two harbored SMARCB1 deletion. Other mutations were nonsense or frame-shifts. The detection sensitivity of the HRM screening method was influenced by both sequence context and specific nucleotide change and varied from 1: 4 to 1:1000 (variant to wild-type DNA). A novel method involving digital HRM, followed by re-sequencing, was used to confirm mutations in tumor specimens containing associated normal tissue. This is the first report describing SMARCB1 mutation screening using HRM. HRM is a rapid, sensitive and inexpensive screening technology that is likely to be widely adopted in diagnostic laboratories to facilitate whole gene mutation screening
Y. J. HAN
[Han Y. J., Chen Y., Liu Y. and Liu X. L. 2017 Sequence variants of the LCORL gene and its association with growth and carcass traits in. Qinchuan cattle in China. J. Genet. 96, xx–xx]. Introduction. Genetically selecting is a better way to satisfy the growing customer requirement with the development of beef cattle industry ...
Stanton, L.W.; Schwab, M.; Bishop, J.M.
Human neuroblastomas frequently display amplification and augmented expression of a gene known as N-myc because of its similarity to the protooncogene c-myc. It has therefore been proposed that N-myc is itself a protooncogene, and subsequent tests have shown that N-myc and c-myc have similar biological activities in cell culture. The authors have now detailed the kinship between N-myc and c-myc by determining the nucleotide sequence of human N-myc and deducing the amino acid sequence of the protein encoded by the gene. The topography of N-myc is strikingly similar to that of c-myc: both genes contain three exons of similar lengths; the coding elements of both genes are located in the second and third exons; and both genes have unusually long 5' untranslated regions in their mRNAs, with features that raise the possibility that expression of the genes may be subject to similar controls of translation. The resemblance between the proteins encoded by N-myc and c-myc sustains previous suspicions that the genes encode related functions
Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing
Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
Dor, Roi; Cooper, Caren B; Lovette, Irby J; Massoni, Viviana; Bulit, Flor; Liljesthrom, Marcela; Winkler, David W
Many animals use photoperiod cues to synchronize reproduction with environmental conditions and thereby improve their reproductive success. The circadian clock, which creates endogenous behavioral and physiological rhythms typically entrained to photoperiod, is well characterized at the molecular level. Recent work provided evidence for an association between Clock poly-Q length polymorphism and latitude and, within a population, an association with the date of laying and the length of the incubation period. Despite relatively high overall breeding synchrony, the timing of clutch initiation has a large impact on the fitness of swallows in the genus Tachycineta. We compared length polymorphism in the Clock poly-Q region among five populations from five different Tachycineta species that breed across a hemisphere-wide latitudinal gradient (Fig. 1). Clock poly-Q variation was not associated with latitude; however, there was an association between Clock poly-Q allele diversity and the degree of clutch size decline within breeding seasons. We did not find evidence for an association between Clock poly-Q variation and date of clutch initiation in for any of the five Tachycineta species, nor did we found a relationship between incubation duration and Clock genotype. Thus, there is no general association between latitude, breeding phenology, and Clock polymorphism in this clade of closely related birds.Figure 1Photos of Tachycineta swallows that were used in this study: A) T. bicolor from Ithaca, New York, B) T. leucorrhoa from Chascomús, Argentina, C) T. albilinea from Hill Bank, Belize, D) T. meyeni from Puerto Varas, Chile, and E) T. thalassina from Mono Lake, California, Photographers: B: Valentina Ferretti; A, C-E: David Winkler.
Silverman, M.; Zieg, J.; Simon, M.
In Salmonella, expression of flagellar antigen alternates between two serotypes (phases) encoded by two genes, H1 and H2. The mechanism which controls the alternative expression of the H1 and H2 genes was examined by cloning these genes and the genetic elements which control their activity on hybrid vehicles in Escherichia coli. H2 gene activity was shown to be controlled by a recombinational switch located adjacent to the H2 gene. Activity of the H1 gene is thought to be repressed, when the H2 gene is expressed, by the product of another gene, rhl (repressor of H1), which is controlled coordinately with the H2 gene. In this report, we describe the construction of hybrid lambda vehicles which contain, in addition to the H2 gene, a genetic activity corresponding to rhl. Variation of flagellar antigens analogous to that observed in Salmonella was observed when E. coli strains were transduced with the hybrid lambda. By using the lambda H2rhl hybrid to program protein syntheis in uv-irradiated cells, the synthesis of a polypeptide was correlated with rhl gene product activity. We conclude that the H2 region consists of two cotranscribed genes, H2 and rhl. The expression of both gene products is regulated by the same recombinational event
Stewart Lindsay B
Full Text Available Abstract Background Gene copy number variation (CNV is responsible for several important phenotypes of the malaria parasite Plasmodium falciparum, including drug resistance, loss of infected erythrocyte cytoadherence and alteration of receptor usage for erythrocyte invasion. Despite the known effects of CNV, little is known about its extent throughout the genome. Results We performed a whole-genome survey of CNV genes in P. falciparum using comparative genome hybridisation of a diverse set of 16 laboratory culture-adapted isolates to a custom designed high density Affymetrix GeneChip array. Overall, 186 genes showed hybridisation signals consistent with deletion or amplification in one or more isolate. There is a strong association of CNV with gene length, genomic location, and low orthology to genes in other Plasmodium species. Sub-telomeric regions of all chromosomes are strongly associated with CNV genes independent from members of previously described multigene families. However, ~40% of CNV genes were located in more central regions of the chromosomes. Among the previously undescribed CNV genes, several that are of potential phenotypic relevance are identified. Conclusion CNV represents a major form of genetic variation within the P. falciparum genome; the distribution of gene features indicates the involvement of highly non-random mutational and selective processes. Additional studies should be directed at examining CNV in natural parasite populations to extend conclusions to clinical settings.
Li, Xia; Bachmanov, Alexander A; Maehashi, Kenji; Li, Weihua; Lim, Raymond; Brand, Joseph G; Beauchamp, Gary K; Reed, Danielle R; Thai, Chloe; Floriano, Wely B
Aspartame is a sweetener added to foods and beverages as a low-calorie sugar replacement. Unlike sugars, which are apparently perceived as sweet and desirable by a range of mammals, the ability to taste aspartame varies, with humans, apes, and Old World monkeys perceiving aspartame as sweet but not other primate species. To investigate whether the ability to perceive the sweetness of aspartame correlates with variations in the DNA sequence of the genes encoding sweet taste receptor proteins, T1R2 and T1R3, we sequenced these genes in 9 aspartame taster and nontaster primate species. We then compared these sequences with sequences of their orthologs in 4 other nontasters species. We identified 9 variant sites in the gene encoding T1R2 and 32 variant sites in the gene encoding T1R3 that distinguish aspartame tasters and nontasters. Molecular docking of aspartame to computer-generated models of the T1R2 + T1R3 receptor dimer suggests that species variation at a secondary, allosteric binding site in the T1R2 protein is the most likely origin of differences in perception of the sweetness of aspartame. These results identified a previously unknown site of aspartame interaction with the sweet receptor and suggest that the ability to taste aspartame might have developed during evolution to exploit a specialized food niche.
Grell, Morten Nedergaard; Jensen, Annette Bruun; Lange, Lene
Fungi within the order Entomophthorales (subphylum Entomophthoromycotina) are obligate biotrophic pathogens of arthropods with a remarkable narrow host range. Infection takes place through the cuticle when conidia hit a susceptible host, facilitated by enzymatic and mechanical mechanisms. In the ...... pathogenicity genes within genera Entomophthora and Pandora, using fungal genomic DNA originating from field-collected, infected insect host species of dipteran (flies, mosquitoes) or hemipteran (aphid) origin.......Fungi within the order Entomophthorales (subphylum Entomophthoromycotina) are obligate biotrophic pathogens of arthropods with a remarkable narrow host range. Infection takes place through the cuticle when conidia hit a susceptible host, facilitated by enzymatic and mechanical mechanisms......, conidia are produced and discharged when humidity gets high—usually during night. In an earlier secretome study of field-collected grain aphids (Sitobion avenae) infected with entomophthoralean fungi, a number of pathogenesis-related, secreted enzymes were discovered (Fungal Genetics and Biology 2011, vol...
Wu, Jia Qian; Shteynberg, David; Arumugam, Manimozhiyan
an alternative approach: reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on dual-genome de novo predictions from TWINSCAN. We tested 444 TWINSCAN-predicted rat genes that showed significant homology to known human genes implicated in disease but that were partially...... in the single-intron experiment. Spliced sequences were amplified in 46 cases (34%). We conclude that this procedure for elucidating gene structures with native cDNA sequences is cost-effective and will become even more so as it is further optimized.......The publication of a draft sequence of a third mammalian genome--that of the rat--suggests a need to rethink genome annotation. New mammalian sequences will not receive the kind of labor-intensive annotation efforts that are currently being devoted to human. In this paper, we demonstrate...
Ning, Hong-Rui; Huang, Si-Yang; Wang, Jin-Lei; Xu, Qian-Ming; Zhu, Xing-Quan
Toxoplasma gondii is a eukaryotic parasite of the phylum Apicomplexa, which infects all warm-blood animals, including humans. In the present study, we examined sequence variation in dense granule 20 (GRA20) genes among T. gondii isolates collected from different hosts and geographical regions worldwide. The complete GRA20 genes were amplified from 16 T. gondii isolates using PCR, sequence were analyzed, and phylogenetic reconstruction was analyzed by maximum parsimony (MP) and maximum likelihood (ML) methods. The results showed that the complete GRA20 gene sequence was 1,586 bp in length among all the isolates used in this study, and the sequence variations in nucleotides were 0-7.9% among all strains. However, removing the type III strains (CTG, VEG), the sequence variations became very low, only 0-0.7%. These results indicated that the GRA20 sequence in type III was more divergence. Phylogenetic analysis of GRA20 sequences using MP and ML methods can differentiate 2 major clonal lineage types (type I and type III) into their respective clusters, indicating the GRA20 gene may represent a novel genetic marker for intraspecific phylogenetic analyses of T. gondii.
Jin, Jingjing; Lee, May; Bai, Bin; Sun, Yanwei; Qu, Jing; Rahmadsyah; Alfiko, Yuzer; Lim, Chin Huat; Suwanto, Antonius; Sugiharti, Maria; Wong, Limsoon; Ye, Jian; Chua, Nam-Hai; Yue, Gen Hua
Oil palm is the world's leading source of vegetable oil and fat. Dura, Pisifera and Tenera are three forms of oil palm. The genome sequence of Pisifera is available whereas the Dura form has not been sequenced yet. We sequenced the genome of one elite Dura palm, and re-sequenced 17 palm genomes. The assemble genome sequence of the elite Dura tree contained 10,971 scaffolds and was 1.701 Gb in length, covering 94.49% of the oil palm genome. 36,105 genes were predicted. Re-sequencing of 17 additional palm trees identified 18.1 million SNPs. We found high genetic variation among palms from different geographical regions, but lower variation among Southeast Asian Dura and Pisifera palms. We mapped 10,000 SNPs on the linkage map of oil palm. In addition, high linkage disequilibrium (LD) was detected in the oil palms used in breeding populations of Southeast Asia, suggesting that LD mapping is likely to be practical in this important oil crop. Our data provide a valuable resource for accelerating genetic improvement and studying the mechanism underlying phenotypic variations of important oil palm traits. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Simon, Lauriane; Rabanal, Fernando A; Dubos, Tristan; Oliver, Cecilia; Lauber, Damien; Poulet, Axel; Vogt, Alexander; Mandlbauer, Ariane; Le Goff, Samuel; Sommer, Andreas; Duborjal, Hervé; Tatout, Christophe; Probst, Aline V
Organized in tandem repeat arrays in most eukaryotes and transcribed by RNA polymerase III, expression of 5S rRNA genes is under epigenetic control. To unveil mechanisms of transcriptional regulation, we obtained here in depth sequence information on 5S rRNA genes from the Arabidopsis thaliana genome and identified differential enrichment in epigenetic marks between the three 5S rDNA loci situated on chromosomes 3, 4 and 5. We reveal the chromosome 5 locus as the major source of an atypical, long 5S rRNA transcript characteristic of an open chromatin structure. 5S rRNA genes from this locus translocated in the Landsberg erecta ecotype as shown by linkage mapping and chromosome-specific FISH analysis. These variations in 5S rDNA locus organization cause changes in the spatial arrangement of chromosomes in the nucleus. Furthermore, 5S rRNA gene arrangements are highly dynamic with alterations in chromosomal positions through translocations in certain mutants of the RNA-directed DNA methylation pathway and important copy number variations among ecotypes. Finally, variations in 5S rRNA gene sequence, chromatin organization and transcripts indicate differential usage of 5S rDNA loci in distinct ecotypes. We suggest that both the usage of existing and new 5S rDNA loci resulting from translocations may impact neighboring chromatin organization.
Parker, Jennifer K; Havird, Justin C; De La Fuente, Leonardo
Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops.
Lin, Dong; Shi, Y.; Miller, W.L.
Adrenodoxin reductase is a flavoprotein mediating electron transport to all mitochondrial forms of cytochrome P450. The authors cloned the human adrenodoxin reductase gene and characterized it by restriction endonuclease mapping and DNA sequencing. The entire gene is approximately 12 kilobases long and consists of 12 exons. The first exon encodes the first 26 of the 32 amino acids of the signal peptide, and the second exon encodes the remainder of signal peptide and the apparent FAD binding site. The remaining 10 exons are clustered in a region of only 4.3 kilobases, separated from the first two exons by a large intron of about 5.6 kilobases. Two forms of human adrenodoxin reductase mRNA, differing by the presence or absence of 18 bases in the middle of the sequence, arise from alternate splicing at the 5' end of exon 7. This alternately spliced region is directly adjacent to the NADPH binding site, which is entirely contained in exon 6. The immediate 5' flanking region lacks TATA and CAAT boxes; however, this region is rich in G+C and contains six copies of the sequence GGGCGGG, resembling promoter sequences of housekeeping genes. RNase protection experiments show that transcription is initiated from multiple sites in the 5' flanking region, located about 21-91 base pairs upstream from the AUG translational initiation codon
Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/ Contact: email@example.com Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:22976083
da Silva, M N; Patton, J L
Patterns of evolutionary relationships among haplotype clades of sequences of the mitochondrial cytochrome b DNA gene are examined for five genera of arboreal rodents of the Caviomorph family Echimyidae from the Amazon Basin. Data are available for 798 bp of sequence from a total of 24 separate localities in Peru, Venezuela, Bolivia, and Brazil for Mesomys, Isothrix, Makalata, Dactylomys, and Echimys. Sequence divergence, corrected for multiple hits, is extensive, ranging from less than 1% for comparisons within populations of over 20% among geographic units within genera. Both the degree of differentiation and the geographic patterning of the variation suggest that more than one species composes the Amazonian distribution of the currently recognized Mesomys hispidus, Isothrix bistriata, Makalata didelphoides, and Dactylomys dactylinus. There is general concordance in the geographic range of haplotype clades for each of these taxa, and the overall level of differentiation within them is largely equivalent. These observations suggest that a common vicariant history underlies the respective diversification of each genus. However, estimated times of divergence based on the rate of third position transversion substitutions for the major clades within each genus typically range above 1 million years. Thus, allopatric isolation precipitating divergence must have been considerably earlier than the late Pleistocene forest fragmentation events commonly invoked for Amazonian biota.
Garcia-Fernàndez, J; Baguñà, J; Saló, E
Freshwater planarians (Platyhelminthes, Turbellaria, and Tricladida) are acoelomate, triploblastic, unsegmented, and bilaterally symmetrical organisms that are mainly known for their ample power to regenerate a complete organism from a small piece of their body. To identify potential pattern-control genes in planarian regeneration, we have isolated two homeobox-containing genes, Dth-1 and Dth-2 [Dugesia (Girardia) tigrina homeobox], by using degenerate oligonucleotides corresponding to the most conserved amino acid sequence from helix-3 of the homeodomain. Dth-1 and Dth-2 homeodomains are closely related (68% at the nucleotide level and 78% at the protein level) and show the conserved residues characteristic of the homeodomains identified to data. Similarity with most homeobox sequences is low (30-50%), except with Drosophila NK homeodomains (80-82% with NK-2) and the rodent TTF-1 homeodomain (77-87%). Some unusual amino acid residues specific to NK-2, TTF-1, Dth-1, and Dth-2 can be observed in the recognition helix (helix-3) and may define a family of homeodomains. The deduced amino acid sequences from the cDNAs contain, in addition to the homeodomain, other domains also present in various homeobox-containing genes. The expression of both genes, detected by Northern blot analysis, appear slightly higher in cephalic regions than in the rest of the intact organism, while a slight increase is detected in the central period (5 days) or regeneration. Images PMID:1714599
Probst, Mario C. O.; Thumann, Harald; Aslanidis, Charalampos; Langmann, Thomas; Buechler, Christa; Patsch, Wolfgang; Baralle, Francisco E.; Dallinga-Thie, Geesje M.; Geisel, Jürgen; Keller, Christiane; Menys, Valentine C.; Schmitz, Gerd
Mutations in the ATP-binding cassette 1 transporter gene (ABCA1) are responsible for the genetic HDL-deficiency syndromes, which are characterized by severely diminished plasma HDL-C levels and a predisposition to cardiovascular disease and splenomegaly. The ABCA1 gene contains 50 exons and codes
Full Text Available Abstract The genetic variability of the mitochondrial D-loop DNA sequence in seven horse breeds bred in Italy (Giara, Haflinger, Italian trotter, Lipizzan, Maremmano, Thoroughbred and Sarcidano was analysed. Five unrelated horses were chosen in each breed and twenty-two haplotypes were identified. The sequences obtained were aligned and compared with a reference sequence and with 27 mtDNA D-loop sequences selected in the GenBank database, representing Spanish, Portuguese, North African, wild horses and an Equus asinus sequence as the outgroup. Kimura two-parameter distances were calculated and a cluster analysis using the Neighbour-joining method was performed to obtain phylogenetic trees among breeds bred in Italy and among Italian and foreign breeds. The cluster analysis indicates that all the breeds but Giara are divided in the two trees, and no clear relationships were revealed between Italian populations and the other breeds. These results could be interpreted as showing the mixed origin of breeds bred in Italy and probably indicate the presence of many ancient maternal lineages with high diversity in mtDNA sequences.
Palmer, Guy H; Futse, James E; Knowles, Donald P; Brayton, Kelly A
Persistence of Anaplasma spp. in the animal reservoir host is required for efficient tick-borne transmission of these pathogens to animals and humans. Using A. marginale infection of its natural reservoir host as a model, persistent infection has been shown to reflect sequential cycles in which antigenic variants emerge, replicate, and are controlled by the immune system. Variation in the immunodominant outer-membrane protein MSP2 is generated by a process of gene conversion, in which unique hypervariable region sequences (HVRs) located in pseudogenes are recombined into a single operon-linked msp2 expression site. Although organisms expressing whole HVRs derived from pseudogenes emerge early in infection, long-term persistent infection is dependent on the generation of complex mosaics in which segments from different HVRs recombine into the expression site. The resulting combinatorial diversity generates the number of variants both predicted and shown to emerge during persistence.
Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.
Roecklein, Kathryn A.; Wong, Patricia M.; Franzen, Peter L.; Hasler, Brant P.; Wood-Vasey, W. Michael; Nimgaonkar, Vishwajit L.; Miller, Megan A.; Kepreos, Kyle M.; Ferrell, Robert E.; Manuck, Stephen B.
The human melanopsin gene has been reported to mediate risk for seasonal affective disorder (SAD), which is hypothesized to be caused by decreased photic input during winter when light levels fall below threshold, resulting in differences in circadian phase and/or sleep. However, it is unclear if melanopsin increases risk of SAD by causing differences in sleep or circadian phase, or if those differences are symptoms of the mood disorder. To determine if melanopsin sequence variations are asso...
Davidsen, T.; Rodland, E.A.; Lagesen, K.
Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...... in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H. influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions....
Amila, A; Acosta, A; Sarmiento, M E; Suraiya, Siti; Zafarina, Z; Panneerchelvam, S; Norazmi, M N
MicroRNAs (miRNAs) play an important role in diseases development. Therefore, human miRNAs may be able to inhibit the survival of Mycobacterium tuberculosis (Mtb) in the human host by targeting critical genes of the pathogen. Mutations within miRNAs can alter their target selection, thereby preventing them from inhibiting Mtb genes, thus increasing host susceptibility to the disease. This study was undertaken to investigate the genetic association of pulmonary tuberculosis (TB) with six human miRNAs genes, namely, hsa-miR-370, hsa-miR-520d, hsa-miR-154, hsa-miR-497, hsa-miR-758, and hsa-miR-593, which have been predicted to interact with Mtb genes. The objective of the study was to determine the possible sequence variation of selected miRNA genes that are potentially associated with the inhibition of critical Mtb genes in TB patients. The study did not show differences in the sequences compared with healthy individuals without antecedents of TB. This result could have been influenced by the sample size and the selection of miRNA genes, which need to be addressed in future studies. Copyright © 2015 Asian African Society for Mycobacteriology. Published by Elsevier Ltd. All rights reserved.
Debora S Marks
Full Text Available The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org. This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of
Pritchard, Victoria L; Viitaniemi, Heidi M; McCairns, R J Scott; Merilä, Juha; Nikinmaa, Mikko; Primmer, Craig R; Leder, Erica H
Much adaptive evolutionary change is underlain by mutational variation in regions of the genome that regulate gene expression rather than in the coding regions of the genes themselves. An understanding of the role of gene expression variation in facilitating local adaptation will be aided by an understanding of underlying regulatory networks. Here, we characterize the genetic architecture of gene expression variation in the threespine stickleback (Gasterosteus aculeatus), an important model in the study of adaptive evolution. We collected transcriptomic and genomic data from 60 half-sib families using an expression microarray and genotyping-by-sequencing, and located expression quantitative trait loci (eQTL) underlying the variation in gene expression in liver tissue using an interval mapping approach. We identified eQTL for several thousand expression traits. Expression was influenced by polymorphism in both cis- and trans-regulatory regions. Trans-eQTL clustered into hotspots. We did not identify master transcriptional regulators in hotspot locations: rather, the presence of hotspots may be driven by complex interactions between multiple transcription factors. One observed hotspot colocated with a QTL recently found to underlie salinity tolerance in the threespine stickleback. However, most other observed hotspots did not colocate with regions of the genome known to be involved in adaptive divergence between marine and freshwater habitats. Copyright © 2017 Pritchard et al.
Victoria L. Pritchard
Full Text Available Much adaptive evolutionary change is underlain by mutational variation in regions of the genome that regulate gene expression rather than in the coding regions of the genes themselves. An understanding of the role of gene expression variation in facilitating local adaptation will be aided by an understanding of underlying regulatory networks. Here, we characterize the genetic architecture of gene expression variation in the threespine stickleback (Gasterosteus aculeatus, an important model in the study of adaptive evolution. We collected transcriptomic and genomic data from 60 half-sib families using an expression microarray and genotyping-by-sequencing, and located expression quantitative trait loci (eQTL underlying the variation in gene expression in liver tissue using an interval mapping approach. We identified eQTL for several thousand expression traits. Expression was influenced by polymorphism in both cis- and trans-regulatory regions. Trans-eQTL clustered into hotspots. We did not identify master transcriptional regulators in hotspot locations: rather, the presence of hotspots may be driven by complex interactions between multiple transcription factors. One observed hotspot colocated with a QTL recently found to underlie salinity tolerance in the threespine stickleback. However, most other observed hotspots did not colocate with regions of the genome known to be involved in adaptive divergence between marine and freshwater habitats.
dePamphilis Claude W
Full Text Available Abstract Background The analysis of synonymous and nonsynonymous rates of DNA change can help in the choice among competing explanations for rate variation, such as differences in constraint, mutation rate, or the strength of genetic drift. Nonphotosynthetic plants of the Orobanchaceae have increased rates of DNA change. In this study 38 taxa of Orobanchaceae and relatives were used and 3 plastid genes were sequenced for each taxon. Results Phylogenetic reconstructions of relative rates of sequence evolution for three plastid genes (rbcL, matK and rps2 show significant rate heterogeneity among lineages and among genes. Many of the non-photosynthetic plants have increases in both synonymous and nonsynonymous rates, indicating that both (1 selection is relaxed, and (2 there has been a change in the rate at which mutations are entering the population in these species. However, rate increases are not always immediate upon loss of photosynthesis. Overall there is a poor correlation of synonymous and nonsynonymous rates. There is, however, a strong correlation of synonymous rates across the 3 genes studied and the lineage-speccific pattern for each gene is strikingly similar. This indicates that the causes of synonymous rate variation are affecting the whole plastid genome in a similar way. There is a weaker correlation across genes for nonsynonymous rates. Here the picture is more complex, as could be expected if there are many causes of variation, differing from taxon to taxon and gene to gene. Conclusions The distinctive pattern of rate increases in Orobanchaceae has at least two causes. It is clear that there is a relaxation of constraint in many (though not all non-photosynthetic lineages. However, there is also some force affecting synonymous sites as well. At this point, it is not possible to tell whether it is generation time, speciation rate, mutation rate, DNA repair efficiency or some combination of these factors.
necting the Middle East, Europe and Central Asia, and, thus, has been subject to major population movements. The ... from different parts of Anatolia by direct sequencing. Analysis of the two ... the country, samples were obtained from individuals com- ing from ..... Arlequin: a software environment for the analysis of popula-.
Jang, Yikweon; Hahm, Eun Hye; Lee, Hyun-Jung; Park, Soyeon; Won, Yong-Jin; Choe, Jae C.
Background In a species with a large distribution relative to its dispersal capacity, geographic variation in traits may be explained by gene flow, selection, or the combined effects of both. Studies of genetic diversity using neutral molecular markers show that patterns of isolation by distance (IBD) or barrier effect may be evident for geographic variation at the molecular level in amphibian species. However, selective factors such as habitat, predator, or interspecific interactions may be critical for geographic variation in sexual traits. We studied geographic variation in advertisement calls in the tree frog Hyla japonica to understand patterns of variation in these traits across Korea and provide clues about the underlying forces for variation. Methodology We recorded calls of H. japonica in three breeding seasons from 17 localities including localities in remote Jeju Island. Call characters analyzed were note repetition rate (NRR), note duration (ND), and dominant frequency (DF), along with snout-to-vent length. Results The findings of a barrier effect on DF and a longitudinal variation in NRR seemed to suggest that an open sea between the mainland and Jeju Island and mountain ranges dominated by the north-south Taebaek Mountains were related to geographic variation in call characters. Furthermore, there was a pattern of IBD in mitochondrial DNA sequences. However, no comparable pattern of IBD was found between geographic distance and call characters. We also failed to detect any effects of habitat or interspecific interaction on call characters. Conclusions Geographic variations in call characters as well as mitochondrial DNA sequences were largely stratified by geographic factors such as distance and barriers in Korean populations of H. japoinca. Although we did not detect effects of habitat or interspecific interaction, some other selective factors such as sexual selection might still be operating on call characters in conjunction with restricted gene
Stat, Michael; Bird, Christopher E; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J; Concepcion, Gregory T; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J; Gates, Ruth D
Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping.
Gardiner, Sarah L.; van Belzen, Martine J.; Boogaard, Merel W.
Huntington disease (HD) is a severe neuropsychiatric disorder caused by a cytosine-adenine-guanine (CAG) repeat expansion in the HTT gene. Although HD is frequently complicated by depression, it is still unknown to what extent common HTT CAG repeat size variations in the normal range could affect...
Sepú lveda, Nuno; Campino, Susana G; Assefa, Samuel A; Sutherland, Colin J; Pain, Arnab; Clark, Taane G
Background: The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model.Results: Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates.Conclusions: In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data. 2013 Seplveda et al.; licensee BioMed Central Ltd.
Sepúlveda, Nuno; Campino, Susana G; Assefa, Samuel A; Sutherland, Colin J; Pain, Arnab; Clark, Taane G
The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model. Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates. In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data.
Background: The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model.Results: Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates.Conclusions: In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data. 2013 Seplveda et al.; licensee BioMed Central Ltd.
Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M
X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.
Ayukov, S. V.; Baturin, V. A.; Gorshkov, A. B.; Oreshina, A. V.
Our Sun became Main Sequence star 4.6 Gyr ago according Standard Solar Model. At that time solar luminosity was 30% lower than current value. This conclusion is based on assumption that Sun is fueled by thermonuclear reactions. If Earth's albedo and emissivity in infrared are unchanged during Earth history, 2.3 Gyr ago oceans had to be frozen. This contradicts to geological data: there was liquid water 3.6-3.8 Gyr ago on Earth. This problem is known as Faint Young Sun Paradox. We analyze luminosity change in standard solar evolution theory. Increase of mean molecular weight in the central part of the Sun due to conversion of hydrogen to helium leads to gradual increase of luminosity with time on the Main Sequence. We also consider several exotic models: fully mixed Sun; drastic change of pp reaction rate; Sun consisting of hydrogen and helium only. Solar neutrino observations however exclude most non-standard solar models.
Windig, J.J.; Hoving, R.A.H.; Priem, J.; Bossers, A.; Keulen, van L.J.M.; Langeveld, J.P.M.
Scrapie is a neurodegenerative disease occurring in goats and sheep. Several haplotypes of the prion protein increase resistance to scrapie infection and may be used in selective breeding to help eradicate scrapie. In this study, frequencies of the allelic variants of the PrP gene are determined
Rane, Hallie S; Smith, Jessica M; Bergthorsson, Ulfar; Katju, Vaishali
Gene conversion, a form of concerted evolution, bears enormous potential to shape the trajectory of sequence and functional divergence of gene paralogs subsequent to duplication events. fog-2, a sex-determination gene unique to Caenorhabditis elegans and implicated in the origin of hermaphroditism in this species, resulted from the duplication of ftr-1, an upstream gene of unknown function. Synonymous sequence divergence in regions of fog-2 and ftr-1 (excluding recent gene conversion tracts) suggests that the duplication occurred 46 million generations ago. Gene conversion between fog-2 and ftr-1 was previously discovered in experimental fog-2 knockout lines of C. elegans, whereby hermaphroditism was restored in mutant obligately outcrossing male-female populations. We analyzed DNA-sequence variation in fog-2 and ftr-1 within 40 isolates of C. elegans from diverse geographic locations in order to evaluate the contribution of gene conversion to genetic variation in the two gene paralogs. The analysis shows that gene conversion contributes significantly to DNA-sequence diversity in fog-2 and ftr-1 (22% and 34%, respectively) and may have the potential to alter sexual phenotypes in natural populations. A radical amino acid change in a conserved region of the F-box domain of fog-2 was found in natural isolates of C. elegans with significantly lower fecundity. We hypothesize that the lowered fecundity is due to reduced masculinization and less sperm production and that amino acid replacement substitutions and gene conversion in fog-2 may contribute significantly to variation in the degree of inbreeding and outcrossing in natural populations.
Hoang, Van L T; Innes, David J; Shaw, P Nicholas; Monteith, Gregory R; Gidley, Michael J; Dietzgen, Ralf G
Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles. A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316 bp. Variety IW had the highest SNP frequency (one SNP every 258 bp) while KP and NDM had similar frequencies (one SNP every 369 bp and 360 bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3'-hydroxylase (F3'H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties. The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and
M. Ananda Chitra
Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain
Marcelo Bento Soares
In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.
Nelson, Matthew R.; Wegmann, Daniel; Ehm, Margaret G.
Rare genetic variants contribute to complex disease risk; however, the abundance of rare variants in human populations remains unknown. We explored this spectrum of variation by sequencing 202 genes encoding drug targets in 14,002 individuals. We find rare variants are abundant (1 every 17 bases)...
Full Text Available Bali cattle is very popular Indonesian local beef related to their status in community living process of farmers in Indonesia, especially as providers of meat and exotic animal. Bali cattle were able to adapt the limited environment and becoming local livestock that existed until recently. In our early study by microsatellites showed that Bali cattle have specific allele. In this study we analyzed the variance of partly sex related Y (SRY gene sequence in Bali cattle bull as a source of cement for Artificial Insemination (AI. Blood from 17 two location of AI center, Singosari, Malang and Baturiti, Bali was collected and then extracted to get the DNA genome. PCR reaction was done to amplify partially of SRY gene segment and followed by sequencing PCR products to get the DNA sequence of SRY gene. The SRY gene sequence was used to determine the genetic variation and phylogenetic relationship. We found that Bali cattle bull from Singosari has relatively closed genetic relationship with Baturiti. It is also supported that in early data some Bali bulls of Singosari were came from Baturiti. It has been known that Baturiti is the one source of Bali cattle bull with promising genetic potential. While, in general that Bali bull where came from two areas were not different on reproductive performances. It is important to understand about the genetic variation of Bali cattle in molecular level related to conservation effort and maintaining the genetic characters of the local cattle. So, it will not become extinct or even decreased the genetic quality of Indonesian indigenous cattle. Key Words : Bali cattle, SRY gene, artificial insemination, phylogenetic, allele Animal Production 13(3:150-155 (2011
Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning
Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.
Rogers, Kai J; Fu, Wenqing; Akey, Joshua M; Monnat, Raymond J
Fanconi anemia (FA) is a human recessive genetic disease resulting from inactivating mutations in any of 16 FANC (Fanconi) genes. Individuals with FA are at high risk of developmental abnormalities, early bone marrow failure and leukemia. These are followed in the second and subsequent decades by a very high risk of carcinomas of the head and neck and anogenital region, and a small continuing risk of leukemia. In order to characterize base pair-level disease-associated (DA) and population genetic variation in FANC genes and the segregation of this variation in the human population, we identified 2948 unique FANC gene variants including 493 FA DA variants across 57,240 potential base pair variation sites in the 16 FANC genes. We then analyzed the segregation of this variation in the 7578 subjects included in the Exome Sequencing Project (ESP) and the 1000 Genomes Project (1KGP). There was a remarkably high frequency of FA DA variants in ESP/1KGP subjects: at least 1 FA DA variant was identified in 78.5% (5950 of 7578) individuals included in these two studies. Six widely used functional prediction algorithms correctly identified only a third of the known, DA FANC missense variants. We also identified FA DA variants that may be good candidates for different types of mutation-specific therapies. Our results demonstrate the power of direct DNA sequencing to detect, estimate the frequency of and follow the segregation of deleterious genetic variation in human populations. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org.
Yi, Li; Cheng, Shi-Peng; Yan, Xi-Jun; Wang, Jian-Ke; Luo, Bin
To recognize the molecular biology character, phylogenetic relationship and the state quo prevalent of Canine parvovirus (CPV), Faecal samnples from pet dogs with acute enteritis in the cities of Beijing, Wuhan, and Nanjing were collected and tested for CPV by PCR and other assay between 2006 and 2008. There was no CPV to FPV (MEV) variation by PCR-RFLP analysis in all samples. The complete ORFs of VP2 genes were obtained by PCR from 15 clinical CPVs and 2 CPV vaccine strains. All amplicons were cloned and sequenced. Analysis of the VP2 sequences showed that clinical CPVs both belong to CPV-2a subtype, and could be classified into a new cluster by amino acids contrasting which contains Tyr-->Ile (324) mutation. Besides the 2 CPV vaccine strains belong to CPV-2 subtype, and both of them have scattered variation in amino acids residues of VP2 protein. Construction of the phylogenetic tree based on CPV VP2 sequence showed these 15 CPV clinical strains were in close relationship with Korea strain K001 than CPV-2a isolates in other countries at early time, It is indicated that the canine parvovirus genetic variation was associated with location and time in some degree. The survey of CPV capsid protein VP2 gene provided the useful information for the identification of CPV types and understanding of their genetic relationship.
Wang, J M; Wang, Y Q; Gao, Z L; Wu, J T; Shi, B K; Yu, C C
Bladder cancer is a common cancer worldwide and its incidence continues to increase. There are approximately 261,000 cases of bladder cancer resulting in 115,000 deaths annually. This study aimed to integrate bladder cancer genome copy number variation information and bladder cancer gene transcription level expression data to construct a causal-target module network of the range of bladder cancer-related genomes. Here, we explored the control mechanism underlying bladder cancer phenotype expression regulation by the major bladder cancer genes. We selected 22 modules as the initial module network to expand the search to screen more networks. After bootstrapping 100 times, we obtained 16 key regulators. These 16 key candidate regulatory genes were further expanded to identify the expression changes of 11,676 genes in 275 modules, which may all have the same regulation. In conclusion, a series of modules associated with the terms 'cancer' or 'bladder' were considered to constitute a potential network.
Campopiano, Rosa; Ryskalin, Larisa; Giardina, Emiliano; Zampatti, Stefania; Busceti, Carla L; Biagioni, Francesca; Ferese, Rosangela; Storto, Marianna; Gambardella, Stefano; Fornai, Francesco
Amyotrophic lateral sclerosis (ALS) is fatal neurodegenerative disease clinically characterized by upper and lower motor neuron dysfunction resulting in rapidly progressive paralysis and death from respiratory failure. Most cases appear to be sporadic, but 5-10 % of cases have a family history of the disease, and over the last decade, identification of mutations in about 20 genes predisposing to these disorders has provided the means to better understand their pathogenesis. Next Generation sequencing (NGS) is an advanced high-throughput DNA sequencing technology which have rapidly contributed to an acceleration in the discovery of genetic risk factors for both familial and sporadic neurological and neurodegenerative diseases. These strategies allowed to rapidly identify disease-associated variants and genetic risk factors for both familial (fALS) and sporadic ALS (sALS), strongly contributing to the knowledge of the genetic architecture of ALS. Moreover, as the number of ALS genes grows, many of the proteins they encode are in intracellular processes shared with other known diseases, suggesting an overlapping of clinical and phatological features between different diseases. To emphasize this concept, the review focuses on genes coding for Valosin-containing protein (VPC) and two Heterogeneous nuclear RNA-binding proteins (HNRNPA1 and hnRNPA2B1), recently idefied through NGS, where different mutations have been associated in both ALS and other neurological and neurodegenerative diseases.
Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn
Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environment...
Francioli, Laurent C.; Menelaou, Andronild; Pulit, Sara L.; Van Dijk, Freerk; Palamara, Pier Francesco; Elbers, Clara C.; Neerincx, Pieter B. T.; Ye, Kai; Guryev, Victor; Kloosterman, Wigard P.; Deelen, Patrick; Abdellaoui, Abdel; Van Leeuwen, Elisabeth M.; Van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F. J.; Karssen, Lennart C.; Kanterakis, Alexandros; Amin, Najaf; Hottenga, Jouke Jan; Lameijer, Eric-Wubbo; Kattenberg, Mathijs; Dijkstra, Martijn; Byelas, Heorhiy; Van Settenl, Jessica; Van Schaik, Barbera D. C.; Bot, Jan; Nijman, Isaac J.; Renkens, Ivo; Marscha, Tobias; Schonhuth, Alexander; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Polak, Paz; Sohail, Mashaal; Vuzman, Dana; Hormozdiari, Fereydoun; Van Enckevort, David; Mei, Hailiang; Koval, Vyacheslav; Moed, Ma-Tthijs H.; Van der Velde, K. Joeri; Rivadeneira, Fernando; Estrada, Karol; Medina-Gomez, Carolina; Isaacs, Aaron; Platteel, Mathieu; Swertz, Morris A.; Wijmenga, Cisca
Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring
The Genome of the Netherlands Consortium; T. Marschall (Tobias); A. Schönhuth (Alexander)
htmlabstractWhole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch
Aflitos, S.A.; Schijlen, E.G.W.M.; Jong, de J.H.S.G.M.; Ridder, de D.; Smit, S.; Finkers, H.J.; Bakker, F.T.; Geest, van de H.C.; Lintel Hekkert, te B.; Haarst, van J.C.; Smits, L.W.M.; Koops, A.J.; Sanchez-Perez, M.J.; Heusden, van A.W.; Visser, R.G.F.; Schranz, M.E.; Peters, S.A.
We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative for the Lycopersicon, Arcanum, Eriopersicon, and Neolycopersicon groups which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new
Aflitos, S.; Schijlen, E.; de Jong, H.; de Ridder, D.; Smit, S.; Finkers, R.; Wang, J.; Zhang, G.; Li, N.; Mao, L.; Bakker, F.; Dirks, R.; Breit, T.; Gravendeel, B.; Huits, H.; Struss, D.; Swanson-Wagner, R.; van Leeuwen, H.; van Ham, R.C.H.J.; Fito, L.; Guignier, L.; Sevilla, M.; Ellul, P.; Ganko, E.; Kapur, A.; Reclus, E.; de Geus, B.; van de Geest, H.; te Lintel Hekkert, B.; van Haarst, J.; Smits, L.; Koops, A.; Sanchez-Perez, G.; van Heusden, A.W.; Visser, R.; Quan, Z.; Min, J.; Liao, L.; Wang, X.; Wang, G.; Yue, Z.; Yang, X.; Xu, N.; Schranz, E.; Smets, E.; Vos, R.; Rauwerda, J.; Ursem, R.; Schuit, C.; Kerns, M.; van den Berg, J.; Vriezen, W.; Janssen, A.; Datema, E.; Jahrman, T.; Moquet, F.; Bonnet, J.; Peters, S.
We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new
Chiba, A.; Suzutani, T.; Koyano, S.; Azuma, M.; Saijo, M.
To analyze the difference in the degree of divergence between genes from identical herpes virus species, we examined the nucleotide sequence of genes from the herpes simplex virus type 1 (HSV-l ) strains VR-3 and 17 encoding thymidine kinase (TK), deoxyribonuclease (DNase), protein kinase (PK; UL13) and virion-associated host shut off (vhs) protein (UL41). The frequency of nucleotide substitutions per 1 kb in TK gene was 2.5 to 4.3 times higher than those in the other three genes. To prove that the polymorphism of HSV-1 TK gene is common characteristic of herpes virus TK genes, we compared the diversity of TK genes among eight HSV-l , six herpes simplex virus type 2 (HSV-2) and seven varicella-zoster virus (VZV) strains. The average frequency of nucleotide substitutions per 1 kb in the TK gene of HSV-l strains was 4-fold higher than that in the TK gene of HSV-2 strains. The VZV TK gene was highly conserved and only two nucleotide changes were evident in VZV strains. However, the rate of non-synonymous substitutions in total nucleotide substitutions was similar among the TK genes of the three viruses. This result indicated that the mutational rates differed, but there were no significant differences in selective pressure. We conclude that HSV-l TK gene is highly diverged and analysis of variations in the gene is a useful approach for understanding the molecular evolution of HSV-l in a short period. (authors)
Full Text Available There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length.
Full Text Available Abstract Background Myostatin (MSTN is a member of the transforming growth factor-β superfamily that negatively regulates growth of skeletal muscle tissue. The gene encoding for the MSTN peptide is a consolidate candidate for the enhancement of productivity in terrestrial livestock. This gene potentially represents an important target for growth improvement of cultured finfish. Results Here we report molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer MSTN-1 gene. The barramundi MSTN-1 was encoded by three exons 379, 371 and 381 bp in length and translated into a 376-amino acid peptide. Intron 1 and 2 were 412 and 819 bp in length and presented typical GT...AG splicing sites. The upstream region contained cis-regulatory elements such as TATA-box and E-boxes. A first assessment of sequence variability suggested that higher mutation rates are found in the 5' flanking region with several SNP's present in this species. A putative micro RNA target site has also been observed in the 3'UTR (untranslated region and is highly conserved across teleost fish. The deduced amino acid sequence was conserved across vertebrates and exhibited characteristic conserved putative functional residues including a cleavage motif of proteolysis (RXXR, nine cysteines and two glycosilation sites. A qualitative analysis of the barramundi MSTN-1 expression pattern revealed that, in adult fish, transcripts are differentially expressed in various tissues other than skeletal muscles including gill, heart, kidney, intestine, liver, spleen, eye, gonad and brain. Conclusion Our findings provide valuable insights such as sequence variation and genomic information which will aid the further investigation of the barramundi MSTN-1 gene in association with growth. The finding for the first time in finfish MSTN of a miRNA target site in the 3'UTR provides an opportunity for the identification of regulatory mutations on the
Aagesen, Lone; Petersen, Gitte; Seberg, Ole
The behavior of two topological and four character-based congruence measures was explored using different indel treatments in three empirical data sets, each with different alignment difficulties. The analyses were done using direct optimization within a sensitivity analysis framework in which...... the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously...... preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation...
McCoy, Rajiv C; Wakefield, Jon; Akey, Joshua M
Regulatory variation influencing gene expression is a key contributor to phenotypic diversity, both within and between species. Unfortunately, RNA degrades too rapidly to be recovered from fossil remains, limiting functional genomic insights about our extinct hominin relatives. Many Neanderthal sequences survive in modern humans due to ancient hybridization, providing an opportunity to assess their contributions to transcriptional variation and to test hypotheses about regulatory evolution. We developed a flexible Bayesian statistical approach to quantify allele-specific expression (ASE) in complex RNA-seq datasets. We identified widespread expression differences between Neanderthal and modern human alleles, indicating pervasive cis-regulatory impacts of introgression. Brain regions and testes exhibited significant downregulation of Neanderthal alleles relative to other tissues, consistent with natural selection influencing the tissue-specific regulatory landscape. Our study demonstrates that Neanderthal-inherited sequences are not silent remnants of ancient interbreeding but have measurable impacts on gene expression that contribute to variation in modern human phenotypes. Copyright © 2017 Elsevier Inc. All rights reserved.
Daniel P Depledge
Full Text Available A family of hydrophilic acylated surface (HASP proteins, containing extensive and variant amino acid repeats, is expressed at the plasma membrane in infective extracellular (metacyclic and intracellular (amastigote stages of Old World Leishmania species. While HASPs are antigenic in the host and can induce protective immune responses, the biological functions of these Leishmania-specific proteins remain unresolved. Previous genome analysis has suggested that parasites of the sub-genus Leishmania (Viannia have lost HASP genes from their genomes.We have used molecular and cellular methods to analyse HASP expression in New World Leishmania mexicana complex species and show that, unlike in L. major, these proteins are expressed predominantly following differentiation into amastigotes within macrophages. Further genome analysis has revealed that the L. (Viannia species, L. (V. braziliensis, does express HASP-like proteins of low amino acid similarity but with similar biochemical characteristics, from genes present on a region of chromosome 23 that is syntenic with the HASP/SHERP locus in Old World Leishmania species and the L. (L. mexicana complex. A related gene is also present in Leptomonas seymouri and this may represent the ancestral copy of these Leishmania-genus specific sequences. The L. braziliensis HASP-like proteins (named the orthologous (o HASPs are predominantly expressed on the plasma membrane in amastigotes and are recognised by immune sera taken from 4 out of 6 leishmaniasis patients tested in an endemic region of Brazil. Analysis of the repetitive domains of the oHASPs has shown considerable genetic variation in parasite isolates taken from the same patients, suggesting that antigenic change may play a role in immune recognition of this protein family.These findings confirm that antigenic hydrophilic acylated proteins are expressed from genes in the same chromosomal region in species across the genus Leishmania. These proteins are
Bickhart, Derek M.; Xu, Lingyang; Hutchison, Jana L.; Cole, John B.; Null, Daniel J.; Schroeder, Steven G.; Song, Jiuzhou; Garcia, Jose Fernando; Sonstegard, Tad S.; Van Tassell, Curtis P.; Schnabel, Robert D.; Taylor, Jeremy F.; Lewin, Harris A.; Liu, George E.
The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1. Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future. PMID:27085184
M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3
Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.
Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.
This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
Nathan D. Olson
Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.
Christiansen, Gunna; Andersen, H; Birkelund, Svend
DNAs from 14 strains of Mycoplasma hominis isolated from various habitats, including strain PG21, were analyzed for genomic heterogeneity. DNA-DNA filter hybridization values were from 51 to 91%. Restriction endonuclease digestion patterns, analyzed by agarose gel electrophoresis, revealed...... no identity or cluster formation between strains. Variation within M. hominis rRNA genes was analyzed by Southern hybridization of EcoRI-cleaved DNA hybridized with a cloned fragment of the rRNA gene from the mycoplasma strain PG50. Five of the M. hominis strains showed identical hybridization patterns....... These hybridization patterns were compared with those of 12 other mycoplasma species, which showed a much more complex band pattern. Cloned nonribosomal RNA gene fragments of M. hominis PG21 DNA were analyzed, and the fragments were used to demonstrate heterogeneity among the strains. A monoclonal antibody against...
Sandvej, Kristian; Gratama, J W; Munch, M
Sequence variations in the Epstein-Barr virus (EBV) encoded latent membrane protein-1 (LMP-1) gene have been described in a Chinese nasopharyngeal carcinoma-derived isolate (CAO), and in viral isolates from various EBV-associated tumors. It has been suggested that these genetic changes, which...... include loss of a Xho I restriction site (position 169425) and a C-terminal 30-base pair (bp) deletion (position 168287-168256), define EBV genotypes associated with increased tumorigenicity or with disease among particular geographic populations. To determine the frequency of LMP-1 variations in European...... wild-type virus isolates, we sequenced the LMP-1 promoter and gene in EBV from lymphoblastoid cell lines from healthy carriers and patients without EBV-associated disease. Sequence changes were often present, and defined at least four main groups of viral isolates, which we designate Groups A through D...
Windig, J J; Hoving, R A H; Priem, J; Bossers, A; van Keulen, L J M; Langeveld, J P M
Scrapie is a neurodegenerative disease occurring in goats and sheep. Several haplotypes of the prion protein increase resistance to scrapie infection and may be used in selective breeding to help eradicate scrapie. In this study, frequencies of the allelic variants of the PrP gene are determined for six goat breeds in the Netherlands. Overall frequencies in Dutch goats were determined from 768 brain tissue samples in 2005, 766 in 2008 and 300 in 2012, derived from random sampling for the national scrapie surveillance without knowledge of the breed. Breed specific frequencies were determined in the winter 2013/2014 by sampling 300 breeding animals from the main breeders of the different breeds. Detailed analysis of the scrapie-resistant K222 haplotype was carried out in 2014 for 220 Dutch Toggenburger goats and in 2015 for 942 goats from the Saanen derived White Goat breed. Nine haplotypes were identified in the Dutch breeds. Frequencies for non-wild type haplotypes were generally low. Exception was the K222 haplotype in the Dutch Toggenburger (29%) and the S146 haplotype in the Nubian and Boer breeds (respectively 7 and 31%). The frequency of the K222 haplotype in the Toggenburger was higher than for any other breed reported in literature, while for the White Goat breed it was with 3.1% similar to frequencies of other Saanen or Saanen derived breeds. Further evidence was found for the existence of two M142 haplotypes, M142 /S240 and M142 /P240 . Breeds vary in haplotype frequencies but frequencies of resistant genotypes are generally low and consequently selective breeding for scrapie resistance can only be slow but will benefit from animals identified in this study. The unexpectedly high frequency of the K222 haplotype in the Dutch Toggenburger underlines the need for conservation of rare breeds in order to conserve genetic diversity rare or absent in other breeds. © 2016 Blackwell Verlag GmbH.
Barbara E Stranger
Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.
Full Text Available The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis- to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.
Full Text Available Intellectual disability (ID is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K-specific methyltransferase 2B (KMT2B, zinc finger protein 589 (ZNF589, as well as hedgehog acyltransferase (HHAT with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID.
Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.
Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.
Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup
Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.
Full Text Available Abstract Background The poly Q polymorphism in AIB1 (amplified in breast cancer gene is usually assessed by fragment length analysis which does not reveal the actual sequence variation. The purpose of this study is to investigate the sequence variation of poly Q encoding region in breast cancer cell lines at single molecule level, and to determine if the sequence variation is related to AIB1 gene amplification. Methods The polymorphic poly Q encoding region of AIB1 gene was investigated at the single molecule level by PCR cloning/sequencing. The amplification of AIB1 gene in various breast cancer cell lines were studied by real-time quantitative PCR. Results Significant amplifications (5–23 folds of AIB1 gene were found in 2 out of 9 (22% ER positive cell lines (in BT-474 and MCF-7 but not in BT-20, ZR-75-1, T47D, BT483, MDA-MB-361, MDA-MB-468 and MDA-MB-330. The AIB1 gene was not amplified in any of the ER negative cell lines. Different passages of MCF-7 cell lines and their derivatives maintained the feature of AIB1 amplification. When the cells were selected for hormone independence (LCC1 and resistance to 4-hydroxy tamoxifen (4-OH TAM (LCC2 and R27, ICI 182,780 (LCC9 or 4-OH TAM, KEO and LY 117018 (LY-2, AIB1 copy number decreased but still remained highly amplified. Sequencing analysis of poly Q encoding region of AIB1 gene did not reveal specific patterns that could be correlated with AIB1 gene amplification. However, about 72% of the breast cancer cell lines had at least one under represented (3CAA(CAG9(CAACAG3(CAACAGCAG2CAA of the original cell line, a number of altered poly Q encoding sequences were found in the derivatives of MCF-7 cell lines. Conclusion These data suggest that poly Q encoding region of AIB1 gene is somatic unstable in breast cancer cell lines. The instability and the sequence characteristics, however, do not appear to be associated with the level of the gene amplification.
Qurainy, F.A.; Gaafar, A.R.Z.
Assessment and knowledge of the genetic diversity and variation within and between populations of rare and endangered plants is very important for effective conservation. Intergenic spacer sequences variation of psbA-trnH locus of chloroplast genome was assessed within Breonadia salicina (Rubiaceae), a critically endangered and endemic plant species to South western part of Kingdom of Saudi Arabia. The obtained sequence data from 19 individuals in three populations revealed nine haplotypes. The aligned sequences obtained from the overall Saudi accessions extended to 355 bp, revealing nine haplotypes. A high level of haplotype diversity (Hd = 0.842) and low level of nucleotide diversity (Pi = 0.0058) were detected. Consistently, both hierarchical analysis of molecular variance (AMOVA) and constructed neighbor-joining tree indicated null genetic differentiation among populations. This level of differentiation between populations or between regions in psbA-trnH sequences may be due to effects of the abundance of ancestral haplotype sharing and the presence of private haplotypes fixed for each population. Furthermore, the results revealed almost the same level of genetic diversity in comparison with Yemeni accessions, in which Saudi accessions were sharing three haplotypes from the four haplotypes found in Yemeni accessions. (author)
Yang, Jian-Rong; Maclean, Calum J; Park, Chungoo; Zhao, Huabin; Zhang, Jianzhi
It is commonly, although not universally, accepted that most intra and interspecific genome sequence variations are more or less neutral, whereas a large fraction of organism-level phenotypic variations are adaptive. Gene expression levels are molecular phenotypes that bridge the gap between genotypes and corresponding organism-level phenotypes. Yet, it is unknown whether natural variations in gene expression levels are mostly neutral or adaptive. Here we address this fundamental question by genome-wide profiling and comparison of gene expression levels in nine yeast strains belonging to three closely related Saccharomyces species and originating from five different ecological environments. We find that the transcriptome-based clustering of the nine strains approximates the genome sequence-based phylogeny irrespective of their ecological environments. Remarkably, only ∼0.5% of genes exhibit similar expression levels among strains from a common ecological environment, no greater than that among strains with comparable phylogenetic relationships but different environments. These and other observations strongly suggest that most intra and interspecific variations in yeast gene expression levels result from the accumulation of random mutations rather than environmental adaptations. This finding has profound implications for understanding the driving force of gene expression evolution, genetic basis of phenotypic adaptation, and general role of stochasticity in evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Full Text Available Abstract Background Sequencing-by-ligation (SBL is one of several next-generation sequencing methods that has been developed for massive sequencing of DNA immobilized on arrayed beads (or other clonal amplicons. SBL has the advantage of being easy to implement and accessible to all because it can be performed with off-the-shelf reagents. However, SBL has the limitation of very short read lengths. Results To overcome the read length limitation, research groups have developed complex library preparation processes, which can be time-consuming, difficult, and result in low complexity libraries. Herein we describe a variation on traditional SBL protocols that extends the number of sequential bases that can be sequenced by using Endonuclease V to nick a query primer, thus leaving a ligatable end extended into the unknown sequence for further SBL cycles. To demonstrate the protocol, we constructed a known DNA sequence and utilized our SBL variation, cyclic SBL (cSBL, to resequence this region. Using our method, we were able to read thirteen contiguous bases in the 3' - 5' direction. Conclusions Combining this read length with sequencing in the 5' - 3' direction would allow a read length of over twenty bases on a single tage. Implementing mate-paired tags and this SBL variation could enable > 95% coverage of the genome.
Xiao, Yuan; Yuan, Wentao; Yu, Bo; Guo, Yan; Xu, Xu; Wang, Xinqiong; Yu, Yi; Yu, Yi; Gong, Biao; Xu, Chundi
To identify causal mutations in certain genes in children with acute recurrent pancreatitis (ARP) or chronic pancreatitis (CP). After patients were enrolled (CP, 55; ARP, 14) and their clinical characteristics were investigated, we performed next-generation sequencing to detect nucleotide variations among the following 10 genes: cationic trypsinogen protease serine 1 (PRSS1), serine protease inhibitor, Kazal type 1 (SPINK1), cystic fibrosis transmembrane conductance regulator gene (CFTR), chymotrypsin C (CTRC), calcium-sensing receptor (CASR), cathepsin B (CTSB), keratin 8 (KRT8), CLAUDIN 2 (CLDN2), carboxypeptidase A1 (CPA1), and ATPase type 8B member 1 (ATP8B1). Mutations were searched against online databases to obtain information on the cause of the diseases. Certain novel mutations were analyzed using the SIFT2 and Polyphen-2 to predict the effect on protein function. There were 45 patients with CP and 10 patients with ARP who harbored 1 or more mutations in these genes; 45 patients had at least 1 mutation related to pancreatitis. Mutations were observed in the PRSS1, SPINK1, and CFTR genes in 17 patients, the CASR gene in 5 patients, and the CTSB, CTRC, and KRT8 genes in 1 patient. Mutations were not found in the CLDN, CPA1, or ATP8B1 genes. We found that mutations in SPINK1 may increase the risk of pancreatic duct stones (OR, 11.07; P = .003). The patients with CFTR mutations had a higher level of serum amylase (316.0 U/L vs 92.5 U/L; P = .026). Mutations, especially those in PRSS1, SPINK1, and CFTR, accounted for the major etiologies in Chinese children with CP or ARP. Children presenting mutations in the SPINK1 gene may have a higher risk of developing pancreatic duct stones. Copyright © 2017 Elsevier Inc. All rights reserved.
Correia, Samantha; Palser, Anne; Elgueta Karstegl, Claudio; Middeldorp, Jaap M; Ramayanti, Octavia; Cohen, Jeffrey I; Hildesheim, Allan; Fellner, Maria Dolores; Wiels, Joelle; White, Robert E; Kellam, Paul; Farrell, Paul J
Viral gene sequences from an enlarged set of about 200 Epstein-Barr virus (EBV) strains, including many primary isolates, have been used to investigate variation in key viral genetic regions, particularly LMP1, Zp, gp350, EBNA1, and the BART microRNA (miRNA) cluster 2. Determination of type 1 and type 2 EBV in saliva samples from people from a wide range of geographic and ethnic backgrounds demonstrates a small percentage of healthy white Caucasian British people carrying predominantly type 2 EBV. Linkage of Zp and gp350 variants to type 2 EBV is likely to be due to their genes being adjacent to the EBNA3 locus, which is one of the major determinants of the type 1/type 2 distinction. A novel classification of EBNA1 DNA binding domains, named QCIGP, results from phylogeny analysis of their protein sequences but is not linked to the type 1/type 2 classification. The BART cluster 2 miRNA region is classified into three major variants through single-nucleotide polymorphisms (SNPs) in the primary miRNA outside the mature miRNA sequences. These SNPs can result in altered levels of expression of some miRNAs from the BART variant frequently present in Chinese and Indonesian nasopharyngeal carcinoma (NPC) samples. The EBV genetic variants identified here provide a basis for future, more directed analysis of association of specific EBV variations with EBV biology and EBV-associated diseases. IMPORTANCE Incidence of diseases associated with EBV varies greatly in different parts of the world. Thus, relationships between EBV genome sequence variation and health, disease, geography, and ethnicity of the host may be important for understanding the role of EBV in diseases and for development of an effective EBV vaccine. This paper provides the most comprehensive analysis so far of variation in specific EBV genes relevant to these diseases and proposed EBV vaccines. By focusing on variation in LMP1, Zp, gp350, EBNA1, and the BART miRNA cluster 2, new relationships with the known
Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N
Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.
Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. PMID:23731509
: Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...
This textbook provides an authoritative introduction to both classical and coalescent approaches to population genetics. Written for graduate students and advanced undergraduates by one of the world’s leading authorities in the field, the book focuses on the theoretical background of population...... genetics, while emphasizing the close interplay between theory and empiricism. Traditional topics such as genetic and phenotypic variation, mutation, migration, and linkage are covered and advanced by contemporary coalescent theory, which describes the genealogy of genes in a population, ultimately...... connecting them to a single common ancestor. Effects of selection, particularly genomic effects, are discussed with reference to molecular genetic variation. The book is designed for students of population genetics, bioinformatics, evolutionary biology, molecular evolution, and theoretical biology—as well...
Wang, Yan; Liu, Guo-Hua; Li, Jia-Yuan; Xu, Min-Jun; Ye, Yong-Gang; Zhou, Dong-Hui; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
This study examined sequence variation in three mitochondrial DNA (mtDNA) regions, namely cytochrome c oxidase subunit 1 (cox1), NADH dehydrogenase subunit 5 (nad5) and cytochrome b (cytb), among Trichuris ovis isolates from different hosts in Guangdong Province, China. A portion of the cox1 (pcox1), nad5 (pnad5) and cytb (pcytb) genes was amplified separately from individual whipworms by PCR, and was subjected to sequencing from both directions. The size of the sequences of pcox1, pnad5 and pcytb was 618, 240 and 464 bp, respectively. Although the intra-specific sequence variations within T. ovis were 0-0.8% for pcox1, 0-0.8% for pnad5 and 0-1.9% for pcytb, the inter-specific sequence differences among members of the genus Trichuris were significantly higher, being 24.3-26.5% for pcox1, 33.7-56.4% for pnad5 and 24.8-26.1% for pcytb, respectively. Phylogenetic analyses using combined sequences of pcox1, pnad5 and pcytb, with three different computational algorithms (maximum likelihood, maximum parsimony and Bayesian inference), indicated that all of the T. ovis isolates grouped together with high statistical support. These findings demonstrated the existence of intra-specific variation in mtDNA sequences among T. ovis isolates from different hosts, and have implications for studying molecular epidemiology and population genetics of T. ovis.
Yang, Deying; Ren, Yongjun; Fu, Yan; Xie, Yue; Nie, Huaming; Nong, Xiang; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou
Taenia pisiformis is one of the most important parasites of canines and rabbits. T. pisiformis cysticercus (the larval stage) causes severe damage to rabbit breeding, which results in huge economic losses. In this study, the genetic variation of T. pisiformis was determined in Sichuan Province, China. Fragments of the mitochondrial cytochrome b (cytb) (922 bp) gene were amplified in 53 isolates from 8 regions of T. pisiformis. Overall, 12 haplotypes were found in these 53 cytb sequences. Molecular genetic variations showed 98.4% genetic variation derived from intra-region. FST and Nm values suggested that 53 isolates were not genetically differentiated and had low levels of genetic diversity. Neutrality indices of the cytb sequences showed the evolution of T. pisiformis followed a neutral mode. Phylogenetic analysis revealed no correlation between phylogeny and geographic distribution. These findings indicate that 53 isolates of T. pisiformis keep a low genetic variation, which provide useful knowledge for monitoring changes in parasite populations for future control strategies.
Sep?lveda, Nuno; Campino, Susana G; Assefa, Samuel A; Sutherland, Colin J; Pain5, Arnab; Clark, Taane G
BACKGROUND: The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poi...
Dubnov , Shlomo; Assayag , Gérard
cote interne IRCAM: Assayag05a; National audience; We describe a model for improvisation design based on Factor Oracle automation, which is extended to perform learning and analysis of incoming sequences in terms of sequence variation parameters, namely replication, recombination and innovation. These parameters describe the improvisation plan and allow the design of new improvisations or analysis and modification of plans of existing improvisations. We further introduce an idea of flow exper...
Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.
Patrick Guilfoile; Ron Burns; Zu-Yi Gu; Matt Amundson; Fu-Hsian Chang
A cbbl cellobiohydrolase gene was cloned and sequenced from the fungus Trichoderrna harzianum FP108. The cloning was performed by PCR amplification of T. harzianum genomic DNA, using PCR primers whose sequence was based on the cbbl gene from Tricboderma reesei. The 3' end of the gene was isolated by inverse...
Degefu, Y.; Paulin, L.; Lübeck, Peter Stephensen
A gene encoding an endoxylanase from the phytopathogenic fungus Helminthosporium turcicum Pass. was cloned and sequenced. The entire nucleotide sequence of a 1991 bp genomic fragment containing an endoxylanase gene was determined. The xylanase gene of 795 bp, interrupted by two introns of 52 and ...
Grauers, Anna; Wang, Jingwen; Einarsdottir, Elisabet
samples from 100 surgically treated idiopathic scoliosis patients. Novel or rare missense, nonsense, or splice site variants were selected for individual genotyping in the 1,739 cases and 1,812 controls. In addition, the 5'UTR, noncoding exon and promoter regions of LBX1, not covered by exome sequencing...... by exome sequencing after filtration and an initial genotyping validation. However, we could not verify any association to idiopathic scoliosis in the large cohort of 1,739 cases and 1,812 controls. We did not find any variants in the 5'UTR, noncoding exon and promoter regions of LBX1. CONCLUSIONS: Here...... that are significantly associated with idiopathic scoliosis in Asian and Caucasian populations, rs11190870 close to the LBX1 gene being the most replicated finding. PURPOSE: The aim of the present study was to investigate the genetics of idiopathic scoliosis in a Scandinavian cohort by performing a candidate gene study...
Full Text Available Next-generation sequencing now for the first time allows researchers to gauge the depth and variation of entire transcriptomes. However, now as rare transcripts can be detected that are present in cells at single copies, more advanced computational tools are needed to accurately annotate and profile them. miRNAs are 22 nucleotide small RNAs (sRNAs that post-transcriptionally reduce the output of protein coding genes. They have established roles in numerous biological processes, including cancers and other diseases. During miRNA biogenesis, the sRNAs are sequentially cleaved from precursor molecules that have a characteristic hairpin RNA structure. The vast majority of new miRNA genes that are discovered are mined from small RNA sequencing (sRNA-seq, which can detect more than a billion RNAs in a single run. However, given that many of the detected RNAs are degradation products from all types of transcripts, the accurate identification of miRNAs remain a non-trivial computational problem. Here we review the tools available to predict animal miRNAs from sRNA sequencing data. We present tools for generalist and specialist use cases, including prediction from massively pooled data or in species without reference genome. We also present wet-lab methods used to validate predicted miRNAs, and approaches to computationally benchmark prediction accuracy. For each tool, we reference validation experiments and benchmarking efforts. Last, we discuss the future of the field.
Bauer, Bianca S; Forsyth, George W; Sandmeyer, Lynne S; Grahn, Bruce H
Mitochondrial transcription factor A (Tfam) has been implicated in the pathogenesis of retinal dysplasia in miniature schnauzer dogs and it has been proposed that affected dogs have altered mitochondrial numbers, size, and morphology. To test these hypotheses the Tfam gene of affected and normal miniature schnauzer dogs with retinal dysplasia was sequenced and lymphocyte mitochondria were quantified, measured, and the morphology was compared in normal and affected dogs using transmission electron microscopy. For Tfam sequencing, retina, retinal pigment epithelium (RPE), and whole blood samples were collected. Total RNA was isolated from the retina and RPE and reverse transcribed to make cDNA. Genomic DNA was extracted from white blood cell pellets obtained from the whole blood samples. The Tfam coding sequence, 5' promoter region, intron1 and the 3' non-coding sequence of normal and affected dogs were amplified using polymerase chain reaction (PCR), cloned and sequenced. For electron microscopy, lymphocytes from affected and normal dogs were photographed and the mitochondria within each cross-section were identified, quantified, and the mitochondrial area (μm²) per lymphocyte cross-section was calculated. Lastly, using a masked technique, mitochondrial morphology was compared between the 2 groups. Sequencing of the miniature schnauzer Tfam gene revealed no functional sequence variation between affected and normal dogs. Lymphocyte and mitochondrial area, mitochondrial quantification, and morphology assessment also revealed no significant difference between the 2 groups. Further investigation into other candidate genes or factors causing retinal dysplasia in the miniature schnauzer is warranted.
Le Quesne Stabej, Polona; Saihan, Zubin; Rangesh, Nell; Steele-Stallard, Heather B; Ambrose, John; Coffey, Alison; Emmerson, Jenny; Haralambous, Elene; Hughes, Yasmin; Steel, Karen P; Luxon, Linda M; Webster, Andrew R; Bitner-Glindzicz, Maria
Usher syndrome (USH) is an autosomal recessive disorder comprising retinitis pigmentosa, hearing loss and, in some cases, vestibular dysfunction. It is clinically and genetically heterogeneous with three distinctive clinical types (I-III) and nine Usher genes identified. This study is a comprehensive clinical and genetic analysis of 172 Usher patients and evaluates the contribution of digenic inheritance. The genes MYO7A, USH1C, CDH23, PCDH15, USH1G, USH2A, GPR98, WHRN, CLRN1 and the candidate gene SLC4A7 were sequenced in 172 UK Usher patients, regardless of clinical type. No subject had definite mutations (nonsense, frameshift or consensus splice site mutations) in two different USH genes. Novel missense variants were classified UV1-4 (unclassified variant): UV4 is 'probably pathogenic', based on control frequency A being the most common USH1 mutation in the cohort). USH2A was responsible for 79.3% of USH2 families and GPR98 for only 6.6%. No mutations were found in USH1G, WHRN or SLC4A7. One or two pathogenic/likely pathogenic variants were identified in 86% of cases. No convincing cases of digenic inheritance were found. It is concluded that digenic inheritance does not make a significant contribution to Usher syndrome; the observation of multiple variants in different genes is likely to reflect polymorphic variation, rather than digenic effects.
Full Text Available Abstract Background Like humans, the living elephants are unusual among mammals in being sparsely covered with hair. Relative to extant elephants, the extinct woolly mammoth, Mammuthus primigenius, had a dense hair cover and extremely long hair, which likely were adaptations to its subarctic habitat. The fibroblast growth factor 5 (FGF5 gene affects hair length in a diverse set of mammalian species. Mutations in FGF5 lead to recessive long hair phenotypes in mice, dogs, and cats; and the gene has been implicated in hair length variation in rabbits. Thus, FGF5 represents a leading candidate gene for the phenotypic differences in hair length notable between extant elephants and the woolly mammoth. We therefore sequenced the three exons (except for the 3' UTR and a portion of the promoter of FGF5 from the living elephantid species (Asian, African savanna and African forest elephants and, using protocols for ancient DNA, from a woolly mammoth. Results Between the extant elephants and the mammoth, two single base substitutions were observed in FGF5, neither of which alters the amino acid sequence. Modeling of the protein structure suggests that the elephantid proteins fold similarly to the human FGF5 protein. Bioinformatics analyses and DNA sequencing of another locus that has been implicated in hair cover in humans, type I hair keratin pseudogene (KRTHAP1, also yielded negative results. Interestingly, KRTHAP1 is a pseudogene in elephantids as in humans (although fully functional in non-human primates. Conclusion The data suggest that the coding sequence of the FGF5 gene is not the critical determinant of hair length differences among elephantids. The results are discussed in the context of hairlessness among mammals and in terms of the potential impact of large body size, subarctic conditions, and an aquatic ancestor on hair cover in the Proboscidea.
Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I
Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.
Khera, Amit V.; Won, Hong-Hee; Peloso, Gina M.; Lawson, Kim S.; Bartz, Traci M.; Deng, Xuan; van Leeuwen, Elisabeth M.; Natarajan, Pradeep; Emdin, Connor A.; Bick, Alexander G.; Morrison, Alanna C.; Brody, Jennifer A.; Gupta, Namrata; Nomura, Akihiro; Kessler, Thorsten; Duga, Stefano; Bis, Joshua C.; van Duijn, Cornelia M.; Cupples, L. Adrienne; Psaty, Bruce; Rader, Daniel J.; Danesh, John; Schunkert, Heribert; McPherson, Ruth; Farrall, Martin; Watkins, Hugh; Lander, Eric; Wilson, James G.; Correa, Adolfo; Boerwinkle, Eric; Merlini, Piera Angelica; Ardissino, Diego; Saleheen, Danish; Gabriel, Stacey; Kathiresan, Sekar
Background About 7% of US adults have severe hypercholesterolemia (untreated LDL cholesterol ≥190 mg/dl). Such high LDL levels may be due to familial hypercholesterolemia (FH), a condition caused by a single mutation in any of three genes. Lifelong elevations in LDL cholesterol in FH mutation carriers may confer CAD risk beyond that captured by a single LDL cholesterol measurement. Objectives Assess the prevalence of a FH mutation among those with severe hypercholesterolemia and determine whether CAD risk varies according to mutation status beyond the observed LDL cholesterol. Methods Three genes causative for FH (LDLR, APOB, PCSK9) were sequenced in 26,025 participants from 7 case-control studies (5,540 CAD cases, 8,577 CAD-free controls) and 5 prospective cohort studies (11,908 participants). FH mutations included loss-of-function variants in LDLR, missense mutations in LDLR predicted to be damaging, and variants linked to FH in ClinVar, a clinical genetics database. Results Among 8,577 CAD-free control participants, 430 had LDL cholesterol ≥190 mg/dl; of these, only eight (1.9%) carried a FH mutation. Similarly, among 11,908 participants from 5 prospective cohorts, 956 had LDL cholesterol ≥190 mg/dl and of these, only 16 (1.7%) carried a FH mutation. Within any stratum of observed LDL cholesterol, risk of CAD was higher among FH mutation carriers when compared with non-carriers. When compared to a reference group with LDL cholesterol <130 mg/dl and no mutation, participants with LDL cholesterol ≥190 mg/dl and no FH mutation had six-fold higher risk for CAD (OR 6.0; 95%CI 5.2–6.9) whereas those with LDL cholesterol ≥190 mg/dl as well as a FH mutation demonstrated twenty-two fold increased risk (OR 22.3; 95%CI 10.7–53.2). Conclusions Among individuals with LDL cholesterol ≥190 mg/dl, gene sequencing identified a FH mutation in <2%. However, for any given observed LDL cholesterol, FH mutation carriers are at substantially increased risk for CAD
Jul 19, 2010 ... and antisense primers, a single band of 573 base pairs .... Amino acid sequence alignment of Cluster I and Cluster II of phylogenetic tree. First ten sequences ... sequence weighting, postion-spiecific gap penalties and weight.
Full Text Available Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.
Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F
Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences
Bershad, Anya K; Weafer, Jessica J; Kirkpatrick, Matthew G; Wardle, Margaret C; Miller, Melissa A; de Wit, Harriet
3,4-Methylenedioxymethamphetamine (MDMA, "ecstasy") enhances desire to socialize and feelings of empathy, which are thought to be related to increased oxytocin levels. Thus, variation in the oxytocin receptor gene (OXTR) may influence responses to the drug. Here, we examined the influence of a single OXTR nucleotide polymorphism (SNP) on responses to MDMA in humans. Based on findings that carriers of the A allele at rs53576 exhibit reduced sensitivity to oxytocin-induced social behavior, we hypothesized that these individuals would show reduced subjective responses to MDMA, including sociability. In this three-session, double blind, within-subjects study, healthy volunteers with past MDMA experience (N = 68) received a MDMA (0, 0.75 mg/kg, and 1.5 mg/kg) and provided self-report ratings of sociability, anxiety, and drug effects. These responses were examined in relation to rs53576. MDMA (1.5 mg/kg) did not increase sociability in individuals with the A/A genotype as it did in G allele carriers. The genotypic groups did not differ in responses at the lower MDMA dose, or in cardiovascular or other subjective responses. These findings are consistent with the idea that MDMA-induced sociability is mediated by oxytocin, and that variation in the oxytocin receptor gene may influence responses to the drug.
Chen, Ye; Lv, Cheng; Li, Fangting; Li, Tiejun
Stochastic genetic switching driven by intrinsic noise is an important process in gene expression. When the rates of gene activation/inactivation are relatively slow, fast, or medium compared with the synthesis/degradation rates of mRNAs and proteins, the variability of protein and mRNA levels may exhibit very different dynamical patterns. It is desirable to provide a systematic approach to identify their key dynamical features in different regimes, aiming at distinguishing which regime a considered gene regulatory network is in from their phenotypic variations. We studied a gene expression model with positive feedbacks when genetic switching rates vary over a wide range. With the goal of providing a method to distinguish the regime of the switching rates, we first focus on understanding the essential dynamics of gene expression system in different cases. In the regime of slow switching rates, we found that the effective dynamics can be reduced to independent evolutions on two separate layers corresponding to gene activation and inactivation states, and the transitions between two layers are rare events, after which the system goes mainly along deterministic ODE trajectories on a particular layer to reach new steady states. The energy landscape in this regime can be well approximated by using Gaussian mixture model. In the regime of intermediate switching rates, we analyzed the mean switching time to investigate the stability of the system in different parameter ranges. We also discussed the case of fast switching rates from the viewpoint of transition state theory. Based on the obtained results, we made a proposal to distinguish these three regimes in a simulation experiment. We identified the intermediate regime from the fact that the strength of cellular memory is lower than the other two cases, and the fast and slow regimes can be distinguished by their different perturbation-response behavior with respect to the switching rates perturbations. We proposed a
Kudo, Shinichi; Fukuda, Minoru
Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ∼ 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication
Donaldson, Michael E; Rico, Yessica; Hueffer, Karsten; Rando, Halie M; Kukekova, Anna V; Kyle, Christopher J
Pathogens are recognized as major drivers of local adaptation in wildlife systems. By determining which gene variants are favored in local interactions among populations with and without disease, spatially explicit adaptive responses to pathogens can be elucidated. Much of our current understanding of host responses to disease comes from a small number of genes associated with an immune response. High-throughput sequencing (HTS) technologies, such as genotype-by-sequencing (GBS), facilitate expanded explorations of genomic variation among populations. Hybridization-based GBS techniques can be leveraged in systems not well characterized for specific variants associated with disease outcome to "capture" specific genes and regulatory regions known to influence expression and disease outcome. We developed a multiplexed, sequence capture assay for red foxes to simultaneously assess ~300-kbp of genomic sequence from 116 adaptive, intrinsic, and innate immunity genes of predicted adaptive significance and their putative upstream regulatory regions along with 23 neutral microsatellite regions to control for demographic effects. The assay was applied to 45 fox DNA samples from Alaska, where three arctic rabies strains are geographically restricted and endemic to coastal tundra regions, yet absent from the boreal interior. The assay provided 61.5% on-target enrichment with relatively even sequence coverage across all targeted loci and samples (mean = 50×), which allowed us to elucidate genetic variation across introns, exons, and potential regulatory regions (4,819 SNPs). Challenges remained in accurately describing microsatellite variation using this technique; however, longer-read HTS technologies should overcome these issues. We used these data to conduct preliminary analyses and detected genetic structure in a subset of red fox immune-related genes between regions with and without endemic arctic rabies. This assay provides a template to assess immunogenetic variation
Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.
Mierau, Igor; Tan, Paris S.T.; Haandrikman, Alfred J.; Kok, Jan; Leenhouts, Kees J.; Konings, Wil N.; Venema, Gerard
The gene specifying an endopeptidase of Lactococcus lactis, named pepO, was cloned from a genomic library of L. lactis subsp. cremoris P8-247 in lambdaEMBL3 and was subsequently sequenced. pepO is probably the last gene of an operon encoding the binding-protein-dependent oligopeptide transport
Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D.
Background Whole genome sequencing enables a high resolution view ofthe human genome and provides unique insights into genome structureat an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools while validatedalso include a number of
Self, James E; Shawkat, Fatima; Malpas, Crispin T; Thomas, N Simon; Harris, Christopher M; Hodgkins, Peter R; Chen, Xiaoli; Trump, Dorothy; Lotery, Andrew J
To perform a genotype-phenotype correlation study in an X-linked congenital idiopathic nystagmus pedigree (pedigree 1) and to assess the allelic variance of the FRMD7 gene in congenital idiopathic nystagmus. Subjects from pedigree 1 underwent detailed clinical examination including nystagmology. Screening of FRMD7 was undertaken in pedigree 1 and in 37 other congenital idiopathic nystagmus probands and controls. Direct sequencing confirmed sequence changes. X-inactivation studies were performed in pedigree 1. The nystagmus phenotype was extremely variable in pedigree 1. We identified 2 FRMD7 mutations. However, 80% of X-linked families and 96% of simplex cases showed no mutations. X-inactivation studies demonstrated no clear causal link between skewing and variable penetrance. We confirm profound phenotypic variation in X-linked congenital idiopathic nystagmus pedigrees. We demonstrate that other congenital nystagmus genes exist besides FRMD7. We show that the role of X inactivation in variable penetrance is unclear in congenital idiopathic nystagmus. Clinical Relevance We demonstrate that phenotypic variation of nystagmus occurs in families with FRMD7 mutations. While FRMD7 mutations may be found in some cases of X-linked congenital idiopathic nystagmus, the diagnostic yield is low. X-inactivation assays are unhelpful as a test for carrier status for this disease.
Hoffmann, Robert D; Palmgren, Michael
Whole-genome duplications in the ancestors of many diverse species provided the genetic material for evolutionary novelty. Several models explain the retention of paralogous genes. However, how these models are reflected in the evolution of coding and non-coding sequences of paralogous genes is unknown. Here, we analyzed the coding and non-coding sequences of paralogous genes in Arabidopsis thaliana and compared these sequences with those of orthologous genes in Arabidopsis lyrata. Paralogs with lower expression than their duplicate had more nonsynonymous substitutions, were more likely to fractionate, and exhibited less similar expression patterns with their orthologs in the other species. Also, lower-expressed genes had greater tissue specificity. Orthologous conserved non-coding sequences in the promoters, introns, and 3' untranslated regions were less abundant at lower-expressed genes compared to their higher-expressed paralogs. A gene ontology (GO) term enrichment analysis showed that paralogs with similar expression levels were enriched in GO terms related to ribosomes, whereas paralogs with different expression levels were enriched in terms associated with stress responses. Loss of conserved non-coding sequences in one gene of a paralogous gene pair correlates with reduced expression levels that are more tissue specific. Together with increased mutation rates in the coding sequences, this suggests that similar forces of purifying selection act on coding and non-coding sequences. We propose that coding and non-coding sequences evolve concurrently following gene duplication.
Jul 26, 2017 ... Clinical utility of a 377 gene custom next-generation sequencing epilepsy panel ... number of genes, making it a very attractive option for a condition as .... clinical value of various test offerings to guide decision making.
Full Text Available The molecular mechanisms that drive the development of the endangered fossil fish species Acipenser baeri are difficult to study due to the lack of genomic data. Recent advances in sequencing technologies and the reducing cost of sequencing offer exclusive opportunities for exploring important molecular mechanisms underlying specific biological processes. This manuscript describes the large scale sequencing and analyses of mRNA from Acipenser baeri collected at five development time points using the Illumina Hiseq2000 platform. The sequencing reads were de novo assembled and clustered into 278167 unigenes, of which 57346 (20.62% had 45837 known homologues proteins in Uniprot protein databases while 11509 proteins matched with at least one sequence of assembled unigenes. The remaining 79.38% of unigenes could stand for non-coding unigenes or unigenes specific to A. baeri. A number of 43062 unigenes were annotated into functional categories via Gene Ontology (GO annotation whereas 29526 unigenes were associated with 329 pathways by mapping to KEGG database. Subsequently, 3479 differentially expressed genes were scanned within developmental stages and clustered into 50 gene expression profiles. Genes preferentially expressed at each stage were also identified. Through GO and KEGG pathway enrichment analysis, relevant physiological variations during the early development of A. baeri could be better cognized. Accordingly, the present study gives insights into the transcriptome profile of the early development of A. baeri, and the information contained in this large scale transcriptome will provide substantial references for A. baeri developmental biology and promote its aquaculture research.
Rout, P. K.; Thangraj, K.; Mandal, A.; Roy, R.
Jamunapari, a dairy goat breed of India, has been gradually declining in numbers in its home tract over the years. We have analysed genetic variation and population history in Jamunapari goats based on 17 microsatellite loci, 2 milk protein loci, mitochondrial hypervariable region I (HVRI) sequencing, and three Y-chromosomal gene sequencing. We used the mitochondrial DNA (mtDNA) mismatch distribution, microsatellite data, and bottleneck tests to infer the population history and demography. The mean number of alleles per locus was 9.0 indicating that the allelic variation was high in all the loci and the mean heterozygosity was 0.769 at nuclear loci. Although the population size is smaller than 8,000 individuals, the amount of variability both in terms of allelic richness and gene diversity was high in all the microsatellite loci except ILST 005. The gene diversity and effective number of alleles at milk protein loci were higher than the 10 other Indian goat breeds that they were compared to. Mismatch analysis was carried out and the analysis revealed that the population curve was unimodal indicating the expansion of population. The genetic diversity of Y-chromosome genes was low in the present study. The observed mean M ratio in the population was above the critical significance value (Mc) and close to one indicating that it has maintained a slowly changing population size. The mode-shift test did not detect any distortion of allele frequency and the heterozygosity excess method showed that there was no significant departure from mutation-drift equilibrium detected in the population. However, the effects of genetic bottlenecks were observed in some loci due to decreased heterozygosity and lower level of M ratio. There were two observed genetic subdivisions in the population supporting the observations of farmers in different areas. This base line information on genetic diversity, bottleneck analysis, and mismatch analysis was obtained to assist the conservation
Onsongo, Getiria; Baughn, Linda B; Bower, Matthew; Henzler, Christine; Schomaker, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat
Simultaneous detection of small copy number variations (CNVs) (<0.5 kb) and single-nucleotide variants in clinically significant genes is of great interest for clinical laboratories. The analytical variability in next-generation sequencing (NGS) and artifacts in coverage data because of issues with mappability along with lack of robust bioinformatics tools for CNV detection have limited the utility of targeted NGS data to identify CNVs. We describe the development and implementation of a bioinformatics algorithm, copy number variation-random forest (CNV-RF), that incorporates a machine learning component to identify CNVs from targeted NGS data. Using CNV-RF, we identified 12 of 13 deletions in samples with known CNVs, two cases with duplications, and identified novel deletions in 22 additional cases. Furthermore, no CNVs were identified among 60 genes in 14 cases with normal copy number and no CNVs were identified in another 104 patients with clinical suspicion of CNVs. All positive deletions and duplications were confirmed using a quantitative PCR method. CNV-RF also detected heterozygous deletions and duplications with a specificity of 50% across 4813 genes. The ability of CNV-RF to detect clinically relevant CNVs with a high degree of sensitivity along with confirmation using a low-cost quantitative PCR method provides a framework for providing comprehensive NGS-based CNV/single-nucleotide variant detection in a clinical molecular diagnostics laboratory. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Patil, Gunvant; Valliyodan, Babu; Deshmukh, Rupesh; Prince, Silvas; Nicander, Bjorn; Zhao, Mingzhe; Sonah, Humira; Song, Li; Lin, Li; Chaudhary, Juhi; Liu, Yang; Joshi, Trupti; Xu, Dong; Nguyen, Henry T
SWEET (MtN3_saliva) domain proteins, a recently identified group of efflux transporters, play an indispensable role in sugar efflux, phloem loading, plant-pathogen interaction and reproductive tissue development. The SWEET gene family is predominantly studied in Arabidopsis and members of the family are being investigated in rice. To date, no transcriptome or genomics analysis of soybean SWEET genes has been reported. In the present investigation, we explored the evolutionary aspect of the SWEET gene family in diverse plant species including primitive single cell algae to angiosperms with a major emphasis on Glycine max. Evolutionary features showed expansion and duplication of the SWEET gene family in land plants. Homology searches with BLAST tools and Hidden Markov Model-directed sequence alignments identified 52 SWEET genes that were mapped to 15 chromosomes in the soybean genome as tandem duplication events. Soybean SWEET (GmSWEET) genes showed a wide range of expression profiles in different tissues and developmental stages. Analysis of public transcriptome data and expression profiling using quantitative real time PCR (qRT-PCR) showed that a majority of the GmSWEET genes were confined to reproductive tissue development. Several natural genetic variants (non-synonymous SNPs, premature stop codons and haplotype) were identified in the GmSWEET genes using whole genome re-sequencing data analysis of 106 soybean genotypes. A significant association was observed between SNP-haplogroup and seed sucrose content in three gene clusters on chromosome 6. Present investigation utilized comparative genomics, transcriptome profiling and whole genome re-sequencing approaches and provided a systematic description of soybean SWEET genes and identified putative candidates with probable roles in the reproductive tissue development. Gene expression profiling at different developmental stages and genomic variation data will aid as an important resource for the soybean research
Phelan, Jody E.
Background Approximately 10 % of the Mycobacterium tuberculosis genome is made up of two families of genes that are poorly characterized due to their high GC content and highly repetitive nature. The PE and PPE families are typified by their highly conserved N-terminal domains that incorporate proline-glutamate (PE) and proline-proline-glutamate (PPE) signature motifs. They are hypothesised to be important virulence factors involved with host-pathogen interactions, but their high genetic variability and complexity of analysis means they are typically disregarded in genome studies. Results To elucidate the structure of these genes, 518 genomes from a diverse international collection of clinical isolates were de novo assembled. A further 21 reference M. tuberculosis complex genomes and long read sequence data were used to validate the approach. SNP analysis revealed that variation in the majority of the 168 pe/ppe genes studied was consistent with lineage. Several recombination hotspots were identified, notably pe_pgrs3 and pe_pgrs17. Evidence of positive selection was revealed in 65 pe/ppe genes, including epitopes potentially binding to major histocompatibility complex molecules. Conclusions This, the first comprehensive study of the pe and ppe genes, provides important insight into M. tuberculosis diversity and has significant implications for vaccine development.
Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra
Taciana de Carvalho Coutinho
Full Text Available ABSTRACT Cotton fiber are tubular cells which develop from the differentiation of ovule epidermis. In addition to being one of the most important natural fiber of the textile group, cotton fiber afford an excellent experimental system for studying the cell wall. The aim of this work was to isolate and characterise the genes expressed in cotton fiber (Gossypium hirsutum L. to be used in future work in cotton breeding. Fiber of the cotton cultivar CNPA ITA 90 II were used to extract RNA for the subsequent generation of a cDNA library. Seventeen sequences were obtained, of which 14 were already described in the NCBI database (National Centre for Biotechnology Information, such as those encoding the lipid transfer proteins (LTPs and arabinogalactans (AGP. However, other cDNAs such as the B05 clone, which displays homology with the glycosyltransferases, have still not been described for this crop. Nevertheless, results showed that several clones obtained in this study are associated with cell wall proteins, wall-modifying enzymes and lipid transfer proteins directly involved in fiber development.
Granados-Cifuentes, Camila; Bellantuono, Anthony J; Ridgway, Tyrone; Hoegh-Guldberg, Ove; Rodriguez-Lanetty, Mauricio
Ecosystems worldwide are suffering the consequences of anthropogenic impact. The diverse ecosystem of coral reefs, for example, are globally threatened by increases in sea surface temperatures due to global warming. Studies to date have focused on determining genetic diversity, the sequence variability of genes in a species, as a proxy to estimate and predict the potential adaptive response of coral populations to environmental changes linked to climate changes. However, the examination of natural gene expression variation has received less attention. This variation has been implicated as an important factor in evolutionary processes, upon which natural selection can act. We acclimatized coral nubbins from six colonies of the reef-building coral Acropora millepora to a common garden in Heron Island (Great Barrier Reef, GBR) for a period of four weeks to remove any site-specific environmental effects on the physiology of the coral nubbins. By using a cDNA microarray platform, we detected a high level of gene expression variation, with 17% (488) of the unigenes differentially expressed across coral nubbins of the six colonies (jsFDR-corrected, p natural variation between reef corals when assessing experimental gene expression differences. The high transcriptional variation detected in this study is interpreted and discussed within the context of adaptive potential and phenotypic plasticity of reef corals. Whether this variation will allow coral reefs to survive to current challenges remains unknown.
E. Gus Cothran
Full Text Available Genetic variation in Zemaitukai horses was investigated using mitochondrial DNA (mtDNA sequencing. The study was performed on 421 bp of the mitochondrial DNA control region, which is known to be more variable than other sections of the mitochondrial genome. Samples from each of the remaining maternal family lines of Zemaitukai horses and three random samples for other Lithuanian (Lithuanian Heavy Draught, Zemaitukai large type and ten European horse breeds were sequenced. Five distinct haplotypes were obtained for the five Zemaitukai maternal families supporting the pedigree data. The minimal difference between two different sequence haplotypes was 6 and the maximal 11 nucleotides in Zemaitukai horse breed. A total of 20 nucleotide differences compared to the reference sequence were found in Lithuanian horse breeds. Genetic cluster analysis did not shown any clear pattern of relationship among breeds of different type.
David A Garfield
Full Text Available Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear, allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role.
Chen, Wenjing; Cheng, Jianding; Ou, Xueling; Chen, Yong; Tong, Dayue; Sun, Hongyu
DNA sequence variation including base(s) changes and insertion or deletion in the primer binding region may cause a null allele and, if this changes the length of the amplified fragment out of the allelic ladder, off-ladder (OL) alleles may be detected. In order to provide accurate and reliable DNA evidence for forensic DNA analysis, it is essential to clarify sequence variations in prevalently used STR loci. Suspected null alleles and OL alleles of PlowerPlex16® System from 21,934 unrelated Chinese individuals were verified by alternative systems and sequenced. A total of 17 cases with null alleles were identified, including 12 kinds of point mutations in 16 cases and a 19-base deletion in one case. The total frequency of null alleles was 7.751 × 10(-4). Eight hundred and forty-four OL alleles classified as being of 97 different kinds were observed at 15 STR loci of the PowerPlex®16 system except vWA. All the frequencies of OL alleles were under 0.01. Null alleles should be confirmed by alternative primers and OL alleles should be named appropriately. Particular attention should be paid to sequence variation, since incorrect designation could lead to false conclusions.
Full Text Available Understanding the relationship between genetic and phenotypic variation is one of the great outstanding challenges in biology. To meet this challenge, comprehensive genomic variation maps of human as well as of model organism populations are required. Here, we present a nucleotide resolution catalog of single-nucleotide, multi-nucleotide, and structural variants in 39 Drosophila melanogaster Genetic Reference Panel inbred lines. Using an integrative, local assembly-based approach for variant discovery, we identify more than 3.6 million distinct variants, among which were more than 800,000 unique insertions, deletions (indels, and complex variants (1 to 6,000 bp. While the SNP density is higher near other variants, we find that variants themselves are not mutagenic, nor are regions with high variant density particularly mutation-prone. Rather, our data suggest that the elevated SNP density around variants is mainly due to population-level processes. We also provide insights into the regulatory architecture of gene expression variation in adult flies by mapping cis-expression quantitative trait loci (cis-eQTLs for more than 2,000 genes. Indels comprise around 10% of all cis-eQTLs and show larger effects than SNP cis-eQTLs. In addition, we identified two-fold more gene associations in males as compared to females and found that most cis-eQTLs are sex-specific, revealing a partial decoupling of the genomic architecture between the sexes as well as the importance of genetic factors in mediating sex-biased gene expression. Finally, we performed RNA-seq-based allelic expression imbalance analyses in the offspring of crosses between sequenced lines, which revealed that the majority of strong cis-eQTLs can be validated in heterozygous individuals.
Full Text Available Balanced chromosome abnormalities (BCAs occur at a high frequency in healthy and diseased individuals, but cost-efficient strategies to identify BCAs and evaluate whether they contribute to a phenotype have not yet become widespread. Here we apply genome-wide mate-pair library sequencing to characterize structural variation in a patient with unclear neurodevelopmental disease (NDD and complex de novo BCAs at the karyotype level. Nucleotide-level characterization of the clinically described BCA breakpoints revealed disruption of at least three NDD candidate genes (LINC00299, NUP205, PSMD14 that gave rise to abnormal mRNAs and could be assumed as disease-causing. However, unbiased genome-wide analysis of the sequencing data for cryptic structural variation was key to reveal an additional submicroscopic inversion that truncates the schizophrenia- and bipolar disorder-associated brain transcription factor ZNF804A as an equally likely NDD-driving gene. Deep sequencing of fluorescent-sorted wild-type and derivative chromosomes confirmed the clinically undetected BCA. Moreover, deep sequencing further validated a high accuracy of mate-pair library sequencing to detect structural variants larger than 10 kB, proposing that this approach is powerful for clinical-grade genome-wide structural variant detection. Our study supports previous evidence for a role of ZNF804A in NDD and highlights the need for a more comprehensive assessment of structural variation in karyotypically abnormal individuals and patients with neurocognitive disease to avoid diagnostic deception.
Blake, Jonathon; Riddell, Andrew; Theiss, Susanne; Gonzalez, Alexis Perez; Haase, Bettina; Jauch, Anna; Janssen, Johannes W. G.; Ibberson, David; Pavlinic, Dinko; Moog, Ute; Benes, Vladimir; Runz, Heiko
Balanced chromosome abnormalities (BCAs) occur at a high frequency in healthy and diseased individuals, but cost-efficient strategies to identify BCAs and evaluate whether they contribute to a phenotype have not yet become widespread. Here we apply genome-wide mate-pair library sequencing to characterize structural variation in a patient with unclear neurodevelopmental disease (NDD) and complex de novo BCAs at the karyotype level. Nucleotide-level characterization of the clinically described BCA breakpoints revealed disruption of at least three NDD candidate genes (LINC00299, NUP205, PSMD14) that gave rise to abnormal mRNAs and could be assumed as disease-causing. However, unbiased genome-wide analysis of the sequencing data for cryptic structural variation was key to reveal an additional submicroscopic inversion that truncates the schizophrenia- and bipolar disorder-associated brain transcription factor ZNF804A as an equally likely NDD-driving gene. Deep sequencing of fluorescent-sorted wild-type and derivative chromosomes confirmed the clinically undetected BCA. Moreover, deep sequencing further validated a high accuracy of mate-pair library sequencing to detect structural variants larger than 10 kB, proposing that this approach is powerful for clinical-grade genome-wide structural variant detection. Our study supports previous evidence for a role of ZNF804A in NDD and highlights the need for a more comprehensive assessment of structural variation in karyotypically abnormal individuals and patients with neurocognitive disease to avoid diagnostic deception. PMID:24625750
Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.
Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposit...
Full Text Available In the era of targeted therapy, mutation profiling of cancer is a crucial aspect of making therapeutic decisions. To characterize cancer at a molecular level, the use of formalin-fixed paraffin-embedded tissue is important. We tested the Ion AmpliSeq Cancer Hotspot Panel v2 and nCounter Copy Number Variation Assay in 89 formalin-fixed paraffin-embedded gastric cancer samples to determine whether they are applicable in archival clinical samples for personalized targeted therapies. We validated the results with Sanger sequencing, real-time quantitative PCR, fluorescence in situ hybridization and immunohistochemistry. Frequently detected somatic mutations included TP53 (28.17%, APC (10.1%, PIK3CA (5.6%, KRAS (4.5%, SMO (3.4%, STK11 (3.4%, CDKN2A (3.4% and SMAD4 (3.4%. Amplifications of HER2, CCNE1, MYC, KRAS and EGFR genes were observed in 8 (8.9%, 4 (4.5%, 2 (2.2%, 1 (1.1% and 1 (1.1% cases, respectively. In the cases with amplification, fluorescence in situ hybridization for HER2 verified gene amplification and immunohistochemistry for HER2, EGFR and CCNE1 verified the overexpression of proteins in tumor cells. In conclusion, we successfully performed semiconductor-based sequencing and nCounter copy number variation analyses in formalin-fixed paraffin-embedded gastric cancer samples. High-throughput screening in archival clinical samples enables faster, more accurate and cost-effective detection of hotspot mutations or amplification in genes.
Yuen, Ryan KC; Merico, Daniele; Bookman, Matt; Howe, Jennifer L; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D’Abate, Lia; Chan, Ada JS; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson WL; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A; Parr, Jeremy R; Spence, Sarah J; Vorstman, Jacob; Frey, Brendan J; Robinson, James T; Strug, Lisa J; Fernandez, Bridget A; Elsabbagh, Mayada; Carter, Melissa T; Hallmayer, Joachim; Knoppers, Bartha M; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H; Glazer, David; Pletcher, Mathew T; Scherer, Stephen W
We are performing whole genome sequencing (WGS) of families with Autism Spectrum Disorder (ASD) to build a resource, named MSSNG, to enable the sub-categorization of phenotypes and underlying genetic factors involved. Here, we report WGS of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible in a cloud platform, and through an internet portal with controlled access. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertion/deletions (indels) or copy number variations (CNVs) per ASD subject. We identified 18 new candidate ASD-risk genes such as MED13 and PHF3, and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (p=6×10−4). In 294/2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried CNV/chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD. PMID:28263302
Lund, T; Rehfeld, J F; Olsen, Jørgen
In order to deduce the primary structure of bovine preprogastrin we therefore sequenced a gastrin DNA clone isolated from a bovine liver cosmid library. Bovine preprogastrin comprises 104 amino acids and consists of a signal peptide, a 37 amino acid spacer-sequence, the gastrin-34 sequence followed...
Wang, Jing; He, Ximiao; Ruan, Jue
Working in parallel with the efforts to sequence the chicken (Gallus gallus) genome, the Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, The Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DN...... on quantitative trait loci using data from collaborating institutions and public resources. Our data can be queried by search engine and homology-based BLAST searches. ChickVD is publicly accessible at http://chicken.genomics.org.cn. Udgivelsesdato: 2005-Jan-1...
The gene (pox1) encoding a phenol oxidase 1 from Pleurotus ostreatus was sequenced and the corresponding pox1-cDNA was also synthesized, cloned and sequenced. The isolated gene is flanked by an upstream region called the promoter (399 bp) prior to the start codon (ATG). The putative metalresponsive elements ...
Full Text Available Gorlin syndrome (GS is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs. In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals, whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.
Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki
Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.
Full Text Available Abstract Background Hox genes are known to play a key role in shaping the body plan of metazoans. Evolutionary dynamics of these genes is therefore essential in explaining patterns of evolutionary diversity. Among extant sarcopterygians comprising both lobe-finned fishes and tetrapods, our knowledge of the Hox genes and clusters has largely been restricted in several model organisms such as frogs, birds and mammals. Some evolutionary gaps still exist, especially for those groups with derived body morphology or occupying key positions on the tree of life, hindering our understanding of how Hox gene inventory varied along the sarcopterygian lineage. Results We determined the Hox gene inventory for six sarcopterygian groups: lungfishes, caecilians, salamanders, snakes, turtles and crocodiles by comprehensive PCR survey and genome walking. Variable Hox genes in each of the six sarcopterygian group representatives, compared to the human Hox gene inventory, were further validated for their presence/absence by PCR survey in a number of related species representing a broad evolutionary coverage of the group. Turtles, crocodiles, birds and placental mammals possess the same 39 Hox genes. HoxD12 is absent in snakes, amphibians and probably lungfishes. HoxB13 is lost in frogs and caecilians. Lobe-finned fishes, amphibians and squamate reptiles possess HoxC3. HoxC1 is only present in caecilians and lobe-finned fishes. Similar to coelacanths, lungfishes also possess HoxA14, which is only found in lobe-finned fishes to date. Our Hox gene variation data favor the lungfish-tetrapod, turtle-archosaur and frog-salamander relationships and imply that the loss of HoxD12 is not directly related to digit reduction. Conclusions Our newly determined Hox inventory data provide a more complete scenario for evolutionary dynamics of Hox genes along the sarcopterygian lineage. Limbless, worm-like caecilians and snakes possess similar Hox gene inventories to animals with
Thudi, Mahendar; Khan, Aamir W; Kumar, Vinay; Gaur, Pooran M; Katta, Krishnamohan; Garg, Vanika; Roorkiwal, Manish; Samineni, Srinivasan; Varshney, Rajeev K
Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for
Giardina, Emiliano; Oddone, Francesco; Lepre, Tiziana; Centofanti, Marco; Peconi, Cristina; Tanga, Lucia; Quaranta, Luciano; Frezzotti, Paolo; Novelli, Giuseppe; Manni, Gianluca
Single nucleotide polymorphisms (SNPs) within the LOXL1 gene are associated with pseudoesfoliation syndrome and pseudoesfoliation glaucoma. The aim of our study is to investigate a potential involvement of LOXL1 gene in the pathogenesis of pigment dispersion syndrome (PDS) and pigmentary glaucoma (PG). A cohort of Caucasian origin of 84 unrelated and clinically well-characterised patients with PDS/PG and 200 control subjects were included in the study. Genomic DNA from whole blood was extracted and the coding and regulatory regions of LOXL1 gene were risequenced in both patients and controls to identify unknown sequence variations. Genotype and haplotype analysis were performed with UNPHASED software. The expression levels of LOXL1 were determined on c-DNA from peripheral blood lymphocytes by quantitative real-time RT-PCR. A significant allele association was detected for SNP rs2304722 within the fifth intron of LOXL1 (Odds ratio (OR = 2.43, p-value = 3,05e-2). Haplotype analysis revealed the existence of risk and protective haplotypes associated with PG-PDS (OR = 3.35; p-value = 1.00e-5 and OR = 3.35; p-value = 1.00e-4, respectively). Expression analysis suggests that associated haplotypes can regulate the expression level LOXL1. Haplotypes of LOXL1 are associated with PG-PDS independently from rs1048661, leading to a differential expression of the transcript.
Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M.; Greenwood, Alex D.; Roca, Alfred L.
The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. PMID:25462343
3Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, People's Republic of China ... of these rearrangements involve tRNA genes, ND5 gene and ... ncbi.nlm.nih.gov/projects/Sequin/download/seq_win_download.
Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.
Huson, Daniel H; Tappu, Rewati; Bazinet, Adam L; Xie, Chao; Cummings, Michael P; Nieselt, Kay; Williams, Rohan
Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However, for some questions, only specific genes of interest need to be assembled. This is then a gene-centric assembly where the goal is to assemble reads into contigs for a family of orthologous genes. We present a new method for performing gene-centric assembly, called protein-alignment-guided assembly, and provide an implementation in our metagenome analysis tool MEGAN. Genes are assembled on the fly, based on the alignment of all reads against a protein reference database such as NCBI-nr. Specifically, the user selects a gene family based on a classification such as KEGG and all reads binned to that gene family are assembled. Using published synthetic community metagenome sequencing reads and a set of 41 gene families, we show that the performance of this approach compares favorably with that of full-featured assemblers and that of a recently published HMM-based gene-centric assembler, both in terms of the number of reference genes detected and of the percentage of reference sequence covered. Protein-alignment-guided assembly of orthologous gene families complements whole-metagenome assembly in a new and very useful way.
Stickney, A L; Dunowska, M; Cave, N J
The ongoing evolution of feline immunodeficiency virus (FIV) has resulted in the existence of a diverse continuum of viruses. FIV isolates differ with regards to their mutation and replication rates, plasma viral loads, cell tropism and the ability to induce apoptosis. Clinical disease in FIV-infected cats is also inconsistent. Genomic sequence variation of FIV is likely to be responsible for some of the variation in viral behaviour. The specific genetic sequences that influence these key viral properties remain to be determined. With knowledge of the specific key determinants of pathogenicity, there is the potential for veterinarians in the future to apply this information for prognostic purposes. Genomic sequence variation of FIV also presents an obstacle to effective vaccine development. Most challenge studies demonstrate acceptable efficacy of a dual-subtype FIV vaccine (Fel-O-Vax FIV) against FIV infection under experimental settings; however, vaccine efficacy in the field still remains to be proven. It is important that we discover the key determinants of immunity induced by this vaccine; such data would compliment vaccine field efficacy studies and provide the basis to make informed recommendations on its use.
Full Text Available Abstract Background The mycobacterial genome is inclined to polymerase slippage and a high mutation rate in microsatellite regions due to high GC content and absence of a mismatch repair system. However, the exact molecular mechanisms underlying microsatellite variation have not been fully elucidated. Here, we investigated mutation events in the hyper-variable trinucleotide microsatellite locus MML0050 located in the Rv0050 gene of W-Beijing and non-W-Beijing Mycobacterium tuberculosis strains in order to gain insight into the genomic structure and activity of repeated regions. Results Size analysis indicated the presence of five alleles that differed in length by three base pairs. Moreover, nucleotide gains occurred more frequently than loses in this trinucleotide microsatellite. Mutation frequency was not completely related with the total length, though the relative frequency in the longest allele was remarkably higher than that in the shortest. Sequence analysis was able to detect seven alleles and revealed that point mutations enhanced the level of locus variation. Introduction of an interruptive motif correlated with the total allele length and genetic lineage, rather than the length of the longest stretch of perfect repeats. Finally, the level of locus variation was drastically different between the two genetic lineages. Conclusion The Rv0050 locus encodes the bifunctional penicillin-binding protein ponA1 and is essential to mycobacterial survival. Our investigations of this particularly dynamic genomic region provide insights into the overall mode of microsatellite evolution. Specifically, replication slippage was implicated in the mutational process of this microsatellite and a sequence-based genetic analysis was necessary to determine that point mutation events acted to maintain microsatellite size integrity while providing genomic diversity.
Brian Wade Jamandre
Full Text Available The sequence and structure of the complete mtDNA control region (CR of M. cephalus from African, Pacific, and Atlantic populations are presented in this study to assess its usefulness in phylogeographic studies of this species. The mtDNA CR sequence variations among M. cephalus populations largely exceeded intraspecific polymorphisms that are generally observed in other vertebrates. The length of CR sequence varied among M. cephalus populations due to the presence of indels and variable number of tandem repeats at the 3′ hypervariable domain. The high evolutionary rate of the CR in this species probably originated from these mutations. However, no excessive homoplasic mutations were noticed. Finally, the star shaped tree inferred from the CR polymorphism stresses a rapid radiation worldwide, in this species. The CR still appears as a good marker for phylogeographic investigations and additional worldwide samples are warranted to further investigate the genetic structure and evolution in M. cephalus.
Full Text Available The fire ant Solenopsis invicta and its close relatives display an important social polymorphism involving differences in colony queen number. Colonies are headed by either a single reproductive queen (monogyne form or multiple queens (polygyne form. This variation in social organization is associated with variation at the gene Gp-9, with monogyne colonies harboring only B-like allelic variants and polygyne colonies always containing b-like variants as well. We describe naturally occurring variation at Gp-9 in fire ants based on 185 full-length sequences, 136 of which were obtained from S. invicta collected over much of its native range. While there is little overall differentiation between most of the numerous alleles observed, a surprising amount is found in the coding regions of the gene, with such substitutions usually causing amino acid replacements. This elevated coding-region variation may result from a lack of negative selection acting to constrain amino acid replacements over much of the protein, different mutation rates or biases in coding and non-coding sequences, negative selection acting with greater strength on non-coding than coding regions, and/or positive selection acting on the protein. Formal selection analyses provide evidence that the latter force played an important role in the basal b-like lineages coincident with the emergence of polygyny. While our data set reveals considerable paraphyly and polyphyly of S. invicta sequences with respect to those of other fire ant species, the b-like alleles of the socially polymorphic species are monophyletic. An expanded analysis of colonies containing alleles of this clade confirmed the invariant link between their presence and expression of polygyny. Finally, our discovery of several unique alleles bearing various combinations of b-like and B-like codons allows us to conclude that no single b-like residue is completely predictive of polygyne behavior and, thus, potentially causally
Li, Jie; Wang, Shunli; Li, Shanshan; Ge, Pei; Li, Xiaohui; Ma, Wujun; Zeller, F J; Hsam, Sai L K; Yan, Yueming
The α-gliadins are associated with human celiac disease. A total of 23 noninterrupted full open reading frame α-gliadin genes and 19 pseudogenes were cloned and sequenced from C, M, N, and U genomes of four diploid Aegilops species. Sequence comparison of α-gliadin genes from Aegilops and Triticum species demonstrated an existence of extensive allelic variations in Gli-2 loci of the four Aegilops genomes. Specific structural features were found including the compositions and variations of two polyglutamine domains (QI and QII) and four T cell stimulatory toxic epitopes. The mean numbers of glutamine residues in the QI domain in C and N genomes and the QII domain in C, N, and U genomes were much higher than those in Triticum genomes, and the QI domain in C and N genomes and the QII domain in C, M, N, and U genomes displayed greater length variations. Interestingly, the types and numbers of four T cell stimulatory toxic epitopes in α-gliadins from the four Aegilops genomes were significantly less than those from Triticum A, B, D, and their progenitor genomes. Relationships between the structural variations of the two polyglutamine domains and the distributions of four T cell stimulatory toxic epitopes were found, resulting in the α-gliadin genes from the Aegilops and Triticum genomes to be classified into three groups.
Kammlah Diane M
Full Text Available Abstract Background Cattle fever ticks, Rhipicephalus (Boophilus microplus and R. (B. annulatus, vector bovine and equine babesiosis, and have significantly expanded beyond the permanent quarantine zone established in South Texas. Currently, there are no vaccines approved for use within the United States for controlling these vectors. Vaccines developed in Australia and Cuba based on the midgut antigen Bm86 have variable efficacy against cattle fever ticks. A possible explanation for this variation in vaccine efficacy is amino acid sequence divergence between the recombinant Bm86 vaccine component and native Bm86 expressed in ticks from different geographical regions of the world. Results There was 91.8% amino acid sequence identity in Bm86 among R. microplus and R. annulatus sequenced from South Texas infestations. When South Texas isolates were compared to the Australian Yeerongpilly and Cuban Camcord vaccine strains, there was 89.8% and 90.0% identity, respectively. Most of the sequence divergence was focused in one region of the protein, amino acids 206-298. Hydrophilicity profiles revealed that two short regions of Bm86 (amino acids 206-210 and 560-570 appear to be more hydrophilic in South Texas isolates compared to vaccine strains. Only one amino acid difference was found between South Texas and vaccine strains within two previously described B-cell epitopes. A total of 4 amino acid differences were observed within three peptides previously shown to induce protective immune responses in cattle. Conclusions Sequence differences between South Texas isolates and Yeerongpilly and Camcord strains are spread throughout the entire Bm86 sequence, suggesting that geographic variation does exist. Differences within previously described B-cell epitopes between South Texas isolates and vaccine strains are minimal; however, short regions of hydrophilic amino acids found unique to South Texas isolates suggest that additional unique surface exposed
Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M
High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.
Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.
Ladefoged, Søren; Christiansen, Gunna
of which showed similarity to that which encodes the LicA protein of Haemophilus influenzae. The organization of the genes in the region showed no resemblance to that in the corresponding regions of other bacteria sequenced so far. The gyrA gene was mapped 35 kb downstream from the gyrB gene.......The homolog of the gyrB gene, which has been reported to be present in the vicinity of the initiation site of replication in bacteria, was mapped on the Mycoplasma hominis genome, and the region was subsequently sequenced. Five open reading frames were identified flanking the gyrB gene, one...
Gericke, Niklas M.; Hagberg, Mariana; dos Santos, Vanessa Carvalho; Joaquim, Leyla Mariane; El-Hani, Charbel N.
The aim of this paper is to investigate in a systematic and comparative way previous results of independent studies on the treatment of genes and gene function in high school textbooks from six different countries. We analyze how the conceptual variation within the scientific domain of Genetics regarding gene function models and gene concepts is…
Chaillou, S.; Lokman, B.C.; Leer, R.J.; Posthuma, C.; Postma, P.W.; Pouwels, P.H.
Two genes, xylP and xylQ, from the xylose regulon of Lactobacillus pentosus were cloned and sequenced. Together with the repressor gene of the regulon, xylR, the xylPQ genes form an operon which is inducible by xylose and which is transcribed from a promoter located 145 bp upstream of xylP. A
Full Text Available Genome sequencing has been useful to gain an understanding of bacterial evolution. It has been used for studying the phylogeography and/or the impact of mutation and recombination on bacterial populations. However, it has rarely been used to study gene turnover at microevolutionary scales. Here, we sequenced Mexican strains of the human pathogen Acinetobacter baumannii sampled from the same locale over a 3 year period to obtain insights into the microevolutionary dynamics of gene content variability. We found that the Mexican A. baumannii population was recently founded and has been emerging due to a rapid clonal expansion. Furthermore, we noticed that on average the Mexican strains differed from each other by over 300 genes and, notably, this gene content variation has accrued more frequently and faster than the accumulation of mutations. Moreover, due to its rapid pace, gene content variation reflects the phylogeny only at very short periods of time. Additionally, we found that the external branches of the phylogeny had almost 100 more genes than the internal branches. All in all, these results show that rapid gene turnover has been of paramount importance in producing genetic variation within this population and demonstrate the utility of genome sequencing to study alternative forms of genetic variation.
Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.
Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins. Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...
Tatarenkov, Andrey; Ayala, Francisco J
We studied nucleotide sequence variation at the gene coding for dopa decarboxylase (Ddc) in seven populations of Drosophila melanogaster. Strength and pattern of linkage disequilibrium are somewhat distinct in the extensively sampled Spanish and Raleigh populations. In the Spanish population, a few sites are in strong positive association, whereas a large number of sites in the Raleigh population are associated nonrandomly but the association is not strong. Linkage disequilibrium analysis shows presence of two groups of haplotypes in the populations, each of which is fairly diverged, suggesting epistasis or inversion polymorphism. There is evidence of two forms of natural selection acting on Ddc. The McDonald-Kreitman test indicates a deficit of fixed amino acid differences between D. melanogaster and D. simulans, which may be due to negative selection. An excess of derived alleles at high frequency, significant according to the H-test, is consistent with the effect of hitchhiking. The hitchhiking may have been caused by directional selection downstream of the locus studied, as suggested by a gradual decrease of the polymorphism-to-divergence ratio. Altogether, the Ddc locus exhibits a complicated pattern of variation apparently due to several evolutionary forces. Such a complex pattern may be a result of an unusually high density of functionally important genes.
Full Text Available Abstract Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18% of the lung carcinomas and 1 out of 7 (14% of acute inflamatory lung infiltrate specimens studied of a Mexican Population.
Saroj K. Dangi
Full Text Available Aim: Blackleg disease is caused by Clostridium chauvoei in ruminants. Although virulence factors such as C. chauvoei toxin A, sialidase, and flagellin are well characterized, hyaluronidases of C. chauvoei are not characterized. The present study was aimed at cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of C. chauvoei. Materials and Methods: C. chauvoei strain ATCC 10092 was grown in ATCC 2107 media and confirmed by polymerase chain reaction (PCR using the primers specific for 16-23S rDNA spacer region. nagH gene of C. chauvoei was amplified and cloned into pRham-SUMO vector and transformed into Escherichia cloni 10G cells. The construct was then transformed into E. cloni cells. Colony PCR was carried out to screen the colonies followed by sequencing of nagH gene in the construct. Results: PCR amplification yielded nagH gene of 1143 bp product, which was cloned in prokaryotic expression system. Colony PCR, as well as sequencing of nagH gene, confirmed the presence of insert. Sequence was then subjected to BLAST analysis of NCBI, which confirmed that the sequence was indeed of nagH gene of C. chauvoei. Phylogenetic analysis of the sequence showed that it is closely related to Clostridium perfringens and Clostridium paraputrificum. Conclusion: The gene for virulence factor nagH was cloned into a prokaryotic expression vector and confirmed by sequencing.
Full Text Available Rice is one of the main food crops and several studies have examined the molecular mechanism of the exposure of the rice plant stigma. The improvement in the exposure of the stigma in female parent hybrid combinations can enhance the efficiency of hybrid breeding. In the present study, a mutant plant with low exposed stigma (lesr was discovered among the descendants of the indica thermo-sensitive sterile line 115S. The ES% rate of the mutant decreased by 70.64% compared with the wild type variety. The F2 population was established by genetic analysis considering the mutant as the female parent and the restorer line 93S as the male parent. The results indicated a normal F1 population, while a clear division was noted for the high and low exposed stigma groups, respectively. This process was possible only by a ES of 25% in the F2 population. This was in agreement with the ratio of 3:1, which indicated that the mutant was controlled by a recessive main-effect QTL locus, temporarily named as LESR. Genome-wide comparison of the SNP profiles between the early, high and low production bulks were constructed from F2 plants using bulked segregant analysis in combination with high-throughput sequencing technology. The results demonstrated that the candidate loci was located on the chromosome 10 of the rice. Following screening of the recombinant rice plants with newly developed molecular markers, the genetic region was narrowed down to 0.25 Mb. This region was flanked by InDel-2 and InDel-2 at the physical location from 13.69 to 13.94 Mb. Within this region, 7 genes indicated base differences between parents. A total of 2 genes exhibited differences at the coding region and upstream of the coding region, respectively. The present study aimed to further clone the LESR gene, verify its function and identify the stigma variation.
Argyropoulos, Christos; Etheridge, Alton; Sakhanenko, Nikita; Galas, David
The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.
years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...
Wang Qin; Yue Jingyin; Li Jin; Song Li; Liu Qiang; Mu Chuanjie; Wu Hongying
Objective: To explore radioresistance correlative genes in IRM-2 inbred mouse. Methods: The total RNA was extracted from spleen cells of IRM-2 and their parent 615 and ICR/JCL mouse. The mRNA differential display technique was used to analyze gene expression differences. Each differential bands were amplified by PCR, cloned and sequenced. Results: There were 75 differential expression bands appearing in IRM-2 mouse but not in 615 and ICR/JCL mouse. Fifty-two pieces of cDNA sequences were got by sequencing. Twenty-one expressed sequence tags (EST) that were not the same as known mice genes were found and registered by comparing with GenBank database. Conclusion: Twenty-one EST denote that radioresistance correlative genes may be in IRM-2 mouse, which have laid a foundation for isolating and identifying radioresistance correlative genes in further study. (authors)
Full Text Available Gene expression is highly plastic, which can help organisms to both acclimate and adapt to changing environments. Possible variation in gene expression among individuals with the same genotype (among clones is not widely considered, even though it could impact the results of studies that focus on gene expression phenotypes, for example studies using clonal lines. We examined the extent of within and between clone variation in gene expression in the earthworm Dendrobaena octaedra, which reproduces through apomictic parthenogenesis. Five microsatellite markers were developed and used to confirm that offspring are genetic clones of their parent. After that, expression of 12 genes was measured from five individuals each from six clonal lines after exposure to copper contaminated soil. Variation in gene expression was higher over all genotypes than within genotypes, as initially assumed. A subset of the genes was also examined in the offspring of exposed individuals in two of the clonal lines. In this case, variation in gene expression within genotypes was as high as that observed over all genotypes. One gene in particular (chymotrypsin inhibitor also showed significant differences in the expression levels among genetically identical individuals. Gene expression can vary considerably, and the extent of variation may depend on the genotypes and genes studied. Ensuring a large sample, with many different genotypes, is critical in studies comparing gene expression phenotypes. Researchers should be especially cautious inferring gene expression phenotypes when using only a single clonal or inbred line, since the results might be specific to only certain genotypes.
Kristoffer T Bæk
Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.
Bardak, H; Gunay, M; Ercalik, Y; Bardak, Y; Ozbas, H; Bagci, O
Age-related macular degeneration (AMD) is the leading cause of blindness in developed countries. It is a complex disease with both genetic and environmental risk factors. To improve clinical management of this condition, it is important to develop risk assessment and prevention strategies for environmental influences, and establish a more effective treatment approach. The aim of the present study was to investigate age-related maculopathy susceptibility protein 2 (ARMS2) gene sequences among Turkish patients with exudative AMD. In addition to 39 advanced exudative AMD patients, 250 healthy individuals for whom exome sequencing data were available were included as a control group. Patients with a history of known environmental and systemic AMD risk factors were excluded. Genomic DNA was isolated from peripheral blood and analyzed using next-generation sequencing. All coding exons of the ARMS2 gene were assessed. Three different ARMS2 sequence variations (rs10490923, rs2736911, and rs10490924) were identified in both the patient and control group. Within the control group, two further ARMS2 gene variants (rs7088128 and rs36213074) were also detected. Logistic regression analysis revealed a relationship between the rs10490924 polymorphism and AMD in the Turkish population.
Sequence Variations in the Flagellar Antigen Genes fliC H25 and fliC H28 of Escherichia coli and Their Use in Identification and Characterization of Enterohemorrhagic E. coli (EHEC) O145:H25 and O145:H28
Beutin, Lothar; Delannoy, Sabine; Fach, Patrick
Enterohemorrhagic E. coli (EHEC) serogroup O145 is regarded as one of the major EHEC serogroups involved in severe infections in humans. EHEC O145 encompasses motile and non-motile strains of serotypes O145:H25 and O145:H28. Sequencing the fliC-genes associated with the flagellar antigens H25 and H28 revealed the genetic diversity of the fliC H25 and fliC H28 gene sequences in E. coli. Based on allele discrimination of these fliC-genes real-time PCR tests were designed for identification of EHEC O145:H25 and O145:H28. The fliC H25 genes present in O145:H25 were found to be very similar to those present in E. coli serogroups O2, O100, O165, O172 and O177 pointing to their common evolution but were different from fliC H25 genes of a multiple number of other E. coli serotypes. In a similar way, EHEC O145:H28 harbor a characteristic fliC H28 allele which, apart from EHEC O145:H28, was only found in enteropathogenic (EPEC) O28:H28 strains that shared some common traits with EHEC O145:H28. The real time PCR-assays targeting these fliC H25[O145] and fliC H28[O145] alleles allow better characterization of EHEC O145:H25 and EHEC O145:H28. Evaluation of these PCR assays in spiked ready-to eat salad samples resulted in specific detection of both types of EHEC O145 strains even when low spiking levels of 1–10 cfu/g were used. Furthermore these PCR assays allowed identification of non-motile E. coli strains which are serologically not typable for their H-antigens. The combined use of O-antigen genotyping (O145wzy) and detection of the respective fliC H25[O145] and fliC H28[O145] allele types contributes to improve identification and molecular serotyping of E. coli O145 isolates. PMID:26000885
Gitt, M.A.; Barondes, S.H.
The authors have isolated and sequenced the genomic DNA encoding a human dimeric soluble lactose-binding lectin. The gene has four exons, and its upstream region contains sequences that suggest control by glucocorticoids, heat (environmental) shock, metals, and other factors. They have also isolated and sequenced three exons of the gene encoding another human putative lectin, the existence of which was first indicated by isolation of its cDNA. Comparisons suggest a general pattern of genomic organization of members of this lectin gene family
Full Text Available BACKGROUND: The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. METHODOLOGY/PRINCIPAL FINDINGS: Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3% being the most abundant. More than four thousand simple sequence repeat (SSR sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. CONCLUSIONS: The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye.
Full Text Available Plasmids are important antibiotic resistance determinant carriers that can disseminate various drug resistance genes among species or genera. By using a high throughput sequencing approach, two groups of plasmids of Escherichia coli (named E1 and E2, each consisting of 160 clinical E. coli strains isolated from different periods of time were sequenced and analyzed. A total of 20 million reads were obtained and mapped onto the known resistance gene sequences. As a result, a total of 9 classes, including 36 types of antibiotic resistant genes, were identified. Among these genes, 25 and 27 single nucleotide polymorphisms (SNPs appeared, of which 9 and 12 SNPs are nonsynonymous substitutions in the E1 and E2 samples. It is interesting to find that a novel genotype of bla(KLUC, whose close relatives, bla(KLUC-1 and bla(KLUC-2, have been previously reported as carried on the Kluyvera cryocrescens chromosome and Enterobacter cloacae plasmid, was identified. It shares 99% and 98% amino acid identities with Kluc-1 and Kluc-2, respectively. Further PCR screening of 608 Enterobacteriaceae family isolates yielded a second variant (named bla(KLUC-4. It was interesting to find that Kluc-3 showed resistance to several cephalosporins including cefotaxime, whereas bla(KLUC-4 did not show any resistance to the antibiotics tested. This may be due to a positively charged residue, Arg, replaced by a neutral residue, Leu, at position 167, which is located within an omega-loop. This work represents large-scale studies on resistance gene distribution, diversification and genetic variation in pooled multi-drug resistance plasmids, and provides insight into the use of high throughput sequencing technology for microbial resistance gene detection.
For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. ... been widely used for phylogenetic studies and sequence differences in ... In order to fill up the internal gap, a new set.
Casals, Ferran; Anglada, Roger; Bonet, Núria; Rasal, Raquel; van der Gaag, Kristiaan J; Hoogenboom, Jerry; Solé-Morata, Neus; Comas, David; Calafell, Francesc
We have genotyped the 58 STRs (27 autosomal, 24 Y-STRs and 7 X-STRs) and 94 autosomal SNPs in Illumina ForenSeq™ Primer Mix A in 88 Spanish Roma (Gypsy) samples and 143 Catalans. Since this platform is based in massive parallel sequencing, we have used simple R scripts to uncover the sequence variation in the repeat region. Thus, we have found, across 58 STRs, 541 length-based alleles, which, after considering repeat-sequence variation, became 804 different alleles. All loci in both populations were in Hardy-Weinberg equilibrium. F ST between both populations was 0.0178 for autosomal SNPs, 0.0146 for autosomal STRs, 0.0101 for X-STRs and 0.1866 for Y-STRs. Combined a priori statistics showed quite large; for instance, pooling all the autosomal loci, the a priori probabilities of discriminating a suspect become 1-(2.3×10 -70 ) and 1-(5.9×10 -73 ), for Roma and Catalans respectively, and the chances of excluding a false father in a trio are 1-(2.6×10 -20 ) and 1-(2.0×10 -21 ). Copyright © 2017 Elsevier B.V. All rights reserved.
Wendy Anne Gold
Full Text Available Rett syndrome (RTT is a rare, severe disorder of neuronal plasticity that predominantly affects girls. Girls with RTT usually appear asymptomatic in the first 6-18 months of life, but gradually develop severe motor, cognitive and behavioural abnormalities that persist for life. A predominance of neuronal and synaptic dysfunction, with altered excitatory-inhibitory neuronal synaptic transmission and synaptic plasticity are overarching features of RTT in children and in mouse models. Approximately 95% of patients with classical RTT have mutations in the X-linked methyl-CpG-binding (MECP2 gene, whilst other genes, including cyclin-dependent kinase-like 5 (CDKL5, Forkhead box protein G1 (FOXG1, Myocyte-specific enhancer factor 2C (MEF2C and Transcription factor 4 (TCF4, have been associated with phenotypes overlapping with RTT. However, there remain a proportion of patients who carry a clinical diagnosis of RTT, but who are mutation negative. In recent years, next-generation sequencing (NGS technologies have revolutionized approaches to genetic studies, making whole-exome and even whole-genome sequencing possible strategies for the detection of rare and de novo mutations, aiding the discovery of novel disease genes. Here, we review the recent progress that is emerging in identifying pathogenic variations, specifically from exome sequencing in RTT patients, and emphasize the need for the use of this technology to identify known and new disease genes in RTT patients.
ZHAO Baimei; CAO Haipeng; DUAN Junjie; YANG Wencai
Three crosses,Hawaii7981×PI128216,Hawaii7981×LA1589,and PI128216×LA1589,were made to develop F2 populations for testing allelism among three genes Xv3,Rx4,and RxLA1589 conferring resistance to bacterial spot caused by Xanthomonas perforans race T3 in tomato. Each population consisted of 535–1 655 individuals. An infiltration method was used to inoculate the leaves of the parental and F2 plants as well as the susceptible control OH88119 for detecting hypersensitive resistance（HR）. The results showed that all the tomato plants except OH88119 had HR to race T3,indicating that Xv3,Rx4,and RxLA1589 were allelic genes. Genomic DNA fragments of the Rx4 alleles from Hawaii7981,PI128216,and LA1589 were amplified using gene-specific primers and sequenced. No sequence variation was observed in the coding region of Rx4 in the three resistant lines. Based on the published map positions of these loci as well as the allelic tests and sequence data obtained in this study,we speculated that Xv3,Rx4,and RxLA1589 were the same gene. The results will provide useful information for understanding the mechanism of resistance to race T3 and developing resistant tomato varieties.
Knaus, Brian J; Grünwald, Niklaus J
Inference of copy number variation presents a technical challenge because variant callers typically require the copy number of a genome or genomic region to be known a priori . Here we present a method to infer copy number that uses variant call format (VCF) data as input and is implemented in the R package vcfR . This method is based on the relative frequency of each allele (in both genic and non-genic regions) sequenced at heterozygous positions throughout a genome. These heterozygous positions are summarized by using arbitrarily sized windows of heterozygous positions, binning the allele frequencies, and selecting the bin with the greatest abundance of positions. This provides a non-parametric summary of the frequency that alleles were sequenced at. The method is applicable to organisms that have reference genomes that consist of full chromosomes or sub-chromosomal contigs. In contrast to other software designed to detect copy number variation, our method does not rely on an assumption of base ploidy, but instead infers it. We validated these approaches with the model system of Saccharomyces cerevisiae and applied it to the oomycete Phytophthora infestans , both known to vary in copy number. This functionality has been incorporated into the current release of the R package vcfR to provide modular and flexible methods to investigate copy number variation in genomic projects.
Full Text Available Abstract Background Two high-risk genes have been implicated in the development of CM (cutaneous melanoma. Germline mutations of the CDKN2A gene are found in CDK4 gene reported to date. Beside those high penetrance genes, certain allelic variants of the MC1R gene modify the risk of developing the disease. The aims of our study were: to determine the prevalence of germline CDKN2A mutations and variants in members of families with familial CM and in patients with multiple primary CM; to search for possible CDK4 mutations, and to determine the frequency of variations in the MC1R gene. Methods From January 2001 until January 2007, 64 individuals were included in the study. The group included 28 patients and 7 healthy relatives belonging to 25 families, 26 patients with multiple primary tumors and 3 children with CM. Additionally 54 healthy individuals were included as a control group. Mutations and variants of the melanoma susceptibility genes were identified by direct sequencing. Results Seven families with CDKN2A mutations were discovered (7/25 or 28.0%. The L94Q mutation found in one family had not been previously reported in other populations. The D84N variant, with possible biological impact, was discovered in the case of patient without family history but with multiple primary CM. Only one mutation carrier was found in the control group. Further analysis revealed that c.540C>T heterozygous carriers were more common in the group of CM patients and their healthy relatives (11/64 vs. 2/54. One p14ARF variant was discovered in the control group and no mutations of the CDK4 gene were found. Most frequently found variants of the MC1R gene were T314T, V60L, V92M, R151C, R160W and R163Q with frequencies slightly higher in the group of patients and their relatives than in the group of controls, but the difference was statistically insignificant. Conclusion The present study has shown high prevalence of p16INK4A mutations in Slovenian population of
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.
Full Text Available We previously reported the microarray-based selection of three ovulation-related genes in zebrafish. We used a different selection method in this study, RNA sequencing analysis. An additional eight up-regulated candidates were found as specifically up-regulated genes in ovulation-induced samples. Changes in gene expression were confirmed by qPCR analysis. Furthermore, up-regulation prior to ovulation during natural spawning was verified in samples from natural pairing. Gene knock-out zebrafish strains of one of the candidates, the starmaker gene (stm, were established by CRISPR genome editing techniques. Unexpectedly, homozygous mutants were fertile and could spawn eggs. However, a high percentage of unfertilized eggs and abnormal embryos were produced from these homozygous females. The results suggest that the stm gene is necessary for fertilization. In this study, we selected additional ovulation-inducing candidate genes, and a novel function of the stm gene was investigated.
Gardiner, Sarah L; van Belzen, Martine J; Boogaard, Merel W; van Roon-Mom, Willeke M C; Rozing, Maarten P; van Hemert, Albert M; Smit, Johannes H; Beekman, Aartjan T F; van Grootheest, Gerard; Schoevers, Robert A; Oude Voshaar, Richard C; Roos, Raymund A C; Comijs, Hannie C; Penninx, Brenda W J H; van der Mast, Roos C; Aziz, N Ahmad
Huntington disease (HD) is a severe neuropsychiatric disorder caused by a cytosine-adenine-guanine (CAG) repeat expansion in the HTT gene. Although HD is frequently complicated by depression, it is still unknown to what extent common HTT CAG repeat size variations in the normal range could affect depression risk in the general population. Using binary logistic regression, we assessed the association between HTT CAG repeat size and depression risk in two well-characterized Dutch cohorts─the Netherlands Study of Depression and Anxiety and the Netherlands Study of Depression in Older Persons─including 2165 depressed and 1058 non-depressed persons. In both cohorts, separately as well as combined, there was a significant non-linear association between the risk of lifetime depression and HTT CAG repeat size in which both relatively short and relatively large alleles were associated with an increased risk of depression (β = -0.292 and β = 0.006 for the linear and the quadratic term, respectively; both P < 0.01 after adjustment for the effects of sex, age, and education level). The odds of lifetime depression were lowest in persons with a HTT CAG repeat size of 21 (odds ratio: 0.71, 95% confidence interval: 0.52 to 0.98) compared to the average odds in the total cohort. In conclusion, lifetime depression risk was higher with both relatively short and relatively large HTT CAG repeat sizes in the normal range. Our study provides important proof-of-principle that repeat polymorphisms can act as hitherto unappreciated but complex genetic modifiers of depression.
Amirhaeri, S; Wohlrab, F; Wells, R D
The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Gray, Rebecca R.; Tanaka, Yasuhito; Takebe, Yutaka; Magiorkinis, Gkikas; Buskell, Zelma; Seeff, Leonard; Alter, Harvey J.; Pybus, Oliver G.
Reconstructing the transmission history of infectious diseases in the absence of medical or epidemiological records often relies on the evolutionary analysis of pathogen genetic sequences. The precision of evolutionary estimates of epidemic history can be increased by the inclusion of sequences derived from ‘archived’ samples that are genetically distinct from contemporary strains. Historical sequences are especially valuable for viral pathogens that circulated for many years before being formally identified, including HIV and the hepatitis C virus (HCV). However, surprisingly few HCV isolates sampled before discovery of the virus in 1989 are currently available. Here, we report and analyse two HCV subgenomic sequences obtained from infected individuals in 1953, which represent the oldest genetic evidence of HCV infection. The pairwise genetic diversity between the two sequences indicates a substantial period of HCV transmission prior to the 1950s, and their inclusion in evolutionary analyses provides new estimates of the common ancestor of HCV in the USA. To explore and validate the evolutionary information provided by these sequences, we used a new phylogenetic molecular clock method to estimate the date of sampling of the archived strains, plus the dates of four more contemporary reference genomes. Despite the short fragments available, we conclude that the archived sequences are consistent with a proposed sampling date of 1953, although statistical uncertainty is large. Our cross-validation analyses suggest that the bias and low statistical power observed here likely arise from a combination of high evolutionary rate heterogeneity and an unstructured, star-like phylogeny. We expect that attempts to date other historical viruses under similar circumstances will meet similar problems. PMID:23938759
Kitamoto, N; Yoshino-Yasuda, S; Ohmiya, K; Tsukagoshi, N
A gene (pel1) encoding pectin lyase (Pel1) was isolated from a shoyu koji mold, Aspergillus oryzae KBN616, and characterized. The structural gene comprised 1,196 bp with a single intron. The ORF encoded 381 amino acids with a signal peptide of 20 amino acids. The deduced amino acid sequence showed high similarity to those of Aspergillus niger pectin lyases and Glomerella cingulata PnlA. The pel1 gene was successfully overexpressed under the promoter of the A. oryzae TEF1 gene. The molecular mass of the recombinant pectin lyase substantially coincided with that calculated based on nucleotide sequence.
Satoh, Chiyoko; Takahashi, Norio; Asakawa, Junichi; Hiyama, Keiko; Kodaira, Meiko (Radiation Effects Research Foundation, Hiroshima (Japan))
In the course of feasibility studies to examine the efficiencies and practicalities of various techniques for screening for genetic variations, the human coagulation factor IX (F9) genes of 63 Japanese families were examined by PCR-denaturing gradient gel electrophoresis (PCR-DGGE). Four target sequences with lengths of 983-2,891 bp from the F9 genes of 126 unrelated individuals from Hiroshima and their 100 children were amplified by PCR, digested with restriction enzymes to approximately 500-bp fragments, and examined by DGGE - a total of 6,724 bp being examined per individual. GC-rich sequences (GC-clamps) of 40 bp were attached to both ends of the target sequences, as far as was feasible. Eleven types of new nucleotide substitutions were detected in the population, none of which produced RFLPs or caused hemophilia B. By examining two target sequences in a single lane, approximately 8,000 bp in a diploid individual could be examined. This approach is very effective for the detection of variations in DNA and is applicable to large-scale population studies. 46 refs., 3 figs., 1 tab.
Genetically select is a better way to satisfy the growing customer requirement ... a ranscriptional repressor has an important effect to the gene expression and cell ... In this study, a total of 450 animals with no genetic relationship were used to.
Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred
Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.
Yazun Bashir Jarrar
Nov 26, 2017 ... Sequence analysis of the N-acetyltransferase 2 gene (NAT2) among Jordanian volunteers, Libyan. Journal of Medicine .... For molecular modeling of NAT2 protein, visualized ..... cal clustering. .... cular dynamics simulation.
[Solc R., Hirschfeldova K., Kebrdlova V. and Baxova A. 2014 Analysis of common SHOX gene sequence variants ... based on a Gibbs sampling strategy were done using .... SHOX (short stature homeobox) are an important cause of growth.
Ramaiah, N.; Chun, J.; Ravel, J.; Straube, W.L.; Hill, R.T.; Colwell, R.R.
in all cases were confirmed by PCR of DNA extracts and Southern hybridization analyses, using an internal probe for confirmation of luxA amplification products. Sequence analysis of luxA genes from three nonluminescent bacteria isolated from...
Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi
Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for
Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
Baloch, M.J.; Jatoi, W.A.
Physiological and yield traits such as stomatal conductance (mmol m-/sup 2/s/sup -1/), Leaf relative water content (RWC %) and grain yield per plant were studied in a separate experiment. Results revealed that five out of sixteen cultivars viz. Anmol, Moomal, Sarsabz, Bhitai and Pavan, appeared to be relatively more drought tolerant. Based on morphophysiological results, studies were continued to look at these cultivars for drought tolerance at molecular level. Initially, four well recognized primers for dehydrin genes (DHNs) responsible for drought induction in T. durum L., T. aestivum L. and O. sativa L. were used for profiling gene sequence of sixteen wheat cultivars. The primers amplified the DHN genes variably like Primer WDHN13 (T. aestivum L.) amplified the DHN gene in only seven cultivars whereas primer TdDHN15 ( T. durum L.) amplified all the sixteen cultivars with even different DNA banding patterns some showing second weaker DNA bands. Third primer TdDHN16 (T. durum L.) has shown entirely different PCR amplification prototype, specially showing two strong DNA bands while fourth primer RAB16C (O. sativa L.) failed to amplify DHN gene in any of the cultivars. Examination of DNA sequences revealed several interesting features. First, it identified the two exon/one intron structure of this gene (complete sequences were not shown), a feature not previously described in the two database cDNA sequences available from T. aestivum L. (gi|21850). Secondly, the analysis identified several single nucleotide polymorphisms (SNPs), positions in gene sequence. Although complete gene sequence was not obtained for all the cultivars, yet there were a total of 38 variable positions in exonic (coding region) sequence, from a total gene length of 453 nucleotides. Matrix of SNP shows these 37 positions with individual sequence at positions given for each of the 14 cultivars (sequence of two cultivars was not obtained) included in this analysis. It demonstrated a considerab le
Hasebe, M; Omori, T; Nakazawa, M; Sano, T; Kato, M; Iwatsuki, K
Pteriodophytes have a longer evolutionary history than any other vascular land plant and, therefore, have endured greater loss of phylogenetically informative information. This factor has resulted in substantial disagreements in evaluating characters and, thus, controversy in establishing a stable classification. To compare competing classifications, we obtained DNA sequences of a chloroplast gene. The sequence of 1206 nt of the large subunit of the ribulose-bisphosphate carboxylase gene (rbc...
Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying
To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was succe...
Kwon, Minseok; Leem, Sangseob; Yoon, Joon; Park, Taesung
With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants. Here, we propose GxGrare which is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. The proposed method consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene. The proposed method is illustrated with 1080 whole exome sequencing data of the Korean population in order to identify causal gene-gene interaction for rare variants for type 2 diabetes. The proposed GxGrare performs well for gene-gene interaction detection with collapsing of rare variants. GxGrare is available at http://bibs.snu.ac.kr/software/gxgrare which contains simulation data and documentation. Supported operating systems include Linux and OS X.
Piton, Amélie; Redin, Claire; Mandel, Jean-Louis
Because of the unbalanced sex ratio (1.3-1.4 to 1) observed in intellectual disability (ID) and the identification of large ID-affected families showing X-linked segregation, much attention has been focused on the genetics of X-linked ID (XLID). Mutations causing monogenic XLID have now been reported in over 100 genes, most of which are commonly included in XLID diagnostic gene panels. Nonetheless, the boundary between true mutations and rare non-disease-causing variants often remains elusive. The sequencing of a large number of control X chromosomes, required for avoiding false-positive results, was not systematically possible in the past. Such information is now available thanks to large-scale sequencing projects such as the National Heart, Lung, and Blood (NHLBI) Exome Sequencing Project, which provides variation information on 10,563 X chromosomes from the general population. We used this NHLBI cohort to systematically reassess the implication of 106 genes proposed to be involved in monogenic forms of XLID. We particularly question the implication in XLID of ten of them (AGTR2, MAGT1, ZNF674, SRPX2, ATP6AP2, ARHGEF6, NXF5, ZCCHC12, ZNF41, and ZNF81), in which truncating variants or previously published mutations are observed at a relatively high frequency within this cohort. We also highlight 15 other genes (CCDC22, CLIC2, CNKSR2, FRMPD4, HCFC1, IGBP1, KIAA2022, KLF8, MAOA, NAA10, NLGN3, RPL10, SHROOM4, ZDHHC15, and ZNF261) for which replication studies are warranted. We propose that similar reassessment of reported mutations (and genes) with the use of data from large-scale human exome sequencing would be relevant for a wide range of other genetic diseases. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Lionel Anath C
Full Text Available Abstract Background Copy number variations (CNVs can contribute to variable degrees of fitness and/or disease predisposition. Recent studies show that at least 1% of any given genome is copy number variable when compared to the human reference sequence assembly. Homozygous deletions (or CNV nulls that are found in the normal population are of particular interest because they may serve to define non-essential genes in human biology. Results In a genomic screen investigating CNV in Autism Spectrum Disorders (ASDs we detected a heterozygous deletion on chromosome 10p12.1, spanning the Patched-domain containing 3 (PTCHD3 gene, at a frequency of ~1.4% (6/427. This finding seemed interesting, given recent discoveries on the role of another Patched-domain containing gene (PTCHD1 in ASD. Screening of another 177 ASD probands yielded two additional heterozygous deletions bringing the frequency to 1.3% (8/604. The deletion was found at a frequency of ~0.73% (27/3,695 in combined control population from North America and Northern Europe predominately of European ancestry. Screening of the human genome diversity panel (HGDP-CEPH covering worldwide populations yielded deletions in 7/1,043 unrelated individuals and those detected were confined to individuals of European/Mediterranean/Middle Eastern ancestry. Breakpoint mapping yielded an identical 102,624 bp deletion in all cases and controls tested, suggesting a common ancestral event. Interestingly, this CNV occurs at a break of synteny between humans and mouse. Considering all data, however, no significant association of these rare PTCHD3 deletions with ASD was observed. Notwithstanding, our RNA expression studies detected PTCHD3 in several tissues, and a novel shorter isoform for PTCHD3 was characterized. Expression in transfected COS-7 cells showed PTCHD3 isoforms colocalize with calnexin in the endoplasmic reticulum. The presence of a patched (Ptc domain suggested a role for PTCHD3 in various biological
Wypijewski, K.; Musial, W.; Augustyniak, J.; Malinowski, T.
The coat protein (CP) gene of the Skierniewice isolate of plum pox virus (PPV-S) has been amplified using the reverse transcription - polymerase chain reaction (RT-PCR), cloned and sequenced. The nucleotide sequence of the gene and the deduced amino-acid sequences of PPV-S CP were compared with those of other PPV strains. The nucleotide sequence showed very high homology to most of the published sequences. The motif: Asp-Ala-Gly (DAG), important for the aphid transmissibility, was present in the amino-acid sequence. Our isolate did not react in ELISA with monoclonal antibodies MAb06 supposed to be specific for PPV-D. (author). 32 refs, 1 fig., 2 tabs
Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying
To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi’an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%–99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites. PMID:23024043
Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying
To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi'an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi'an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%-99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites.
Mosquitoes are vectors for the transmission of many human pathogens that include viruses, nematodes and protozoa. For the understanding of their vectorial capacity, identification of disease carrying and refractory strains is essential. Recently, molecular taxonomic techniques have been utilized for this purpose. Sequence ...
Wu, Qinggang; Zhang, Jingping; Zhao, Chuncheng; Zhu, Jianguo
Cloning and sequencing of the papA gene from uropathogenic Escherichia coli 4030 strain to investigate the differences of the sequences of the papA of UPEC4030 strain and the ones of related genes, in order to make whether or not it was a new genotype. Cloning and sequencing methods were used to analyze the sequence of the papA of UPEC4030 strain in comparison with related sequences. The sequence analysis of papA revealed a 722 bp gene and encode 192 amino acid polypeptide. The overall homology of the papA genes between UPEC4030 and the standard strains of ten F types were 36.11%-77.95% and 22.20%-78.34% at nucleotide and deduced amino acid levels. The homology between the sequence of the reverse primers and the corresponding sequence of UPEC4030 papA was 10%-66.67%. The results confirmed that UPEC4030 strain contained a novel papA variant. UPEC4030 strain could contain an unknown papA variant or the novel genotype. The pathogenic mechanism and epidemiology related need to be further studied.
Jan 15, 2014 ... as the type strains of a species of genus Trichoderma based on phylogenetic tree analysis together with the 18S rRNA gene sequence search in Ribosomal Database Project, small subunit rRNA and large subunit rRNA databases. The sequence was deposited in GenBank with the accession numbers.
Cirulli, Elizabeth T.; Lasseigne, Brittany N.; Petrovski, Slavé; Sapp, Peter C.; Dion, Patrick A.; Leblond, Claire S.; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J.; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E.; Boone, Braden E.; Wimbish, Jack R.; Waite, Lindsay L.; Jones, Angela L.; Carulli, John P.; Day-Williams, Aaron G.; Staropoli, John F.; Xin, Winnie W.; Chesi, Alessandra; Raphael, Alya R.; McKenna-Yasek, Diane; Cady, Janet; de Jong, J. M. B. Vianney; Kenna, Kevin P.; Smith, Bradley N.; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H.; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E.; Baloh, Robert H.; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M.; Gibson, Summer; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Baas, Frank; ten Asbroek, Anneloor L. M. A.
Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS
This study was conducted to analyze the polymorphisms of chicken Toll-like receptors 4(TLR4) gene and aimed to provide a theoretical foundation for a further research on correlation between chicken TLR4 gene and disease resistance. Genetic variations at exon 2 of TLR4 gene in 14 chicken breeds and the red jungle ...
Gillet-Markowska, Alexandre; Richard, Hugues; Fischer, Gilles; Lafontaine, Ingrid
The detection of structural variations (SVs) in short-range Paired-End (PE) libraries remains challenging because SV breakpoints can involve large dispersed repeated sequences, or carry inherent complexity, hardly resolvable with classical PE sequencing data. In contrast, large insert-size sequencing libraries (Mate-Pair libraries) provide higher physical coverage of the genome and give access to repeat-containing regions. They can thus theoretically overcome previous limitations as they are becoming routinely accessible. Nevertheless, broad insert size distributions and high rates of chimerical sequences are usually associated to this type of libraries, which makes the accurate annotation of SV challenging. Here, we present Ulysses, a tool that achieves drastically higher detection accuracy than existing tools, both on simulated and real mate-pair sequencing datasets from the 1000 Human Genome project. Ulysses achieves high specificity over the complete spectrum of variants by assessing, in a principled manner, the statistical significance of each possible variant (duplications, deletions, translocations, insertions and inversions) against an explicit model for the generation of experimental noise. This statistical model proves particularly useful for the detection of low frequency variants. SV detection performed on a large insert Mate-Pair library from a breast cancer sample revealed a high level of somatic duplications in the tumor and, to a lesser extent, in the blood sample as well. Altogether, these results show that Ulysses is a valuable tool for the characterization of somatic mosaicism in human tissues and in cancer genomes. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: email@example.com.
Andersen, A.D.H.; Lange, Marianne; Lillevang, S.T.
The human chromosome region 2q33 including the three costimulatory molecules CD28, CTLA-4 and ICOS, has been subject to much attention due to its linkage to a number of autoimmune diseases. The search for the causal relationship of this linkage has revealed several polymorphisms, but no variations...... in the amino acid sequences except for one polymorphism in, the leader sequence of CTLA-4. In the present study, we examined the ICOS gene of an unrelated group of healthy donors from the Danish population. We were able to report 16 intronic SNP, one intronic G-insert and two repeat regions in intron 4......, consistent with the [T](n) and the [GT](n) regions reported in a Japanese study. Putative haplotypes for the established SNP and repeat polymorphisms have been estimated by computational analysis. Sequencing of similar to3500 by of the upstream region of ICOS revealed an additional eight SNP of which two...
Yan, Zhi-tao; Li, Nan-fang; Guo, Yan-ying; Yao, Xiao-guang; Wang, Hong-mei; Hu, Jun-li
To investigate the association between the genetic variations of the functional region in bone morphogenetic protein gene (BMP7) with type 2 diabetes mellitus in Chinese Uygur individuals. A case-control study was conducted based on epidemiological investigation. A total of 717 Uygur subjects (276 males and 441 females) were selected and divided into two groups: diabetes mellitus group (n = 502, 191 males and 311 females) and control group (n = 215, 85 males and 130 females). All exons, flanking introns and the promoter regions of (BMP7) gene were sequenced in 48 Uygur diabetics. Representative variations were selected according to the minor allele frequency (MAF) and linkage disequilibrium and genotyped using the TaqMan polymerase chain reaction method in 717 Uygur individuals, a relatively isolated general population in a relatively homogeneous environment and a case-control study was conducted to test the association between the genetic variations of (BMP7) gene and type 2 diabetes mellitus. Five novel and 8 known variations in the (BMP7) gene were identified. All genotype distributions were tested for deviations from Hardy-Weinberg equilibrium (P> 0.05). There was significant difference of genotype distribution of rs6025422 between type 2 diabetes mellitus and control groups in the male population (P 0.05), but there was no difference in total and female population (P> 0.05). And the means of fasting blood glucose (FBG), fasting insulin and HOMA-index significantly decreased in individuals with AA, AG and GG genotypes of rs6025422 in male population (Ppopulation (P> 0.05). The logistic regression analysis showed that GG genotype of rs6025422 variation might be a protective factor for diabetes in male (OR= 0.637, 95% confidence interval 0.439-0.923, P< 0.05). The present study suggests that the rs6025422 polymorphism in (BMP7) gene may be associated with diabetes mellitus and insulin resistance in Uygur men.
Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria
The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3
Full Text Available Objective The vertebral number is associated with body length and carcass traits, which represents an economically important trait in farm animals. The variation of vertebral number has been observed in a few mammalian species. However, the variation of vertebral number and quantitative trait loci in sheep breeds have not been well addressed. Methods In our investigation, the information including gender, age, carcass weight, carcass length and the number of thoracic and lumbar vertebrae from 624 China Kazakh sheep was collected. The effect of vertebral number variation on carcass weight and carcass length was estimated by general linear model. Further, the polymorphic sites of Vertnin (VRTN gene were identified by sequencing, and the association of the genotype and vertebral number variation was analyzed by the one-way analysis of variance model. Results The variation of thoracolumbar vertebrae number in Kazakh sheep (18 to 20 was smaller than that in Texel sheep (17 to 21. The individuals with 19 thoracolumbar vertebrae (T13L6 were dominant in Kazakh sheep (79.2%. The association study showed that the numbers of thoracolumbar vertebrae were positively correlated with the carcass length and carcass weight, statistically significant with carcass length. To investigate the association of thoracolumbar vertebrae number with VRTN gene, we genotyped the VRTN gene. A total of 9 polymorphic sites were detected and only a single nucleotide polymorphism (SNP (rs426367238 was suggested to associate with thoracic vertebral number statistically. Conclusion The variation of thoracolumbar vertebrae number positively associated with the carcass length and carcass weight, especially with the carcass length. VRTN gene polymorphism of the SNP (rs426367238 with significant effect on thoracic vertebral number could be as a candidate marker to further evaluate its role in influence of thoracolumbar vertebral number.
Zhang, Zhifeng; Sun, Yawei; Du, Wei; He, Sangang; Liu, Mingjun; Tian, Changyan
The vertebral number is associated with body length and carcass traits, which represents an economically important trait in farm animals. The variation of vertebral number has been observed in a few mammalian species. However, the variation of vertebral number and quantitative trait loci in sheep breeds have not been well addressed. In our investigation, the information including gender, age, carcass weight, carcass length and the number of thoracic and lumbar vertebrae from 624 China Kazakh sheep was collected. The effect of vertebral number variation on carcass weight and carcass length was estimated by general linear model. Further, the polymorphic sites of Vertnin ( VRTN ) gene were identified by sequencing, and the association of the genotype and vertebral number variation was analyzed by the one-way analysis of variance model. The variation of thoracolumbar vertebrae number in Kazakh sheep (18 to 20) was smaller than that in Texel sheep (17 to 21). The individuals with 19 thoracolumbar vertebrae (T13L6) were dominant in Kazakh sheep (79.2%). The association study showed that the numbers of thoracolumbar vertebrae were positively correlated with the carcass length and carcass weight, statistically significant with carcass length. To investigate the association of thoracolumbar vertebrae number with VRTN gene, we genotyped the VRTN gene. A total of 9 polymorphic sites were detected and only a single nucleotide polymorphism (SNP) (rs426367238) was suggested to associate with thoracic vertebral number statistically. The variation of thoracolumbar vertebrae number positively associated with the carcass length and carcass weight, especially with the carcass length. VRTN gene polymorphism of the SNP (rs426367238) with significant effect on thoracic vertebral number could be as a candidate marker to further evaluate its role in influence of thoracolumbar vertebral number.
Yao, Yining; Yang, Qinrui; Shao, Chengchen; Liu, Baonian; Zhou, Yuxiang; Xu, Hongmei; Zhou, Yueqin; Tang, Qiqun; Xie, Jianhui
Rare variants are widely observed in human genome and sequence variations at primer binding sites might impair the process of PCR amplification resulting in dropouts of alleles, named as null alleles. In this study, 5 cases from routine paternity testing using PowerPlex ® 21 System for STR genotyping were considered to harbor null alleles at TH01, FGA, D5S818, D8S1179, and D16S539, respectively. The dropout of alleles was confirmed by using alternative commercial kits AGCU Expressmarker 22 PCR amplification kit and AmpFℓSTR ® . Identifiler ® Plus Kit, and sequencing results revealed a single base variation at the primer binding site of each STR locus. Results from the collection of previous reports show that null alleles at D5S818 were frequently observed in population detected by two PowerPlex ® typing systems and null alleles at D19S433 were mostly observed in Japanese population detected by two AmpFℓSTR™ typing systems. Furthermore, the most popular mutation type appeared the transition from C to T with G to A, which might have a potential relationship with DNA methylation. Altogether, these results can provide helpful information in forensic practice to the elimination of genotyping discrepancy and the development of primer sets. Copyright © 2017 Elsevier B.V. All rights reserved.
Full Text Available The growth and development of plants are sensitive to their surroundings. Although numerous studies have analyzed plant transcriptomic variation, few have quantified the effect of combinations of factors or identified factor-specific effects. In this study, we performed RNA sequencing (RNA-seq analysis on tobacco leaves derived from 10 treatment combinations of three groups of ecological factors, i.e., climate factors (CFs, soil factors (SFs, and tillage factors (TFs. We detected 4980, 2916, and 1605 differentially expressed genes (DEGs that were affected by CFs, SFs, and TFs, which included 2703, 768, and 507 specific and 703 common DEGs (simultaneously regulated by CFs, SFs, and TFs, respectively. GO and KEGG enrichment analyses showed that genes involved in abiotic stress responses and secondary metabolic pathways were overrepresented in the common and CF-specific DEGs. In addition, we noted enrichment in CF-specific DEGs related to the circadian rhythm, SF-specific DEGs involved in mineral nutrient absorption and transport, and SF- and TF-specific DEGs associated with photosynthesis. Based on these results, we propose a model that explains how plants adapt to various ecological factors at the transcriptomic level. Additionally, the identified DEGs lay the foundation for future investigations of stress resistance, circadian rhythm and photosynthesis in tobacco.
The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.
Brito, E C; Vimaleswaran, K S; Brage, S
.005; rs13117172, p = 0.008) and fasting glucose concentrations (rs7657071, p = 0.002). None remained significant after correcting for the number of statistical comparisons. We proceeded by testing for gene x physical activity interactions for the polymorphisms that showed nominal evidence of association...... in the main effect models. None of these tests was statistically significant. CONCLUSIONS/INTERPRETATION: Variants at PPARGC1A may influence several metabolic traits in this European paediatric cohort. However, variation at PPARGC1A is unlikely to have a major impact on cardiovascular or metabolic health...
Prohormone convertase subtilisin/kexin type 1 gene. (PCSK1) plays a role in body mass control. Recent associa- tion studies have shown that three common nonsynonymous. SNPs are linked to increase risk of obesity and therefore it has been the focus of this study. Hence, in this study, polymorphisms of the bovine ...
The aim of this study was to investigate the nucleotide sequence of COX1 gene in mitochondrial genome of Khorasan native chicken and detect the possible mutations in the genome. For this purpose, after sampling and extracting DNA from the whole blood samples, the COX1 gene was amplified using specific primers and ...
Bruinenberg, P. G.; Evers, M.; Waterham, H. R.; Kuipers, J.; Arnberg, A. C.; AB, G.
We have cloned the AMO gene, encoding the microbody matrix enzyme amine oxidase (EC 188.8.131.52) from the yeast Hansenula polymorpha. The gene was isolated by differential screening of a cDNA library, immunoselection, and subsequent screening of a H. polymorpha genomic library. The nucleotide sequence
Heidekamp, F.; Dirkse, W.G.; Hille, J.; Ormondt, H. van
The nucleotide sequence of the tmr gene, encoded by the octopine Ti plasmid from Agrobacterium tumefaciens (pTiAch5), was determined. The T-DNA, which encompasses this gene, is involved in tumor formation and maintenance, and probably mediates the cytokinin-independent growth of transformed plant
Oct 24, 2011 ... G), and the major structural protein of inner capsid particles (ICP), and also specific antigen of mucosa immunization that mediate specific immunological reaction. In this report, sequence analysis of VP6 gene of giant panda rotavirus was carried out. Full-length VP6 gene encoding for ICP of giant panda.
Fromont-Racine, M; Bucchini, D; Madsen, O
Expression of the human insulin gene was examined in transgenic mouse lines carrying the gene with various lengths of DNA sequences 5' to the transcription start site (+1). Expression of the transgene was demonstrated by 1) the presence of human C-peptide in urine, 2) the presence of specific...... of the transgene was observed in cell types other than beta-islet cells....
Jul 17, 2012 ... These nucleotide and protein sequence analysis of the putative swrW gene provides vital information on the versatility .... chain reaction (PCR) products were stored at 4°C. Presence of ... identical to the same gene with an E-value of 0.0. .... The Prokaryotes-A Handbook on the Biol. of Bacteria:Ecophysiol.
Horn, Fabian; Habel, Andreas; Scharf, Daniel H.; Dworschak, Jan; Brakhage, Axel A.; Guthke, Reinhard; Hertweck, Christian; Linde, J?rg
Verticillium hemipterigenum (anamorph Torrubiella hemipterigena) is an entomopathogenic fungus and produces a broad range of secondary metabolites. Here, we present the draft genome sequence of the fungus, including gene structure and functional annotation. Genes were predicted incorporating RNA-Seq data and functionally annotated to provide the basis for further genome studies.
Jespersen, Jakob S.; Petersen, Bent; Seguin-Orlando, Andaine
at identifying PfEMP1 features associated with high virulence. Here we present the first effective method for sequence analysis of var genes expressed in field samples: a sequential PCR and next generation sequencing based technique applied on expressed var sequence tags and subsequently on long range PCR......, encoded by ~60 highly variable 'var' genes per haploid genome. PfEMP1 is exported to the surface of infected erythrocytes and is thought to be fundamental to immune evasion by adhesion to host and parasite factors. The highly variable nature has constituted a roadblock in var expression studies aimed...
Pietrowski, D; Förster, M
The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).
Massaia, Andrea; Xue, Yali
The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.
Full Text Available Abstract Background Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL. In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. Methods We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. Results We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Conclusion Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.
My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.
Fernandez-Valverde, Selene L; Calcino, Andrew D; Degnan, Bernard M
The demosponge Amphimedon queenslandica is amongst the few early-branching metazoans with an assembled and annotated draft genome, making it an important species in the study of the origin and early evolution of animals. Current gene models in this species are largely based on in silico predictions and low coverage expressed sequence tag (EST) evidence. Amphimedon queenslandica protein-coding gene models are improved using deep RNA-Seq data from four developmental stages and CEL-Seq data from 82 developmental samples. Over 86% of previously predicted genes are retained in the new gene models, although 24% have additional exons; there is also a marked increase in the total number of annotated 3' and 5' untranslated regions (UTRs). Importantly, these new developmental transcriptome data reveal numerous previously unannotated protein-coding genes in the Amphimedon genome, increasing the total gene number by 25%, from 30,060 to 40,122. In general, Amphimedon genes have introns that are markedly smaller than those in other animals and most of the alternatively spliced genes in Amphimedon undergo intron-retention; exon-skipping is the least common mode of alternative splicing. Finally, in addition to canonical polyadenylation signal sequences, Amphimedon genes are enriched in a number of unique AT-rich motifs in their 3' UTRs. The inclusion of developmental transcriptome data has substantially improved the structure and composition of protein-coding gene models in Amphimedon queenslandica, providing a more accurate and comprehensive set of genes for functional and comparative studies. These improvements reveal the Amphimedon genome is comprised of a remarkably high number of tightly packed genes. These genes have small introns and there is pervasive intron retention amongst alternatively spliced transcripts. These aspects of the sponge genome are more similar unicellular opisthokont genomes than to other animal genomes.
Valdecy Oliveira Pontes
Full Text Available In the context of the approach of the linguistic variation of Spanish and the use of Functionalist Translation in Foreign Language classes, this article aims to report the results of the application of a Didactic Sequence (SD, in the style of the Geneva School, Hispanic plays for the teaching of linguistic variation in the pronominal treatment forms of the Spanish-Portuguese Brazilian language pair. SD was applied in the subject "Introduction to Translation Studies in Spanish Language" (2nd semester, offered by the course in Letters - Spanish Language and its Literatures, of the Federal University of Ceará. This article was based on the theoretical foundations of Functionalist Translation (NORD, 1994, 1996, 2009, 2012, Translation and Sociolinguistics (BOLAÑOS-CUELLAR, 2000; MAYORAL, 1998, elaboration of SD (DOLZ; NOVERRAZ; SCHNEUWLY, 2004; CRISTÓVÃO, 2010; BARROS, 2012 and research on the variation in the forms of treatment of Spanish and Portuguese (FONTANELLA DE WEINBER, 1999; SCHERRE et al, 2015.
Full Text Available Abstract Background Broad-spectrum fluoroquinolone antibiotics are central in modern health care and are used to treat and prevent a wide range of bacterial infections. The recently discovered qnr genes provide a mechanism of resistance with the potential to rapidly spread between bacteria using horizontal gene transfer. As for many antibiotic resistance genes present in pathogens today, qnr genes are hypothesized to originate from environmental bacteria. The vast amount of data generated by shotgun metagenomics can therefore be used to explore the diversity of qnr genes in more detail. Results In this paper we describe a new method to identify qnr genes in nucleotide sequence data. We show, using cross-validation, that the method has a high statistical power of correctly classifying sequences from novel classes of qnr genes, even for fragments as short as 100 nucleotides. Based on sequences from public repositories, the method was able to identify all previously reported plasmid-mediated qnr genes. In addition, several fragments from novel putative qnr genes were identified in metagenomes. The method was also able to annotate 39 chromosomal variants of which 11 have previously not been reported in literature. Conclusions The method described in this paper significantly improves the sensitivity and specificity of identification and annotation of qnr genes in nucleotide sequence data. The predicted novel putative qnr genes in the metagenomic data support the hypothesis of a large and uncharacterized diversity within this family of resistance genes in environmental bacterial communities. An implementation of the method is freely available at http://bioinformatics.math.chalmers.se/qnr/.
Höps, Wolfram; Jeffryes, Matt; Bateman, Alex
We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation. Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases. We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.
Loennerdal, B.; Bergstroem, S.; Andersson, Y.; Hialmarsson, K.; Sundgyist, A.; Hernell, O.
Human β-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on βcasein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic 32 p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human β-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human β-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human β-casein gene and will facilitate studies on factors affecting its expression
Wu, Chang-De; Pang, Wan-Yong; Zhao, De-Ming
The prion protein gene of African lion (Panthera Leo) was first cloned and polymorphisms screened. The results suggest that the prion protein gene of eight African lions is highly homogenous. The amino acid sequences of the prion protein (PrP) of all samples tested were identical. Four single nucleotide polymorphisms (C42T, C81A, C420T, T600C) in the prion protein gene (Prnp) of African lion were found, but no amino acid substitutions. Sequence analysis showed that the higher homology is observed to felis catus AF003087 (96.7%) and to sheep number M31313.1 (96.2%) Genbank accessed. With respect to all the mammalian prion protein sequences compared, the African lion prion protein sequence has three amino acid substitutions. The homology might in turn affect the potential intermolecular interactions critical for cross species transmission of prion disease.
Full Text Available Abstract Background In humans, copies of the Long Interspersed Nuclear Element 1 (LINE-1 retrotransposon comprise 21% of the reference genome, and have been shown to modulate expression and produce novel splice isoforms of transcripts from genes that span or neighbor the LINE-1 insertion site. Results In this work, newly released pilot data from the 1000 Genomes Project is analyzed to detect previously unreported full length insertions of the retrotransposon LINE-1. By direct analysis of the sequence data, we have identified 22 previously unreported LINE-1 insertion sites within the sequence data reported for a mother/father/daughter trio. Conclusions It is demonstrated here that next generation sequencing data, as well as emerging high quality datasets from individual genome projects allow us to assess the amount of heterogeneity with respect to the LINE-1 retrotransposon amongst humans, and provide us with a wealth of testable hypotheses as to the impact that this diversity may have on the health of individuals and populations.
Campos, W N; Massaro, J D; Martinelli, A L C; Halliwell, J A; Marsh, S G E; Mendes-Junior, C T; Donadi, E A
The HFE molecule controls iron uptake from gut, and defects in the molecule have been associated with iron overload, particularly in hereditary hemochromatosis. The HFE gene including both coding and boundary intronic regions were sequenced in 304 Brazilian individuals, encompassing healthy individuals and patients exhibiting hereditary or acquired iron overload. Six sites of variation were detected: (1) H63D C>G in exon 2, (2) IVS2 (+4) T>C in intron 2, (3) a C>G transversion in intron 3, (4) C282Y G>A in exon 4, (5) IVS4 (-44) T>C in intron 4, and (6) a new guanine deletion (G>del) in intron 5, which were used for haplotype inference. Nine HFE alleles were detected and six of these were officially named on the basis of the HLA Nomenclature, defined by the World Health Organization (WHO) Nomenclature Committee for Factors of the HLA System, and published via the IPD-IMGT/HLA website. Four alleles, HFE*001, *002, *003, and *004 exhibited variation within their exon sequences. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The potential for two overlapping fragments of DNA from a clone of newly isolated alkanes degrading bacterium Pseudomonas frederiksbergensis encoding sequences with similar homology to two parts of functional proteins is described. One strand contains a sequence with high homology to alkanes monooxygenase (alkB), a member of the alkanes hydroxylase family, and the other strand contains a sequence with some homology to alcohol dehydrogenase gene (alkJ). Overlapping of the genes on opposite strands has been reported in eukaryotic species, and is now reported in a bacterial species. The sequence comparisons and ORFS results revealed that the regulation and the genes organization involved in alkane oxidation represented in Pseudomonas frederiksberghensis varies among the different known alkane degrading bacteria. The alk gene cluster containing homologues to the known alkane monooxygenase (alkB), and rubredoxin (alkG) are oriented in the same direction, whereas alcohol dehydrogenase (alkJ) is oriented in the opposite direction. Such genomes encode messages on both strands of the DNA, or in an overlapping but different reading frames, of the same strand of DNA. The possibility of creating novel genes from pre-existing sequences, known as overprinting, which is a widespread phenomenon in small viruses. Here, the origin and evolution of the gene overlap to bacteriophages belonging to the family Microviridae have been investigated. Such a phenomenon is most widely described in extremely small genomes such as those of viruses or small plasmids, yet here is a unique phenomenon. (author)
Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli
RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.
Full Text Available RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60% of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.
Full Text Available The Chinese giant salamander, Andrias davidianus, is an important species in the course of evolution; however, there is insufficient genomic data in public databases for understanding its immunologic mechanisms. High-throughput transcriptome sequencing is necessary to generate an enormous number of transcript sequences from A. davidianus for gene discovery. In this study, we generated more than 40 million reads from samples of spleen and skin tissue using the Illumina paired-end sequencing technology. De novo assembly yielded 87,297 transcripts with a mean length of 734 base pairs (bp. Based on the sequence similarities, searching with known proteins, 38,916 genes were identified. Gene enrichment analysis determined that 981 transcripts were assigned to the immune system. Tissue-specific expression analysis indicated that 443 of transcripts were specifically expressed in the spleen and skin. Among these transcripts, 147 transcripts were found to be involved in immune responses and inflammatory reactions, such as fucolectin, β-defensins and lymphotoxin beta. Eight tissue-specific genes were selected for validation using real time reverse transcription quantitative PCR (qRT-PCR. The results showed that these genes were significantly more expressed in spleen and skin than in other tissues, suggesting that these genes have vital roles in the immune response. This work provides a comprehensive genomic sequence resource for A. davidianus and lays the foundation for future research on the immunologic and disease resistance mechanisms of A. davidianus and other amphibians.
Boone, R. D.; Rogers, S. L.
We report on work to assess the functional gene sequences for soil microbiota that control nitrogen cycle pathways along the successional sequence (willow, alder, poplar, white spruce, black spruce) on the Tanana River floodplain, Interior Alaska. Microbial DNA and mRNA were extracted from soils (0-10 cm depth) for amoA (ammonium monooxygenase), nifH (nitrogenase reductase), napA (nitrate reductase), and nirS and nirK (nitrite reductase) genes. Gene presence was determined by amplification of a conserved sequence of each gene employing sequence specific oligonucleotide primers and Polymerase Chain Reaction (PCR). Expression of the genes was measured via nested reverse transcriptase PCR amplification of the extracted mRNA. Amplified PCR products were visualized on agarose electrophoresis gels. All five successional stages show evidence for the presence and expression of microbial genes that regulate N fixation (free-living), nitrification, and nitrate reduction. We detected (1) nifH, napA, and nirK presence and amoA expression (mRNA production) for all five successional stages and (2) nirS and amoA presence and nifH, nirK, and napA expression for early successional stages (willow, alder, poplar). The results highlight that the existing body of previous process-level work has not sufficiently considered the microbial potential for a nitrate economy and free-living N fixation along the complete floodplain successional sequence.
Suzuki, Hideaki; Yu, Jiwen; Wang, Fei; Zhang, Jinfa
Cytoplasmic male sterility (CMS), which is a maternally inherited trait and controlled by novel chimeric genes in the mitochondrial genome, plays a pivotal role in the production of hybrid seed. In cotton, no PCR-based marker has been developed to discriminate CMS-D8 (from Gossypium trilobum) from its normal Upland cotton (AD1, Gossypium hirsutum) cytoplasm. The objective of the current study was to develop PCR-based single nucleotide polymorphic (SNP) markers from mitochondrial genes for the CMS-D8 cytoplasm. DNA sequence variation in mitochondrial genes involved in the oxidative phosphorylation chain including ATP synthase subunit 1, 4, 6, 8 and 9, and cytochrome c oxidase 1, 2 and 3 subunits were identified by comparing CMS-D8, its isogenic maintainer and restorer lines on the same nuclear genetic background. An allelic specific PCR (AS-PCR) was utilized for SNP typing by incorporating artificial mismatched nucleotides into the third or fourth base from the 3' terminus in both the specific and nonspecific primers. The result indicated that the method modifying allele-specific primers was successful in obtaining eight SNP markers out of eight SNPs using eight primer pairs to discriminate two alleles between AD1 and CMS-D8 cytoplasms. Two of the SNPs for atp1 and cox1 could also be used in combination to discriminate between CMS-D8 and CMS-D2 cytoplasms. Additionally, a PCR-based marker from a nine nucleotide insertion-deletion (InDel) sequence (AATTGTTTT) at the 59-67 bp positions from the start codon of atp6, which is present in the CMS and restorer lines with the D8 cytoplasm but absent in the maintainer line with the AD1 cytoplasm, was also developed. A SNP marker for two nucleotide substitutions (AA in AD1 cytoplasm to CT in CMS-D8 cytoplasm) in the intron (1,506 bp) of cox2 gene was also developed. These PCR-based SNP markers should be useful in discriminating CMS-D8 and AD1 cytoplasms, or those with CMS-D2 cytoplasm as a rapid, simple, inexpensive, and
Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu
The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.
Full Text Available Abstract Background Plasmodium falciparum histidine-rich protein PFHRP2 measurement is used widely for diagnosis, and more recently for severity assessment in falciparum malaria. The Pfhrp2 gene is highly polymorphic, with deletion of the entire gene reported in both laboratory and field isolates. These issues potentially confound the interpretation of PFHRP2 measurements. Methods Studies designed to detect deletion of Pfhrp2 and its paralog Pfhrp3 were undertaken with samples from patients in seven countries contributing to the largest hospital-based severe malaria trial (AQUAMAT. The quantitative relationship between sequence polymorphism and PFHRP2 plasma concentration was examined in samples from selected sites in Mozambique and Tanzania. Results There was no evidence for deletion of either Pfhrp2 or Pfhrp3 in the 77 samples with lowest PFHRP2 plasma concentrations across the seven countries. Pfhrp2 sequence diversity was very high with no haplotypes shared among 66 samples sequenced. There was no correlation between Pfhrp2 sequence length or repeat type and PFHRP2 plasma concentration. Conclusions These findings indicate that sequence polymorphism is not a significant cause of variation in PFHRP2 concentration in plasma samples from African children. This justifies the further development of plasma PFHRP2 concentration as a method for assessing African children who may have severe falciparum malaria. The data also add to the existing evidence base supporting the use of rapid diagnostic tests based on PFHRP2 detection.
Inês C Conceição
Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high
Full Text Available BACKGROUND: Horizontal gene transfer (HGT is recognized as one of the major forces for bacterial genome evolution. Many clinically important bacteria may acquire virulence factors and antibiotic resistance through HGT. The comparative genomic analysis has become an important tool for identifying HGT in emerging pathogens. In this study, the Serine-Aspartate Repeat (Sdr family has been compared among different sources of Staphylococcus aureus (S. aureus to discover sequence diversities within their genomes. METHODOLOGY/PRINCIPAL FINDINGS: Four sdr genes were analyzed for 21 different S. aureus strains and 218 mastitis-associated S. aureus isolates from Canada. Comparative genomic analyses revealed that S. aureus strains from bovine mastitis (RF122 and mastitis isolates in this study, ovine mastitis (ED133, pig (ST398, chicken (ED98, and human methicillin-resistant S. aureus (MRSA (TCH130, MRSA252, Mu3, Mu50, N315, 04-02981, JH1 and JH9 were highly associated with one another, presumably due to HGT. In addition, several types of insertion and deletion were found in sdr genes of many isolates. A new insertion sequence was found in mastitis isolates, which was presumably responsible for the HGT of sdrC gene among different strains. Moreover, the sdr genes could be used to type S. aureus. Regional difference of sdr genes distribution was also indicated among the tested S. aureus isolates. Finally, certain associations were found between sdr genes and subclinical or clinical mastitis isolates. CONCLUSIONS: Certain sdr gene sequences were shared in S. aureus strains and isolates from different species presumably due to HGT. Our results also suggest that the distributional assay of virulence factors should detect the full sequences or full functional regions of these factors. The traditional assay using short conserved regions may not be accurate or credible. These findings have important implications with regard to animal husbandry practices that may
Full Text Available Dyslexia is a heritable neurodevelopmental disorder characterized by difficulties in reading and writing. In this study, we describe the identification of a set of 17 polymorphisms located across 1.9 Mb region on chromosome 5q31.3, encompassing genes of the PCDHG cluster, TAF7, PCDH1 and ARHGAP26, dominantly inherited with dyslexia in a multi-incident family. Strikingly, the non-risk form of seven variations of the PCDHG cluster, are preponderant in the human lineage, while risk alleles are ancestral and conserved across Neanderthals to non-human primates. Four of these seven ancestral variations (c.460A > C [p.Ile154Leu], c.541G > A [p.Ala181Thr], c.2036G > C [p.Arg679Pro] and c.2059A > G [p.Lys687Glu] result in amino acid alterations. p.Ile154Leu and p.Ala181Thr are present at EC2: EC3 interacting interface of γA3-PCDH and γA4-PCDH respectively might affect trans-homophilic interaction and hence neuronal connectivity. p.Arg679Pro and p.Lys687Glu are present within the linker region connecting trans-membrane to extracellular domain. Sequence analysis indicated the importance of p.Ile154, p.Arg679 and p.Lys687 in maintaining class specificity. Thus the observed association of PCDHG genes encoding neural adhesion proteins reinforces the hypothesis of aberrant neuronal connectivity in the pathophysiology of dyslexia. Additionally, the striking conservation of the identified variants indicates a role of PCDHG in the evolution of highly specialized cognitive skills critical to reading.
Saville Barry J
Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619
Båge, Tove; Lagervall, Maria; Jansson, Leif; Lundeberg, Joakim; Yucel-Lindberg, Tülay
Periodontitis is a chronic inflammatory disease affecting the soft tissue and bone that surrounds the teeth. Despite extensive research, distinctive genes responsible for the disease have not been identified. The objective of this study was to elucidate transcriptome changes in periodontitis, by investigating gene expression profiles in gingival tissue obtained from periodontitis-affected and healthy gingiva from the same patient, using RNA-sequencing. Gingival biopsies were obtained from a disease-affected and a healthy site from each of 10 individuals diagnosed with periodontitis. Enrichment analysis performed among uniquely expressed genes for the periodontitis-affected and healthy tissues revealed several regulated pathways indicative of inflammation for the periodontitis-affected condition. Hierarchical clustering of the sequenced biopsies demonstrated clustering according to the degree of inflammation, as observed histologically in the biopsies, rather than clustering at the individual level. Among the top 50 upregulated genes in periodontitis-affected tissues, we investigated two genes which have not previously been demonstrated to be involved in periodontitis. These included interferon regulatory factor 4 and chemokine (C-C motif) ligand 18, which were also expressed at the protein level in gingival biopsies from patients with periodontitis. In conclusion, this study provides a first step towards a quantitative comprehensive insight into the transcriptome changes in periodontitis. We demonstrate for the first time site-specific local variation in gene expression profiles of periodontitis-affected and healthy tissues obtained from patients with periodontitis, using RNA-seq. Further, we have identified novel genes expressed in periodontitis tissues, which may constitute potential therapeutic targets for future treatment strategies of periodontitis. PMID:23029519
Hunter Gary R
Full Text Available Abstract Background The objective of the present study was to map candidate loci influencing naturally occurring variation in triacylglycerol (TAG storage using quantitative complementation procedures in Drosophila melanogaster. Based on our results from Drosophila, we performed a human population-based association study to investigate the effect of natural variation in LAMA5 gene on body composition in humans. Results We identified four candidate genes that contributed to differences in TAG storage between two strains of D. melanogaster, including Laminin A (LanA, which is a member of the α subfamily of laminin chains. We confirmed the effects of this gene using a viable LanA mutant and showed that female flies homozygous for the mutation had significantly lower TAG storage, body weight, and total protein content than control flies. Drosophila LanA is closely related to human LAMA5 gene, which maps to the well-replicated obesity-linkage region on chromosome 20q13.2-q13.3. We tested for association between three common single nucleotide polymorphisms (SNPs in the human LAMA5 gene and variation in body composition and lipid profile traits in a cohort of unrelated women of European American (EA and African American (AA descent. In both ethnic groups, we found that SNP rs659822 was associated with weight (EA: P = 0.008; AA: P = 0.05 and lean mass (EA: P= 0.003; AA: P = 0.03. We also found this SNP to be associated with height (P = 0.01, total fat mass (P = 0.01, and HDL-cholesterol (P = 0.003 but only in EA women. Finally, significant associations of SNP rs944895 with serum TAG levels (P = 0.02 and HDL-cholesterol (P = 0.03 were observed in AA women. Conclusion Our results suggest an evolutionarily conserved role of a member of the laminin gene family in contributing to variation in weight and body composition.
single D allele in the genotype enhanced the activity up to 37⋅56 ± 3⋅13%. The results suggested ethnic .... as hypertension, cardiovascular disease, diabetes and neph- ritis. ... Clarkson P 1998 Human gene for physical performance;.
Chen, Ye; Lv, Cheng; Li, Fangting; Li, Tiejun
Background Stochastic genetic switching driven by intrinsic noise is an important process in gene expression. When the rates of gene activation/inactivation are relatively slow, fast, or medium compared with the synthesis/degradation rates of mRNAs and proteins, the variability of protein and mRNA levels may exhibit very different dynamical patterns. It is desirable to provide a systematic approach to identify their key dynamical features in different regimes, aiming at distinguishing which r...
Hines Heather M
Full Text Available Abstract Background Heliconius butterfly wing pattern diversity offers a unique opportunity to investigate how natural genetic variation can drive the evolution of complex adaptive phenotypes. Positional cloning and candidate gene studies have identified a handful of regulatory and pigmentation genes implicated in Heliconius wing pattern variation, but little is known about the greater developmental networks within which these genes interact to pattern a wing. Here we took a large-scale transcriptomic approach to identify the network of genes involved in Heliconius wing pattern development and variation. This included applying over 140 transcriptome microarrays to assay gene expression in dissected wing pattern elements across a range of developmental stages and wing pattern morphs of Heliconius erato. Results We identified a number of putative early prepattern genes with color-pattern related expression domains. We also identified 51 genes differentially expressed in association with natural color pattern variation. Of these, the previously identified color pattern “switch gene” optix was recovered as the first transcript to show color-specific differential expression. Most differentially expressed genes were transcribed late in pupal development and have roles in cuticle formation or pigment synthesis. These include previously undescribed transporter genes associated with ommochrome pigmentation. Furthermore, we observed upregulation of melanin-repressing genes such as ebony and Dat1 in non-melanic patterns. Conclusions This study identifies many new genes implicated in butterfly wing pattern development and provides a glimpse into the number and types of genes affected by variation in genes that drive color pattern evolution.
Marie-Claude N. Laffitte
Full Text Available Leishmania has a plastic genome, and drug pressure can select for gene copy number variation (CNV. CNVs can apply either to whole chromosomes, leading to aneuploidy, or to specific genomic regions. For the latter, the amplification of chromosomal regions occurs at the level of homologous direct or inverted repeated sequences leading to extrachromosomal circular or linear amplified DNAs. This ability of Leishmania to respond to drug pressure by CNVs has led to the development of genomic screens such as Cos-Seq, which has the potential of expediting the discovery of drug targets for novel promising drug candidates.
Casartelli, Nicoletta; Di Matteo, Gigliola; Argentini, Claudio; Cancrini, Caterina; Bernardi, Stefania; Castelli, Guido; Scarlatti, Gabriella; Plebani, Anna; Rossi, Paolo; Doria, Margherita
Evaluation of sequence evolution as well as structural defects and mutations of the human immunodeficiency virus-type 1 (HIV-1) nef gene in relation to disease progression in infected children. We examined a large number of nef alleles sequentially derived from perinatally HIV-1-infected children with different rates of disease progression: six non-progressors (NPs), four rapid progressors (RPs), and three slow progressors (SPs). Nef alleles (182 total) were isolated from patients' peripheral blood mononuclear cells (PBMCs), sequenced and analysed for their evolutionary pattern, frequency of mutations and occurrence of amino acid variations associated with different stages of disease. The evolution rate of the nef gene apparently correlated with CD4+ decline in all progression groups. Evidence for rapid viral turnover and positive selection for changes were found only in two SPs and two RPs respectively. In NPs, a higher proportion of disrupted sequences and mutations at various functional motifs were observed. Furthermore, NP-derived Nef proteins were often changed at residues localized in the folded core domain at cytotoxic T lymphocytes (CTL) epitopes (E(105), K(106), E(110), Y(132), K(164), and R(200)), while other residues outside the core domain are more often changed in RPs (A(43)) and SPs (N(173) and Y(214)). Our results suggest a link between nef gene functions and the progression rate in HIV-1-infected children. Moreover, non-progressor-associated variations in the core domain of Nef, together with the genetic analysis, suggest that nef gene evolution is shaped by an effective immune system in these patients.
Jabbi, M; Chen, Q; Turner, N; Kohn, P; White, M; Kippenhan, J S; Dickinson, D; Kolachana, B; Mattay, V; Weinberger, D R; Berman, K F
Characterizing the molecular mechanisms underlying the heritability of complex behavioral traits such as human anxiety remains a challenging endeavor for behavioral neuroscience. Copy-number variation (CNV) in the general transcription factor gene, GTF2I, located in the 7q11.23 chromosomal region that is hemideleted in Williams syndrome and duplicated in the 7q11.23 duplication syndrome (Dup7), is associated with gene-dose-dependent anxiety in mouse models and in both Williams syndrome and Dup7. Because of this recent preclinical and clinical identification of a genetic influence on anxiety, we examined whether sequence variation in GTF2I, specifically the single-nucleotide polymorphism rs2527367, interacts with trait and state anxiety to collectively impact neural response to anxiety-laden social stimuli. Two hundred and sixty healthy adults completed the Tridimensional Personality Questionnaire Harm Avoidance (HA) subscale, a trait measure of anxiety proneness, and underwent functional magnetic resonance imaging (fMRI) while matching aversive (fearful or angry) facial identity. We found an interaction between GTF2I allelic variations and HA that affects brain response: in individuals homozygous for the major allele, there was no correlation between HA and whole-brain response to aversive cues, whereas in heterozygotes and individuals homozygous for the minor allele, there was a positive correlation between HA sub-scores and a selective dorsolateral prefrontal cortex (DLPFC) responsivity during the processing of aversive stimuli. These results demonstrate that sequence variation in the GTF2I gene influences the relationship between trait anxiety and brain response to aversive social cues in healthy individuals, supporting a role for this neurogenetic mechanism in anxiety.
Wimmer, Katharina; Wernstedt, Annekatrin
The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
Narooie-Nejad, Mehrnaz; Moossavi, Maryam; Torkamanzehi, Adam; Moghtaderi, Ali
Among the factors postulated to play a role in MS susceptibility, the role of vitamin D is outstanding. Since the function of vitamin D receptor (VDR) represents the effect of vitamin D on the body and genetic variations in VDR gene may affect its function, we aim to highlight the association of two VDR gene polymorphisms with MS susceptibility. In current study, we recruited 113 MS patients and 122 healthy controls. TaqI (rs731236) and ApaI (rs7975232) genetic variations in these two groups were evaluated using the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) technique. All genotype and allele frequencies in both variations showed association with the disease status. However, to find the definite connection between genetic variations in VDR gene and MS disease in a population of South East of Iran, more researches on gene structure and its function with regard to patients' conditions are required.
Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying
A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.
Upton, C.; McFadden, G.
The thymidine kinase (TK) gene of Shope fibroma virus (SFV), a tumorigenic leporipoxvirus, was localized within the viral genome with degenerate oligonucleotide probes. These probes were constructed to two regions of high sequence conservation between the vaccinia virus TK gene and those of several known eucaryotic cellular TK genes, including human, mouse, hamster, and chicken TK genes. The oligonucleotide probes initially localized the SFV TK gene 50 kilobases (kb) from the right terminus of the 160-kb SFV genome within the 9.5-kb BamHI-HindIII fragment E. Fine-mapping analysis indicated that the TK Gene was within a 1.2-kb AvaI-HaeIII fragment, and DNA sequencing of this region revealed an open reading frame capable of encoding a polypeptide of 187 amino acids possessing considerable homology to the TK genes of the vaccinia, variola, and monkeypox orthopoxviruses and also to a variety of cellular TK genes. Homology matrix analysis and homology scores suggest that the SFV TK gene has diverged significantly from its counterpart members in the orthopoxvirus genus. Nevertheless, the presence of conserved upstream open reading frames on the 5' side of all of the poxvirus TK genes indicates a similarity of functional organization between the orthopoxviruses and leporipoxviruses. These data suggest a common ancestral origin for at least some of the unique internal regions of the leporipoxviruses and orthopoxviruses as exemplified by SFV and vaccinia virus, respectively
Siew, Ging Yang; Ng, Wei Lun; Tan, Sheau Wei; Alitheen, Noorjahan Banu; Tan, Soon Guan; Yeap, Swee Keong
Durian ( Durio zibethinus ) is one of the most popular tropical fruits in Asia. To date, 126 durian types have been registered with the Department of Agriculture in Malaysia based on phenotypic characteristics. Classification based on morphology is convenient, easy, and fast but it suffers from phenotypic plasticity as a direct result of environmental factors and age. To overcome the limitation of morphological classification, there is a need to carry out genetic characterization of the various durian types. Such data is important for the evaluation and management of durian genetic resources in producing countries. In this study, simple sequence repeat (SSR) markers were used to study the genetic variation in 27 durian types from the germplasm collection of Universiti Putra Malaysia. Based on DNA sequences deposited in Genbank, seven pairs of primers were successfully designed to amplify SSR regions in the durian DNA samples. High levels of variation among the 27 durian types were observed (expected heterozygosity, H E = 0.35). The DNA fingerprinting power of SSR markers revealed by the combined probability of identity (PI) of all loci was 2.3×10 -3 . Unique DNA fingerprints were generated for 21 out of 27 durian types using five polymorphic SSR markers (the other two SSR markers were monomorphic). We further tested the utility of these markers by evaluating the clonal status of shared durian types from different germplasm collection sites, and found that some were not clones. The findings in this preliminary study not only shows the feasibility of using SSR markers for DNA fingerprinting of durian types, but also challenges the current classification of durian types, e.g., on whether the different types should be called "clones", "varieties", or "cultivars". Such matters have a direct impact on the regulation and management of durian genetic resources in the region.
Schulz, Tizian; Stoye, Jens; Doerr, Daniel
Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit
Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics
Andrey B. Shcherban
Full Text Available The TaHOX-1 gene of common wheat Triticum aestivum L. (BAD-genome encodes transcription factor (HD-Zip I which is characterized by the presence of a DNA-binding homeodomain (HD with an adjacent Leucine zipper (LZ motif. This gene can play a role in adapting plant to a variety of abiotic stresses, such as drought, cold, salinity etc., which strongly affect wheat production. However, it's both functional role in stress resistance and divergence during wheat evolution has not yet been elucidated. This data in brief article is associated with the research paper “Structural and functional divergence of homoeologous copies of the TaHOX-1 gene in polyploid wheats and their diploid ancestors”. The data set represents a recent survey of the primary HOX-1 gene sequences isolated from the first wheat allotetraploids (BA-genome and their corresponding Triticum and Aegilops diploid relatives. Specifically, we provide detailed information about the HOX-1 nucleotide sequences of the promoter region and both nucleotide and amino acid sequences of the gene. The sequencing data used here is available at DDBJ/EMBL/GenBank under the accession numbers MG000630-MG000698. Keywords: Wheat, Polyploid, HOX-1 gene, Homeodomain, Transcription factor, Promoter, Triticum, Aegilops
Diehn, Till A.; Pommerrenig, Benjamin; Bernhardt, Nadine; Hartmann, Anja; Bienert, Gerd P.
Aquaporins (AQPs) are essential channel proteins that regulate plant water homeostasis and the uptake and distribution of uncharged solutes such as metalloids, urea, ammonia, and carbon dioxide. Despite their importance as crop plants, little is known about AQP gene and protein function in cabbage (Brassica oleracea) and other Brassica species. The recent releases of the genome sequences of B. oleracea and Brassica rapa allow comparative genomic studies in these species to investigate the evolution and features of Brassica genes and proteins. In this study, we identified all AQP genes in B. oleracea by a genome-wide survey. In total, 67 genes of four plant AQP subfamilies were identified. Their full-length gene sequences and locations on chromosomes and scaffolds were manually curated. The identification of six additional full-length AQP sequences in the B. rapa genome added to the recently published AQP protein family of this species. A phylogenetic analysis of AQPs of Arabidopsis thaliana, B. oleracea, B. rapa allowed us to follow AQP evolution in closely related species and to systematically classify and (re-) name these isoforms. Thirty-three groups of AQP-orthologous genes were identified between B. oleracea and Arabidopsis and their expression was analyzed in different organs. The two selectivity filters, gene structure and coding sequences were highly conserved within each AQP subfamily while sequence variations in some introns and untranslated regions were frequent. These data suggest a similar substrate selectivity and function of Brassica AQPs compared to Arabidopsis orthologs. The comparative analyses of all AQP subfamilies in three Brassicaceae species give initial insights into AQP evolution in these taxa. Based on the genome-wide AQP identification in B. oleracea and the sequence analysis and reprocessing of Brassica AQP information, our dataset provides a sequence resource for further investigations of the physiological and molecular functions of
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Richaud, F; Richaud, C; Ratet, P; Patte, J C
In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amino acid polypeptide of 31,372 daltons.
Chen, Wenhai; Chen, Dandan; Zhao, Ming; Zou, Yangyun; Zeng, Yanwu; Gu, Xun
Though pleiotropy, which refers to the phenomenon of a gene affecting multiple traits, has long played a central role in genetics, development, and evolution, estimation of the number of pleiotropy components remains a hard mission to accomplish. In this paper, we report a newly developed software package, Genepleio, to estimate the effective gene pleiotropy from phylogenetic analysis of protein sequences. Since this estimate can be interpreted as the minimum pleiotropy of a gene, it is used to play a role of reference for many empirical pleiotropy measures. This work would facilitate our understanding of how gene pleiotropy affects the pattern of genotype-phenotype map and the consequence of organismal evolution.
Blomstrøm, Monica Marie
several growth modulators and invasion modulators were identified and independently validated. These candidates revealed a group of genes with metastasis-related functions in vitro that are involved in RNA-related processes, such as RNA-processing. Moreover, a general feature was that proliferation......) and non-CSCs. The main goal of this project was to functionally characterize a set of candidate genes recovered from next-generation sequencing analysis for their role in breast cancer metastasis formation. The starting gene set comprised 104 gene variants; i.e. 57 wildtype and 47 mutated variants. During...
Richaud, F; Richaud, C; Ratet, P; Patte, J C
In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amin...
Richaud, F; Richaud, C; Ratet, P; Patte, J C
In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amino acid polypeptide of 31,372 daltons. Images PMID:3514578
Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.
The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.
Full Text Available Proteins secreted to the extracellular environment or to the periphery of the cell envelope, the secretome, play essential roles in foraging, antagonistic and mutualistic interactions. We hypothesize that arms races, genetic conflicts and varying selective pressures should lead to the rapid change of sequences and gene repertoires of the secretome. The analysis of 42 bacterial pan-genomes shows that secreted, and especially extracellular proteins, are predominantly encoded in the accessory genome, i.e. among genes not ubiquitous within the clade. Genes encoding outer membrane proteins might engage more frequently in intra-chromosomal gene conversion because they are more often in multi-genic families. The gene sequences encoding the secretome evolve faster than the rest of the genome and in particular at non-synonymous positions. Cell wall proteins in Firmicutes evolve particularly fast when compared with outer membrane proteins of Proteobacteria. Virulence factors are over-represented in the secretome, notably in outer membrane proteins, but cell localization explains more of the variance in substitution rates and gene repertoires than sequence homology to known virulence factors. Accordingly, the repertoires and sequences of the genes encoding the secretome change fast in the clades of obligatory and facultative pathogens and also in the clades of mutualists and free-living bacteria. Our study shows that cell localization shapes genome evolution. In agreement with our hypothesis, the repertoires and the sequences of genes encoding secreted proteins evolve fast. The particularly rapid change of extracellular proteins suggests that these public goods are key players in bacterial adaptation.
Chang, Lixian; Yuan, Weiping; Zeng, Huimin; Zhou, Quanquan; Wei, Wei; Zhou, Jianfeng; Li, Miaomiao; Wang, Xiaomin; Xu, Mingjiang; Yang, Fengchun; Yang, Yungui; Cheng, Tao; Zhu, Xiaofan
Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients' clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology.
Ou, Langbo; Varian-Ramos, Claire W; Cristol, Daniel A
Bird eggs are used widely as noninvasive bioindicators for environmental mercury availability. Previous studies, however, have found varying relationships between laying sequence and egg mercury concentrations. Some studies have reported that the mercury concentration was higher in first-laid eggs or declined across the laying sequence, whereas in other studies mercury concentration was not related to egg order. Approximately 300 eggs (61 clutches) were collected from captive zebra finches dosed throughout their reproductive lives with methylmercury (0.3 μg/g, 0.6 μg/g, 1.2 μg/g, or 2.4 μg/g wet wt in diet); the total mercury concentration (mean ± standard deviation [SD] dry wt basis) of their eggs was 7.03 ± 1.38 μg/g, 14.15 ± 2.52 μg/g, 26.85 ± 5.85 μg/g, and 49.76 ± 10.37 μg/g, respectively (equivalent to fresh wt egg mercury concentrations of 1.24 μg/g, 2.50 μg/g, 4.74 μg/g, and 8.79 μg/g). The authors observed a significant decrease in the mercury concentration of successive eggs when compared with the first egg and notable variation between clutches within treatments. The mercury level of individual females within and among treatments did not alter this relationship. Based on the results, sampling of a single egg in each clutch from any position in the laying sequence is sufficient for purposes of population risk assessment, but it is not recommended as a proxy for individual female exposure or as an estimate of average mercury level within the clutch. © 2015 SETAC.
Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V
Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Deen, K.C.; Sweet, R.W.
Sequences in the human genome with homology to the murine mammary tumor virus (MMTV) pol gene were isolated from a human phage library. Ten clones with extensive pol homology were shown to define five separate loci. These loci share common sequences immediately adjacent to the pol-like segments and, in addition, contain a related repeat element which bounds this region. This organization is suggestive of a proviral structure. The authors estimate that the human genome contains 30 to 40 copies of these pol-related sequences. The pol region of one of the cloned segments (HM16) and the complete MMTV pol gene were sequenced and compared. The nucleotide homology between these pol sequences is 52% and is concentrated in the terminal regions. The MMTV pol gene contains a single long open reading frame encoding 899 amino acids and is demarcated from the partially overlapping putative gag gene by termination codons and a shift in translational reading frame. The pol sequence of HM16 is multiply terminated but does contain open reading frames which encode 370, 105, and 112 amino acids residues in separate reading frames. The authors deduced a composite pol protein sequence for HM16 by aligning it to the MMTV pol gene and then compared these sequences with other retroviral pol protein sequences. Conserved sequences occur in both the amino and carboxyl regions which lie within the polymerase and endonuclease domains of pol, respectively
The 2011 Tohoku earthquake sequence consists of foreshocks, mainshock, aftershocks, and repeating earthquakes. To quantify spatial and temporal stress drop variations is important for understanding M9-class megathrust earthquakes. Variability and spatial and temporal pattern of stress drop is a basic information for rupture dynamics as well as useful to source modeling. As pointed in the ground motion prediction equations by Campbell and Bozorgnia [2008, Earthquake Spectra], mainshock-aftershock pairs often provide significant decrease of stress drop. We here focus strong motion records before and after the Tohoku earthquake, and analyze source spectral ratios considering azimuth- and distance dependency [Miyake et al., 2001, GRL]. Due to the limitation of station locations on land, spatial and temporal stress drop variations are estimated by adjusting shifts from the omega-squared source spectral model. The adjustment is based on the stochastic Green's function simulations of source spectra considering azimuth- and distance dependency. We assumed the same Green's functions for event pairs for each station, both the propagation path and site amplification effects are cancelled out. Precise studies of spatial and temporal stress drop variations have been performed [e.g., Allmann and Shearer, 2007, JGR], this study targets the relations between stress drop vs. progression of slow slip prior to the Tohoku earthquake by Kato et al. [2012, Science] and plate structures. Acknowledgement: This study is partly supported by ERI Joint Research (2013-B-05). We used the JMA unified earthquake catalogue and K-NET, KiK-net, and F-net data provided by NIED.
Lean, Choo Bee; Lee, Edmund Jon Deoon
MCT1(SLC16A1) is the first member of the monocarboxylate transporter (MCT) and its family is involved in the transportation of metabolically important monocarboxylates such as lactate, pyruvate, acetate and ketone bodies. This study identifies genetic variations in SLC16A1 in the ethnic Chinese group of the Singaporean population (n=95). The promoter, coding region and exon-intron junctions of the SLC16A1 gene encoding the MCT1 transporter were screened for genetic variation in the study population by DNA sequencing. Seven genetic variations of SLC16A1, including 4 novel ones, were found: 2 in the promoter region, 2 in the coding exons (both nonsynonymous variations), 2 in the 3' untranslated region (3'UTR) and 1 in the intron. Of the two mutations detected in the promoter region, the -363-855T>C is a novel mutation. The 1282G>A (Val(428)Ile) is a novel SNP and was found as heterozygotic in 4 subjects. The 1470T>A (Asp(490)Glu) was found to be a common polymorphism in this study. Lastly, IVS3-17A>C in intron 3 and 2258 (755)A>G in 3'UTR are novel mutations found to be common polymorphisms in the local Chinese population. To our knowledge, this is the first report of a comprehensive analysis on the MCT1 gene in any population.
D`Souza, T.M.; Boominathan, K.; Reddy, C.A. [Michigan State Univ., East Lansing, MI (United States)
Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.
Homberg, J.R.; Lesch, K.P.
Converging evidence indicates an association of the short (s), low-expressing variant of the repeat length polymorphism, serotonin transporter-linked polymorphic region (5-HTTLPR), in the human serotonin transporter gene (5-HTT, SERT, SLC6A4) with anxiety-related traits and increased risk for
K. Hui; L. Minamide; N. Prandoni; H. Festenstein; F.G. Grosveld (Frank)
textabstractK36.16 is an AKR H-2k thymoma which expresses an aberrant H-2Dd-like allospecificity, does not have a detectable amount of the H-2Kk syngeneic antigen and grows very easily in syngeneic mice. By DNA-mediated gene transfer experiments, we were able to obtain transformed clones which do
characteristics and possible biological function of porcine PRDM16 gene have been less reported. ... included the heart, liver, spleen, lung, kidney, brain, longissimus dorsi muscle, interior fat, stomach, small ... al., 2008), and the copy number of PRDM16 molecules of patients with osteosarcoma was .... BMC Cancer 4, 45.
Shiramizu, B; Hu, N; Frisque, R J; Nerurkar, V R
The oncogenic potential of human polyomavirus JC (JCV), a ubiquitous virus that establishes infection during early childhood in approximately 70% of the human population, is unclear. As a neurotropic virus, JCV has been implicated in pediatric central nervous system tumors and has been suggested to be a pathogenic agent in pediatric acute lymphoblastic leukemia. Recent studies have demonstrated JCV gene sequences in pediatric medulloblastomas and among patients with colorectal cancer. JCV early protein T-antigen (TAg) can form complexes with cellular regulatory proteins and thus may play a role in tumorigenesis. Since JCV is detected in B-lymphocytes, a retrospective analysis of pediatric B-cell and non-B-cell malignancies as well as other HIV-associated pediatric malignancies was conducted for the presence of JCV gene sequences. DNA was extracted from 49 pediatric malignancies, including Hodgkin disease, non-Hodgkin lymphoma, large cell lymphoma and sarcoma. Polymerase chain reaction (PCR) was conducted using JCV specific nested primer sets for the transcriptional control region (TCR), TAg, and viral capsid protein 1 (VP1) genes. Southern blot analysis and DNA sequencing were used to confirm specificity of the amplicons. A 215-bp region of the JCV VP1 gene was amplified from 26 (53%) pediatric tumor tissues. The JCV TCR and two JCV gene regions were amplified from a leiomyosarcoma specimen from an HIV-infected patient. The leiomyosarcoma specimen from the cecum harbored the archetype strain of JCV. Including the leiomyosarcoma specimen, three of five specimens sequenced were typed as JCV genotype 2. The failure to amplify JCV TCR, and TAg gene sequences in the presence of JCV VP1 gene sequence is surprising. Even though JCV TAg gene, which is similar to the SV40 TAg gene, is oncogenic in animal models, the presence of JCV gene sequences in pediatric malignancies does not prove causality. In light of the available data on the presence of JCV in normal and cancerous
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
Li, H C; Lu, H B; Yang, F Y; Liu, S J; Bai, C J; Zhang, Y W
Sucrose phosphate synthase (SPS) is an enzyme used by higher plants for sucrose synthesis. In this study, three primer sets were designed on the basis of known SPS sequences from maize (GenBank: NM_001112224.1) and sugarcane (GenBank: JN584485.1), and five novel SPS genes were identified by RT-PCR from the genomes of Pennisetum spp (the hybrid P. americanum x P. purpureum, P. purpureum Schum., P. purpureum Schum. cv. Red, P. purpureum Schum. cv. Taiwan, and P. purpureum Schum. cv. Mott). The cloned sequences showed 99.9% identity and 80-88% similarity to the SPS sequences of other plants. The SPS gene of hybrid Pennisetum had one nucleotide and four amino acid polymorphisms compared to the other four germplasms, and cluster analysis was performed to assess genetic diversity in this species. Additional characterization of the SPS gene product can potentially allow Pennisetum to be exploited as a biofuel source.
The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
Zhang, Lei; Xue, Feng; Yan, Jie; Mao, Ya-fei; Li, Li-wei
To determine the frequency of mce gene in Leptospira interrogans, and to investigate the gene transcription levels of L. interrogans before and after infecting cells. The segments of entire mce genes from 13 L.interrogans strains and 1 L.biflexa strain were amplified by PCR and then sequenced after T-A cloning. A prokaryotic expression system of mce gene was constructed; the expression and output of the target recombinant protein rMce were examined by SDS-PAGE and Western Blot assay. Rabbits were intradermally immunized with rMce to prepare the antiserum, the titer of antiserum was measured by immunodiffusion test. The transcription levels of mce gene in L.interrogans serogroup Icterohaemorrhagiae serovar lai strain 56601 before and after infecting J774A.1 cells were monitored by real-time fluorescence quantitative RT-PCR. mce gene was carried in all tested L.interrogans strains, but not in L.biflexa serogroup Semaranga serovar patoc strain Patoc I. The similarities of nucleotide and putative amino acid sequences of the cloned mce genes to the reported sequences (GenBank accession No: NP712236) were 99.02%-100% and 97.91%-100%, respectively. The constructed prokaryotic expression system of mce gene expressed rMce and the output of rMce was about 5% of the total bacterial proteins. The antiserum against whole cell of L.interrogans strain 56601 efficiently recognized rMce. After infecting J774A.1 cells, transcription levels of the mce gene in L.interrogans strain 56601 were remarkably up-regulated. The constructed prokaryotic expression system of mce gene and the prepared antiserum against rMce provide useful tools for further study of the gene function.
Lee, Chao-Zong; Liou, Guey-Yuh; Yuan, Gwo-Fang
Aflatoxins are polyketide-derived secondary metabolites produced by Aspergillus parasiticus, Aspergillus flavus, Aspergillus nomius and a few other species. The toxic effects of aflatoxins have adverse consequences for human health and agricultural economics. The aflR gene, a regulatory gene for aflatoxin biosynthesis, encodes a protein containing a zinc-finger DNA-binding motif. Although Aspergillus oryzae and Aspergillus sojae, which are used in fermented foods and in ingredient manufacture, have no record of producing aflatoxin, they have been shown to possess an aflR gene. This study examined 34 strains of Aspergillus section Flavi. The aflR gene of 23 of these strains was successfully amplified and sequenced. No aflR PCR products were found in five A. sojae strains or six strains of A. oryzae. These PCR results suggested that the aflR gene is absent or significantly different in some A. sojae and A. oryzae strains. The sequenced aflR genes from the 23 positive strains had greater than 96.6 % similarity, which was particularly conserved in the zinc-finger DNA-binding domain. The aflR gene of A. sojae has two obvious characteristics: an extra CTCATG sequence fragment and a C to T transition that causes premature termination of AFLR protein synthesis. Differences between A. parasiticus/A. sojae and A. flavus/A. oryzae aflR genes were also identified. Some strains of A. flavus as well as A. flavus var. viridis, A. oryzae var. viridis and A. oryzae var. effuses have an A. oryzae-type aflR gene. For all strains with the A. oryzae-type aflR gene, there was no evidence of aflatoxin production. It is suggested that for safety reasons, the aflR gene could be examined to assess possible aflatoxin production by Aspergillus section Flavi strains.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Full Text Available "nBackground: Acanthamoeba keratitis develops by pathogenic Acanthamoeba such as A. palestinensis. Indeed this species is one of the known causative agents of amoebic keratitis in Iran. Mannose Binding Protein (MBP is the main pathogenicity factors for developing this sight threatening disease. We aimed to characterize MBP gene in pathogenic Acanthamoeba isolates such as A. palestinensis."nMethods: This experimental research was performed in the School of Public Health, Tehran University of Medical Sciences, Tehran, Iran during 2007-2008. A. palestinensis was grown on 2% non-nutrient agar overlaid with Escherichia coli. DNA extraction was performed using phenol-chloroform method. PCR reaction and amplification were done using specific primer pairs of MBP. The amplified fragment were purified and sequenced. Finally, the obtained fragment was deposited in the gene data bank."nResults: A 900 bp PCR-product was recovered after PCR reaction. Sequence analysis of the purified PCR product revealed a gene with 943 nucleotides. Homology analysis of the obtained sequence showed 81% similarity with the available MBP gene in the gene data bank. The fragment was deposited in the gene data bank under accession number EU678895"nConclusion: MBP is known as the most important factor in Acanthamoeba pathogenesis cascade. Therefore, characterization of this gene can aid in developing better therapeutic agents and even immunization of high-risk people.
Merlin G. Butler
Full Text Available Classical autism or autistic disorder belongs to a group of genetically heterogeneous conditions known as Autism Spectrum Disorders (ASD. Heritability is estimated as high as 90% for ASD with a recently reported compilation of 629 clinically relevant candidate and known genes. We chose to undertake a descriptive next generation whole exome sequencing case study of 30 well-characterized Caucasian females with autism (average age, 7.7 ± 2.6 years; age range, 5 to 16 years from multiplex families. Genomic DNA was used for whole exome sequencing via paired-end next generation sequencing approach and X chromosome inactivation status. The list of putative disease causing genes was developed from primary selection criteria using machine learning-derived classification score and other predictive parameters (GERP2, PolyPhen2, and SIFT. We narrowed the variant list to 10 to 20 genes and screened for biological significance including neural development, function and known neurological disorders. Seventy-eight genes identified met selection criteria ranging from 1 to 9 filtered variants per female. Five females presented with functional variants of X-linked genes (IL1RAPL1, PIR, GABRQ, GPRASP2, SYTL4 with cadherin, protocadherin and ankyrin repeat gene families most commonly altered (e.g., CDH6, FAT2, PCDH8, CTNNA3, ANKRD11. Other genes related to neurogenesis and neuronal migration (e.g., SEMA3F, MIDN, were also identified.
Bonnet, Crystel; Grati, M'hamed; Marlin, Sandrine; Levilliers, Jacqueline; Hardelin, Jean-Pierre; Parodi, Marine; Niasme-Grare, Magali; Zelenika, Diana; Délépine, Marc; Feldmann, Delphine; Jonard, Laurence; El-Amraoui, Aziz; Weil, Dominique; Delobel, Bruno; Vincent, Christophe; Dollfus, Hélène; Eliot, Marie-Madeleine; David, Albert; Calais, Catherine; Vigneron, Jacqueline; Montaut-Verient, Bettina; Bonneau, Dominique; Dubin, Jacques; Thauvin, Christel; Duvillard, Alain; Francannet, Christine; Mom, Thierry; Lacombe, Didier; Duriez, Françoise; Drouin-Garraud, Valérie; Thuillier-Obstoy, Marie-Françoise; Sigaudy, Sabine; Frances, Anne-Marie; Collignon, Patrick; Challe, Georges; Couderc, Rémy; Lathrop, Mark; Sahel, José-Alain; Weissenbach, Jean; Petit, Christine; Denoyelle, Françoise
Usher syndrome (USH) combines sensorineural deafness with blindness. It is inherited in an autosomal recessive mode. Early diagnosis is critical for adapted educational and patient management choices, and for genetic counseling. To date, nine causative genes have been identified for the three clinical subtypes (USH1, USH2 and USH3). Current diagnostic strategies make use of a genotyping microarray that is based on the previously reported mutations. The purpose of this study was to design a more accurate molecular diagnosis tool. We sequenced the 366 coding exons and flanking regions of the nine known USH genes, in 54 USH patients (27 USH1, 21 USH2 and 6 USH3). Biallelic mutations were detected in 39 patients (72%) and monoallelic mutations in an additional 10 patients (18.5%). In addition to biallelic mutations in one of the USH genes, presumably pathogenic mutations in another USH gene were detected in seven patients (13%), and another patient carried monoallelic mutations in three different USH genes. Notably, none of the USH3 patients carried detectable mutations in the only known USH3 gene, whereas they all carried mutations in USH2 genes. Most importantly, the currently used microarray would have detected only 30 of the 81 different mutations that we found, of which 39 (48%) were novel. Based on these results, complete exon sequencing of the currently known USH genes stands as a definite improvement for molecular diagnosis of this disease, which is of utmost importance in the perspective of gene therapy.
Full Text Available Interrelationships among dinoflagellates in molecular phylogenies are largely unresolved, especially in the deepest branches. Ribosomal DNA (rDNA sequences provide phylogenetic signals only at the tips of the dinoflagellate tree. Two reasons for the poor resolution of deep dinoflagellate relationships using rDNA sequences are (1 most sites are relatively conserved and (2 there are different evolutionary rates among sites in different lineages. Therefore, alternative molecular markers are required to address the deeper phylogenetic relationships among dinoflagellates. Preliminary evidence indicates that the heat shock protein 90 gene (Hsp90 will provide an informative marker, mainly because this gene is relatively long and appears to have relatively uniform rates of evolution in different lineages.We more than doubled the previous dataset of Hsp90 sequences from dinoflagellates by generating additional sequences from 17 different species, representing seven different orders. In order to concatenate the Hsp90 data with rDNA sequences, we supplemented the Hsp90 sequences with three new SSU rDNA sequences and five new LSU rDNA sequences. The new Hsp90 sequences were generated, in part, from four additional heterotrophic dinoflagellates and the type species for six different genera. Molecular phylogenetic analyses resulted in a paraphyletic assemblage near the base of the dinoflagellate tree consisting of only athecate species. However, Noctiluca was never part of this assemblage and branched in a position that was nested within other lineages of dinokaryotes. The phylogenetic trees inferred from Hsp90 sequences were consistent with trees inferred from rDNA sequences in that the backbone of the dinoflagellate clade was largely unresolved.The sequence conservation in both Hsp90 and rDNA sequences and the poor resolution of the deepest nodes suggests that dinoflagellates reflect an explosive radiation in morphological diversity in their recent
Bøe Wolff, A S; Oftedal, B; Johansson, S; Bruland, O; Løvås, K; Meager, A; Pedersen, C; Husebye, E S; Knappskog, P M
Autoimmune Addison's disease (AAD) is often associated with other components in autoimmune polyendocrine syndromes (APS). Whereas APS I is caused by mutations in the AIRE gene, the susceptibility genes for AAD and APS II are unclear. In the present study, we investigated whether polymorphisms or copy number variations in the AIRE gene were associated with AAD and APS II. First, nine SNPs in the AIRE gene were analyzed in 311 patients with AAD and APS II and 521 healthy controls, identifying no associated risk. Second, in a subgroup of 25 of these patients, AIRE sequencing revealed three novel polymorphisms. Finally, the AIRE copy number was determined by duplex quantitative PCR in 14 patients with APS I, 161 patients with AAD and APS II and in 39 healthy subjects. In two Scandinavian APS I patients previously reported to be homozygous for common AIRE mutations, we identified large deletions of the AIRE gene covering at least exon 2 to exon 8. We conclude that polymorphisms in the AIRE gene are not associated with AAD and APS II. We further suggest that DNA analysis of the parents of patients found to be homozygous for mutations in AIRE, always should be performed.
Basyuni, M.; Sagami, H.; Baba, S.; Oku, H.
It has been previously reported that dolichols but not polyprenols were predominated in mangrove leaves and roots. Therefore, the occurrence of larger amounts of dolichol in leaves of mangrove plants implies that polyprenol reductase is responsible for the conversion of polyprenol to dolichol may be active in mangrove leaves. Here we report the early assessment of probably polyprenol reductase gene from genome sequence of mangrove plant Kandelia obovata. The functional assignment of the gene was based on a homology search of the sequences against the non-redundant (nr) peptide database of NCBI using Blastx. The degree of sequence identity between DNA sequence and known polyprenol reductase was confirmed using the Blastx probability E-value, total score, and identity. The genome sequence data resulted in three partial sequences, termed c23157 (700 bp), c23901 (960 bp), and c24171 (531 bp). The c23157 gene showed the highest similarity (61%) to predicted polyprenol reductase 2- like from Gossypium raimondii with E-value 2e-100. The second gene was c23901 to exhibit high similarity (78%) to the steroid 5-alpha-reductase Det2 from J. curcas with E-value 2e-140. Furthermore, the c24171 gene depicted highest similarity (79%) to the polyprenol reductase 2 isoform X1 from Jatropha curcas with E- value 7e-21.The present study suggested that the c23157, c23901, and c24171, genes may encode predicted polyprenol reductase. The c23157, c23901, c24171 are therefore the new type of predicted polyprenol reductase from K. obovata.
Studying natural variation of rice resistance (R) genes in cultivated and wild rice relatives can predict resistance stability to rice blast fungus. In the present study, the protein coding regions of rice R gene Pi-d2 in 35 rice accessions of subgroups, aus (AUS), indica (IND), temperate japonica (...
Jim, Heather S L; Lin, Hui-Yi; Tyrer, Jonathan P
Disruption in circadian gene expression, whether due to genetic variation or environmental factors (e.g., light at night, shiftwork), is associated with increased incidence of breast, prostate, gastrointestinal and hematologic cancers and gliomas. Circadian genes are highly expressed in the ovari...
Full Text Available Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frame-shifts (FS. Genes encoding for alpha subunits of biphenyl (bphA and benzoate (benA dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 43.1% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of Maximum Expected Error (MEE filtering and single linkage pre-clustering (SLP proved the most efficient read procession. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study and the tool was implemented into FunGene Pipeline available at http://fungene.cme.msu.edu/FunGenePipeline/ and https://github.com/rdpstaff/Framebot.
Taira, M.; Yoshida, T.; Miyagawa, K.; Sakamoto, H.; Terada, M.; Sugimura, T.
The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A) + RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It was found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity
He, Y L; Wu, Y H; Quan, F S; Liu, Y G; Zhang, Y
To better understand the function of the myostatin gene and its promoter region in bovine, we amplified and sequenced the myostatin gene and promoter from the blood of Qinchuan and Red Angus cattle by using polymerase chain reaction. The sequences of Qinchuan and Red Angus cattle were compared with those of other cattle breeds available in GenBank. Exon splice sites were confirmed by mRNA sequencing. Compared to the published sequence (GenBank accession No. AF320998), 69 single nucleotide polymorphisms (SNPs) were identified in the Qinchuan myostatin gene, only one of which was an insertion mutation in Qinchuan cattle. There was a 16-bp insertion in the first 705-bp intron in 3 Qinchuan cattle. A total of 7 SNPs were identified in exon 3, in which the mutation occurred in the third base of the codon and was synonymous. On comparing the Qinchuan myostatin gene sequence to that of Red Angus cattle, a total of 50 SNPs were identified in the first and third exons. In addition, there were 18 SNPs identified in the Qinchuan cattle promoter region compared with those of other cattle compared to the Red Angus cattle myostatin promoter region. breeds (GenBank accession No. AF348479), but only 14 SNPs when compared to the Red Angus cattle myostatin promoter region.
Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.
To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.
Filges, Isabel; Friedman, Jan M
Massively parallel sequencing has revolutionized our understanding of Mendelian disorders, and many novel genes have been discovered to cause disease phenotypes when mutant. At the same time, next-generation sequencing approaches have enabled non-invasive prenatal testing of free fetal DNA in maternal blood. However, little attention has been paid to using whole exome and genome sequencing strategies for gene identification in fetal disorders that are lethal in utero, because they can appear to be sporadic and Mendelian inheritance may be missed. We present challenges and advantages of applying next-generation sequencing approaches to gene discovery in fetal malformation phenotypes and review recent successful discovery approaches. We discuss the implication and significance of recessive inheritance and cross-species phenotyping in fetal lethal conditions. Whole exome sequencing can be used in individual families with undiagnosed lethal congenital anomaly syndromes to discover causal mutations, provided that prior to data analysis, the fetal phenotype can be correlated to a particular developmental pathway in embryogenesis. Cross-species phenotyping allows providing further evidence for causality of discovered variants in genes involved in those extremely rare phenotypes and will increase our knowledge about normal and abnormal human developmental processes. Ultimately, families will benefit from the option of early prenatal diagnosis. © 2014 John Wiley & Sons, Ltd.
Full Text Available It is known that the fragrant trait in rice (Oryza sativa L. is largely controlled by fgr gene on chromosome 8 and it has been specified that the existence of an 8 bp deletion and three single nucleotide polymorphism (SN