WorldWideScience

Sample records for protein-encoding gene based

  1. The presence of two S-layer-protein-encoding genes is conserved among species related to Lactobacillus acidophilus

    NARCIS (Netherlands)

    Boot, H.J.; Kolen, C.P.A.M.; Pot, B.; Kersters, K.; Pouwels, P.H.

    1996-01-01

    Previously we have shown that the type strain of Lactobacillus acidophilus possesses two S-protein-encoding genes, one of which is silent, on a chromosomal segment of 6 kb. The S-protein-encoding gene in the expression site can be exchanged for the silent S-protein-encoding gene by inversion of this

  2. Putative recombination events and evolutionary history of five economically important viruses of fruit trees based on coat protein-encoding gene sequence analysis.

    Science.gov (United States)

    Boulila, Moncef

    2010-06-01

    To enhance the knowledge of recombination as an evolutionary process, 267 accessions retrieved from GenBank were investigated, all belonging to five economically important viruses infecting fruit crops (Plum pox, Apple chlorotic leaf spot, Apple mosaic, Prune dwarf, and Prunus necrotic ringspot viruses). Putative recombinational events were detected in the coat protein (CP)-encoding gene using RECCO and RDP version 3.31beta algorithms. Based on RECCO results, all five viruses were shown to contain potential recombination signals in the CP gene. Reconstructed trees with modified topologies were proposed. Furthermore, RECCO performed better than the RDP package in detecting recombination events and exhibiting their evolution rate along the sequences of the five viruses. RDP, however, provided the possible major and minor parents of the recombinants. Thus, the two methods should be considered complementary.

  3. RNA-Seq-based analysis of cold shock response in Thermoanaerobacter tengcongensis, a bacterium harboring a single cold shock protein encoding gene.

    Directory of Open Access Journals (Sweden)

    Bo Liu

    Full Text Available BACKGROUND: Although cold shock responses and the roles of cold shock proteins in microorganisms containing multiple cold shock protein genes have been well characterized, related studies on bacteria possessing a single cold shock protein gene have not been reported. Thermoanaerobacter tengcongensis MB4, a thermophile harboring only one known cold shock protein gene (TtescpC, can survive from 50° to 80 °C, but has poor natural competence under cold shock at 50 °C. We therefore examined cold shock responses and their effect on natural competence in this bacterium. RESULTS: The transcriptomes of T. tengcongensis before and after cold shock were analyzed by RNA-seq and over 1200 differentially expressed genes were successfully identified. These genes were involved in a wide range of biological processes, including modulation of DNA replication, recombination, and repair; energy metabolism; production of cold shock protein; synthesis of branched amino acids and branched-chain fatty acids; and sporulation. RNA-seq analysis also suggested that T. tengcongensis initiates cell wall and membrane remodeling processes, flagellar assembly, and sporulation in response to low temperature. Expression profiles of TtecspC and failed attempts to produce a TtecspC knockout strain confirmed the essential role of TteCspC in the cold shock response, and also suggested a role of this protein in survival at optimum growth temperature. Repression of genes encoding ComEA and ComEC and low energy metabolism levels in cold-shocked cells are the likely basis of poor natural competence at low temperature. CONCLUSION: Our study demonstrated changes in global gene expression under cold shock and identified several candidate genes related to cold shock in T. tengcongensis. At the same time, the relationship between cold shock response and poor natural competence at low temperature was preliminarily elucidated. These findings provide a foundation for future studies on genetic

  4. Molecular evolution of the Paramyxoviridae and Rhabdoviridae multiple-protein-encoding P gene.

    Science.gov (United States)

    Jordan, I K; Sutter, B A; McClure, M A

    2000-01-01

    Presented here is an analysis of the molecular evolutionary dynamics of the P gene among 76 representative sequences of the Paramyxoviridae and Rhabdoviridae RNA virus families. In a number of Paramyxoviridae taxa, as well as in vesicular stomatitis viruses of the Rhabdoviridae, the P gene encodes multiple proteins from a single genomic RNA sequence. These products include the phosphoprotein (P), as well as the C and V proteins. The complexity of the P gene makes it an intriguing locus to study from an evolutionary perspective. Amino acid sequence alignments of the proteins encoded at the P and N loci were used in independent phylogenetic reconstructions of the Paramyxoviridae and Rhabdoviridae families. P-gene-coding capacities were mapped onto the Paramyxoviridae phylogeny, and the most parsimonious path of multiple-coding-capacity evolution was determined. Levels of amino acid variation for Paramyxoviridae and Rhabdoviridae P-gene-encoded products were also analyzed. Proteins encoded in overlapping reading frames from the same nucleotides have different levels of amino acid variation. The nucleotide architecture that underlies the amino acid variation was determined in order to evaluate the role of selection in the evolution of the P gene overlapping reading frames. In every case, the evolution of one of the proteins encoded in the overlapping reading frames has been constrained by negative selection while the other has evolved more rapidly. The integrity of the overlapping reading frame that represents a derived state is generally maintained at the expense of the ancestral reading frame encoded by the same nucleotides. The evolution of such multicoding sequences is likely a response by RNA viruses to selective pressure to maximize genomic information content while maintaining small genome size. The ability to evolve such a complex genomic strategy is intimately related to the dynamics of the viral quasispecies, which allow enhanced exploration of the adaptive

  5. Cellulolytic (cel) genes of Clostridium thermocellum F7 and the proteins encoded by them

    International Nuclear Information System (INIS)

    Piruzyan, E.S.; Mogutov, M.A.; Velikodvorskaya, G.A.; Pushkarskaya, T.A.

    1988-01-01

    This study is concerned with genes cell, ce12, and ce13 encoding the endoglucanase of the cellulolytic complex of the anaerobic thermophilic Clostridium thermocellum F7 bacteria, these genes having been closed by us earlier. The authors present the characteristics of proteins synthesized by the cel genes in the minicell system of the strain Escherichia coli K-12 X925. The molecular weights of the proteins encoded by genes cell, ce12, and ce13 are 30,000, 45,000, and 50,000 dalton, respectively. The study of the homology of the cloned section of the C. thermocellum DNA containing the endoglucanase genes, using Southern's blot-hybridization method, did not reveal their physical linkage in the genome. The authors detected a plasmid with a size of about 30 kb in the cells of the C. thermocellum F7 strain investigated

  6. Relationships between protein-encoding gene abundance and corresponding process are commonly assumed yet rarely observed

    Science.gov (United States)

    Rocca, Jennifer D.; Hall, Edward K.; Lennon, Jay T.; Evans, Sarah E.; Waldrop, Mark P.; Cotner, James B.; Nemergut, Diana R.; Graham, Emily B.; Wallenstein, Matthew D.

    2015-01-01

    For any enzyme-catalyzed reaction to occur, the corresponding protein-encoding genes and transcripts are necessary prerequisites. Thus, a positive relationship between the abundance of gene or transcripts and corresponding process rates is often assumed. To test this assumption, we conducted a meta-analysis of the relationships between gene and/or transcript abundances and corresponding process rates. We identified 415 studies that quantified the abundance of genes or transcripts for enzymes involved in carbon or nitrogen cycling. However, in only 59 of these manuscripts did the authors report both gene or transcript abundance and rates of the appropriate process. We found that within studies there was a significant but weak positive relationship between gene abundance and the corresponding process. Correlations were not strengthened by accounting for habitat type, differences among genes or reaction products versus reactants, suggesting that other ecological and methodological factors may affect the strength of this relationship. Our findings highlight the need for fundamental research on the factors that control transcription, translation and enzyme function in natural systems to better link genomic and transcriptomic data to ecosystem processes.

  7. [Cloning, mutagenesis and symbiotic phenotype of three lipid transfer protein encoding genes from Mesorhizobium huakuii 7653R].

    Science.gov (United States)

    Li, Yanan; Zeng, Xiaobo; Zhou, Xuejuan; Li, Youguo

    2016-12-04

    Lipid transfer protein superfamily is involved in lipid transport and metabolism. This study aimed to construct mutants of three lipid transfer protein encoding genes in Mesorhizobium huakuii 7653R, and to study the phenotypes and function of mutations during symbiosis with Astragalus sinicus. We used bioinformatics to predict structure characteristics and biological functions of lipid transfer proteins, and conducted semi-quantitative and fluorescent quantitative real-time PCR to analyze the expression levels of target genes in free-living and symbiotic conditions. Using pK19mob insertion mutagenesis to construct mutants, we carried out pot plant experiments to observe symbiotic phenotypes. MCHK-5577, MCHK-2172 and MCHK-2779 genes encoding proteins belonged to START/RHO alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) superfamily, involved in lipid transport or metabolism, and were identical to M. loti at 95% level. Gene relative transcription level of the three genes all increased compared to free-living condition. We obtained three mutants. Compared with wild-type 7653R, above-ground biomass of plants and nodulenitrogenase activity induced by the three mutants significantly decreased. Results indicated that lipid transfer protein encoding genes of Mesorhizobium huakuii 7653R may play important roles in symbiotic nitrogen fixation, and the mutations significantly affected the symbiotic phenotypes. The present work provided a basis to study further symbiotic function mechanism associated with lipid transfer proteins from rhizobia.

  8. Identification of human microRNA-like sequences embedded within the protein-encoding genes of the human immunodeficiency virus.

    Directory of Open Access Journals (Sweden)

    Bryan Holland

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are highly conserved, short (18-22 nts, non-coding RNA molecules that regulate gene expression by binding to the 3' untranslated regions (3'UTRs of mRNAs. While numerous cellular microRNAs have been associated with the progression of various diseases including cancer, miRNAs associated with retroviruses have not been well characterized. Herein we report identification of microRNA-like sequences in coding regions of several HIV-1 genomes. RESULTS: Based on our earlier proteomics and bioinformatics studies, we have identified 8 cellular miRNAs that are predicted to bind to the mRNAs of multiple proteins that are dysregulated during HIV-infection of CD4+ T-cells in vitro. In silico analysis of the full length and mature sequences of these 8 miRNAs and comparisons with all the genomic and subgenomic sequences of HIV-1 strains in global databases revealed that the first 18/18 sequences of the mature hsa-miR-195 sequence (including the short seed sequence, matched perfectly (100%, or with one nucleotide mismatch, within the envelope (env genes of five HIV-1 genomes from Africa. In addition, we have identified 4 other miRNA-like sequences (hsa-miR-30d, hsa-miR-30e, hsa-miR-374a and hsa-miR-424 within the env and the gag-pol encoding regions of several HIV-1 strains, albeit with reduced homology. Mapping of the miRNA-homologues of env within HIV-1 genomes localized these sequence to the functionally significant variable regions of the env glycoprotein gp120 designated V1, V2, V4 and V5. CONCLUSIONS: We conclude that microRNA-like sequences are embedded within the protein-encoding regions of several HIV-1 genomes. Given that the V1 to V5 regions of HIV-1 envelopes contain specific, well-characterized domains that are critical for immune responses, virus neutralization and disease progression, we propose that the newly discovered miRNA-like sequences within the HIV-1 genomes may have evolved to self-regulate survival of the

  9. Molecular comparison of the structural proteins encoding gene clusters of two related Lactobacillus delbrueckii bacteriophages.

    Science.gov (United States)

    Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T

    1993-01-01

    Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043

  10. Molecular adaptation within the coat protein-encoding gene of Tunisian almond isolates of Prunus necrotic ringspot virus.

    Science.gov (United States)

    Boulila, Moncef; Ben Tiba, Sawssen; Jilani, Saoussen

    2013-04-01

    The sequence alignments of five Tunisian isolates of Prunus necrotic ringspot virus (PNRSV) were searched for evidence of recombination and diversifying selection. Since failing to account for recombination can elevate the false positive error rate in positive selection inference, a genetic algorithm (GARD) was used first and led to the detection of potential recombination events in the coat protein-encoding gene of that virus. The Recco algorithm confirmed these results by identifying, additionally, the potential recombinants. For neutrality testing and evaluation of nucleotide polymorphism in PNRSV CP gene, Tajima's D, and Fu and Li's D and F statistical tests were used. About selection inference, eight algorithms (SLAC, FEL, IFEL, REL, FUBAR, MEME, PARRIS, and GA branch) incorporated in HyPhy package were utilized to assess the selection pressure exerted on the expression of PNRSV capsid. Inferred phylogenies pointed out, in addition to the three classical groups (PE-5, PV-32, and PV-96), the delineation of a fourth cluster having the new proposed designation SW6, and a fifth clade comprising four Tunisian PNRSV isolates which underwent recombination and selective pressure and to which the name Tunisian outgroup was allocated.

  11. Brain transcriptional stability upon prion protein-encoding gene invalidation in zygotic or adult mouse

    Directory of Open Access Journals (Sweden)

    Béringue Vincent

    2010-07-01

    Full Text Available Abstract Background The physiological function of the prion protein remains largely elusive while its key role in prion infection has been expansively documented. To potentially assess this conundrum, we performed a comparative transcriptomic analysis of the brain of wild-type mice with that of transgenic mice invalidated at this locus either at the zygotic or at the adult stages. Results Only subtle transcriptomic differences resulting from the Prnp knockout could be evidenced, beside Prnp itself, in the analyzed adult brains following microarray analysis of 24 109 mouse genes and QPCR assessment of some of the putatively marginally modulated loci. When performed at the adult stage, neuronal Prnp disruption appeared to sequentially induce a response to an oxidative stress and a remodeling of the nervous system. However, these events involved only a limited number of genes, expression levels of which were only slightly modified and not always confirmed by RT-qPCR. If not, the qPCR obtained data suggested even less pronounced differences. Conclusions These results suggest that the physiological function of PrP is redundant at the adult stage or important for only a small subset of the brain cell population under classical breeding conditions. Following its early reported embryonic developmental regulation, this lack of response could also imply that PrP has a more detrimental role during mouse embryogenesis and that potential transient compensatory mechanisms have to be searched for at the time this locus becomes transcriptionally activated.

  12. Translational Control of Host Gene Expression by a Cys-Motif Protein Encoded in a Bracovirus.

    Directory of Open Access Journals (Sweden)

    Eunseong Kim

    Full Text Available Translational control is a strategy that various viruses use to manipulate their hosts to suppress acute antiviral response. Polydnaviruses, a group of insect double-stranded DNA viruses symbiotic to some endoparasitoid wasps, are divided into two genera: ichnovirus (IV and bracovirus (BV. In IV, some Cys-motif genes are known as host translation-inhibitory factors (HTIF. The genome of endoparasitoid wasp Cotesia plutellae contains a Cys-motif gene (Cp-TSP13 homologous to an HTIF known as teratocyte-secretory protein 14 (TSP14 of Microplitis croceipes. Cp-TSP13 consists of 129 amino acid residues with a predicted molecular weight of 13.987 kDa and pI value of 7.928. Genomic DNA region encoding its open reading frame has three introns. Cp-TSP13 possesses six conserved cysteine residues as other Cys-motif genes functioning as HTIF. Cp-TSP13 was expressed in Plutella xylostella larvae parasitized by C. plutellae. C. plutellae bracovirus (CpBV was purified and injected into non-parasitized P. xylostella that expressed Cp-TSP13. Cp-TSP13 was cloned into a eukaryotic expression vector and used to infect Sf9 cells to transiently express Cp-TSP13. The synthesized Cp-TSP13 protein was detected in culture broth. An overlaying experiment showed that the purified Cp-TSP13 entered hemocytes. It was localized in the cytosol. Recombinant Cp-TSP13 significantly inhibited protein synthesis of secretory proteins when it was added to in vitro cultured fat body. In addition, the recombinant Cp-TSP13 directly inhibited the translation of fat body mRNAs in in vitro translation assay using rabbit reticulocyte lysate. Moreover, the recombinant Cp-TSP13 significantly suppressed cellular immune responses by inhibiting hemocyte-spreading behavior. It also exhibited significant insecticidal activities by both injection and feeding routes. These results indicate that Cp-TSP13 is a viral HTIF.

  13. Domestication process of the goat revealed by an analysis of the nearly complete mitochondrial protein-encoding genes.

    Directory of Open Access Journals (Sweden)

    Koh Nomura

    Full Text Available Goats (Capra hircus are one of the oldest domesticated species, and they are kept all over the world as an essential resource for meat, milk, and fiber. Although recent archeological and molecular biological studies suggested that they originated in West Asia, their domestication processes such as the timing of population expansion and the dynamics of their selection pressures are little known. With the aim of addressing these issues, the nearly complete mitochondrial protein-encoding genes were determined from East, Southeast, and South Asian populations. Our coalescent time estimations suggest that the timing of their major population expansions was in the Late Pleistocene and significantly predates the beginning of their domestication in the Neolithic era (≈10,000 years ago. The ω (ratio of non-synonymous rate/synonymous substitution rate for each lineage was also estimated. We found that the ω of the globally distributed haplogroup A which is inherited by more than 90% of goats examined, turned out to be extremely low, suggesting that they are under severe selection pressure probably due to their large population size. Conversely, the ω of the Asian-specific haplogroup B inherited by about 5% of goats was relatively high. Although recent molecular studies suggest that domestication of animals may tend to relax selective constraints, the opposite pattern observed in our goat mitochondrial genome data indicates the process of domestication is more complex than may be presently appreciated and cannot be explained only by a simple relaxation model.

  14. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Yongyan [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Ai, Zhiying [Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); College of Life Sciences, Northwest A and F University, Yangling 712100, Shaanxi (China); Yao, Kezhen [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Cao, Lixia; Du, Juan; Shi, Xiaoyan [Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); College of Life Sciences, Northwest A and F University, Yangling 712100, Shaanxi (China); Guo, Zekun, E-mail: gzk@nwsuaf.edu.cn [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Zhang, Yong, E-mail: zhylab@hotmail.com [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China)

    2013-10-15

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway by stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR.

  15. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    International Nuclear Information System (INIS)

    Wu, Yongyan; Ai, Zhiying; Yao, Kezhen; Cao, Lixia; Du, Juan; Shi, Xiaoyan; Guo, Zekun; Zhang, Yong

    2013-01-01

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway by stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR

  16. Thermal response of rat fibroblasts stably transfected with the human 70-kDa heat shock protein-encoding gene

    International Nuclear Information System (INIS)

    Li, G.C.; Li, Ligeng; Liu, Yunkang; Mak, J.Y.; Chen, Lili; Lee, W.M.F.

    1991-01-01

    The major heat shock protein hsp70 is synthesized by cells of a wide variety of organisms in response to heat shock or other environmental stresses and is assumed to play an important role in protecting cells from thermal stress. The authors have tested this hypothesis directly by transfecting a constitutively expressed recombinant human hsp70-encoding gene into rat fibroblasts and examining the relationship between the levels of human hsp70 expressed and thermal resistance of the stably transfected rat cells. Successful transfection and expression of the gene for human hsp70 were characterized by RNA hybridization analysis, low-dimensional gel electrophoresis, and immunoblot analysis. When individual cloned cell lines were exposed to 45C and their thermal survivals were determined by colony-formation assay, they found that the expression of human hsp70 conferred heat resistance to the rat cells. These results reinforce the hypothesis that hsp70 has a protective function against thermal stress

  17. Phylogenetic analysis of fungal heterotrimeric G protein-encoding genes and their expression during dimorphism in Mucor circinelloides.

    Science.gov (United States)

    Valle-Maldonado, Marco Iván; Jácome-Galarza, Irvin Eduardo; Díaz-Pérez, Alma Laura; Martínez-Cadena, Guadalupe; Campos-García, Jesús; Ramírez-Díaz, Martha Isela; Reyes-De la Cruz, Homero; Riveros-Rosas, Héctor; Díaz-Pérez, César; Meza-Carmen, Víctor

    2015-12-01

    In fungi, heterotrimeric G proteins are key regulators of biological processes such as mating, virulence, morphology, among others. Mucor circinelloides is a model organism for many biological processes, and its genome contains the largest known repertoire of genes that encode putative heterotrimeric G protein subunits in the fungal kingdom: twelve Gα (McGpa1-12), three Gβ (McGpb1-3), and three Gγ (McGpg1-3). Phylogenetic analysis of fungal Gα showed that they are divided into four distinct groups as reported previously. Fungal Gβ and Gγ are also divided into four phylogenetic groups, and to our understanding this is the first report of a phylogenetic classification for fungal Gβ and Gγ subunits. Almost all genes that encode putative heterotrimeric G subunits in M. circinelloides are differentially expressed during dimorphic growth, except for McGpg1 (Gγ) that showed very low mRNA levels at all developmental stages. Moreover, several of the subunits are expressed in a similar pattern and at the same level, suggesting that they constitute discrete complexes. For example, McGpb3 (Gβ), and McGpg2 (Gγ), are co-expressed during mycelium growth, and McGpa1, McGpb2, and McGpg2, are co-expressed during yeast development. These findings provide the conceptual framework to study the biological role of these genes during M. circinelloides morphogenesis. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  18. The mechanosensory structure of the hair cell requires clarin-1, a protein encoded by Usher syndrome III causative gene.

    Science.gov (United States)

    Geng, Ruishuang; Melki, Sami; Chen, Daniel H-C; Tian, Guilian; Furness, David N; Oshima-Takago, Tomoko; Neef, Jakob; Moser, Tobias; Askew, Charles; Horwitz, Geoff; Holt, Jeffrey R; Imanishi, Yoshikazu; Alagramam, Kumar N

    2012-07-11

    Mutation in the clarin-1 gene (Clrn1) results in loss of hearing and vision in humans (Usher syndrome III), but the role of clarin-1 in the sensory hair cells is unknown. Clarin-1 is predicted to be a four transmembrane domain protein similar to members of the tetraspanin family. Mice carrying null mutation in the clarin-1 gene (Clrn1(-/-)) show loss of hair cell function and a possible defect in ribbon synapse. We investigated the role of clarin-1 using various in vitro and in vivo approaches. We show by immunohistochemistry and patch-clamp recordings of Ca(2+) currents and membrane capacitance from inner hair cells that clarin-1 is not essential for formation or function of ribbon synapse. However, reduced cochlear microphonic potentials, FM1-43 [N-(3-triethylammoniumpropyl)-4-(4-(dibutylamino)styryl) pyridinium dibromide] loading, and transduction currents pointed to diminished cochlear hair bundle function in Clrn1(-/-) mice. Electron microscopy of cochlear hair cells revealed loss of some tall stereocilia and gaps in the v-shaped bundle, although tip links and staircase arrangement of stereocilia were not primarily affected by Clrn1(-/-) mutation. Human clarin-1 protein expressed in transfected mouse cochlear hair cells localized to the bundle; however, the pathogenic variant p.N48K failed to localize to the bundle. The mouse model generated to study the in vivo consequence of p.N48K in clarin-1 (Clrn1(N48K)) supports our in vitro and Clrn1(-/-) mouse data and the conclusion that CLRN1 is an essential hair bundle protein. Furthermore, the ear phenotype in the Clrn1(N48K) mouse suggests that it is a valuable model for ear disease in CLRN1(N48K), the most prevalent Usher syndrome III mutation in North America.

  19. The Mechanosensory Structure of the Hair Cell Requires Clarin-1, a Protein Encoded by Usher Syndrome III Causative Gene

    Science.gov (United States)

    Geng, Ruishuang; Melki, Sami; Chen, Daniel H.-C.; Tian, Guilian; Furness, David; Oshima-Takago, Tomoko; Neef, Jakob; Moser, Tobias; Askew, Charles; Horwitz, Geoff; Holt, Jeffrey; Imanishi, Yoshikazu; Alagramam, Kumar N.

    2012-01-01

    Mutation in the clarin-1 gene results in loss of hearing and vision in humans (Usher syndrome III), but the role of clarin-1 in the sensory hair cells is unknown. Clarin-1 is predicted to be a four transmembrane domain protein similar to members of the tetraspanin family. Mice carrying null mutation in the clarin-1 (Clrn1−/−) gene show loss of hair cell function and a possible defect in ribbon synapse. We investigated the role of clarin-1 using various in vitro and in vivo approaches. We show by immunohistochemistry and patch-clamp recordings of Ca2+ currents and membrane capacitance from IHCs that clarin-1 is not essential for formation or function of ribbon synapse. However, reduced cochlear microphonic potentials, FM1-43 loading and transduction currents pointed to diminished cochlear hair bundle function in Clrn1−/− mice. Electron microscopy of cochlear hair cells revealed loss of some tall stereocilia and gaps in the v-shaped bundle, although tip-links and staircase arrangement of stereocilia were not primarily affected by Clrn1−/− mutation. Human clarin-1 protein expressed in transfected mouse cochlear hair cells localized to the bundle; however, the pathogenic variant, p.N48K, failed to localize to the bundle. The mouse model generated to study the in vivo consequence of p. N48K in clarin-1 (Clrn1N48K) supports our in vitro and Clrn1−/− mouse data and the conclusion that CLRN1 is an essential hair bundle protein. Further, the ear phenotype in the Clrn1N48K mouse suggests that it is a valuable model for ear disease in CLRN1N48K, the most prevalent Usher III mutation in North America. PMID:22787034

  20. Combinational deletion of three membrane protein-encoding genes highly attenuates yersinia pestis while retaining immunogenicity in a mouse model of pneumonic plague.

    Science.gov (United States)

    Tiner, Bethany L; Sha, Jian; Kirtley, Michelle L; Erova, Tatiana E; Popov, Vsevolod L; Baze, Wallace B; van Lier, Christina J; Ponnusamy, Duraisamy; Andersson, Jourdan A; Motin, Vladimir L; Chauhan, Sadhana; Chopra, Ashok K

    2015-04-01

    Previously, we showed that deletion of genes encoding Braun lipoprotein (Lpp) and MsbB attenuated Yersinia pestis CO92 in mouse and rat models of bubonic and pneumonic plague. While Lpp activates Toll-like receptor 2, the MsbB acyltransferase modifies lipopolysaccharide. Here, we deleted the ail gene (encoding the attachment-invasion locus) from wild-type (WT) strain CO92 or its lpp single and Δlpp ΔmsbB double mutants. While the Δail single mutant was minimally attenuated compared to the WT bacterium in a mouse model of pneumonic plague, the Δlpp Δail double mutant and the Δlpp ΔmsbB Δail triple mutant were increasingly attenuated, with the latter being unable to kill mice at a 50% lethal dose (LD50) equivalent to 6,800 LD50s of WT CO92. The mutant-infected animals developed balanced TH1- and TH2-based immune responses based on antibody isotyping. The triple mutant was cleared from mouse organs rapidly, with concurrent decreases in the production of various cytokines and histopathological lesions. When surviving animals infected with increasing doses of the triple mutant were subsequently challenged on day 24 with the bioluminescent WT CO92 strain (20 to 28 LD50s), 40 to 70% of the mice survived, with efficient clearing of the invading pathogen, as visualized in real time by in vivo imaging. The rapid clearance of the triple mutant, compared to that of WT CO92, from animals was related to the decreased adherence and invasion of human-derived HeLa and A549 alveolar epithelial cells and to its inability to survive intracellularly in these cells as well as in MH-S murine alveolar and primary human macrophages. An early burst of cytokine production in macrophages elicited by the triple mutant compared to WT CO92 and the mutant's sensitivity to the bactericidal effect of human serum would further augment bacterial clearance. Together, deletion of the ail gene from the Δlpp ΔmsbB double mutant severely attenuated Y. pestis CO92 to evoke pneumonic plague in a

  1. Many Saccharomyces cerevisiae Cell Wall Protein Encoding Genes Are Coregulated by Mss11, but Cellular Adhesion Phenotypes Appear Only Flo Protein Dependent.

    Science.gov (United States)

    Bester, Michael C; Jacobson, Dan; Bauer, Florian F

    2012-01-01

    The outer cell wall of the yeast Saccharomyces cerevisiae serves as the interface with the surrounding environment and directly affects cell-cell and cell-surface interactions. Many of these interactions are facilitated by specific adhesins that belong to the Flo protein family. Flo mannoproteins have been implicated in phenotypes such as flocculation, substrate adhesion, biofilm formation, and pseudohyphal growth. Genetic data strongly suggest that individual Flo proteins are responsible for many specific cellular adhesion phenotypes. However, it remains unclear whether such phenotypes are determined solely by the nature of the expressed FLO genes or rather as the result of a combination of FLO gene expression and other cell wall properties and cell wall proteins. Mss11 has been shown to be a central element of FLO1 and FLO11 gene regulation and acts together with the cAMP-PKA-dependent transcription factor Flo8. Here we use genome-wide transcription analysis to identify genes that are directly or indirectly regulated by Mss11. Interestingly, many of these genes encode cell wall mannoproteins, in particular, members of the TIR and DAN families. To examine whether these genes play a role in the adhesion properties associated with Mss11 expression, we assessed deletion mutants of these genes in wild-type and flo11Δ genetic backgrounds. This analysis shows that only FLO genes, in particular FLO1/10/11, appear to significantly impact on such phenotypes. Thus adhesion-related phenotypes are primarily dependent on the balance of FLO gene expression.

  2. A Major Facilitator Superfamily protein encoded by TcMucK gene is not required for cuticle pigmentation, growth and development in Tribolium castaneum.

    Science.gov (United States)

    Mun, Seulgi; Noh, Mi Young; Osanai-Futahashi, Mizuko; Muthukrishnan, Subbaratnam; Kramer, Karl J; Arakane, Yasuyuki

    2014-06-01

    Insect cuticle pigmentation and sclerotization (tanning) are vital physiological processes for insect growth, development and survival. We have previously identified several colorless precursor molecules as well as enzymes involved in their biosynthesis and processing to yield the mature intensely colored body cuticle pigments. A recent study indicated that the Bombyx mori (silkmoth) gene, BmMucK, which encodes a protein orthologous to a Culex pipiens quiquefasciatus (Southern house mosquito) cis,cis, muconate transporter, is a member of the "Major Facilitator Superfamily" (MFS) of transporter proteins and is associated with the appearance of pigmented body segments of naturally occurring body color mutants of B. mori. While RNA interference of the BmMucK gene failed to result in any observable phenotype, RNAi using a dsRNA for an orthologous gene from the red flour beetle, Tribolium castaneum, was reported to result in molting defects and darkening of the cuticle and some body parts, leading to the suggestion that orthologs of MucK genes may differ in their functions among insects. To verify the role and essentiality of the ortholog of this gene in development and body pigmentation function in T. castaneum we obtained cDNAs for the orthologous gene (TcMucK) from RNA isolated from the GA-1 wild-type strain of T. castaneum. The sequence of a 1524 nucleotides-long cDNA for TcMucK which encodes the putatively full-length protein, was assembled from two overlapping RT-PCR fragments and the expression profile of this gene during development was analyzed by real-time PCR. This cDNA encodes a 55.8 kDa protein consisting of 507 amino acid residues and includes 11 putative transmembrane segments. Transcripts of TcMucK were detected throughout all of the developmental stages analyzed. The function of this gene was explored by injection of two different double-stranded RNAs targeting different regions of the TcMucK gene (dsTcMucKs) into young larvae to down

  3. Isolation and characterization of BetaM protein encoded by ATP1B4 - a unique member of the Na,K-ATPase β-subunit gene family

    International Nuclear Information System (INIS)

    Pestov, Nikolay B.; Zhao, Hao; Basrur, Venkatesha; Modyanov, Nikolai N.

    2011-01-01

    Highlights: → Structural properties of BetaM and Na,K-ATPase β-subunits are sharply different. → BetaM protein is concentrated in nuclear membrane of skeletal myocytes. → BetaM does not associate with a Na,K-ATPase α-subunit in skeletal muscle. → Polypeptide chain of the native BetaM is highly sensitive to endogenous proteases. → BetaM in neonatal muscle is a product of alternative splice mRNA variant B. -- Abstract: ATP1B4 genes represent a rare instance of the orthologous gene co-option that radically changed functions of encoded BetaM proteins during vertebrate evolution. In lower vertebrates, this protein is a β-subunit of Na,K-ATPase located in the cell membrane. In placental mammals, BetaM completely lost its ancestral role and through acquisition of two extended Glu-rich clusters into the N-terminal domain gained entirely new properties as a muscle-specific protein of the inner nuclear membrane possessing the ability to regulate gene expression. Strict temporal regulation of BetaM expression, which is the highest in late fetal and early postnatal myocytes, indicates that it plays an essential role in perinatal development. Here we report the first structural characterization of the native eutherian BetaM protein. It should be noted that, in contrast to structurally related Na,K-ATPase β-subunits, the polypeptide chain of BetaM is highly sensitive to endogenous proteases that greatly complicated its isolation. Nevertheless, using a complex of protease inhibitors, a sample of authentic BetaM was isolated from pig neonatal skeletal muscle by a combination of ion-exchange and lectin-affinity chromatography followed by SDS-PAGE. Results of the analysis of the BetaM tryptic digest using MALDI-TOF and ESI-MS/MS mass spectrometry have demonstrated that native BetaM in neonatal skeletal muscle is a product of alternative splice mRNA variant B and comprised of 351 amino acid residues. Isolated BetaM protein was also characterized by SELDI-TOF mass

  4. Isolation and characterization of BetaM protein encoded by ATP1B4 - a unique member of the Na,K-ATPase {beta}-subunit gene family

    Energy Technology Data Exchange (ETDEWEB)

    Pestov, Nikolay B. [Department of Physiology and Pharmacology, University of Toledo College of Medicine, 3000 Arlington Ave., Toledo, OH 43614 (United States); Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow 117997 (Russian Federation); Zhao, Hao [Department of Physiology and Pharmacology, University of Toledo College of Medicine, 3000 Arlington Ave., Toledo, OH 43614 (United States); Basrur, Venkatesha [Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109 (United States); Modyanov, Nikolai N., E-mail: nikolai.modyanov@utoledo.edu [Department of Physiology and Pharmacology, University of Toledo College of Medicine, 3000 Arlington Ave., Toledo, OH 43614 (United States)

    2011-09-09

    Highlights: {yields} Structural properties of BetaM and Na,K-ATPase {beta}-subunits are sharply different. {yields} BetaM protein is concentrated in nuclear membrane of skeletal myocytes. {yields} BetaM does not associate with a Na,K-ATPase {alpha}-subunit in skeletal muscle. {yields} Polypeptide chain of the native BetaM is highly sensitive to endogenous proteases. {yields} BetaM in neonatal muscle is a product of alternative splice mRNA variant B. -- Abstract: ATP1B4 genes represent a rare instance of the orthologous gene co-option that radically changed functions of encoded BetaM proteins during vertebrate evolution. In lower vertebrates, this protein is a {beta}-subunit of Na,K-ATPase located in the cell membrane. In placental mammals, BetaM completely lost its ancestral role and through acquisition of two extended Glu-rich clusters into the N-terminal domain gained entirely new properties as a muscle-specific protein of the inner nuclear membrane possessing the ability to regulate gene expression. Strict temporal regulation of BetaM expression, which is the highest in late fetal and early postnatal myocytes, indicates that it plays an essential role in perinatal development. Here we report the first structural characterization of the native eutherian BetaM protein. It should be noted that, in contrast to structurally related Na,K-ATPase {beta}-subunits, the polypeptide chain of BetaM is highly sensitive to endogenous proteases that greatly complicated its isolation. Nevertheless, using a complex of protease inhibitors, a sample of authentic BetaM was isolated from pig neonatal skeletal muscle by a combination of ion-exchange and lectin-affinity chromatography followed by SDS-PAGE. Results of the analysis of the BetaM tryptic digest using MALDI-TOF and ESI-MS/MS mass spectrometry have demonstrated that native BetaM in neonatal skeletal muscle is a product of alternative splice mRNA variant B and comprised of 351 amino acid residues. Isolated BetaM protein was

  5. [Cloning and expression analysis of a zinc-regulated transporters (ZRT), iron-regulated transporter (IRT)-like protein encoding gene in Dendrobium officinale].

    Science.gov (United States)

    Zhang, Gang; Li, Yi-Min; Li, Biao; Zhang, Da-Wei; Guo, Shun-Xing

    2015-01-01

    The zinc-regulated transporters (ZRT), iron-regulated transporter (IRT)-like protein (ZIP) plays an important role in the growth and development of plant. In this study, a full length cDNA of ZIP encoding gene, designed as DoZIP1 (GenBank accession KJ946203), was identified from Dendrobium officinale using RT-PCR and RACE. Bioinformatics analysis showed that DoZIP1 consisted of a 1,056 bp open reading frame (ORF) encoded a 351-aa protein with a molecular weight of 37.57 kDa and an isoelectric point (pI) of 6.09. The deduced DoZIP1 protein contained the conserved ZIP domain, and its secondary structure was composed of 50.71% alpha helix, 11.11% extended strand, 36.18% random coil, and beta turn 1.99%. DoZIP1 protein exhibited a signal peptide and eight transmembrane domains, presumably locating in cell membrane. The amino acid sequence had high homology with ZIP proteins from Arabidopsis, alfalfa and rice. A phylogenetic tree analysis demonstrated that DoZIP1 was closely related to AtZIP10 and OsZIP3, and they were clustered into one clade. Real time quantitative PCR analysis demonstrated that the transcription level of DoZIP1 in D. officinale roots was the highest (4.19 fold higher than that of stems), followed by that of leaves (1.12 fold). Molecular characters of DoZIP1 will be useful for further functional determination of the gene involving in the growth and development of D. officinale.

  6. Molecular adaptation within the coat protein-encoding gene of ...

    Indian Academy of Sciences (India)

    Since failing to account for recombination can elevate the false positive error rate in positive ... †These authors contributed equally to this work. ucts, as .... The plant samples (leaves) were ground (1:5, w/v) in a PBS- ..... Sweet cherry. USA.

  7. Properties of virion transactivator proteins encoded by primate cytomegaloviruses

    Directory of Open Access Journals (Sweden)

    Barry Peter A

    2009-05-01

    Full Text Available Abstract Background Human cytomegalovirus (HCMV is a betaherpesvirus that causes severe disease in situations where the immune system is immature or compromised. HCMV immediate early (IE gene expression is stimulated by the virion phosphoprotein pp71, encoded by open reading frame (ORF UL82, and this transactivation activity is important for the efficient initiation of viral replication. It is currently recognized that pp71 acts to overcome cellular intrinsic defences that otherwise block viral IE gene expression, and that interactions of pp71 with the cell proteins Daxx and ATRX are important for this function. A further property of pp71 is the ability to enable prolonged gene expression from quiescent herpes simplex virus type 1 (HSV-1 genomes. Non-human primate cytomegaloviruses encode homologs of pp71, but there is currently no published information that addresses their effects on gene expression and modes of action. Results The UL82 homolog encoded by simian cytomegalovirus (SCMV, strain Colburn, was identified and cloned. This ORF, named S82, was cloned into an HSV-1 vector, as were those from baboon, rhesus monkey and chimpanzee cytomegaloviruses. The use of an HSV-1 vector enabled expression of the UL82 homologs in a range of cell types, and permitted investigation of their abilities to direct prolonged gene expression from quiescent genomes. The results show that all UL82 homologs activate gene expression, and that neither host cell type nor promoter target sequence has major effects on these activities. Surprisingly, the UL82 proteins specified by non-human primate cytomegaloviruses, unlike pp71, did not direct long term expression from quiescent HSV-1 genomes. In addition, significant differences were observed in the intranuclear localization of the UL82 homologs, and in their effects on Daxx. Strikingly, S82 mediated the release of Daxx from nuclear domain 10 substructures much more rapidly than pp71 or the other proteins tested. All

  8. Identification of physicochemical selective pressure on protein encoding nucleotide sequences

    Directory of Open Access Journals (Sweden)

    Sainudiin Raazesh

    2006-03-01

    Full Text Available Abstract Background Statistical methods for identifying positively selected sites in protein coding regions are one of the most commonly used tools in evolutionary bioinformatics. However, they have been limited by not taking the physiochemical properties of amino acids into account. Results We develop a new codon-based likelihood model for detecting site-specific selection pressures acting on specific physicochemical properties. Nonsynonymous substitutions are divided into substitutions that differ with respect to the physicochemical properties of interest, and those that do not. The substitution rates of these two types of changes, relative to the synonymous substitution rate, are then described by two parameters, γ and ω respectively. The new model allows us to perform likelihood ratio tests for positive selection acting on specific physicochemical properties of interest. The new method is first used to analyze simulated data and is shown to have good power and accuracy in detecting physicochemical selective pressure. We then re-analyze data from the class-I alleles of the human Major Histocompatibility Complex (MHC and from the abalone sperm lysine. Conclusion Our new method allows a more flexible framework to identify selection pressure on particular physicochemical properties.

  9. The effects of clobazam treatment in rats on the expression of genes and proteins encoding glucronosyltransferase 1A/2B (UGT1A/2B) and multidrug resistance‐associated protein-2 (MRP2), and development of thyroid follicular cell hypertrophy

    Energy Technology Data Exchange (ETDEWEB)

    Miyawaki, Izuru, E-mail: izuru-miyawaki@ds-pharma.co.jp; Tamura, Akitoshi; Matsumoto, Izumi; Inada, Hiroshi; Kunimatsu, Takeshi; Kimura, Juki; Funabashi, Hitoshi

    2012-12-15

    Clobazam (CLB) is known to increase hepatobiliary thyroxine (T4) clearance in Sprague–Dawley (SD) rats, which results in hypothyroidism followed by thyroid follicular cell hypertrophy. However, the mechanism of the acceleration of T4-clearance has not been fully investigated. In the present study, we tried to clarify the roles of hepatic UDP-glucronosyltransferase (UGT) isoenzymes (UGT1A and UGT2B) and efflux transporter (multidrug resistance–associated protein-2; MRP2) in the CLB-induced acceleration of T4-clearance using two mutant rat strains, UGT1A-deficient mutant (Gunn) and MRP2-deficient mutant (EHBR) rats, especially focusing on thyroid morphology, levels of circulating hormones (T4 and triiodothyronine (T3)) and thyroid-stimulating hormone (TSH), and mRNA or protein expressions of UGTs (Ugt1a1, Ugt1a6, and Ugt2b1/2) and MRP2 (Mrp). CLB induced thyroid morphological changes with increases in TSH in SD and Gunn rats, but not in EHBR rats. T4 was slightly decreased in SD and Gunn rats, and T3 was decreased in Gunn rats, whereas these hormones were maintained in EHBR rats. Hepatic Ugt1a1, Ugt1a6, Ugt2b1/2, and Mrp2 mRNAs were upregulated in SD rats. In Gunn rats, UGT1A mRNAs (Ugt1a1/6) and protein levels were quite low, but UGT2B mRNAs (Ugt2b1/2) and protein were prominently upregulated. In SD and Gunn rats, MRP2 mRNA and protein were upregulated to the same degree. These results suggest that MRP2 is an important contributor in development of the thyroid cellular hypertrophy in CLB-treated rats, and that UGT1A and UGT2B work in concert with MRP2 in the presence of MRP2 function to enable the effective elimination of thyroid hormones. -- Highlights: ► Role of UGT and MRP2 in thyroid pathology was investigated in clobazam-treated rats. ► Clobazam induced thyroid cellular hypertrophy in SD and Gunn rats, but not EHBR rats. ► Hepatic Mrp2 gene and protein were upregulated in SD and Gunn rats, but not EHBR rats. ► Neither serum thyroid hormones (T3/T4

  10. The effects of clobazam treatment in rats on the expression of genes and proteins encoding glucronosyltransferase 1A/2B (UGT1A/2B) and multidrug resistance‐associated protein-2 (MRP2), and development of thyroid follicular cell hypertrophy

    International Nuclear Information System (INIS)

    Miyawaki, Izuru; Tamura, Akitoshi; Matsumoto, Izumi; Inada, Hiroshi; Kunimatsu, Takeshi; Kimura, Juki; Funabashi, Hitoshi

    2012-01-01

    Clobazam (CLB) is known to increase hepatobiliary thyroxine (T4) clearance in Sprague–Dawley (SD) rats, which results in hypothyroidism followed by thyroid follicular cell hypertrophy. However, the mechanism of the acceleration of T4-clearance has not been fully investigated. In the present study, we tried to clarify the roles of hepatic UDP-glucronosyltransferase (UGT) isoenzymes (UGT1A and UGT2B) and efflux transporter (multidrug resistance–associated protein-2; MRP2) in the CLB-induced acceleration of T4-clearance using two mutant rat strains, UGT1A-deficient mutant (Gunn) and MRP2-deficient mutant (EHBR) rats, especially focusing on thyroid morphology, levels of circulating hormones (T4 and triiodothyronine (T3)) and thyroid-stimulating hormone (TSH), and mRNA or protein expressions of UGTs (Ugt1a1, Ugt1a6, and Ugt2b1/2) and MRP2 (Mrp). CLB induced thyroid morphological changes with increases in TSH in SD and Gunn rats, but not in EHBR rats. T4 was slightly decreased in SD and Gunn rats, and T3 was decreased in Gunn rats, whereas these hormones were maintained in EHBR rats. Hepatic Ugt1a1, Ugt1a6, Ugt2b1/2, and Mrp2 mRNAs were upregulated in SD rats. In Gunn rats, UGT1A mRNAs (Ugt1a1/6) and protein levels were quite low, but UGT2B mRNAs (Ugt2b1/2) and protein were prominently upregulated. In SD and Gunn rats, MRP2 mRNA and protein were upregulated to the same degree. These results suggest that MRP2 is an important contributor in development of the thyroid cellular hypertrophy in CLB-treated rats, and that UGT1A and UGT2B work in concert with MRP2 in the presence of MRP2 function to enable the effective elimination of thyroid hormones. -- Highlights: ► Role of UGT and MRP2 in thyroid pathology was investigated in clobazam-treated rats. ► Clobazam induced thyroid cellular hypertrophy in SD and Gunn rats, but not EHBR rats. ► Hepatic Mrp2 gene and protein were upregulated in SD and Gunn rats, but not EHBR rats. ► Neither serum thyroid hormones (T3/T4

  11. Gene therapy and its implications in Periodontics

    Science.gov (United States)

    Mahale, Swapna; Dani, Nitin; Ansari, Shumaila S.; Kale, Triveni

    2009-01-01

    Gene therapy is a field of Biomedicine. With the advent of gene therapy in dentistry, significant progress has been made in the control of periodontal diseases and reconstruction of dento-alveolar apparatus. Implementation in periodontics include: -As a mode of tissue engineering with three approaches: cell, protein-based and gene delivery approach. -Genetic approach to Biofilm Antibiotic Resistance. Future strategies of gene therapy in preventing periodontal diseases: -Enhances host defense mechanism against infection by transfecting host cells with an antimicrobial peptide protein-encoding gene. -Periodontal vaccination. Gene therapy is one of the recent entrants and its applications in the field of periodontics are reviewed in general here. PMID:20376232

  12. Partial Least Squares Based Gene Expression Analysis in EBV- Positive and EBV-Negative Posttransplant Lymphoproliferative Disorders.

    Science.gov (United States)

    Wu, Sa; Zhang, Xin; Li, Zhi-Ming; Shi, Yan-Xia; Huang, Jia-Jia; Xia, Yi; Yang, Hang; Jiang, Wen-Qi

    2013-01-01

    Post-transplant lymphoproliferative disorder (PTLD) is a common complication of therapeutic immunosuppression after organ transplantation. Gene expression profile facilitates the identification of biological difference between Epstein-Barr virus (EBV) positive and negative PTLDs. Previous studies mainly implemented variance/regression analysis without considering unaccounted array specific factors. The aim of this study is to investigate the gene expression difference between EBV positive and negative PTLDs through partial least squares (PLS) based analysis. With a microarray data set from the Gene Expression Omnibus database, we performed PLS based analysis. We acquired 1188 differentially expressed genes. Pathway and Gene Ontology enrichment analysis identified significantly over-representation of dysregulated genes in immune response and cancer related biological processes. Network analysis identified three hub genes with degrees higher than 15, including CREBBP, ATXN1, and PML. Proteins encoded by CREBBP and PML have been reported to be interact with EBV before. Our findings shed light on expression distinction of EBV positive and negative PTLDs with the hope to offer theoretical support for future therapeutic study.

  13. Chimeras taking shape: Potential functions of proteins encoded by chimeric RNA transcripts

    Science.gov (United States)

    Frenkel-Morgenstern, Milana; Lacroix, Vincent; Ezkurdia, Iakes; Levin, Yishai; Gabashvili, Alexandra; Prilusky, Jaime; del Pozo, Angela; Tress, Michael; Johnson, Rory; Guigo, Roderic; Valencia, Alfonso

    2012-01-01

    Chimeric RNAs comprise exons from two or more different genes and have the potential to encode novel proteins that alter cellular phenotypes. To date, numerous putative chimeric transcripts have been identified among the ESTs isolated from several organisms and using high throughput RNA sequencing. The few corresponding protein products that have been characterized mostly result from chromosomal translocations and are associated with cancer. Here, we systematically establish that some of the putative chimeric transcripts are genuinely expressed in human cells. Using high throughput RNA sequencing, mass spectrometry experimental data, and functional annotation, we studied 7424 putative human chimeric RNAs. We confirmed the expression of 175 chimeric RNAs in 16 human tissues, with an abundance varying from 0.06 to 17 RPKM (Reads Per Kilobase per Million mapped reads). We show that these chimeric RNAs are significantly more tissue-specific than non-chimeric transcripts. Moreover, we present evidence that chimeras tend to incorporate highly expressed genes. Despite the low expression level of most chimeric RNAs, we show that 12 novel chimeras are translated into proteins detectable in multiple shotgun mass spectrometry experiments. Furthermore, we confirm the expression of three novel chimeric proteins using targeted mass spectrometry. Finally, based on our functional annotation of exon organization and preserved domains, we discuss the potential features of chimeric proteins with illustrative examples and suggest that chimeras significantly exploit signal peptides and transmembrane domains, which can alter the cellular localization of cognate proteins. Taken together, these findings establish that some chimeric RNAs are translated into potentially functional proteins in humans. PMID:22588898

  14. Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.

    Science.gov (United States)

    Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian

    2018-02-23

    Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.

  15. Mimic Phosphorylation of a βC1 Protein Encoded by TYLCCNB Impairs Its Functions as a Viral Suppressor of RNA Silencing and a Symptom Determinant.

    Science.gov (United States)

    Zhong, Xueting; Wang, Zhan Qi; Xiao, Ruyuan; Cao, Linge; Wang, Yaqin; Xie, Yan; Zhou, Xueping

    2017-08-15

    Phosphorylation of the βC1 protein encoded by the betasatellite of tomato yellow leaf curl China virus (TYLCCNB-βC1) by SNF1-related protein kinase 1 (SnRK1) plays a critical role in defense of host plants against geminivirus infection in Nicotiana benthamiana However, how phosphorylation of TYLCCNB-βC1 impacts its pathogenic functions during viral infection remains elusive. In this study, we identified two additional tyrosine residues in TYLCCNB-βC1 that are phosphorylated by SnRK1. The effects of TYLCCNB-βC1 phosphorylation on its functions as a viral suppressor of RNA silencing (VSR) and a symptom determinant were investigated via phosphorylation mimic mutants in N. benthamiana plants. Mutations that mimic phosphorylation of TYLCCNB-βC1 at tyrosine 5 and tyrosine 110 attenuated disease symptoms during viral infection. The phosphorylation mimics weakened the ability of TYLCCNB-βC1 to reverse transcriptional gene silencing and to suppress posttranscriptional gene silencing and abolished its interaction with N. benthamiana ASYMMETRIC LEAVES 1 in N. benthamiana leaves. The mimic phosphorylation of TYLCCNB-βC1 had no impact on its protein stability, subcellular localization, or self-association. Our data establish an inhibitory effect of phosphorylation of TYLCCNB-βC1 on its pathogenic functions as a VSR and a symptom determinant and provide a mechanistic explanation of how SnRK1 functions as a host defense factor. IMPORTANCE Tomato yellow leaf curl China virus (TYLCCNV), which causes a severe yellow leaf curl disease in China, is a monopartite geminivirus associated with the betasatellite (TYLCCNB). TYLCCNB encodes a single pathogenicity protein, βC1 (TYLCCNB-βC1), which functions as both a viral suppressor of RNA silencing (VSR) and a symptom determinant. Here, we show that mimicking phosphorylation of TYLCCNB-βC1 weakens its ability to reverse transcriptional gene silencing, to suppress posttranscriptional gene silencing, and to interact with N

  16. The protein encoded by the proto-oncogene DEK changes the topology of chromatin and reduces the efficiency of DNA replication in a chromatin-specific manner

    DEFF Research Database (Denmark)

    Alexiadis, V; Waldmann, T; Andersen, Jens S.

    2000-01-01

    The structure of chromatin regulates the genetic activity of the underlying DNA sequence. We report here that the protein encoded by the proto-oncogene DEK, which is involved in acute myelogenous leukemia, induces alterations of the superhelical density of DNA in chromatin. The change in topology...

  17. Diverse replication-associated protein encoding circular DNA viruses in guano samples of Central-Eastern European bats.

    Science.gov (United States)

    Kemenesi, Gábor; Kurucz, Kornélia; Zana, Brigitta; Földes, Fanni; Urbán, Péter; Vlaschenko, Anton; Kravchenko, Kseniia; Budinski, Ivana; Szodoray-Parádi, Farkas; Bücs, Szilárd; Jére, Csaba; Csősz, István; Szodoray-Parádi, Abigél; Estók, Péter; Görföl, Tamás; Boldogh, Sándor; Jakab, Ferenc

    2018-03-01

    Circular replication-associated protein encoding single-stranded DNA (CRESS DNA) viruses are increasingly recognized worldwide in a variety of samples. Representative members include well-described veterinary pathogens with worldwide distribution, such as porcine circoviruses or beak and feather disease virus. In addition, numerous novel viruses belonging to the family Circoviridae with unverified pathogenic roles have been discovered in different human samples. Viruses of the family Genomoviridae have also been described as being highly abundant in different faecal and environmental samples, with case reports showing them to be suspected pathogens in human infections. In order to investigate the genetic diversity of these viruses in European bat populations, we tested guano samples from Georgia, Hungary, Romania, Serbia and Ukraine. This resulted in the detection of six novel members of the family Circoviridae and two novel members of the family Genomoviridae. Interestingly, a gemini-like virus, namely niminivirus, which was originally found in raw sewage samples in Nigeria, was also detected in our samples. We analyzed the nucleotide composition of members of the family Circoviridae to determine the possible host origins of these viruses. This study provides the first dataset on CRESS DNA viruses of European bats, and members of several novel viral species were discovered.

  18. The P0 protein encoded by cotton leafroll dwarf virus (CLRDV) inhibits local but not systemic RNA silencing.

    Science.gov (United States)

    Delfosse, Verónica C; Agrofoglio, Yamila C; Casse, María F; Kresic, Iván Bonacic; Hopp, H Esteban; Ziegler-Graff, Véronique; Distéfano, Ana J

    2014-02-13

    Plants employ RNA silencing as a natural defense mechanism against viruses. As a counter-defense, viruses encode silencing suppressor proteins (SSPs) that suppress RNA silencing. Most, but not all, the P0 proteins encoded by poleroviruses have been identified as SSP. In this study, we demonstrated that cotton leafroll dwarf virus (CLRDV, genus Polerovirus) P0 protein suppressed local silencing that was induced by sense or inverted repeat transgenes in Agrobacterium co-infiltration assay in Nicotiana benthamiana plants. A CLRDV full-length infectious cDNA clone that is able to infect N. benthamiana through Agrobacterium-mediated inoculation also inhibited local silencing in co-infiltration assays, suggesting that the P0 protein exhibits similar RNA silencing suppression activity when expressed from the full-length viral genome. On the other hand, the P0 protein did not efficiently inhibit the spread of systemic silencing signals. Moreover, Northern blotting indicated that the P0 protein inhibits the generation of secondary but not primary small interfering RNAs. The study of CLRDV P0 suppression activity may contribute to understanding the molecular mechanisms involved in the induction of cotton blue disease by CLRDV infection. Copyright © 2013 Elsevier B.V. All rights reserved.

  19. Scuba: scalable kernel-based gene prioritization.

    Science.gov (United States)

    Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

    2018-01-25

    The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .

  20. Optimized Mitochondrial Targeting of Proteins Encoded by Modified mRNAs Rescues Cells Harboring Mutations in mtATP6

    Directory of Open Access Journals (Sweden)

    Randall Marcelo Chin

    2018-03-01

    Full Text Available Summary: Mitochondrial disease may be caused by mutations in the protein-coding genes of the mitochondrial genome. A promising strategy for treating such diseases is allotopic expression—the translation of wild-type copies of these proteins in the cytosol, with subsequent translocation into the mitochondria, resulting in rescue of mitochondrial function. In this paper, we develop an automated, quantitative, and unbiased screening platform to evaluate protein localization and mitochondrial morphology. This platform was used to compare 31 mitochondrial targeting sequences and 15 3′ UTRs in their ability to localize up to 9 allotopically expressed proteins to the mitochondria and their subsequent impact on mitochondrial morphology. Taking these two factors together, we synthesized chemically modified mRNAs that encode for an optimized allotopic expression construct for mtATP6. These mRNAs were able to functionally rescue a cell line harboring the 8993T > G point mutation in the mtATP6 gene. : Allotopic expression of proteins normally encoded by mtDNA is a promising therapy for mitochondrial disease. Chin et al. use an unbiased and high-content imaging-based screening platform to optimize allotopic expression. Modified mRNAs encoding for the optimized allotopic expression constructs rescued the respiration and growth of mtATP6-deficient cells. Keywords: mitochondria, mitochondrial disease, mRNA, modified mRNA, ATP6, allotopic expression, rare disease, gene therapy, screening, high content imaging

  1. Prioritization of candidate disease genes by combining topological similarity and semantic similarity.

    Science.gov (United States)

    Liu, Bin; Jin, Min; Zeng, Pan

    2015-10-01

    The identification of gene-phenotype relationships is very important for the treatment of human diseases. Studies have shown that genes causing the same or similar phenotypes tend to interact with each other in a protein-protein interaction (PPI) network. Thus, many identification methods based on the PPI network model have achieved good results. However, in the PPI network, some interactions between the proteins encoded by candidate gene and the proteins encoded by known disease genes are very weak. Therefore, some studies have combined the PPI network with other genomic information and reported good predictive performances. However, we believe that the results could be further improved. In this paper, we propose a new method that uses the semantic similarity between the candidate gene and known disease genes to set the initial probability vector of a random walk with a restart algorithm in a human PPI network. The effectiveness of our method was demonstrated by leave-one-out cross-validation, and the experimental results indicated that our method outperformed other methods. Additionally, our method can predict new causative genes of multifactor diseases, including Parkinson's disease, breast cancer and obesity. The top predictions were good and consistent with the findings in the literature, which further illustrates the effectiveness of our method. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Rational design of gene-based vaccines.

    Science.gov (United States)

    Barouch, Dan H

    2006-01-01

    Vaccine development has traditionally been an empirical discipline. Classical vaccine strategies include the development of attenuated organisms, whole killed organisms, and protein subunits, followed by empirical optimization and iterative improvements. While these strategies have been remarkably successful for a wide variety of viruses and bacteria, these approaches have proven more limited for pathogens that require cellular immune responses for their control. In this review, current strategies to develop and optimize gene-based vaccines are described, with an emphasis on novel approaches to improve plasmid DNA vaccines and recombinant adenovirus vector-based vaccines. Copyright 2006 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  3. Paper-based synthetic gene networks.

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J

    2014-11-06

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.

  4. Paper-based Synthetic Gene Networks

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

    2014-01-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  5. DNA Array-Based Gene Profiling

    Science.gov (United States)

    Mocellin, Simone; Provenzano, Maurizio; Rossi, Carlo Riccardo; Pilati, Pierluigi; Nitti, Donato; Lise, Mario

    2005-01-01

    Cancer is a heterogeneous disease in most respects, including its cellularity, different genetic alterations, and diverse clinical behaviors. Traditional molecular analyses are reductionist, assessing only 1 or a few genes at a time, thus working with a biologic model too specific and limited to confront a process whose clinical outcome is likely to be governed by the combined influence of many genes. The potential of functional genomics is enormous, because for each experiment, thousands of relevant observations can be made simultaneously. Accordingly, DNA array, like other high-throughput technologies, might catalyze and ultimately accelerate the development of knowledge in tumor cell biology. Although in its infancy, the implementation of DNA array technology in cancer research has already provided investigators with novel data and intriguing new hypotheses on the molecular cascade leading to carcinogenesis, tumor aggressiveness, and sensitivity to antiblastic agents. Given the revolutionary implications that the use of this technology might have in the clinical management of patients with cancer, principles of DNA array-based tumor gene profiling need to be clearly understood for the data to be correctly interpreted and appreciated. In the present work, we discuss the technical features characterizing this powerful laboratory tool and review the applications so far described in the field of oncology. PMID:15621987

  6. [Smart therapeutics based on synthetic gene circuits].

    Science.gov (United States)

    Peng, Shuguang; Xie, Zhen

    2017-03-25

    Synthetic biology has an important impact on biology research since its birth. Applying the thought and methods that reference from electrical engineering, synthetic biology uncovers many regulatory mechanisms of life systems, transforms and expands a series of biological components. Therefore, it brings a wide range of biomedical applications, including providing new ideas for disease diagnosis and treatment. This review describes the latest advances in the field of disease diagnosis and therapy based on mammalian cell or bacterial synthetic gene circuits, and provides new ideas for future smart therapy design.

  7. A powerful score-based test statistic for detecting gene-gene co-association.

    Science.gov (United States)

    Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun

    2016-01-29

    The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.

  8. Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology

    Science.gov (United States)

    Rossin, Elizabeth J.; Lage, Kasper; Raychaudhuri, Soumya; Xavier, Ramnik J.; Tatar, Diana; Benita, Yair

    2011-01-01

    Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein–protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in

  9. Systems-level analysis of risk genes reveals the modular nature of schizophrenia.

    Science.gov (United States)

    Liu, Jiewei; Li, Ming; Luo, Xiong-Jian; Su, Bing

    2018-05-19

    Schizophrenia (SCZ) is a complex mental disorder with high heritability. Genetic studies (especially recent genome-wide association studies) have identified many risk genes for schizophrenia. However, the physical interactions among the proteins encoded by schizophrenia risk genes remain elusive and it is not known whether the identified risk genes converge on common molecular networks or pathways. Here we systematically investigated the network characteristics of schizophrenia risk genes using the high-confidence protein-protein interactions (PPI) from the human interactome. We found that schizophrenia risk genes encode a densely interconnected PPI network (P = 4.15 × 10 -31 ). Compared with the background genes, the schizophrenia risk genes in the interactome have significantly higher degree (P = 5.39 × 10 -11 ), closeness centrality (P = 7.56 × 10 -11 ), betweeness centrality (P = 1.29 × 10 -11 ), clustering coefficient (P = 2.22 × 10 -2 ), and shorter average shortest path length (P = 7.56 × 10 -11 ). Based on the densely interconnected PPI network, we identified 48 hub genes and 4 modules formed by highly interconnected schizophrenia genes. We showed that the proteins encoded by schizophrenia hub genes have significantly more direct physical interactions. Gene ontology (GO) analysis revealed that cell adhesion, cell cycle, immune system response, and GABR-receptor complex categories were enriched in the modules formed by highly interconnected schizophrenia risk genes. Our study reveals that schizophrenia risk genes encode a densely interconnected molecular network and demonstrates the modular nature of schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  11. Variant Exported Blood-Stage Proteins Encoded by Plasmodium Multigene Families Are Expressed in Liver Stages Where They Are Exported into the Parasitophorous Vacuole.

    Directory of Open Access Journals (Sweden)

    Aurélie Fougère

    2016-11-01

    Full Text Available Many variant proteins encoded by Plasmodium-specific multigene families are exported into red blood cells (RBC. P. falciparum-specific variant proteins encoded by the var, stevor and rifin multigene families are exported onto the surface of infected red blood cells (iRBC and mediate interactions between iRBC and host cells resulting in tissue sequestration and rosetting. However, the precise function of most other Plasmodium multigene families encoding exported proteins is unknown. To understand the role of RBC-exported proteins of rodent malaria parasites (RMP we analysed the expression and cellular location by fluorescent-tagging of members of the pir, fam-a and fam-b multigene families. Furthermore, we performed phylogenetic analyses of the fam-a and fam-b multigene families, which indicate that both families have a history of functional differentiation unique to RMP. We demonstrate for all three families that expression of family members in iRBC is not mutually exclusive. Most tagged proteins were transported into the iRBC cytoplasm but not onto the iRBC plasma membrane, indicating that they are unlikely to play a direct role in iRBC-host cell interactions. Unexpectedly, most family members are also expressed during the liver stage, where they are transported into the parasitophorous vacuole. This suggests that these protein families promote parasite development in both the liver and blood, either by supporting parasite development within hepatocytes and erythrocytes and/or by manipulating the host immune response. Indeed, in the case of Fam-A, which have a steroidogenic acute regulatory-related lipid transfer (START domain, we found that several family members can transfer phosphatidylcholine in vitro. These observations indicate that these proteins may transport (host phosphatidylcholine for membrane synthesis. This is the first demonstration of a biological function of any exported variant protein family of rodent malaria parasites.

  12. Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology

    DEFF Research Database (Denmark)

    Rossin, Elizabeth J.; Hansen, Kasper Lage; Raychaudhuri, Soumya

    2011-01-01

    Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these r......Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed...... in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more...... that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non...

  13. A Legionella pneumophila effector protein encoded in a region of genomic plasticity binds to Dot/Icm-modified vacuoles.

    Directory of Open Access Journals (Sweden)

    Shira Ninio

    2009-01-01

    Full Text Available Legionella pneumophila is an opportunistic pathogen that can cause a severe pneumonia called Legionnaires' disease. In the environment, L. pneumophila is found in fresh water reservoirs in a large spectrum of environmental conditions, where the bacteria are able to replicate within a variety of protozoan hosts. To survive within eukaryotic cells, L. pneumophila require a type IV secretion system, designated Dot/Icm, that delivers bacterial effector proteins into the host cell cytoplasm. In recent years, a number of Dot/Icm substrate proteins have been identified; however, the function of most of these proteins remains unknown, and it is unclear why the bacterium maintains such a large repertoire of effectors to promote its survival. Here we investigate a region of the L. pneumophila chromosome that displays a high degree of plasticity among four sequenced L. pneumophila strains. Analysis of GC content suggests that several genes encoded in this region were acquired through horizontal gene transfer. Protein translocation studies establish that this region of genomic plasticity encodes for multiple Dot/Icm effectors. Ectopic expression studies in mammalian cells indicate that one of these substrates, a protein called PieA, has unique effector activities. PieA is an effector that can alter lysosome morphology and associates specifically with vacuoles that support L. pneumophila replication. It was determined that the association of PieA with vacuoles containing L. pneumophila requires modifications to the vacuole mediated by other Dot/Icm effectors. Thus, the localization properties of PieA reveal that the Dot/Icm system has the ability to spatially and temporally control the association of an effector with vacuoles containing L. pneumophila through activities mediated by other effector proteins.

  14. HMM-Based Gene Annotation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Haussler, David; Hughey, Richard; Karplus, Keven

    1999-09-20

    Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.

  15. KBERG: KnowledgeBase for Estrogen Responsive Genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam

    2007-01-01

    Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection...... is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition...

  16. Ranking candidate disease genes from gene expression and protein interaction: a Katz-centrality based approach.

    Directory of Open Access Journals (Sweden)

    Jing Zhao

    Full Text Available Many diseases have complex genetic causes, where a set of alleles can affect the propensity of getting the disease. The identification of such disease genes is important to understand the mechanistic and evolutionary aspects of pathogenesis, improve diagnosis and treatment of the disease, and aid in drug discovery. Current genetic studies typically identify chromosomal regions associated specific diseases. But picking out an unknown disease gene from hundreds of candidates located on the same genomic interval is still challenging. In this study, we propose an approach to prioritize candidate genes by integrating data of gene expression level, protein-protein interaction strength and known disease genes. Our method is based only on two, simple, biologically motivated assumptions--that a gene is a good disease-gene candidate if it is differentially expressed in cases and controls, or that it is close to other disease-gene candidates in its protein interaction network. We tested our method on 40 diseases in 58 gene expression datasets of the NCBI Gene Expression Omnibus database. On these datasets our method is able to predict unknown disease genes as well as identifying pleiotropic genes involved in the physiological cellular processes of many diseases. Our study not only provides an effective algorithm for prioritizing candidate disease genes but is also a way to discover phenotypic interdependency, cooccurrence and shared pathophysiology between different disorders.

  17. Isolation and characterization of a copalyl diphosphate synthase gene promoter from Salvia miltiorrhiza

    Directory of Open Access Journals (Sweden)

    Piotr Szymczyk

    2016-09-01

    Full Text Available The promoter, 5' UTR, and 34-nt 5' fragments of protein encoding region of the Salvia miltiorrhiza copalyl diphosphate synthase gene were cloned and characterized. No tandem repeats, miRNA binding sites, or CpNpG islands were observed in the promoter, 5' UTR, or protein encoding fragments. The entire isolated promoter and 5' UTR is 2235 bp long and contains repetitions of many cis-active elements, recognized by homologous transcription factors, found in Arabidopsis thaliana and other plant species. A pyrimidine-rich fragment with only 6 non-pyrimidine bases was localized in the 33-nt stretch from nt 2185 to 2217 in the 5' UTR. The observed cis-active sequences are potential binding sites for trans-factors that could regulate spatio-temporal CPS gene expression in response to biotic and abiotic stress conditions. Obtained results are initially verified by in silico and co-expression studies based on A. thaliana microarray data. The quantitative RT-PCR analysis confirmed that the entire 2269-bp copalyl diphosphate synthase gene fragment has the promoter activity. Quantitative RT-PCR analysis was used to study changes in CPS promoter activity occurring in response to the application of four selected biotic and abiotic regulatory factors; auxin, gibberellin, salicylic acid, and high-salt concentration.

  18. Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

    Directory of Open Access Journals (Sweden)

    Ettore Mosca

    2017-09-01

    Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.

  19. EasyClone: method for iterative chromosomal integration of multiple genes in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Jensen, Niels Bjerg; Strucko, Tomas; Kildegaard, Kanchana Rueksomtawin

    2014-01-01

    of multiple genes with an option of recycling selection markers. The vectors combine the advantage of efficient uracil excision reaction-based cloning and Cre-LoxP-mediated marker recycling system. The episomal and integrative vector sets were tested by inserting genes encoding cyan, yellow, and red...... fluorescent proteins into separate vectors and analyzing for co-expression of proteins by flow cytometry. Cells expressing genes encoding for the three fluorescent proteins from three integrations exhibited a much higher level of simultaneous expression than cells producing fluorescent proteins encoded...... on episomal plasmids, where correspondingly 95% and 6% of the cells were within a fluorescence interval of Log10 mean ± 15% for all three colors. We demonstrate that selective markers can be simultaneously removed using Cre-mediated recombination and all the integrated heterologous genes remain...

  20. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  1. Heterogeneous nuclear ribonucleoproteins H, H', and F are members of a ubiquitously expressed subfamily of related but distinct proteins encoded by genes mapping to different chromosomes

    DEFF Research Database (Denmark)

    Honoré, B; Rasmussen, H H; Vorum, H

    1995-01-01

    Molecular cDNA cloning, two-dimensional gel immunoblotting, and amino acid microsequencing identified three sequence-unique and distinct proteins that constitute a subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins corresponding to hnRNPs H, H', and F. These proteins share...... epitopes and sequence identity with two other proteins, isoelectric focusing sample spot numbers 2222 (37.6 kDa; pI 6.5) and 2326 (39.5 kDa; pI 6.6), indicating that the subfamily may contain additional members. The identity between hnRNPs H and H' is 96%, between H and F 78%, and between H' and F 75......%, respectively. The three proteins contain three repeats, which we denote quasi-RRMs (qRRMs) since they have a remote similarity to the RNA recognition motif (RRM). The three qRRMs of hnRNP H, with a few additional NH2-terminal amino acids, were constructed by polymerase chain reaction amplification and used...

  2. Identification of the major structural and nonstructural proteins encoded by human parvovirus B19 and mapping of their genes by procaryotic expression of isolated genomic fragments

    Energy Technology Data Exchange (ETDEWEB)

    Cotmore, S.F.; McKie, V.C.; Anderson, L.J.; Astell, C.R.; Tattersall, P.

    1986-11-01

    Plasma from a child with homozygous sickle-cell disease, sampled during the early phase of an aplastic crisis, contained human parvovirus B19 virions. Plasma taken 10 days later (during the convalescent phase) contained both immunoglobulin M and immunoglobulin G antibodies directed against two viral polypeptides with apparent molecular weights for 83,000 and 58,000 which were present exclusively in the particulate fraction of the plasma taken during the acute phase. These two protein species comigrated at 110S on neutral sucrose velocity gradients with the B19 viral DNA and thus appear to constitute the viral capsid polypeptides. The B19 genome was molecularly cloned into a bacterial plasmid vector. Two expression constructs containing B19 sequences from different halves of the viral genome were obtained, which directed the synthesis, in bacteria, of segments of virally encoded protein. These polypeptide fragments were then purified and used to immunize rabbits. Antibodies against a protein sequence specified between nucleotides 2897 and 3749 recognized both the 83- and 58-kilodalton capsid polypeptides in aplastic plasma taken during the acute phase and detected similar proteins in the similar proteins in the tissues of a stillborn fetus which had been infected transplacentally with B19. Antibodies against a protein sequence encoded in the other half of the B19 genome (nucleotides 1072 through 2044) did not react specifically with any protein in plasma taken during the acute phase but recognized three nonstructural polypeptides of 71, 63, and 52 kilodaltons present in the liver and, at lower levels, in some other tissues of the transplacentally infected fetus.

  3. DNA binding sites recognised in vitro by a knotted class 1 homeodomain protein encoded by the hooded gene, k, in barley (Hordeum vulgare)

    DEFF Research Database (Denmark)

    Krusell, L; Rasmussen, I; Gausing, K

    1997-01-01

    of knotted1 from maize was isolated from barley seedlings and expressed as a maltose binding protein fusion in E. coli. The purified HvH21-fusion protein selected DNA fragments with 1-3 copies of the sequence TGAC. Gel shift experiments showed that the TGAC element was required for binding and the results...

  4. Identification of the major structural and nonstructural proteins encoded by human parvovirus B19 and mapping of their genes by procaryotic expression of isolated genomic fragments

    International Nuclear Information System (INIS)

    Cotmore, S.F.; McKie, V.C.; Anderson, L.J.; Astell, C.R.; Tattersall, P.

    1986-01-01

    Plasma from a child with homozygous sickle-cell disease, sampled during the early phase of an aplastic crisis, contained human parvovirus B19 virions. Plasma taken 10 days later (during the convalescent phase) contained both immunoglobulin M and immunoglobulin G antibodies directed against two viral polypeptides with apparent molecular weights for 83,000 and 58,000 which were present exclusively in the particulate fraction of the plasma taken during the acute phase. These two protein species comigrated at 110S on neutral sucrose velocity gradients with the B19 viral DNA and thus appear to constitute the viral capsid polypeptides. The B19 genome was molecularly cloned into a bacterial plasmid vector. Two expression constructs containing B19 sequences from different halves of the viral genome were obtained, which directed the synthesis, in bacteria, of segments of virally encoded protein. These polypeptide fragments were then purified and used to immunize rabbits. Antibodies against a protein sequence specified between nucleotides 2897 and 3749 recognized both the 83- and 58-kilodalton capsid polypeptides in aplastic plasma taken during the acute phase and detected similar proteins in the similar proteins in the tissues of a stillborn fetus which had been infected transplacentally with B19. Antibodies against a protein sequence encoded in the other half of the B19 genome (nucleotides 1072 through 2044) did not react specifically with any protein in plasma taken during the acute phase but recognized three nonstructural polypeptides of 71, 63, and 52 kilodaltons present in the liver and, at lower levels, in some other tissues of the transplacentally infected fetus

  5. Transposases are the most abundant, most ubiquitous genes in nature.

    Science.gov (United States)

    Aziz, Ramy K; Breitbart, Mya; Edwards, Robert A

    2010-07-01

    Genes, like organisms, struggle for existence, and the most successful genes persist and widely disseminate in nature. The unbiased determination of the most successful genes requires access to sequence data from a wide range of phylogenetic taxa and ecosystems, which has finally become achievable thanks to the deluge of genomic and metagenomic sequences. Here, we analyzed 10 million protein-encoding genes and gene tags in sequenced bacterial, archaeal, eukaryotic and viral genomes and metagenomes, and our analysis demonstrates that genes encoding transposases are the most prevalent genes in nature. The finding that these genes, classically considered as selfish genes, outnumber essential or housekeeping genes suggests that they offer selective advantage to the genomes and ecosystems they inhabit, a hypothesis in agreement with an emerging body of literature. Their mobile nature not only promotes dissemination of transposable elements within and between genomes but also leads to mutations and rearrangements that can accelerate biological diversification and--consequently--evolution. By securing their own replication and dissemination, transposases guarantee to thrive so long as nucleic acid-based life forms exist.

  6. RNAi-based silencing of genes encoding the vacuolar- ATPase ...

    African Journals Online (AJOL)

    RNAi-based silencing of genes encoding the vacuolar- ATPase subunits a and c in pink bollworm (Pectinophora gossypiella). Ahmed M. A. Mohammed. Abstract. RNA interference is a post- transcriptional gene regulation mechanism that is predominantly found in eukaryotic organisms. RNAi demonstrated a successful ...

  7. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  8. Hepatitis B virus DNA polymerase gene polymorphism based ...

    African Journals Online (AJOL)

    Hepatitis B virus DNA polymerase gene polymorphism based prediction of genotypes in chronic HBV patients from Western India. Yashwant G. Chavan, Sharad R. Pawar, Minal Wani, Amol D. Raut, Rabindra N. Misra ...

  9. Evaluation of Gene-Based Family-Based Methods to Detect Novel Genes Associated With Familial Late Onset Alzheimer Disease

    Directory of Open Access Journals (Sweden)

    Maria V. Fernández

    2018-04-01

    Full Text Available Gene-based tests to study the combined effect of rare variants on a particular phenotype have been widely developed for case-control studies, but their evolution and adaptation for family-based studies, especially studies of complex incomplete families, has been slower. In this study, we have performed a practical examination of all the latest gene-based methods available for family-based study designs using both simulated and real datasets. We examined the performance of several collapsing, variance-component, and transmission disequilibrium tests across eight different software packages and 22 models utilizing a cohort of 285 families (N = 1,235 with late-onset Alzheimer disease (LOAD. After a thorough examination of each of these tests, we propose a methodological approach to identify, with high confidence, genes associated with the tested phenotype and we provide recommendations to select the best software and model for family-based gene-based analyses. Additionally, in our dataset, we identified PTK2B, a GWAS candidate gene for sporadic AD, along with six novel genes (CHRD, CLCN2, HDLBP, CPAMD8, NLRP9, and MAS1L as candidate genes for familial LOAD.

  10. A Nonlinear Model for Gene-Based Gene-Environment Interaction

    Directory of Open Access Journals (Sweden)

    Jian Sa

    2016-06-01

    Full Text Available A vast amount of literature has confirmed the role of gene-environment (G×E interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.

  11. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  12. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    2014-01-01

    Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.

  13. Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

    KAUST Repository

    AlShahrani, Mona; Hoehndorf, Robert

    2018-01-01

    In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.

  14. Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

    KAUST Repository

    Alshahrani, Mona

    2018-04-30

    In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease\\'s (or patient\\'s) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.

  15. Link-based quantitative methods to identify differentially coexpressed genes and gene Pairs

    Directory of Open Access Journals (Sweden)

    Ye Zhi-Qiang

    2011-08-01

    Full Text Available Abstract Background Differential coexpression analysis (DCEA is increasingly used for investigating the global transcriptional mechanisms underlying phenotypic changes. Current DCEA methods mostly adopt a gene connectivity-based strategy to estimate differential coexpression, which is characterized by comparing the numbers of gene neighbors in different coexpression networks. Although it simplifies the calculation, this strategy mixes up the identities of different coexpression neighbors of a gene, and fails to differentiate significant differential coexpression changes from those trivial ones. Especially, the correlation-reversal is easily missed although it probably indicates remarkable biological significance. Results We developed two link-based quantitative methods, DCp and DCe, to identify differentially coexpressed genes and gene pairs (links. Bearing the uniqueness of exploiting the quantitative coexpression change of each gene pair in the coexpression networks, both methods proved to be superior to currently popular methods in simulation studies. Re-mining of a publicly available type 2 diabetes (T2D expression dataset from the perspective of differential coexpression analysis led to additional discoveries than those from differential expression analysis. Conclusions This work pointed out the critical weakness of current popular DCEA methods, and proposed two link-based DCEA algorithms that will make contribution to the development of DCEA and help extend it to a broader spectrum.

  16. New Genome Similarity Measures based on Conserved Gene Adjacencies.

    Science.gov (United States)

    Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

    2017-06-01

    Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

  17. GOBO: gene expression-based outcome for breast cancer online.

    Directory of Open Access Journals (Sweden)

    Markus Ringnér

    Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.

  18. PCR-based detection of gene transfer vectors: application to gene doping surveillance.

    Science.gov (United States)

    Perez, Irene C; Le Guiner, Caroline; Ni, Weiyi; Lyles, Jennifer; Moullier, Philippe; Snyder, Richard O

    2013-12-01

    Athletes who illicitly use drugs to enhance their athletic performance are at risk of being banned from sports competitions. Consequently, some athletes may seek new doping methods that they expect to be capable of circumventing detection. With advances in gene transfer vector design and therapeutic gene transfer, and demonstrations of safety and therapeutic benefit in humans, there is an increased probability of the pursuit of gene doping by athletes. In anticipation of the potential for gene doping, assays have been established to directly detect complementary DNA of genes that are top candidates for use in doping, as well as vector control elements. The development of molecular assays that are capable of exposing gene doping in sports can serve as a deterrent and may also identify athletes who have illicitly used gene transfer for performance enhancement. PCR-based methods to detect foreign DNA with high reliability, sensitivity, and specificity include TaqMan real-time PCR, nested PCR, and internal threshold control PCR.

  19. Translational control and differential RNA decay are key elements regulating postsegregational expression of the killer protein encoded by the parB locus of plasmid R1

    DEFF Research Database (Denmark)

    Gerdes, K; Helin, K; Christensen, O W

    1988-01-01

    The parB locus of plasmid R1, which mediates plasmid stability via postsegregational killing of plasmid-free cells, encodes two genes, hok and sok. The hok gene product is a potent cell-killing protein. The hok gene is regulated at the translational level by the sok gene-encoded repressor, a small...

  20. Safety evaluation of the phosphinothricin acetyltransferase proteins encoded by the pat and bar sequences that confer tolerance to glufosinate-ammonium herbicide in transgenic plants.

    Science.gov (United States)

    Hérouet, Corinne; Esdaile, David J; Mallyon, Bryan A; Debruyne, Eric; Schulz, Arno; Currier, Thomas; Hendrickx, Koen; van der Klis, Robert-Jan; Rouan, Dominique

    2005-03-01

    Transgenic plant varieties, which are tolerant to glufosinate-ammonium, were developed. The herbicide tolerance is based upon the presence of either the bar or the pat gene, which encode for two homologous phosphinothricin acetyltransferases (PAT), in the plant genome. Based on both a review of published literature and experimental studies, the safety assessment reviews the first step of a two-step-approach for the evaluation of the safety of the proteins expressed in plants. It can be used to support the safety of food or feed products derived from any crop that contains and expresses these PAT proteins. The safety evaluation supports the conclusion that the genes and the donor microorganisms (Streptomyces) are innocuous. The PAT enzymes are highly specific and do not possess the characteristics associated with food toxins or allergens, i.e., they have no sequence homology with any known allergens or toxins, they have no N-glycosylation sites, they are rapidly degraded in gastric and intestinal fluids, and they are devoid of adverse effects in mice after intravenous administration at a high dose level. In conclusion, there is a reasonable certainty of no harm resulting from the inclusion of the PAT proteins in human food or in animal feed.

  1. Finding gene regulatory network candidates using the gene expression knowledge base.

    Science.gov (United States)

    Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

    2014-12-10

    Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

  2. Model-based gene set analysis for Bioconductor.

    Science.gov (United States)

    Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien

    2011-07-01

    Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.

  3. Identification and cloning of two insecticidal protein genes from ...

    African Journals Online (AJOL)

    Bacillus thuringiensis (Bt) is the most widely applied type of microbial pesticide due to its high specificity and environmental safety. The activity of Bt is largely attributed to the insecticidal crystal protein encoded by the cry genes. Different insecticidal crystal proteins of Bt have different bioactivity against distinct agricultural ...

  4. A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes.

    Directory of Open Access Journals (Sweden)

    Samuel Sunghwan Cho

    Full Text Available Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs. However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods

  5. A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes

    Science.gov (United States)

    Seo, Minseok; Shin, Su-kyung; Kwon, Eun-Young; Kim, Sung-Eun; Bae, Yun-Jung; Lee, Seungyeoun; Sung, Mi-Kyung; Choi, Myung-Sook; Park, Taesung

    2016-01-01

    Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs) among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs). However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods. Through analysis of

  6. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  7. Detection of Gene Interactions Based on Syntactic Relations

    Directory of Open Access Journals (Sweden)

    Mi-Young Kim

    2008-01-01

    Full Text Available Interactions between proteins and genes are considered essential in the description of biomolecular phenomena, and networks of interactions are applied in a system's biology approach. Recently, many studies have sought to extract information from biomolecular text using natural language processing technology. Previous studies have asserted that linguistic information is useful for improving the detection of gene interactions. In particular, syntactic relations among linguistic information are good for detecting gene interactions. However, previous systems give a reasonably good precision but poor recall. To improve recall without sacrificing precision, this paper proposes a three-phase method for detecting gene interactions based on syntactic relations. In the first phase, we retrieve syntactic encapsulation categories for each candidate agent and target. In the second phase, we construct a verb list that indicates the nature of the interaction between pairs of genes. In the last phase, we determine direction rules to detect which of two genes is the agent or target. Even without biomolecular knowledge, our method performs reasonably well using a small training dataset. While the first phase contributes to improve recall, the second and third phases contribute to improve precision. In the experimental results using ICML 05 Workshop on Learning Language in Logic (LLL05 data, our proposed method gave an F-measure of 67.2% for the test data, significantly outperforming previous methods. We also describe the contribution of each phase to the performance.

  8. Construction of coffee transcriptome networks based on gene annotation semantics

    Directory of Open Access Journals (Sweden)

    Castillo Luis F.

    2012-12-01

    Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

  9. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Directory of Open Access Journals (Sweden)

    Mario Fruzangohar

    Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.

  10. Analysis of regulatory networks constructed based on gene ...

    Indian Academy of Sciences (India)

    2013-12-09

    Dec 9, 2013 ... early diagnosis of complex diseases or cancer without obvious symptoms. [Gong J., Diao B., Yao G. J., ... expression levels of thousands of genes in a specific cell or tissue. Previous ..... base of the brain. It mainly controls the ...

  11. Identifying and Analyzing Novel Epilepsy-Related Genes Using Random Walk with Restart Algorithm

    Directory of Open Access Journals (Sweden)

    Wei Guo

    2017-01-01

    Full Text Available As a pathological condition, epilepsy is caused by abnormal neuronal discharge in brain which will temporarily disrupt the cerebral functions. Epilepsy is a chronic disease which occurs in all ages and would seriously affect patients’ personal lives. Thus, it is highly required to develop effective medicines or instruments to treat the disease. Identifying epilepsy-related genes is essential in order to understand and treat the disease because the corresponding proteins encoded by the epilepsy-related genes are candidates of the potential drug targets. In this study, a pioneering computational workflow was proposed to predict novel epilepsy-related genes using the random walk with restart (RWR algorithm. As reported in the literature RWR algorithm often produces a number of false positive genes, and in this study a permutation test and functional association tests were implemented to filter the genes identified by RWR algorithm, which greatly reduce the number of suspected genes and result in only thirty-three novel epilepsy genes. Finally, these novel genes were analyzed based upon some recently published literatures. Our findings implicate that all novel genes were closely related to epilepsy. It is believed that the proposed workflow can also be applied to identify genes related to other diseases and deepen our understanding of the mechanisms of these diseases.

  12. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  13. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    Science.gov (United States)

    Zhou, Xionghui; Liu, Juan

    2014-01-01

    Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for

  14. Thermodynamics-based models of transcriptional regulation with gene sequence.

    Science.gov (United States)

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  15. Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

    Science.gov (United States)

    Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2004-02-01

    To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.

  16. Minimal gene selection for classification and diagnosis prediction based on gene expression profile

    Directory of Open Access Journals (Sweden)

    Alireza Mehridehnavi

    2013-01-01

    Conclusion: We have shown that the use of two most significant genes based on their S/N ratios and selection of suitable training samples can lead to classify DLBCL patients with a rather good result. Actually with the aid of mentioned methods we could compensate lack of enough number of patients, improve accuracy of classifying and reduce complication of computations and so running time.

  17. Cloning of human basic A1, a distinct 59-kDa dystrophin-associated protein encoded on chromosome 8q23-24

    Energy Technology Data Exchange (ETDEWEB)

    Ahn, A.H. [Harvard Medical School, Boston, MA (United States); Yoshida, Mikiharu; Hagiwara, Yasuko; Ozawa, Eijiro [National Institute of Neuroscience, Ogawa Higashi, Kodaira (Japan); Anderson, M.S.; Feener, C.A.; Selig, S. [Howard Hughes Medical Institute at Children`s Hospital, Boston, MA (United States); Kunkel, L.M. [Harvard Medical School, Boston, MA (United States)]|[Howard Hughes Medical Institute at Children`s Hosptial, Boston, MA (United States)

    1994-05-10

    Duchenne and Becker muscular dystrophies are caused by defects of dystrophin, which forms a part of the membrane cytoskeleton of specialized cells such as muscle. It has been previously shown that the dystrophin-associated protein A1 (59-kDa DAP) is actually a heterogeneous group of phosphorylated proteins consisting of an acidic ({alpha}-A1) and a distinct basic ({beta}-A1) component. Partial peptide sequence of the A1 complex purified from rabbit muscle permitted the design of oligonucleotide probes that were used to isolate a cDNA for one human isoform of A1. This cDNA encodes a basic A1 isoform that is distinct from the recently described syntrophins in Torpedo and mouse and is expressed in many tissues with at least five distinct mRNA species of 5.9, 4.8, 4.3, 3.1, and 1.5 kb. A comparison of the human cDNA sequence with the GenBank expressed sequence tag (EST) data base has identified a relative from human skeletal muscle, EST25263, which is probably a human homologue of the published mouse syntrophin 2. The authors have mapped the human basic component of A1 and EST25263 genes to chromosomes 8q23-24 and 16, respectively.

  18. Development of gene diagnosis for diabetes and cholecystitis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Kono, Akira

    1999-01-01

    Base sequence analysis of CCKAR gene (a gene of A-type receptor for cholecystokinin) from OLETF rat, a model rat for insulin-independent diabetes was made based on the base sequence of wild CCKAR gene, which had been clarified in the previous year. From the pancreas of OLETF rat, DNA was extracted and transduced into λphage after fragmentation to construct the gene library of OLETF. Then, λphage DNA clone bound with labelled cDNA of CCKAR gene was analyzed and the gene structure was compared with that of the wild gene. It was demonstrated that CCKAR gene of OLETF had a deletion (6800 b.p.) ranging from the promoter region to the Exon 2, suggesting that CCKAR gene is not functional in OLETF rat. The whole sequence of this mutant gene was registered into Japan DNA Bank (D 50610). Then, F 2 offspring rats were obtained through crossing OLETF (female) and F344 (male) and the time course-changes in the blood glucose level after glucose loading were compared among them. The blood glucose level after glucose loading was significantly higher in the homo-mutant F 2 (CCKAR,-/-) as well as the parent OLETF rat than hetero-mutant F 2 (CCKARm-/+) or the wild rat (CCKAR,+/+). This suggests that CCKAR gene might be involved in the control of blood glucose level and an alteration of the expression level or the functions of CCKAR gene might affect the blood glucose level. (M.N.)

  19. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  20. Comparison of lists of genes based on functional profiles

    Directory of Open Access Journals (Sweden)

    Salicrú Miquel

    2011-10-01

    Full Text Available Abstract Background How to compare studies on the basis of their biological significance is a problem of central importance in high-throughput genomics. Many methods for performing such comparisons are based on the information in databases of functional annotation, such as those that form the Gene Ontology (GO. Typically, they consist of analyzing gene annotation frequencies in some pre-specified GO classes, in a class-by-class way, followed by p-value adjustment for multiple testing. Enrichment analysis, where a list of genes is compared against a wider universe of genes, is the most common example. Results A new global testing procedure and a method incorporating it are presented. Instead of testing separately for each GO class, a single global test for all classes under consideration is performed. The test is based on the distance between the functional profiles, defined as the joint frequencies of annotation in a given set of GO classes. These classes may be chosen at one or more GO levels. The new global test is more powerful and accurate with respect to type I errors than the usual class-by-class approach. When applied to some real datasets, the results suggest that the method may also provide useful information that complements the tests performed using a class-by-class approach if gene counts are sparse in some classes. An R library, goProfiles, implements these methods and is available from Bioconductor, http://bioconductor.org/packages/release/bioc/html/goProfiles.html. Conclusions The method provides an inferential basis for deciding whether two lists are functionally different. For global comparisons it is preferable to the global chi-square test of homogeneity. Furthermore, it may provide additional information if used in conjunction with class-by-class methods.

  1. Global Regulatory Differences for Gene- and Cell-Based Therapies

    DEFF Research Database (Denmark)

    Coppens, Delphi G M; De Bruin, Marie L; Leufkens, Hubert G M

    2017-01-01

    Gene- and cell-based therapies (GCTs) offer potential new treatment options for unmet medical needs. However, the use of conventional regulatory requirements for medicinal products to approve GCTs may impede patient access and therapeutic innovation. Furthermore, requirements differ between...... jurisdictions, complicating the global regulatory landscape. We provide a comparative overview of regulatory requirements for GCT approval in five jurisdictions and hypothesize on the consequences of the observed global differences on patient access and therapeutic innovation....

  2. Canonical correlation analysis for gene-based pleiotropy discovery.

    Directory of Open Access Journals (Sweden)

    Jose A Seoane

    2014-10-01

    Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.

  3. Sequence-based model of gap gene regulatory network.

    Science.gov (United States)

    Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

    2014-01-01

    The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3

  4. A Fisheye Viewer for microarray-based gene expression data.

    Science.gov (United States)

    Wu, Min; Thao, Cheng; Mu, Xiangming; Munson, Ethan V

    2006-10-13

    Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface--an electronic table (E-table) that uses fisheye distortion technology. The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.

  5. A fisheye viewer for microarray-based gene expression data

    Directory of Open Access Journals (Sweden)

    Munson Ethan V

    2006-10-01

    Full Text Available Abstract Background Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface – an electronic table (E-table that uses fisheye distortion technology. Results The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. Conclusion This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.

  6. A modular positive feedback-based gene amplifier

    Directory of Open Access Journals (Sweden)

    Bhalerao Kaustubh D

    2010-02-01

    Full Text Available Abstract Background Positive feedback is a common mechanism used in the regulation of many gene circuits as it can amplify the response to inducers and also generate binary outputs and hysteresis. In the context of electrical circuit design, positive feedback is often considered in the design of amplifiers. Similar approaches, therefore, may be used for the design of amplifiers in synthetic gene circuits with applications, for example, in cell-based sensors. Results We developed a modular positive feedback circuit that can function as a genetic signal amplifier, heightening the sensitivity to inducer signals as well as increasing maximum expression levels without the need for an external cofactor. The design utilizes a constitutively active, autoinducer-independent variant of the quorum-sensing regulator LuxR. We experimentally tested the ability of the positive feedback module to separately amplify the output of a one-component tetracycline sensor and a two-component aspartate sensor. In each case, the positive feedback module amplified the response to the respective inducers, both with regards to the dynamic range and sensitivity. Conclusions The advantage of our design is that the actual feedback mechanism depends only on a single gene and does not require any other modulation. Furthermore, this circuit can amplify any transcriptional signal, not just one encoded within the circuit or tuned by an external inducer. As our design is modular, it can potentially be used as a component in the design of more complex synthetic gene circuits.

  7. Molecular typing of Staphylococcus aureus based on coagulase gene.

    Science.gov (United States)

    Javid, Faizan; Taku, Anil; Bhat, Mohd Altaf; Badroo, Gulzar Ahmad; Mudasir, Mir; Sofi, Tanveer Ahmad

    2018-04-01

    This study was conducted to study the coagulase gene-based genetic diversity of Staphylococcus aureus , isolated from different samples of cattle using restriction fragment length polymorphism (RFLP) and their sequence-based phylogenetic analysis. A total of 192 different samples from mastitic milk, nasal cavity, and pus from skin wounds of cattle from Military Dairy Farm, Jammu, India, were screened for the presence of S. aureus . The presumptive isolates were confirmed by nuc gene-based polymerase chain reaction (PCR). The confirmed S. aureus isolates were subjected to coagulase ( coa ) gene PCR. Different coa genotypes observed were subjected to RFLP using restriction enzymes Hae111 and Alu1 , to obtain the different restriction patterns. One isolate from each restriction pattern was sequenced. These sequences were aligned for maximum homology using the Bioedit softwareandsimilarity in the sequences was inferred with the help of sequence identity matrix. Of 192 different samples,39 (20.31%) isolates of S. aureus were confirmed by targeting nuc gene using PCR. Of 39 S. aureus isolates, 25 (64.10%) isolates carried coa gene. Four different genotypes of coa gene, i.e., 514 bp, 595 bp, 757 bp, and 802 bp were obtained. Two coa genotypes, 595 bp (15 isolates) and 802 bp (4 isolates), were observed in mastitic milk. 514 bp (2 isolates) and 757 bp (4 isolates) coa genotypes were observed from nasal cavity and pus from skin wounds, respectively. On RFLP using both restriction enzymes, four different restriction patterns P1, P2, P3, and P4 were observed. On sequencing, four different sequences having unique restriction patterns were obtained. The most identical sequences with the value of 0.810 were found between isolate S. aureus 514 (nasal cavity) and S. aureus 595 (mastitic milk), and thus, they are most closely related. While as the most distant sequences with the value of 0.483 were found between S. aureus 514 and S. aureus 802 isolates. The study, being localized

  8. Information dimension analysis of bacterial essential and nonessential genes based on chaos game representation

    International Nuclear Information System (INIS)

    Zhou, Qian; Yu, Yong-ming

    2014-01-01

    Essential genes are indispensable for the survival of an organism. Investigating features associated with gene essentiality is fundamental to the prediction and identification of the essential genes. Selecting features associated with gene essentiality is fundamental to predict essential genes with computational techniques. We use fractal theory to make comparative analysis of essential and nonessential genes in bacteria. The information dimensions of essential genes and nonessential genes available in the DEG database for 27 bacteria are calculated based on their gene chaos game representations (CGRs). It is found that weak positive linear correlation exists between information dimension and gene length. Moreover, for genes of similar length, the average information dimension of essential genes is larger than that of nonessential genes. This indicates that essential genes show less regularity and higher complexity than nonessential genes. Our results show that for bacterium with a similar number of essential genes and nonessential genes, the CGR information dimension is helpful for the classification of essential genes and nonessential genes. Therefore, the gene CGR information dimension is very probably a useful gene feature for a genetic algorithm predicting essential genes. (paper)

  9. Development of gene diagnosis for diabetes and cholecystis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Kono, Akira

    1998-01-01

    The gene structures of CCK, A type receptor in human, the rat and the mouse were investigated aiming to clarify that the aberration of the gene is involved in the incidences of diabetes and cholecystis. In this fiscal year, 1997, the normal structure of the gene and the accurate base sequence were analyzed using DNA fragments bound to 32 P-labelled cDNA of human CCKAR originated from the gene library of leucocyte. This gene contained about 2.2 x 10 5 base pairs and the base sequence was completely determined and registered to Japan DNA data bank (D85606). In addition, the genome structures and base sequences of mouse and rat CCKAR were analyzed and registered (D 85605 and D 50608, respectively). The differences in the base sequence of CCKAR among the species were found in the promotor region and the intron regions, suggesting that there might be differences in splicing among species. (M.N.)

  10. Design of Knowledge Bases for Plant Gene Regulatory Networks.

    Science.gov (United States)

    Mukundi, Eric; Gomez-Cano, Fabio; Ouma, Wilberforce Zachary; Grotewold, Erich

    2017-01-01

    Developing a knowledge base that contains all the information necessary for the researcher studying gene regulation in a particular organism can be accomplished in four stages. This begins with defining the data scope. We describe here the necessary information and resources, and outline the methods for obtaining data. The second stage consists of designing the schema, which involves defining the entire arrangement of the database in a systematic plan. The third stage is the implementation, defined by actualization of the database by using software according to a predefined schema. The final stage is development, where the database is made available to users in a web-accessible system. The result is a knowledgebase that integrates all the information pertaining to gene regulation, and which is easily expandable and transferable.

  11. Gene ontology based transfer learning for protein subcellular localization

    Directory of Open Access Journals (Sweden)

    Zhou Shuigeng

    2011-02-01

    Full Text Available Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for

  12. Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

    Directory of Open Access Journals (Sweden)

    Xiaobo Guo

    Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.

  13. A Gene Module-Based eQTL Analysis Prioritizing Disease Genes and Pathways in Kidney Cancer

    Directory of Open Access Journals (Sweden)

    Mary Qu Yang

    Full Text Available Clear cell renal cell carcinoma (ccRCC is the most common and most aggressive form of renal cell cancer (RCC. The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1, as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways. Keywords: ccRCC, Causative mutation, Pathways, Protein-protein interaction, Gene module, eQTL

  14. A comprehensive family-based replication study of schizophrenia genes

    DEFF Research Database (Denmark)

    Aberg, Karolina A; Liu, Youfang; Bukszár, Jozsef

    2013-01-01

     768 control subjects from 6 databases and, after quality control 6298 individuals (including 3286 cases) from 1811 nuclear families. MAIN OUTCOMES AND MEASURES Case-control status for SCZ. RESULTS Replication results showed a highly significant enrichment of SNPs with small P values. Of the SNPs...... in an independent family-based replication study that, after quality control, consisted of 8107 SNPs. SETTING Linkage meta-analysis, brain transcriptome meta-analysis, candidate gene database, OMIM, relevant mouse studies, and expression quantitative trait locus databases. PATIENTS We included 11 185 cases and 10...

  15. Genetic and molecular analyses of Escherichia coli N-acetylneuraminate lyase gene.

    OpenAIRE

    Kawakami, B; Kudo, T; Narahashi, Y; Horikoshi, K

    1986-01-01

    Two plasmids containing the N-acetylneuraminate lyase (NALase) gene (nanA) of Escherichia coli, pNL1 and pNL4, were constructed. Immunoprecipitation analysis indicated that the 35,000-dalton protein encoded in pNL4 was NALase. The synthesis of NALase in E. coli carrying these plasmids was constitutive.

  16. Similarity and functional analyses of expressed parasitism genes in Heterodera schachtii and Heterodera glycines

    Science.gov (United States)

    The secreted proteins encoded by “parasitism genes” expressed within the esophageal glands cells of cyst nematodes play important roles in plant parasitism. Homologous transcripts and encoded proteins of the Heterodera glycines pioneer parasitism genes Hgsyv46, Hg4e02 and Hg5d08 were identified and ...

  17. MvaT Family Proteins Encoded on IncP-7 Plasmid pCAR1 and the Host Chromosome Regulate the Host Transcriptome Cooperatively but Differently.

    Science.gov (United States)

    Yun, Choong-Soo; Takahashi, Yurika; Shintani, Masaki; Takeda, Toshiharu; Suzuki-Minakuchi, Chiho; Okada, Kazunori; Yamane, Hisakazu; Nojiri, Hideaki

    2016-02-01

    MvaT proteins are members of the H-NS family of proteins in pseudomonads. The IncP-7 conjugative plasmid pCAR1 carries an mvaT-homologous gene, pmr. In Pseudomonas putida KT2440 bearing pCAR1, pmr and the chromosomally carried homologous genes, turA and turB, are transcribed at high levels, and Pmr interacts with TurA and TurB in vitro. In the present study, we clarified how the three MvaT proteins regulate the transcriptome of P. putida KT2440(pCAR1). Analyses performed by a modified chromatin immunoprecipitation assay with microarray technology (ChIP-chip) suggested that the binding regions of Pmr, TurA, and TurB in the P. putida KT2440(pCAR1) genome are almost identical; nevertheless, transcriptomic analyses using mutants with deletions of the genes encoding the MvaT proteins during the log and early stationary growth phases clearly suggested that their regulons were different. Indeed, significant regulon dissimilarity was found between Pmr and the other two proteins. Transcription of a larger number of genes was affected by Pmr deletion during early stationary phase than during log phase, suggesting that Pmr ameliorates the effects of pCAR1 on host fitness more effectively during the early stationary phase. Alternatively, the similarity of the TurA and TurB regulons implied that they might play complementary roles as global transcriptional regulators in response to plasmid carriage. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  18. Gene-based Association Approach Identify Genes Across Stress Traits in Fruit Flies

    DEFF Research Database (Denmark)

    Rohde, Palle Duun; Edwards, Stefan McKinnon; Sarup, Pernille Merete

    Identification of genes explaining variation in quantitative traits or genetic risk factors of human diseases requires both good phenotypic- and genotypic data, but also efficient statistical methods. Genome-wide association studies may reveal association between phenotypic variation and variation...... approach grouping variants accordingly to gene position, thus lowering the number of statistical tests performed and increasing the probability of identifying genes with small to moderate effects. Using this approach we identify numerous genes associated with different types of stresses in Drosophila...... melanogaster, but also identify common genes that affects the stress traits....

  19. Gene-Based Genome-Wide Association Analysis in European and Asian Populations Identified Novel Genes for Rheumatoid Arthritis.

    Directory of Open Access Journals (Sweden)

    Hong Zhu

    Full Text Available Rheumatoid arthritis (RA is a complex autoimmune disease. Using a gene-based association research strategy, the present study aims to detect unknown susceptibility to RA and to address the ethnic differences in genetic susceptibility to RA between European and Asian populations.Gene-based association analyses were performed with KGG 2.5 by using publicly available large RA datasets (14,361 RA cases and 43,923 controls of European subjects, 4,873 RA cases and 17,642 controls of Asian Subjects. For the newly identified RA-associated genes, gene set enrichment analyses and protein-protein interactions analyses were carried out with DAVID and STRING version 10.0, respectively. Differential expression verification was conducted using 4 GEO datasets. The expression levels of three selected 'highly verified' genes were measured by ELISA among our in-house RA cases and controls.A total of 221 RA-associated genes were newly identified by gene-based association study, including 71'overlapped', 76 'European-specific' and 74 'Asian-specific' genes. Among them, 105 genes had significant differential expressions between RA patients and health controls at least in one dataset, especially for 20 genes including 11 'overlapped' (ABCF1, FLOT1, HLA-F, IER3, TUBB, ZKSCAN4, BTN3A3, HSP90AB1, CUTA, BRD2, HLA-DMA, 5 'European-specific' (PHTF1, RPS18, BAK1, TNFRSF14, SUOX and 4 'Asian-specific' (RNASET2, HFE, BTN2A2, MAPK13 genes whose differential expressions were significant at least in three datasets. The protein expressions of two selected genes FLOT1 (P value = 1.70E-02 and HLA-DMA (P value = 4.70E-02 in plasma were significantly different in our in-house samples.Our study identified 221 novel RA-associated genes and especially highlighted the importance of 20 candidate genes on RA. The results addressed ethnic genetic background differences for RA susceptibility between European and Asian populations and detected a long list of overlapped or ethnic specific RA

  20. Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

    Science.gov (United States)

    Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

    2016-06-01

    Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. The Nance-Horan syndrome protein encodes a functional WAVE homology domain (WHD) and is important for co-ordinating actin remodelling and maintaining cell morphology.

    Science.gov (United States)

    Brooks, Simon P; Coccia, Margherita; Tang, Hao R; Kanuga, Naheed; Machesky, Laura M; Bailly, Maryse; Cheetham, Michael E; Hardcastle, Alison J

    2010-06-15

    Nance-Horan syndrome (NHS) is an X-linked developmental disorder, characterized by bilateral congenital cataracts, dental anomalies, facial dysmorphism and mental retardation. Null mutations in a novel gene, NHS, cause the syndrome. The NHS gene appears to have multiple isoforms as a result of alternative transcription, but a cellular function for the NHS protein has yet to be defined. We describe NHS as a founder member of a new protein family (NHS, NHSL1 and NHSL2). Here, we demonstrate that NHS is a novel regulator of actin remodelling and cell morphology. NHS localizes to sites of cell-cell contact, the leading edge of lamellipodia and focal adhesions. The N-terminus of isoforms NHS-A and NHS-1A, implicated in the pathogenesis of NHS, have a functional WAVE homology domain that interacts with the Abi protein family, haematopoietic stem/progenitor cell protein 300 (HSPC300), Nap1 and Sra1. NHS knockdown resulted in the disruption of the actin cytoskeleton. We show that NHS controls cell morphology by maintaining the integrity of the circumferential actin ring and controlling lamellipod formation. NHS knockdown led to a striking increase in cell spreading. Conversely, ectopic overexpression of NHS inhibited lamellipod formation. Remodelling of the actin cytoskeleton and localized actin polymerization into branched actin filaments at the plasma membrane are essential for mediating changes in cell shape, migration and cell contact. Our data identify NHS as a new regulator of actin remodelling. We suggest that NHS orchestrates actin regulatory protein function in response to signalling events during development.

  2. The Nance–Horan syndrome protein encodes a functional WAVE homology domain (WHD) and is important for co-ordinating actin remodelling and maintaining cell morphology

    Science.gov (United States)

    Brooks, Simon P.; Coccia, Margherita; Tang, Hao R.; Kanuga, Naheed; Machesky, Laura M.; Bailly, Maryse; Cheetham, Michael E.; Hardcastle, Alison J.

    2010-01-01

    Nance–Horan syndrome (NHS) is an X-linked developmental disorder, characterized by bilateral congenital cataracts, dental anomalies, facial dysmorphism and mental retardation. Null mutations in a novel gene, NHS, cause the syndrome. The NHS gene appears to have multiple isoforms as a result of alternative transcription, but a cellular function for the NHS protein has yet to be defined. We describe NHS as a founder member of a new protein family (NHS, NHSL1 and NHSL2). Here, we demonstrate that NHS is a novel regulator of actin remodelling and cell morphology. NHS localizes to sites of cell–cell contact, the leading edge of lamellipodia and focal adhesions. The N-terminus of isoforms NHS-A and NHS-1A, implicated in the pathogenesis of NHS, have a functional WAVE homology domain that interacts with the Abi protein family, haematopoietic stem/progenitor cell protein 300 (HSPC300), Nap1 and Sra1. NHS knockdown resulted in the disruption of the actin cytoskeleton. We show that NHS controls cell morphology by maintaining the integrity of the circumferential actin ring and controlling lamellipod formation. NHS knockdown led to a striking increase in cell spreading. Conversely, ectopic overexpression of NHS inhibited lamellipod formation. Remodelling of the actin cytoskeleton and localized actin polymerization into branched actin filaments at the plasma membrane are essential for mediating changes in cell shape, migration and cell contact. Our data identify NHS as a new regulator of actin remodelling. We suggest that NHS orchestrates actin regulatory protein function in response to signalling events during development. PMID:20332100

  3. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

    2014-06-01

    In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both

  4. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo

    2014-01-01

    Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further

  5. Statistics on gene-based laser speckles with a small number of scatterers: implications for the detection of polymorphism in the Chlamydia trachomatis omp1 gene

    Science.gov (United States)

    Ulyanov, Sergey S.; Ulianova, Onega V.; Zaytsev, Sergey S.; Saltykov, Yury V.; Feodorova, Valentina A.

    2018-04-01

    The transformation mechanism for a nucleotide sequence of the Chlamydia trachomatis gene into a speckle pattern has been considered. The first and second-order statistics of gene-based speckles have been analyzed. It has been demonstrated that gene-based speckles do not obey Gaussian statistics and belong to the class of speckles with a small number of scatterers. It has been shown that gene polymorphism can be easily detected through analysis of the statistical characteristics of gene-based speckles.

  6. Optimization of conditions for gene delivery system based on PEI

    Directory of Open Access Journals (Sweden)

    Roya Cheraghi

    2017-01-01

    Full Text Available Objective(s: PEI based nanoparticle (NP due to dual capabilities of proton sponge and DNA binding is known as powerful tool for nucleic acid delivery to cells. However, serious cytotoxicity and complicated conditions, which govern NPs properties and its interactions with cells practically, hindered achievement to high transfection efficiency. Here, we have tried to optimize the properties of PEI/ firefly luciferase plasmid complexes and cellular condition to improve transfection efficiency. Materials and Methods: For this purpose, firefly luciferase, as a robust gene reporter, was complexed with PEI to prepare NPs with different size and charge. The physicochemical properties of nanoparticles were evaluated using agarose gel retardation and dynamic light scattering.  MCF7 and BT474 cells at different confluency were also transfected with prepared nanoparticles at various concentrations for short and long times. Results: The branched PEI can instantaneously bind to DNA and form cationic NPs. The results demonstrated the production of nanoparticles with size about 100-500 nm dependent on N/P ratio. Moreover, increase of nanoparticles concentration on the cell surface drastically improved the transfection rate, so at a concentration of 30 ng/ìl, the highest transfection efficiency was achieved. On the other side, at confluency between 40-60%, the maximum efficiency was obtained. The result demonstrated that N/P ratio of 12 could establish an optimized ratio between transfection efficiency and cytotoxicity of PEI/plasmid nanoparticles. The increase of NPs N/P ratio led to significant cytotoxicity. Conclusion: Obtained results verified the optimum conditions for PEI based gene delivery in different cell lines.

  7. The NB-LRR gene Pm60 confers powdery mildew resistance in wheat.

    Science.gov (United States)

    Zou, Shenghao; Wang, Huan; Li, Yiwen; Kong, Zhaosheng; Tang, Dingzhong

    2018-04-01

    Powdery mildew is one of the most devastating diseases of wheat. To date, few powdery mildew resistance genes have been cloned from wheat due to the size and complexity of the wheat genome. Triticum urartu is the progenitor of the A genome of wheat and is an important source for powdery mildew resistance genes. Using molecular markers designed from scaffolds of the sequenced T. urartu accession and standard map-based cloning, a powdery mildew resistance locus was mapped to a 356-kb region, which contains two nucleotide-binding and leucine-rich repeat domain (NB-LRR) protein-encoding genes. Virus-induced gene silencing, single-cell transient expression, and stable transformation assays demonstrated that one of these two genes, designated Pm60, confers resistance to powdery mildew. Overexpression of full-length Pm60 and two allelic variants in Nicotiana benthamiana leaves induced hypersensitive cell death response, but expression of the coiled-coil domain alone was insufficient to induce hypersensitive response. Yeast two-hybrid, bimolecular fluorescence complementation and luciferase complementation imaging assays showed that Pm60 protein interacts with its neighboring NB-containing protein, suggesting that they might be functionally related. The identification and cloning of this novel wheat powdery mildew resistance gene will facilitate breeding for disease resistance in wheat. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  8. WebScipio: An online tool for the determination of gene structures using protein sequences

    Directory of Open Access Journals (Sweden)

    Waack Stephan

    2008-09-01

    Full Text Available Abstract Background Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. Results WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. Conclusion WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at http://www.webscipio.org.

  9. Density based pruning for identification of differentially expressed genes from microarray data

    Directory of Open Access Journals (Sweden)

    Xu Jia

    2010-11-01

    Full Text Available Abstract Motivation Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes. Results We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change. Conclusions Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune

  10. Identification of Constrained Cancer Driver Genes Based on Mutation Timing

    Science.gov (United States)

    Sakoparnig, Thomas; Fried, Patrick; Beerenwinkel, Niko

    2015-01-01

    Cancer drivers are genomic alterations that provide cells containing them with a selective advantage over their local competitors, whereas neutral passengers do not change the somatic fitness of cells. Cancer-driving mutations are usually discriminated from passenger mutations by their higher degree of recurrence in tumor samples. However, there is increasing evidence that many additional driver mutations may exist that occur at very low frequencies among tumors. This observation has prompted alternative methods for driver detection, including finding groups of mutually exclusive mutations and incorporating prior biological knowledge about gene function or network structure. Dependencies among drivers due to epistatic interactions can also result in low mutation frequencies, but this effect has been ignored in driver detection so far. Here, we present a new computational approach for identifying genomic alterations that occur at low frequencies because they depend on other events. Unlike passengers, these constrained mutations display punctuated patterns of occurrence in time. We test this driver–passenger discrimination approach based on mutation timing in extensive simulation studies, and we apply it to cross-sectional copy number alteration (CNA) data from ovarian cancer, CNA and single-nucleotide variant (SNV) data from breast tumors and SNV data from colorectal cancer. Among the top ranked predicted drivers, we find low-frequency genes that have already been shown to be involved in carcinogenesis, as well as many new candidate drivers. The mutation timing approach is orthogonal and complementary to existing driver prediction methods. It will help identifying from cancer genome data the alterations that drive tumor progression. PMID:25569148

  11. Gene-based interaction analysis shows GABAergic genes interacting with parenting in adolescent depressive symptoms

    NARCIS (Netherlands)

    Van Assche, Evelien; Moons, Tim; Cinar, Ozan; Viechtbauer, Wolfgang; Oldehinkel, Albertine J.; Van Leeuwen, Karla; Verschueren, Karine; Colpin, Hilde; Lambrechts, Diether; Van den Noortgate, Wim; Goossens, Luc; Claes, Stephan; van Winkel, Ruud

    2017-01-01

    BACKGROUND: Most gene-environment interaction studies (G × E) have focused on single candidate genes. This approach is criticized for its expectations of large effect sizes and occurrence of spurious results. We describe an approach that accounts for the polygenic nature of most psychiatric

  12. Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.

    Science.gov (United States)

    Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph

    2013-11-07

    The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the

  13. Screening key genes for abdominal aortic aneurysm based on gene expression omnibus dataset.

    Science.gov (United States)

    Wan, Li; Huang, Jingyong; Ni, Haizhen; Yu, Guanfeng

    2018-02-13

    Abdominal aortic aneurysm (AAA) is a common cardiovascular system disease with high mortality. The aim of this study was to identify potential genes for diagnosis and therapy in AAA. We searched and downloaded mRNA expression data from the Gene Expression Omnibus (GEO) database to identify differentially expressed genes (DEGs) from AAA and normal individuals. Then, Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis, transcriptional factors (TFs) network and protein-protein interaction (PPI) network were used to explore the function of genes. Additionally, immunohistochemical (IHC) staining was used to validate the expression of identified genes. Finally, the diagnostic value of identified genes was accessed by receiver operating characteristic (ROC) analysis in GEO database. A total of 1199 DEGs (188 up-regulated and 1011 down-regulated) were identified between AAA and normal individual. KEGG pathway analysis displayed that vascular smooth muscle contraction and pathways in cancer were significantly enriched signal pathway. The top 10 up-regulated and top 10 down-regulated DEGs were used to construct TFs and PPI networks. Some genes with high degrees such as NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16 and FOXO1 were identified to be related to AAA. The consequences of IHC staining showed that CCR7 and PDGFA were up-regulated in tissue samples of AAA. ROC analysis showed that NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16, FOXO1 and PDGFA had the potential diagnostic value for AAA. The identified genes including NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16, FOXO1 and PDGFA might be involved in the pathology of AAA.

  14. The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes

    Directory of Open Access Journals (Sweden)

    Xinguo Lu

    2018-01-01

    Full Text Available With advances in next-generation sequencing(NGS technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast cancer is known as a heterogeneous disease. The identification of subtype-specific driver genes is critical to guide the diagnosis, assessment of prognosis and treatment of breast cancer. We developed an integrated frame based on gene expression profiles and copy number variation (CNV data to identify breast cancer subtype-specific driver genes. In this frame, we employed statistical machine-learning method to select gene subsets and utilized an module-network analysis method to identify potential candidate driver genes. The final subtype-specific driver genes were acquired by paired-wise comparison in subtypes. To validate specificity of the driver genes, the gene expression data of these genes were applied to classify the patient samples with 10-fold cross validation and the enrichment analysis were also conducted on the identified driver genes. The experimental results show that the proposed integrative method can identify the potential driver genes and the classifier with these genes acquired better performance than with genes identified by other methods.

  15. GeneRecon Users' Manual — A coalescent based tool for fine-scale association mapping

    DEFF Research Database (Denmark)

    Mailund, T

    2006-01-01

    GeneRecon is a software package for linkage disequilibrium mapping using coalescent theory. It is based on Bayesian Markov-chain Monte Carlo (MCMC) method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. GeneRecon explicitly models the genealogy of a sample of th...

  16. A PLSPM-Based Test Statistic for Detecting Gene-Gene Co-Association in Genome-Wide Association Study with Case-Control Design

    Science.gov (United States)

    Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

    2013-01-01

    For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods. PMID:23620809

  17. A hybrid network-based method for the detection of disease-related genes

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

    2018-02-01

    Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.

  18. RNAi-Based Identification of Gene-Specific Nuclear Cofactor Networks Regulating Interleukin-1 Target Genes

    Directory of Open Access Journals (Sweden)

    Johanna Meier-Soelch

    2018-04-01

    Full Text Available The potent proinflammatory cytokine interleukin (IL-1 triggers gene expression through the NF-κB signaling pathway. Here, we investigated the cofactor requirements of strongly regulated IL-1 target genes whose expression is impaired in p65 NF-κB-deficient murine embryonic fibroblasts. By two independent small-hairpin (shRNA screens, we examined 170 genes annotated to encode nuclear cofactors for their role in Cxcl2 mRNA expression and identified 22 factors that modulated basal or IL-1-inducible Cxcl2 levels. The functions of 16 of these factors were validated for Cxcl2 and further analyzed for their role in regulation of 10 additional IL-1 target genes by RT-qPCR. These data reveal that each inducible gene has its own (quantitative requirement of cofactors to maintain basal levels and to respond to IL-1. Twelve factors (Epc1, H2afz, Kdm2b, Kdm6a, Mbd3, Mta2, Phf21a, Ruvbl1, Sin3b, Suv420h1, Taf1, and Ube3a have not been previously implicated in inflammatory cytokine functions. Bioinformatics analysis indicates that they are components of complex nuclear protein networks that regulate chromatin functions and gene transcription. Collectively, these data suggest that downstream from the essential NF-κB signal each cytokine-inducible target gene has further subtle requirements for individual sets of nuclear cofactors that shape its transcriptional activation profile.

  19. A new measure for functional similarity of gene products based on Gene Ontology

    Directory of Open Access Journals (Sweden)

    Lengauer Thomas

    2006-06-01

    Full Text Available Abstract Background Gene Ontology (GO is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. Results We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; simRel and funSim. One measure (simRel is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. Conclusion The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families.

  20. TXTGate: profiling gene groups with text-based information

    DEFF Research Database (Denmark)

    Glenisson, P.; Coessens, B.; Van Vooren, S.

    2004-01-01

    We implemented a framework called TXTGate that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, term-as well as gene-centric views are offered on selected textual...

  1. Targeted delivery of genes to endothelial cells and cell- and gene-based therapy in pulmonary vascular diseases.

    Science.gov (United States)

    Suen, Colin M; Mei, Shirley H J; Kugathasan, Lakshmi; Stewart, Duncan J

    2013-10-01

    Pulmonary arterial hypertension (PAH) is a devastating disease that, despite significant advances in medical therapies over the last several decades, continues to have an extremely poor prognosis. Gene therapy is a method to deliver therapeutic genes to replace defective or mutant genes or supplement existing cellular processes to modify disease. Over the last few decades, several viral and nonviral methods of gene therapy have been developed for preclinical PAH studies with varying degrees of efficacy. However, these gene delivery methods face challenges of immunogenicity, low transduction rates, and nonspecific targeting which have limited their translation to clinical studies. More recently, the emergence of regenerative approaches using stem and progenitor cells such as endothelial progenitor cells (EPCs) and mesenchymal stem cells (MSCs) have offered a new approach to gene therapy. Cell-based gene therapy is an approach that augments the therapeutic potential of EPCs and MSCs and may deliver on the promise of reversal of established PAH. These new regenerative approaches have shown tremendous potential in preclinical studies; however, large, rigorously designed clinical studies will be necessary to evaluate clinical efficacy and safety. © 2013 American Physiological Society. Compr Physiol 3:1749-1779, 2013.

  2. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses.

    Science.gov (United States)

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-03-01

    comprehensive gene data set of sex pheromone biosynthesis and degradation enzyme related genes in DBM created by genome- and transcriptome-wide identification, characterization and expression profiling. Our findings provide a basis to better understand the function of genes with tissue enriched expression. The results also provide information on the genes involved in sex pheromone biosynthesis and degradation, and may be useful to identify potential gene targets for pest control strategies by disrupting the insect-insect communication using pheromone-based behavioral antagonists.

  3. GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies.

    Science.gov (United States)

    Yung, Ling Sing; Yang, Can; Wan, Xiang; Yu, Weichuan

    2011-05-01

    Collecting millions of genetic variations is feasible with the advanced genotyping technology. With a huge amount of genetic variations data in hand, developing efficient algorithms to carry out the gene-gene interaction analysis in a timely manner has become one of the key problems in genome-wide association studies (GWAS). Boolean operation-based screening and testing (BOOST), a recent work in GWAS, completes gene-gene interaction analysis in 2.5 days on a desktop computer. Compared with central processing units (CPUs), graphic processing units (GPUs) are highly parallel hardware and provide massive computing resources. We are, therefore, motivated to use GPUs to further speed up the analysis of gene-gene interactions. We implement the BOOST method based on a GPU framework and name it GBOOST. GBOOST achieves a 40-fold speedup compared with BOOST. It completes the analysis of Wellcome Trust Case Control Consortium Type 2 Diabetes (WTCCC T2D) genome data within 1.34 h on a desktop computer equipped with Nvidia GeForce GTX 285 display card. GBOOST code is available at http://bioinformatics.ust.hk/BOOST.html#GBOOST.

  4. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  5. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  6. Candidate genes and pathogenesis investigation for sepsis-related acute respiratory distress syndrome based on gene expression profile.

    Science.gov (United States)

    Wang, Min; Yan, Jingjun; He, Xingxing; Zhong, Qiang; Zhan, Chengye; Li, Shusheng

    2016-04-18

    Acute respiratory distress syndrome (ARDS) is a potentially devastating form of acute inflammatory lung injury as well as a major cause of acute respiratory failure. Although researchers have made significant progresses in elucidating the pathophysiology of this complex syndrome over the years, the absence of a universal detail disease mechanism up until now has led to a series of practical problems for a definitive treatment. This study aimed to predict some genes or pathways associated with sepsis-related ARDS based on a public microarray dataset and to further explore the molecular mechanism of ARDS. A total of 122 up-regulated DEGs and 91 down-regulated differentially expressed genes (DEGs) were obtained. The up- and down-regulated DEGs were mainly involved in functions like mitotic cell cycle and pathway like cell cycle. Protein-protein interaction network of ARDS analysis revealed 20 hub genes including cyclin B1 (CCNB1), cyclin B2 (CCNB2) and topoisomerase II alpha (TOP2A). A total of seven transcription factors including forkhead box protein M1 (FOXM1) and 30 target genes were revealed in the transcription factor-target gene regulation network. Furthermore, co-cited genes including CCNB2-CCNB1 were revealed in literature mining for the relations ARDS related genes. Pathways like mitotic cell cycle were closed related with the development of ARDS. Genes including CCNB1, CCNB2 and TOP2A, as well as transcription factors like FOXM1 might be used as the novel gene therapy targets for sepsis related ARDS.

  7. Identification of potential crucial genes associated with steroid-induced necrosis of femoral head based on gene expression profile.

    Science.gov (United States)

    Lin, Zhe; Lin, Yongsheng

    2017-09-05

    The aim of this study was to explore potential crucial genes associated with the steroid-induced necrosis of femoral head (SINFH) and to provide valid biological information for further investigation of SINFH. Gene expression profile of GSE26316, generated from 3 SINFH rat samples and 3 normal rat samples were downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified using LIMMA package. After functional enrichment analyses of DEGs, protein-protein interaction (PPI) network and sub-PPI network analyses were conducted based on the STRING database and cytoscape. In total, 59 up-regulated DEGs and 156 downregulated DEGs were identified. The up-regulated DEGs were mainly involved in functions about immunity (e.g. Fcer1A and Il7R), and the downregulated DEGs were mainly enriched in muscle system process (e.g. Tnni2, Mylpf and Myl1). The PPI network of DEGs consisted of 123 nodes and 300 interactions. Tnni2, Mylpf, and Myl1 were the top 3 outstanding genes based on both subgraph centrality and degree centrality evaluation. These three genes interacted with each other in the network. Furthermore, the significant network module was composed of 22 downregulated genes (e.g. Tnni2, Mylpf and Myl1). These genes were mainly enriched in functions like muscle system process. The DEGs related to the regulation of immune system process (e.g. Fcer1A and Il7R), and DEGs correlated with muscle system process (e.g. Tnni2, Mylpf and Myl1) may be closely associated with the progress of SINFH, which is still needed to be confirmed by experiments. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Gene expression profiling in cells with enhanced gamma-secretase activity.

    Directory of Open Access Journals (Sweden)

    Alexandra I Magold

    2009-09-01

    Full Text Available Processing by gamma-secretase of many type-I membrane protein substrates triggers signaling cascades by releasing intracellular domains (ICDs that, following nuclear translocation, modulate the transcription of different genes regulating a diverse array of cellular and biological processes. Because the list of gamma-secretase substrates is growing quickly and this enzyme is a cancer and Alzheimer's disease therapeutic target, the mapping of gamma-secretase activity susceptible gene transcription is important for sharpening our view of specific affected genes, molecular functions and biological pathways.To identify genes and molecular functions transcriptionally affected by gamma-secretase activity, the cellular transcriptomes of Chinese hamster ovary (CHO cells with enhanced and inhibited gamma-secretase activity were analyzed and compared by cDNA microarray. The functional clustering by FatiGO of the 1,981 identified genes revealed over- and under-represented groups with multiple activities and functions. Single genes with the most pronounced transcriptional susceptibility to gamma-secretase activity were evaluated by real-time PCR. Among the 21 validated genes, the strikingly decreased transcription of PTPRG and AMN1 and increased transcription of UPP1 potentially support data on cell cycle disturbances relevant to cancer, stem cell and neurodegenerative diseases' research. The mapping of interactions of proteins encoded by the validated genes exclusively relied on evidence-based data and revealed broad effects on Wnt pathway members, including WNT3A and DVL3. Intriguingly, the transcription of TERA, a gene of unknown function, is affected by gamma-secretase activity and was significantly altered in the analyzed human Alzheimer's disease brain cortices.Investigating the effects of gamma-secretase activity on gene transcription has revealed several affected clusters of molecular functions and, more specifically, 21 genes that hold significant

  9. The genome of Paenibacillus sabinae T27 provides insight into evolution, organization and functional elucidation of nif and nif-like genes

    OpenAIRE

    Li, Xinxin; Deng, Zhiping; Liu, Zhanzhi; Yan, Yongliang; Wang, Tianshu; Xie, Jianbo; Lin, Min; Cheng, Qi; Chen, Sanfeng

    2014-01-01

    Background Most biological nitrogen fixation is catalyzed by the molybdenum nitrogenase. This enzyme is a complex which contains the MoFe protein encoded by nifDK and the Fe protein encoded by nifH. In addition to nifHDK, nifHDK-like genes were found in some Archaea and Firmicutes, but their function is unclear. Results We sequenced the genome of Paenibacillus sabinae T27. A total of 4,793 open reading frames were predicted from its 5.27 Mb genome. The genome of P. sabinae T27 contains fiftee...

  10. Stress tolerances of nullmutants of function-unknown genes encoding menadione stress-responsive proteins in Aspergillus nidulans.

    Science.gov (United States)

    Leiter, Éva; Bálint, Mihály; Miskei, Márton; Orosz, Erzsébet; Szabó, Zsuzsa; Pócsi, István

    2016-07-01

    A group of menadione stress-responsive function-unkown genes of Aspergillus nidulans (Locus IDs ANID_03987.1, ANID_06058.1, ANID_10219.1, and ANID_10260.1) was deleted and phenotypically characterized. Importantly, comparative and phylogenetic analyses of the tested A. nidulans genes and their orthologs shed light only on the presence of a TANGO2 domain with NRDE protein motif in the translated ANID_06058.1 gene but did not reveal any recognizable protein-encoding domains in other protein sequences. The gene deletion strains were subjected to oxidative, osmotic, and metal ion stress and, surprisingly, only the ΔANID_10219.1 mutant showed an increased sensitivity to 0.12 mmol l(-1) menadione sodium bisulfite. The gene deletions affected the stress sensitivities (tolerances) irregularly, for example, some strains grew more slowly when exposed to various oxidants and/or osmotic stress generating agents, meanwhile the ΔANID_10260.1 mutant possessed a wild-type tolerance to all stressors tested. Our results are in line with earlier studies demonstrating that the deletions of stress-responsive genes do not confer necessarily any stress-sensitivity phenotypes, which can be attributed to compensatory mechanisms based on other elements of the stress response system with overlapping functions. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Josephine Erhiakporeh

    2016-07-06

    Jul 6, 2016 ... candidate genes for drought tolerance in sesame. (Sesamum ... Our results provided genomic resources for further functional analysis and genetic engineering .... reverse transcribed using the Reverse Transcription System.

  12. The progress of PET based reporter gene imaging

    International Nuclear Information System (INIS)

    Zhao Wei; Zhang Xiuli

    2005-01-01

    More than two decades of intense research have allowed gene therapy to move from the laboratory to the clinical setting, where its use for the treatment of human pathologies has been considerably increased in the last years. However, many crucial questions remain to be solved in this challenging field. In vivo imaging with positron emission tomography (PET) by combination of the appropriate PET reporter gene and PET reporter probe could provide invaluable qualitative and quantitative information to answer multiple unsolved questions about gene therapy. PET imaging could be used to define parameters not available by other techniques that are of substantial interest not only for the proper understanding of the gene therapy process, but also for its future development and clinical application in humans. (authors)

  13. Embryo quality predictive models based on cumulus cells gene expression

    Directory of Open Access Journals (Sweden)

    Devjak R

    2016-06-01

    Full Text Available Since the introduction of in vitro fertilization (IVF in clinical practice of infertility treatment, the indicators for high quality embryos were investigated. Cumulus cells (CC have a specific gene expression profile according to the developmental potential of the oocyte they are surrounding, and therefore, specific gene expression could be used as a biomarker. The aim of our study was to combine more than one biomarker to observe improvement in prediction value of embryo development. In this study, 58 CC samples from 17 IVF patients were analyzed. This study was approved by the Republic of Slovenia National Medical Ethics Committee. Gene expression analysis [quantitative real time polymerase chain reaction (qPCR] for five genes, analyzed according to embryo quality level, was performed. Two prediction models were tested for embryo quality prediction: a binary logistic and a decision tree model. As the main outcome, gene expression levels for five genes were taken and the area under the curve (AUC for two prediction models were calculated. Among tested genes, AMHR2 and LIF showed significant expression difference between high quality and low quality embryos. These two genes were used for the construction of two prediction models: the binary logistic model yielded an AUC of 0.72 ± 0.08 and the decision tree model yielded an AUC of 0.73 ± 0.03. Two different prediction models yielded similar predictive power to differentiate high and low quality embryos. In terms of eventual clinical decision making, the decision tree model resulted in easy-to-interpret rules that are highly applicable in clinical practice.

  14. A sight on the current nanoparticle-based gene delivery vectors

    Science.gov (United States)

    Dizaj, Solmaz Maleki; Jafari, Samira; Khosroushahi, Ahmad Yari

    2014-05-01

    Nowadays, gene delivery for therapeutic objects is considered one of the most promising strategies to cure both the genetic and acquired diseases of human. The design of efficient gene delivery vectors possessing the high transfection efficiencies and low cytotoxicity is considered the major challenge for delivering a target gene to specific tissues or cells. On this base, the investigations on non-viral gene vectors with the ability to overcome physiological barriers are increasing. Among the non-viral vectors, nanoparticles showed remarkable properties regarding gene delivery such as the ability to target the specific tissue or cells, protect target gene against nuclease degradation, improve DNA stability, and increase the transformation efficiency or safety. This review attempts to represent a current nanoparticle based on its lipid, polymer, hybrid, and inorganic properties. Among them, hybrids, as efficient vectors, are utilized in gene delivery in terms of materials (synthetic or natural), design, and in vitro/ in vivo transformation efficiency.

  15. Gene Ontology-Based Analysis of Zebrafish Omics Data Using the Web Tool Comparative Gene Ontology.

    Science.gov (United States)

    Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan

    2017-10-01

    Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.

  16. A Cancer Gene Selection Algorithm Based on the K-S Test and CFS

    Directory of Open Access Journals (Sweden)

    Qiang Su

    2017-01-01

    Full Text Available Background. To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S test and correlation-based feature selection (CFS principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. Results. We adopted support vector machines (SVM as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR, and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. Conclusions. The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.

  17. Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; He, Yongqun

    2017-03-14

    Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of

  18. Clinical and pathological associations with p53 tumour-suppressor gene mutations and expression of p21WAF1/Cip1 in colorectal carcinoma

    NARCIS (Netherlands)

    Slebos, R. J.; Baas, I. O.; Clement, M.; Polak, M.; Mulder, J. W.; van den Berg, F. M.; Hamilton, S. R.; Offerhaus, G. J.

    1996-01-01

    Inactivation of the p53 tumour-suppressor gene is common in a wide variety of human neoplasms. In the majority of cases, single point mutations in the protein-encoding sequence of p53 lead to positive immunohistochemistry (IHC) for the p53 protein, and are accompanied by loss of the wild-type

  19. Molecular mechanisms for protein-encoded inheritance

    Science.gov (United States)

    Wiltzius, Jed J. W.; Landau, Meytal; Nelson, Rebecca; Sawaya, Michael R.; Apostol, Marcin I.; Goldschmidt, Lukasz; Soriaga, Angela B.; Cascio, Duilio; Rajashankar, Kanagalaghatta; Eisenberg, David

    2013-01-01

    Strains are phenotypic variants, encoded by nucleic acid sequences in chromosomal inheritance and by protein “conformations” in prion inheritance and transmission. But how is a protein “conformation” stable enough to endure transmission between cells or organisms? Here new polymorphic crystal structures of segments of prion and other amyloid proteins offer structural mechanisms for prion strains. In packing polymorphism, prion strains are encoded by alternative packings (polymorphs) of β-sheets formed by the same segment of a protein; in a second mechanism, segmental polymorphism, prion strains are encoded by distinct β-sheets built from different segments of a protein. Both forms of polymorphism can produce enduring “conformations,” capable of encoding strains. These molecular mechanisms for transfer of information into prion strains share features with the familiar mechanism for transfer of information by nucleic acid inheritance, including sequence specificity and recognition by non-covalent bonds. PMID:19684598

  20. Gene mutation-based and specific therapies in precision medicine.

    Science.gov (United States)

    Wang, Xiangdong

    2016-04-01

    Precision medicine has been initiated and gains more and more attention from preclinical and clinical scientists. A number of key elements or critical parts in precision medicine have been described and emphasized to establish a systems understanding of precision medicine. The principle of precision medicine is to treat patients on the basis of genetic alterations after gene mutations are identified, although questions and challenges still remain before clinical application. Therapeutic strategies of precision medicine should be considered according to gene mutation, after biological and functional mechanisms of mutated gene expression or epigenetics, or the correspondent protein, are clearly validated. It is time to explore and develop a strategy to target and correct mutated genes by direct elimination, restoration, correction or repair of mutated sequences/genes. Nevertheless, there are still numerous challenges to integrating widespread genomic testing into individual cancer therapies and into decision making for one or another treatment. There are wide-ranging and complex issues to be solved before precision medicine becomes clinical reality. Thus, the precision medicine can be considered as an extension and part of clinical and translational medicine, a new alternative of clinical therapies and strategies, and have an important impact on disease cures and patient prognoses. © 2015 The Author. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

  1. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  2. An Entropy-based gene selection method for cancer classification using microarray data

    Directory of Open Access Journals (Sweden)

    Krishnan Arun

    2005-03-01

    Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.

  3. LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

    Science.gov (United States)

    Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

    2012-01-01

    Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene

  4. Comparative study on gene set and pathway topology-based enrichment methods.

    Science.gov (United States)

    Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim

    2015-10-22

    Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both

  5. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  6. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Science.gov (United States)

    Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

    2012-01-01

    Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  7. Towards gene therapy based on femtosecond optical transfection

    Science.gov (United States)

    Antkowiak, M.; Torres-Mapa, M. L.; McGinty, J.; Chahine, M.; Bugeon, L.; Rose, A.; Finn, A.; Moleirinho, S.; Okuse, K.; Dallman, M.; French, P.; Harding, S. E.; Reynolds, P.; Gunn-Moore, F.; Dholakia, K.

    2012-06-01

    Gene therapy poses a great promise in treatment and prevention of a variety of diseases. However, crucial to studying and the development of this therapeutic approach is a reliable and efficient technique of gene and drug delivery into primary cell types. These cells, freshly derived from an organ or tissue, mimic more closely the in vivo state and present more physiologically relevant information compared to cultured cell lines. However, primary cells are known to be difficult to transfect and are typically transfected using viral methods, which are not only questionable in the context of an in vivo application but rely on time consuming vector construction and may also result in cell de-differentiation and loss of functionality. At the same time, well established non-viral methods do not guarantee satisfactory efficiency and viability. Recently, optical laser mediated poration of cell membrane has received interest as a viable gene and drug delivery technique. It has been shown to deliver a variety of biomolecules and genes into cultured mammalian cells; however, its applicability to primary cells remains to be proven. We demonstrate how optical transfection can be an enabling technique in research areas, such as neuropathic pain, neurodegenerative diseases, heart failure and immune or inflammatory-related diseases. Several primary cell types are used in this study, namely cardiomyocytes, dendritic cells, and neurons. We present our recent progress in optimizing this technique's efficiency and post-treatment cell viability for these types of cells and discuss future directions towards in vivo applications.

  8. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Sesame (Sesamum indicum L.) is one of the most important oilseed crops. It is mainly grown in arid and semi-arid regions with occurrence of unpredictable drought which is one of the major constraints of its production. However, the lack of gene resources associated with drought tolerance hinders sesame genetic ...

  9. Microarray-Based Identification of Transcription Factor Target Genes

    NARCIS (Netherlands)

    Gorte, M.; Horstman, A.; Page, R.B.; Heidstra, R.; Stromberg, A.; Boutilier, K.A.

    2011-01-01

    Microarray analysis is widely used to identify transcriptional changes associated with genetic perturbation or signaling events. Here we describe its application in the identification of plant transcription factor target genes with emphasis on the design of suitable DNA constructs for controlling TF

  10. RNAi-based silencing of genes encoding the vacuolar- ATPase ...

    African Journals Online (AJOL)

    2016-11-09

    Nov 9, 2016 ... Spodoptera exigua larval development by silencing chitin synthase gene with RNA interference. Bull. Entomol. Res. 98:613-619. Dow JAT (1999). The Multifunctional Drosophila melanogaster V-. ATPase is encoded by a multigene family. J. Bioenerg. Biomembr. 31:75-83. Fire A, Xu SQ, Montgomery MK, ...

  11. GO(vis), a gene ontology visualization tool based on multi-dimensional values.

    Science.gov (United States)

    Ning, Zi; Jiang, Zhenran

    2010-05-01

    Most of gene product similarity measurements concentrate on the information content of Gene Ontology (GO) terms or use a path-based similarity between GO terms, which may ignore other important information contained in the structure of the ontology. In our study, we integrate different GO similarity measure approaches to analyze the functional relationship of genes and gene products with a new triangle-based visualization tool called GO(Vis). The purpose of this tool is to demonstrate the effect of three important information factors when measuring the similarity between gene products. One advantage of this tool is that its important ratio can be adjusted to meet different measuring requirements according to the biological knowledge of each factor. The experimental results demonstrate that GO(Vis) can display diagrams of the functional relationship for gene products effectively.

  12. Gene-based testing of interactions in association studies of quantitative traits.

    Directory of Open Access Journals (Sweden)

    Li Ma

    Full Text Available Various methods have been developed for identifying gene-gene interactions in genome-wide association studies (GWAS. However, most methods focus on individual markers as the testing unit, and the large number of such tests drastically erodes statistical power. In this study, we propose novel interaction tests of quantitative traits that are gene-based and that confer advantage in both statistical power and biological interpretation. The framework of gene-based gene-gene interaction (GGG tests combine marker-based interaction tests between all pairs of markers in two genes to produce a gene-level test for interaction between the two. The tests are based on an analytical formula we derive for the correlation between marker-based interaction tests due to linkage disequilibrium. We propose four GGG tests that extend the following P value combining methods: minimum P value, extended Simes procedure, truncated tail strength, and truncated P value product. Extensive simulations point to correct type I error rates of all tests and show that the two truncated tests are more powerful than the other tests in cases of markers involved in the underlying interaction not being directly genotyped and in cases of multiple underlying interactions. We applied our tests to pairs of genes that exhibit a protein-protein interaction to test for gene-level interactions underlying lipid levels using genotype data from the Atherosclerosis Risk in Communities study. We identified five novel interactions that are not evident from marker-based interaction testing and successfully replicated one of these interactions, between SMAD3 and NEDD9, in an independent sample from the Multi-Ethnic Study of Atherosclerosis. We conclude that our GGG tests show improved power to identify gene-level interactions in existing, as well as emerging, association studies.

  13. Two-Way Gene Interaction From Microarray Data Based on Correlation Methods.

    Science.gov (United States)

    Alavi Majd, Hamid; Talebi, Atefeh; Gilany, Kambiz; Khayyer, Nasibeh

    2016-06-01

    Gene networks have generated a massive explosion in the development of high-throughput techniques for monitoring various aspects of gene activity. Networks offer a natural way to model interactions between genes, and extracting gene network information from high-throughput genomic data is an important and difficult task. The purpose of this study is to construct a two-way gene network based on parametric and nonparametric correlation coefficients. The first step in constructing a Gene Co-expression Network is to score all pairs of gene vectors. The second step is to select a score threshold and connect all gene pairs whose scores exceed this value. In the foundation-application study, we constructed two-way gene networks using nonparametric methods, such as Spearman's rank correlation coefficient and Blomqvist's measure, and compared them with Pearson's correlation coefficient. We surveyed six genes of venous thrombosis disease, made a matrix entry representing the score for the corresponding gene pair, and obtained two-way interactions using Pearson's correlation, Spearman's rank correlation, and Blomqvist's coefficient. Finally, these methods were compared with Cytoscape, based on BIND, and Gene Ontology, based on molecular function visual methods; R software version 3.2 and Bioconductor were used to perform these methods. Based on the Pearson and Spearman correlations, the results were the same and were confirmed by Cytoscape and GO visual methods; however, Blomqvist's coefficient was not confirmed by visual methods. Some results of the correlation coefficients are not the same with visualization. The reason may be due to the small number of data.

  14. Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Jing; Ma, Zihao; Carr, Steven A.; Mertins, Philipp; Zhang, Hui; Zhang, Zhen; Chan, Daniel W.; Ellis, Matthew J. C.; Townsend, R. Reid; Smith, Richard D.; McDermott, Jason E.; Chen, Xian; Paulovich, Amanda G.; Boja, Emily S.; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Rodland, Karin D.; Liebler, Daniel C.; Zhang, Bing

    2016-11-11

    Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies

  15. Efficient gene transfer into nondividing cells by adeno-associated virus-based vectors.

    OpenAIRE

    Podsakoff, G; Wong, K K; Chatterjee, S

    1994-01-01

    Gene transfer vectors based on adeno-associated virus (AAV) are emerging as highly promising for use in human gene therapy by virtue of their characteristics of wide host range, high transduction efficiencies, and lack of cytopathogenicity. To better define the biology of AAV-mediated gene transfer, we tested the ability of an AAV vector to efficiently introduce transgenes into nonproliferating cell populations. Cells were induced into a nonproliferative state by treatment with the DNA synthe...

  16. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  17. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

    Science.gov (United States)

    Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

    2017-10-02

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.

  18. Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

    Directory of Open Access Journals (Sweden)

    Zhai Chengxiang

    2010-05-01

    Full Text Available Abstract Background Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO. However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. Results We propose a statistical method that uses the primary literature, i.e. free-text, as the source to perform overrepresentation analysis. The method is based on a statistical framework of mixture model and addresses the methodological flaws in several existing programs. We implemented this method within a literature mining system, BeeSpace, taking advantage of its analysis environment and added features that facilitate the interactive analysis of gene sets. Through experimentation with several datasets, we showed that our program can effectively summarize the important conceptual themes of large gene sets, even when traditional GO-based analysis does not yield informative results. Conclusions We conclude that the current work will provide biologists with a tool that effectively complements the existing ones for overrepresentation analysis from genomic experiments. Our program, Genelist Analyzer, is freely available at: http://workerbee.igb.uiuc.edu:8080/BeeSpace/Search.jsp

  19. Adaptive evolution of mitochondrial energy metabolism genes associated with increased energy demand in flying insects.

    Science.gov (United States)

    Yang, Yunxia; Xu, Shixia; Xu, Junxiao; Guo, Yan; Yang, Guang

    2014-01-01

    Insects are unique among invertebrates for their ability to fly, which raises intriguing questions about how energy metabolism in insects evolved and changed along with flight. Although physiological studies indicated that energy consumption differs between flying and non-flying insects, the evolution of molecular energy metabolism mechanisms in insects remains largely unexplored. Considering that about 95% of adenosine triphosphate (ATP) is supplied by mitochondria via oxidative phosphorylation, we examined 13 mitochondrial protein-encoding genes to test whether adaptive evolution of energy metabolism-related genes occurred in insects. The analyses demonstrated that mitochondrial DNA protein-encoding genes are subject to positive selection from the last common ancestor of Pterygota, which evolved primitive flight ability. Positive selection was also found in insects with flight ability, whereas no significant sign of selection was found in flightless insects where the wings had degenerated. In addition, significant positive selection was also identified in the last common ancestor of Neoptera, which changed its flight mode from direct to indirect. Interestingly, detection of more positively selected genes in indirect flight rather than direct flight insects suggested a stronger selective pressure in insects having higher energy consumption. In conclusion, mitochondrial protein-encoding genes involved in energy metabolism were targets of adaptive evolution in response to increased energy demands that arose during the evolution of flight ability in insects.

  20. Queueing-Based Synchronization and Entrainment for Synthetic Gene Oscillators

    Science.gov (United States)

    Mather, William; Butzin, Nicholas; Hochendoner, Philip; Ogle, Curtis

    Synthetic gene oscillators have been a major focus of synthetic biology research since the beginning of the field 15 years ago. They have proven to be useful both for biotechnological applications as well as a testing ground to significantly develop our understanding of the design principles behind synthetic and native gene oscillators. In particular, the principles governing synchronization and entrainment of biological oscillators have been explored using a synthetic biology approach. Our work combines experimental and theoretical approaches to specifically investigate how a bottleneck for protein degradation, which is present in most if not all existing synthetic oscillators, can be leveraged to robustly synchronize and entrain biological oscillators. We use both the terminology and mathematical tools of queueing theory to intuitively explain the role of this bottleneck in both synchronization and entrainment, which extends prior work demonstrating the usefulness of queueing theory in synthetic and native gene circuits. We conclude with an investigation of how synchronization and entrainment may be sensitive to the presence of multiple proteolytic pathways in a cell that couple weakly through crosstalk. This work was supported by NSF Grant #1330180.

  1. Evaluation of gene importance in microarray data based upon probability of selection

    Directory of Open Access Journals (Sweden)

    Fu Li M

    2005-03-01

    Full Text Available Abstract Background Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. Results Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes with optimal classification performance, compared with results reported in the literature. Conclusion In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities.

  2. Design-Based Learning for Biology: Genetic Engineering Experience Improves Understanding of Gene Expression

    Science.gov (United States)

    Ellefson, Michelle R.; Brinker, Rebecca A.; Vernacchio, Vincent J.; Schunn, Christian D.

    2008-01-01

    Gene expression is a difficult topic for students to learn and comprehend, at least partially because it involves various biochemical structures and processes occurring at the microscopic level. Designer Bacteria, a design-based learning (DBL) unit for high-school students, applies principles of DBL to the teaching of gene expression. Throughout…

  3. A phylogenetic analysis of the genus Psathyrostachys (Poaceae) based on one nuclear gene, three plastid genes, and morphology

    DEFF Research Database (Denmark)

    Petersen, Gitte; Seberg, Ole; Baden, Claus

    2004-01-01

    A phylogenetic analysis of the small, Central Asian genus Psathyrostachys Nevski is presented. The analysis is based on morphological characters and nucleotide sequence data from one nuclear gene, DMC1, and three plastid genes, rbcL, rpoA, and rpoC2. Separate analyses of the three data partitions...... (morphology, nuclear sequences, and plastid sequences) result in mostly congruent trees. The plastid and nuclear sequences produce completely congruent trees, and only the trees based on plastid sequences and morphological characters are incongruent. Combined analysis of all data results in a fairly well......-resolved strict consensus tree: Ps. rupestris is the sister to the remaining species, which are divided into two clades: one including Ps. fragilis and Ps. caduca, the other including Ps. juncea, Ps. huashanica, Ps. lanuginosa, Ps. stoloniformis, and Ps. kronenburgii. Pubescent culms and more than 20 mm long...

  4. Research on the Bionics Design of Automobile Styling Based on the Form Gene

    Science.gov (United States)

    Aili, Zhao; Long, Jiang

    2017-09-01

    From the heritage of form gene point of view, this thesis has analyzed the gene make-up, cultural inheritance and aesthetic features in the evolution and development of forms of brand automobiles and proposed the bionic design concept and methods in the automobile styling design. And this innovative method must be based on the form gene, and the consistency and combination of form element must be maintained during the design. Taking the design of Maserati as an example, the thesis will show you the design method and philosophy in the aspects of form gene expression and bionic design innovation for the future automobile styling.

  5. Ortholog-based screening and identification of genes related to intracellular survival.

    Science.gov (United States)

    Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin

    2018-04-20

    Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.

  6. A Region-Based GeneSIS Segmentation Algorithm for the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    Stelios K. Mylonas

    2015-03-01

    Full Text Available This paper proposes an object-based segmentation/classification scheme for remotely sensed images, based on a novel variant of the recently proposed Genetic Sequential Image Segmentation (GeneSIS algorithm. GeneSIS segments the image in an iterative manner, whereby at each iteration a single object is extracted via a genetic-based object extraction algorithm. Contrary to the previous pixel-based GeneSIS where the candidate objects to be extracted were evaluated through the fuzzy content of their included pixels, in the newly developed region-based GeneSIS algorithm, a watershed-driven fine segmentation map is initially obtained from the original image, which serves as the basis for the forthcoming GeneSIS segmentation. Furthermore, in order to enhance the spatial search capabilities, we introduce a more descriptive encoding scheme in the object extraction algorithm, where the structural search modules are represented by polygonal shapes. Our objectives in the new framework are posed as follows: enhance the flexibility of the algorithm in extracting more flexible object shapes, assure high level classification accuracies, and reduce the execution time of the segmentation, while at the same time preserving all the inherent attributes of the GeneSIS approach. Finally, exploiting the inherent attribute of GeneSIS to produce multiple segmentations, we also propose two segmentation fusion schemes that operate on the ensemble of segmentations generated by GeneSIS. Our approaches are tested on an urban and two agricultural images. The results show that region-based GeneSIS has considerably lower computational demands compared to the pixel-based one. Furthermore, the suggested methods achieve higher classification accuracies and good segmentation maps compared to a series of existing algorithms.

  7. FiGS: a filter-based gene selection workbench for microarray data

    Directory of Open Access Journals (Sweden)

    Yun Taegyun

    2010-01-01

    Full Text Available Abstract Background The selection of genes that discriminate disease classes from microarray data is widely used for the identification of diagnostic biomarkers. Although various gene selection methods are currently available and some of them have shown excellent performance, no single method can retain the best performance for all types of microarray datasets. It is desirable to use a comparative approach to find the best gene selection result after rigorous test of different methodological strategies for a given microarray dataset. Results FiGS is a web-based workbench that automatically compares various gene selection procedures and provides the optimal gene selection result for an input microarray dataset. FiGS builds up diverse gene selection procedures by aligning different feature selection techniques and classifiers. In addition to the highly reputed techniques, FiGS diversifies the gene selection procedures by incorporating gene clustering options in the feature selection step and different data pre-processing options in classifier training step. All candidate gene selection procedures are evaluated by the .632+ bootstrap errors and listed with their classification accuracies and selected gene sets. FiGS runs on parallelized computing nodes that capacitate heavy computations. FiGS is freely accessible at http://gexp.kaist.ac.kr/figs. Conclusion FiGS is an web-based application that automates an extensive search for the optimized gene selection analysis for a microarray dataset in a parallel computing environment. FiGS will provide both an efficient and comprehensive means of acquiring optimal gene sets that discriminate disease states from microarray datasets.

  8. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

    Science.gov (United States)

    Hoff, Katharina J; Lange, Simone; Lomsadze, Alexandre; Borodovsky, Mark; Stanke, Mario

    2016-03-01

    Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Partial least squares based gene expression analysis in estrogen receptor positive and negative breast tumors.

    Science.gov (United States)

    Ma, W; Zhang, T-F; Lu, P; Lu, S H

    2014-01-01

    Breast cancer is categorized into two broad groups: estrogen receptor positive (ER+) and ER negative (ER-) groups. Previous study proposed that under trastuzumab-based neoadjuvant chemotherapy, tumor initiating cell (TIC) featured ER- tumors response better than ER+ tumors. Exploration of the molecular difference of these two groups may help developing new therapeutic strategies, especially for ER- patients. With gene expression profile from the Gene Expression Omnibus (GEO) database, we performed partial least squares (PLS) based analysis, which is more sensitive than common variance/regression analysis. We acquired 512 differentially expressed genes. Four pathways were found to be enriched with differentially expressed genes, involving immune system, metabolism and genetic information processing process. Network analysis identified five hub genes with degrees higher than 10, including APP, ESR1, SMAD3, HDAC2, and PRKAA1. Our findings provide new understanding for the molecular difference between TIC featured ER- and ER+ breast tumors with the hope offer supports for therapeutic studies.

  10. Efficient CRISPR/Cas9-based gene knockout in watermelon.

    Science.gov (United States)

    Tian, Shouwei; Jiang, Linjian; Gao, Qiang; Zhang, Jie; Zong, Mei; Zhang, Haiying; Ren, Yi; Guo, Shaogui; Gong, Guoyi; Liu, Fan; Xu, Yong

    2017-03-01

    CRISPR/Cas9 system can precisely edit genomic sequence and effectively create knockout mutations in T0 generation watermelon plants. Genome editing offers great advantage to reveal gene function and generate agronomically important mutations to crops. Recently, RNA-guided genome editing system using the type II clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein 9 (Cas9) has been applied to several plant species, achieving successful targeted mutagenesis. Here, we report the genome of watermelon, an important fruit crop, can also be precisely edited by CRISPR/Cas9 system. ClPDS, phytoene desaturase in watermelon, was selected as the target gene because its mutant bears evident albino phenotype. CRISPR/Cas9 system performed genome editing, such as insertions or deletions at the expected position, in transfected watermelon protoplast cells. More importantly, all transgenic watermelon plants harbored ClPDS mutations and showed clear or mosaic albino phenotype, indicating that CRISPR/Cas9 system has technically 100% of genome editing efficiency in transgenic watermelon lines. Furthermore, there were very likely no off-target mutations, indicated by examining regions that were highly homologous to sgRNA sequences. Our results show that CRISPR/Cas9 system is a powerful tool to effectively create knockout mutations in watermelon.

  11. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  12. Exploring the role of peptides in polymer-based gene delivery.

    Science.gov (United States)

    Sun, Yanping; Yang, Zhen; Wang, Chunxi; Yang, Tianzhi; Cai, Cuifang; Zhao, Xiaoyun; Yang, Li; Ding, Pingtian

    2017-09-15

    Polymers are widely studied as non-viral gene vectors because of their strong DNA binding ability, capacity to carry large payload, flexibility of chemical modifications, low immunogenicity, and facile processes for manufacturing. However, high cytotoxicity and low transfection efficiency substantially restrict their application in clinical trials. Incorporating functional peptides is a promising approach to address these issues. Peptides demonstrate various functions in polymer-based gene delivery systems, such as targeting to specific cells, breaching membrane barriers, facilitating DNA condensation and release, and lowering cytotoxicity. In this review, we systematically summarize the role of peptides in polymer-based gene delivery, and elaborate how to rationally design polymer-peptide based gene delivery vectors. Polymers are widely studied as non-viral gene vectors, but suffer from high cytotoxicity and low transfection efficiency. Incorporating short, bioactive peptides into polymer-based gene delivery systems can address this issue. Peptides demonstrate various functions in polymer-based gene delivery systems, such as targeting to specific cells, breaching membrane barriers, facilitating DNA condensation and release, and lowering cytotoxicity. In this review, we highlight the peptides' roles in polymer-based gene delivery, and elaborate how to utilize various functional peptides to enhance the transfection efficiency of polymers. The optimized peptide-polymer vectors should be able to alter their structures and functions according to biological microenvironments and utilize inherent intracellular pathways of cells, and consequently overcome the barriers during gene delivery to enhance transfection efficiency. Copyright © 2017 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

  13. A relative variation-based method to unraveling gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Yali Wang

    Full Text Available Gene regulatory network (GRN reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called Z-score, usually perform better. A fundamental problem with the Z-score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the Z-score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be

  14. A network-based gene expression signature informs prognosis and treatment for colorectal cancer patients.

    Directory of Open Access Journals (Sweden)

    Mingguang Shi

    Full Text Available Several studies have reported gene expression signatures that predict recurrence risk in stage II and III colorectal cancer (CRC patients with minimal gene membership overlap and undefined biological relevance. The goal of this study was to investigate biological themes underlying these signatures, to infer genes of potential mechanistic importance to the CRC recurrence phenotype and to test whether accurate prognostic models can be developed using mechanistically important genes.We investigated eight published CRC gene expression signatures and found no functional convergence in Gene Ontology enrichment analysis. Using a random walk-based approach, we integrated these signatures and publicly available somatic mutation data on a protein-protein interaction network and inferred 487 genes that were plausible candidate molecular underpinnings for the CRC recurrence phenotype. We named the list of 487 genes a NEM signature because it integrated information from Network, Expression, and Mutation. The signature showed significant enrichment in four biological processes closely related to cancer pathophysiology and provided good coverage of known oncogenes, tumor suppressors, and CRC-related signaling pathways. A NEM signature-based Survival Support Vector Machine prognostic model was trained using a microarray gene expression dataset and tested on an independent dataset. The model-based scores showed a 75.7% concordance with the real survival data and separated patients into two groups with significantly different relapse-free survival (p = 0.002. Similar results were obtained with reversed training and testing datasets (p = 0.007. Furthermore, adjuvant chemotherapy was significantly associated with prolonged survival of the high-risk patients (p = 0.006, but not beneficial to the low-risk patients (p = 0.491.The NEM signature not only reflects CRC biology but also informs patient prognosis and treatment response. Thus, the network-based

  15. Gene-based meta-analysis of genome-wide association studies implicates new loci involved in obesity

    DEFF Research Database (Denmark)

    Hägg, Sara; Ganna, Andrea; Van Der Laan, Sander W

    2015-01-01

    ) approach to assign variants to genes and to calculate gene-based P-values based on simulations. The VEGAS method was applied to each cohort separately before a gene-based meta-analysis was performed. In Stage 1, two known (FTO and TMEM18) and six novel (PEX2, MTFR2, SSFA2, IARS2, CEP295 and TXNDC12) loci...

  16. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  17. Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

    Science.gov (United States)

    Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

    2005-07-15

    A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.

  18. Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Keita Mori

    2013-01-01

    Full Text Available Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.

  19. Outreach Science Education: Evidence-Based Studies in a Gene Technology Lab

    Science.gov (United States)

    Scharfenberg, Franz-Josef; Bogner, Franz X.

    2014-01-01

    Nowadays, outreach labs are important informal learning environments in science education. After summarizing research to goals outreach labs focus on, we describe our evidence-based gene technology lab as a model of a research-driven outreach program. Evaluation-based optimizations of hands-on teaching based on cognitive load theory (additional…

  20. A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis.

    Science.gov (United States)

    Voyle, Nicola; Keohane, Aoife; Newhouse, Stephen; Lunnon, Katie; Johnston, Caroline; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Hodges, Angela; Kiddle, Steven; Dobson, Richard Jb

    2016-01-01

    Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer's disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.

  1. A robust approach based on Weibull distribution for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Gong Binsheng

    2011-05-01

    Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters

  2. Multigenic lentiviral vectors for combined and tissue-specific expression of miRNA- and protein-based antiangiogenic factors

    Directory of Open Access Journals (Sweden)

    Anne Louise Askou

    Full Text Available Lentivirus-based gene delivery vectors carrying multiple gene cassettes are powerful tools in gene transfer studies and gene therapy, allowing coexpression of multiple therapeutic factors and, if desired, fluorescent reporters. Current strategies to express transgenes and microRNA (miRNA clusters from a single vector have certain limitations that affect transgene expression levels and/or vector titers. In this study, we describe a novel vector design that facilitates combined expression of therapeutic RNA- and protein-based antiangiogenic factors as well as a fluorescent reporter from back-to-back RNApolII-driven expression cassettes. This configuration allows effective production of intron-embedded miRNAs that are released upon transduction of target cells. Exploiting such multigenic lentiviral vectors, we demonstrate robust miRNA-directed downregulation of vascular endothelial growth factor (VEGF expression, leading to reduced angiogenesis, and parallel impairment of angiogenic pathways by codelivering the gene encoding pigment epithelium-derived factor (PEDF. Notably, subretinal injections of lentiviral vectors reveal efficient retinal pigment epithelium-specific gene expression driven by the VMD2 promoter, verifying that multigenic lentiviral vectors can be produced with high titers sufficient for in vivo applications. Altogether, our results suggest the potential applicability of combined miRNA- and protein-encoding lentiviral vectors in antiangiogenic gene therapy, including new combination therapies for amelioration of age-related macular degeneration.

  3. Nucleotide sequence of the human N-myc gene

    International Nuclear Information System (INIS)

    Stanton, L.W.; Schwab, M.; Bishop, J.M.

    1986-01-01

    Human neuroblastomas frequently display amplification and augmented expression of a gene known as N-myc because of its similarity to the protooncogene c-myc. It has therefore been proposed that N-myc is itself a protooncogene, and subsequent tests have shown that N-myc and c-myc have similar biological activities in cell culture. The authors have now detailed the kinship between N-myc and c-myc by determining the nucleotide sequence of human N-myc and deducing the amino acid sequence of the protein encoded by the gene. The topography of N-myc is strikingly similar to that of c-myc: both genes contain three exons of similar lengths; the coding elements of both genes are located in the second and third exons; and both genes have unusually long 5' untranslated regions in their mRNAs, with features that raise the possibility that expression of the genes may be subject to similar controls of translation. The resemblance between the proteins encoded by N-myc and c-myc sustains previous suspicions that the genes encode related functions

  4. Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

    Science.gov (United States)

    Zhang, Xue; Acencio, Marcio Luis; Lemke, Ney

    2016-01-01

    Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research. PMID:27014079

  5. Towards precise classification of cancers based on robust gene functional expression profiles

    Directory of Open Access Journals (Sweden)

    Zhu Jing

    2005-03-01

    Full Text Available Abstract Background Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level. Results Inspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles. Conclusion This modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge

  6. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    Science.gov (United States)

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. HSD3B and gene-gene interactions in a pathway-based analysis of genetic susceptibility to bladder cancer.

    Directory of Open Access Journals (Sweden)

    Angeline S Andrew

    Full Text Available Bladder cancer is the 4(th most common cancer among men in the U.S. We analyzed variant genotypes hypothesized to modify major biological processes involved in bladder carcinogenesis, including hormone regulation, apoptosis, DNA repair, immune surveillance, metabolism, proliferation, and telomere maintenance. Logistic regression was used to assess the relationship between genetic variation affecting these processes and susceptibility in 563 genotyped urothelial cell carcinoma cases and 863 controls enrolled in a case-control study of incident bladder cancer conducted in New Hampshire, U.S. We evaluated gene-gene interactions using Multifactor Dimensionality Reduction (MDR and Statistical Epistasis Network analysis. The 3'UTR flanking variant form of the hormone regulation gene HSD3B2 was associated with increased bladder cancer risk in the New Hampshire population (adjusted OR 1.85 95%CI 1.31-2.62. This finding was successfully replicated in the Texas Bladder Cancer Study with 957 controls, 497 cases (adjusted OR 3.66 95%CI 1.06-12.63. The effect of this prevalent SNP was stronger among males (OR 2.13 95%CI 1.40-3.25 than females (OR 1.56 95%CI 0.83-2.95, (SNP-gender interaction P = 0.048. We also identified a SNP-SNP interaction between T-cell activation related genes GATA3 and CD81 (interaction P = 0.0003. The fact that bladder cancer incidence is 3-4 times higher in males suggests the involvement of hormone levels. This biologic process-based analysis suggests candidate susceptibility markers and supports the theory that disrupted hormone regulation plays a role in bladder carcinogenesis.

  8. SVMRFE based approach for prediction of most discriminatory gene target for type II diabetes

    Directory of Open Access Journals (Sweden)

    Atul Kumar

    2017-06-01

    Full Text Available Type II diabetes is a chronic condition that affects the way our body metabolizes sugar. The body's important source of fuel is now becoming a chronic disease all over the world. It is now very necessary to identify the new potential targets for the drugs which not only control the disease but also can treat it. Support vector machines are the classifier which has a potential to make a classification of the discriminatory genes and non-discriminatory genes. SVMRFE a modification of SVM ranks the genes based on their discriminatory power and eliminate the genes which are not involved in causing the disease. A gene regulatory network has been formed with the top ranked coding genes to identify their role in causing diabetes. To further validate the results pathway study was performed to identify the involvement of the coding genes in type II diabetes. The genes obtained from this study showed a significant involvement in causing the disease, which may be used as a potential drug target.

  9. A dual selection based, targeted gene replacement tool for Magnaporthe grisea and Fusarium oxysporum.

    Science.gov (United States)

    Khang, Chang Hyun; Park, Sook-Young; Lee, Yong-Hwan; Kang, Seogchan

    2005-06-01

    Rapid progress in fungal genome sequencing presents many new opportunities for functional genomic analysis of fungal biology through the systematic mutagenesis of the genes identified through sequencing. However, the lack of efficient tools for targeted gene replacement is a limiting factor for fungal functional genomics, as it often necessitates the screening of a large number of transformants to identify the desired mutant. We developed an efficient method of gene replacement and evaluated factors affecting the efficiency of this method using two plant pathogenic fungi, Magnaporthe grisea and Fusarium oxysporum. This method is based on Agrobacterium tumefaciens-mediated transformation with a mutant allele of the target gene flanked by the herpes simplex virus thymidine kinase (HSVtk) gene as a conditional negative selection marker against ectopic transformants. The HSVtk gene product converts 5-fluoro-2'-deoxyuridine to a compound toxic to diverse fungi. Because ectopic transformants express HSVtk, while gene replacement mutants lack HSVtk, growing transformants on a medium amended with 5-fluoro-2'-deoxyuridine facilitates the identification of targeted mutants by counter-selecting against ectopic transformants. In addition to M. grisea and F. oxysporum, the method and associated vectors are likely to be applicable to manipulating genes in a broad spectrum of fungi, thus potentially serving as an efficient, universal functional genomic tool for harnessing the growing body of fungal genome sequence data to study fungal biology.

  10. Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures

    Directory of Open Access Journals (Sweden)

    Anders E. Berglund

    2017-01-01

    Full Text Available Background. Many gene-expression signatures exist for describing the biological state of profiled tumors. Principal Component Analysis (PCA can be used to summarize a gene signature into a single score. Our hypothesis is that gene signatures can be validated when applied to new datasets, using inherent properties of PCA. Results. This validation is based on four key concepts. Coherence: elements of a gene signature should be correlated beyond chance. Uniqueness: the general direction of the data being examined can drive most of the observed signal. Robustness: if a gene signature is designed to measure a single biological effect, then this signal should be sufficiently strong and distinct compared to other signals within the signature. Transferability: the derived PCA gene signature score should describe the same biology in the target dataset as it does in the training dataset. Conclusions. The proposed validation procedure ensures that PCA-based gene signatures perform as expected when applied to datasets other than those that the signatures were trained upon. Complex signatures, describing multiple independent biological components, are also easily identified.

  11. An Improved Fuzzy Based Missing Value Estimation in DNA Microarray Validated by Gene Ranking

    Directory of Open Access Journals (Sweden)

    Sujay Saha

    2016-01-01

    Full Text Available Most of the gene expression data analysis algorithms require the entire gene expression matrix without any missing values. Hence, it is necessary to devise methods which would impute missing data values accurately. There exist a number of imputation algorithms to estimate those missing values. This work starts with a microarray dataset containing multiple missing values. We first apply the modified version of the fuzzy theory based existing method LRFDVImpute to impute multiple missing values of time series gene expression data and then validate the result of imputation by genetic algorithm (GA based gene ranking methodology along with some regular statistical validation techniques, like RMSE method. Gene ranking, as far as our knowledge, has not been used yet to validate the result of missing value estimation. Firstly, the proposed method has been tested on the very popular Spellman dataset and results show that error margins have been drastically reduced compared to some previous works, which indirectly validates the statistical significance of the proposed method. Then it has been applied on four other 2-class benchmark datasets, like Colorectal Cancer tumours dataset (GDS4382, Breast Cancer dataset (GSE349-350, Prostate Cancer dataset, and DLBCL-FL (Leukaemia for both missing value estimation and ranking the genes, and the results show that the proposed method can reach 100% classification accuracy with very few dominant genes, which indirectly validates the biological significance of the proposed method.

  12. Network Based Integrated Analysis of Phenotype-Genotype Data for Prioritization of Candidate Symptom Genes

    Directory of Open Access Journals (Sweden)

    Xing Li

    2014-01-01

    Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

  13. A non-inheritable maternal Cas9-based multiple-gene editing system in mice

    OpenAIRE

    Takayuki Sakurai; Akiko Kamiyoshi; Hisaka Kawate; Chie Mori; Satoshi Watanabe; Megumu Tanaka; Ryuichi Uetake; Masahiro Sato; Takayuki Shindo

    2016-01-01

    The CRISPR/Cas9 system is capable of editing multiple genes through one-step zygote injection. The preexisting method is largely based on the co-injection of Cas9 DNA (or mRNA) and guide RNAs (gRNAs); however, it is unclear how many genes can be simultaneously edited by this method, and a reliable means to generate transgenic (Tg) animals with multiple gene editing has yet to be developed. Here, we employed non-inheritable maternal Cas9 (maCas9) protein derived from Tg mice with systemic Cas9...

  14. Pairagon+N-SCAN_EST: a model-based gene annotation pipeline

    DEFF Research Database (Denmark)

    Arumugam, Manimozhiyan; Wei, Chaochun; Brown, Randall H

    2006-01-01

    This paper describes Pairagon+N-SCAN_EST, a gene annotation pipeline that uses only native alignments. For each expressed sequence it chooses the best genomic alignment. Systems like ENSEMBL and ExoGean rely on trans alignments, in which expressed sequences are aligned to the genomic loci...... with de novo gene prediction by using N-SCAN_EST. N-SCAN_EST is based on a generalized HMM probability model augmented with a phylogenetic conservation model and EST alignments. It can predict complete transcripts by extending or merging EST alignments, but it can also predict genes in regions without EST...

  15. Effective generation of transgenic pigs and mice by linker based sperm-mediated gene transfer.

    OpenAIRE

    Chang, Keejong; Qian, Jin; Jiang, MeiSheng; Liu, Yi-Hsin; Wu, Ming-Che; Chen, Chi-Dar; Lai, Chao-Kuen; Lo, Hsin-Lung; Hsiao, Chin-Ton; Brown, Lucy; Bolen, James; Huang, Hsiao-I; Ho, Pei-Yu; Shih, Ping Yao; Yao, Chen-Wen

    2002-01-01

    Abstract Background Transgenic animals have become valuable tools for both research and applied purposes. The current method of gene transfer, microinjection, which is widely used in transgenic mouse production, has only had limited success in producing transgenic animals of larger or higher species. Here, we report a linker based sperm-mediated gene transfer method (LB-SMGT) that greatly improves the production efficiency of large transgenic animals. Results The linker protein, a monoclonal ...

  16. Network-based prediction and knowledge mining of disease genes.

    Science.gov (United States)

    Carson, Matthew B; Lu, Hui

    2015-01-01

    In recent years, high-throughput protein interaction identification methods have generated a large amount of data. When combined with the results from other in vivo and in vitro experiments, a complex set of relationships between biological molecules emerges. The growing popularity of network analysis and data mining has allowed researchers to recognize indirect connections between these molecules. Due to the interdependent nature of network entities, evaluating proteins in this context can reveal relationships that may not otherwise be evident. We examined the human protein interaction network as it relates to human illness using the Disease Ontology. After calculating several topological metrics, we trained an alternating decision tree (ADTree) classifier to identify disease-associated proteins. Using a bootstrapping method, we created a tree to highlight conserved characteristics shared by many of these proteins. Subsequently, we reviewed a set of non-disease-associated proteins that were misclassified by the algorithm with high confidence and searched for evidence of a disease relationship. Our classifier was able to predict disease-related genes with 79% area under the receiver operating characteristic (ROC) curve (AUC), which indicates the tradeoff between sensitivity and specificity and is a good predictor of how a classifier will perform on future data sets. We found that a combination of several network characteristics including degree centrality, disease neighbor ratio, eccentricity, and neighborhood connectivity help to distinguish between disease- and non-disease-related proteins. Furthermore, the ADTree allowed us to understand which combinations of strongly predictive attributes contributed most to protein-disease classification. In our post-processing evaluation, we found several examples of potential novel disease-related proteins and corresponding literature evidence. In addition, we showed that first- and second-order neighbors in the PPI network

  17. Cell based-gene delivery approaches for the treatment of spinal cord injury and neurodegenerative disorders.

    Science.gov (United States)

    Taha, Masoumeh Fakhr

    2010-03-01

    Cell based-gene delivery has provided an important therapeutic strategy for different disorders in the recent years. This strategy is based on the transplantation of genetically modified cells to express specific genes and to target the delivery of therapeutic factors, especially for the treatment of cancers and neurological, immunological, cardiovascular and heamatopoietic disorders. Although, preliminary reports are encouraging, and experimental studies indicate functionally and structurally improvements in the animal models of different disorders, universal application of this strategy for human diseases requires more evidence. There are a number of parameters that need to be evaluated, including the optimal cell source, the most effective gene/genes to be delivered, the optimal vector and method of gene delivery into the cells and the most efficient route for the delivery of genetically modified cells into the patient. Also, some obstacles have to be overcome, including the safety and usefulness of the approaches and the stability of the improvements. Here, recent studies concerning with the cell-based gene delivery for spinal cord injury and some neurodegenerative disorders such as amyotrophic lateral sclerosis, Parkinson's disease and Alzheimer's disease are briefly reviewed, and their exciting consequences are discussed.

  18. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

    Science.gov (United States)

    Jia, Bin; Wang, Xiaodong

    2013-12-17

    : The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

  19. Identification of novel risk genes associated with type 1 diabetes mellitus using a genome-wide gene-based association analysis.

    Science.gov (United States)

    Qiu, Ying-Hua; Deng, Fei-Yan; Li, Min-Jing; Lei, Shu-Feng

    2014-11-01

    Type 1 diabetes mellitus is a serious disorder characterized by destruction of pancreatic β-cells, culminating in absolute insulin deficiency. Genetic factors contribute to the susceptibility of type 1 diabetes mellitus. The aim of the present study was to identify more susceptibility genes of type 1 diabetes mellitus. We carried out an initial gene-based genome-wide association study in a total of 4,075 type 1 diabetes mellitus cases and 2,604 controls by using the Gene-based Association Test using Extended Simes procedure. Furthermore, we carried out replication studies, differential expression analysis and functional annotation clustering analysis to support the significance of the identified susceptibility genes. We identified 452 genes associated with type 1 diabetes mellitus, even after adapting the genome-wide threshold for significance (P diabetes mellitus, which were ignored in single-nucleotide polymorphism-based association analysis and were not previously reported. We found that 53 genes have supportive evidence from replication studies and/or differential expression studies. In particular, seven genes including four non-human leukocyte antigen (HLA) genes (RASIP1, STRN4, BCAR1 and MYL2) are replicated in at least one independent population and also differentially expressed in peripheral blood mononuclear cells or monocytes. Furthermore, the associated genes tend to enrich in immune-related pathways or Gene Ontology project terms. The present results suggest the high power of gene-based association analysis in detecting disease-susceptibility genes. Our findings provide more insights into the genetic basis of type 1 diabetes mellitus.

  20. Discovery of global genomic re-organization based on comparison of two newly sequenced rice mitochondrial genomes with cytoplasmic male sterility-related genes

    Directory of Open Access Journals (Sweden)

    Yamada Mari

    2010-03-01

    Full Text Available Abstract Background Plant mitochondrial genomes are known for their complexity, and there is abundant evidence demonstrating that this organelle is important for plant sexual reproduction. Cytoplasmic male sterility (CMS is a phenomenon caused by incompatibility between the nucleus and mitochondria that has been discovered in various plant species. As the exact sequence of steps leading to CMS has not yet been revealed, efforts should be made to elucidate the factors underlying the mechanism of this important trait for crop breeding. Results Two CMS mitochondrial genomes, LD-CMS, derived from Oryza sativa L. ssp. indica (434,735 bp, and CW-CMS, derived from Oryza rufipogon Griff. (559,045 bp, were newly sequenced in this study. Compared to the previously sequenced Nipponbare (Oryza sativa L. ssp. japonica mitochondrial genome, the presence of 54 out of 56 protein-encoding genes (including pseudo-genes, 22 tRNA genes (including pseudo-tRNAs, and three rRNA genes was conserved. Two other genes were not present in the CW-CMS mitochondrial genome, and one of them was present as part of the newly identified chimeric ORF, CW-orf307. At least 12 genomic recombination events were predicted between the LD-CMS mitochondrial genome and Nipponbare, and 15 between the CW-CMS genome and Nipponbare, and novel genetic structures were formed by these genomic rearrangements in the two CMS lines. At least one of the genomic rearrangements was completely unique to each CMS line and not present in 69 rice cultivars or 9 accessions of O. rufipogon. Conclusion Our results demonstrate novel mitochondrial genomic rearrangements that are unique in CMS cytoplasm, and one of the genes that is unique in the CW mitochondrial genome, CW-orf307, appeared to be the candidate most likely responsible for the CW-CMS event. Genomic rearrangements were dynamic in the CMS lines in comparison with those of rice cultivars, suggesting that 'death' and possible 'birth' processes of the

  1. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations.

    Directory of Open Access Journals (Sweden)

    Xingli Guo

    Full Text Available The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.

  2. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Science.gov (United States)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  3. Sex Determination in Insects: a binary decision based on alternative splicing

    OpenAIRE

    Salz, Helen K.

    2011-01-01

    The gene regulatory networks that control sex determination vary between species. Despite these differences, comparative studies in insects have found that alternative splicing is reiteratively used in evolution to control expression of the key sex determining genes. Sex determination is best understood in Drosophila where activation of the RNA binding protein encoding gene Sex-lethal is the central female-determining event. Sex-lethal serves as a genetic switch because once activated it cont...

  4. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

    Science.gov (United States)

    Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

    2018-04-01

    Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes.

    Directory of Open Access Journals (Sweden)

    Emre Guney

    Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.

  6. A Morpholino-based screen to identify novel genes involved in craniofacial morphogenesis

    Science.gov (United States)

    Melvin, Vida Senkus; Feng, Weiguo; Hernandez-Lagunas, Laura; Artinger, Kristin Bruk; Williams, Trevor

    2014-01-01

    BACKGROUND The regulatory mechanisms underpinning facial development are conserved between diverse species. Therefore, results from model systems provide insight into the genetic causes of human craniofacial defects. Previously, we generated a comprehensive dataset examining gene expression during development and fusion of the mouse facial prominences. Here, we used this resource to identify genes that have dynamic expression patterns in the facial prominences, but for which only limited information exists concerning developmental function. RESULTS This set of ~80 genes was used for a high throughput functional analysis in the zebrafish system using Morpholino gene knockdown technology. This screen revealed three classes of cranial cartilage phenotypes depending upon whether knockdown of the gene affected the neurocranium, viscerocranium, or both. The targeted genes that produced consistent phenotypes encoded proteins linked to transcription (meis1, meis2a, tshz2, vgll4l), signaling (pkdcc, vlk, macc1, wu:fb16h09), and extracellular matrix function (smoc2). The majority of these phenotypes were not altered by reduction of p53 levels, demonstrating that both p53 dependent and independent mechanisms were involved in the craniofacial abnormalities. CONCLUSIONS This Morpholino-based screen highlights new genes involved in development of the zebrafish craniofacial skeleton with wider relevance to formation of the face in other species, particularly mouse and human. PMID:23559552

  7. Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Directory of Open Access Journals (Sweden)

    Serafini Maria

    2003-11-01

    Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  8. Single-gene prognostic signatures for advanced stage serous ovarian cancer based on 1257 patient samples.

    Science.gov (United States)

    Zhang, Fan; Yang, Kai; Deng, Kui; Zhang, Yuanyuan; Zhao, Weiwei; Xu, Huan; Rong, Zhiwei; Li, Kang

    2018-04-16

    We sought to identify stable single-gene prognostic signatures based on a large collection of advanced stage serous ovarian cancer (AS-OvCa) gene expression data and explore their functions. The empirical Bayes (EB) method was used to remove the batch effect and integrate 8 ovarian cancer datasets. Univariate Cox regression was used to evaluate the association between gene and overall survival (OS). The Database for Annotation, Visualization and Integrated Discovery (DAVID) tool was used for the functional annotation of genes for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The batch effect was removed by the EB method, and 1257 patient samples were used for further analysis. We selected 341 single-gene prognostic signatures with FDR matrix organization, focal adhesion and DNA replication which are closely associated with cancer. We used the EB method to remove the batch effect of 8 datasets, integrated these datasets and identified stable prognosis signatures for AS-OvCa.

  9. Prediction of highly expressed genes in microbes based on chromatin accessibility

    Directory of Open Access Journals (Sweden)

    Ussery David W

    2007-02-01

    Full Text Available Abstract Background It is well known that gene expression is dependent on chromatin structure in eukaryotes and it is likely that chromatin can play a role in bacterial gene expression as well. Here, we use a nucleosomal position preference measure of anisotropic DNA flexibility to predict highly expressed genes in microbial genomes. We compare these predictions with those based on codon adaptation index (CAI values, and also with experimental data for 6 different microbial genomes, with a particular interest in experimental data from Escherichia coli. Moreover, position preference is examined further in 328 sequenced microbial genomes. Results We find that absolute gene expression levels are correlated with the position preference in many microbial genomes. It is postulated that in these regions, the DNA may be more accessible to the transcriptional machinery. Moreover, ribosomal proteins and ribosomal RNA are encoded by DNA having significantly lower position preference values than other genes in fast-replicating microbes. Conclusion This insight into DNA structure-dependent gene expression in microbes may be exploited for predicting the expression of non-translated genes such as non-coding RNAs that may not be predicted by any of the conventional codon usage bias approaches.

  10. Weighted functional linear regression models for gene-based association analysis.

    Science.gov (United States)

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  11. Tipping the Proteome with Gene-Based Vaccines: Weighing in on the Role of Nano materials

    International Nuclear Information System (INIS)

    Flores, K.J.; Craig, M.; Smith, J.J.; DeLong, R.K.; Wanekaya, A.; Dong, L.

    2012-01-01

    Since the first generation of DNA vaccines was introduced in 1988, remarkable improvements have been made to improve their efficacy and immunogenicity. Although human clinical trials have shown that delivery of DNA vaccines is well tolerated and safe, the potency of these vaccines in humans is somewhat less than optimal. The development of a gene-based vaccine that was effective enough to be approved for clinical use in humans would be one of, if not the most important, advance in vaccines to date. This paper highlights the literature relating to gene-based vaccines, specifically DNA vaccines, and suggests possible approaches to boost their performance. In addition, we explore the idea that combining RNA and nano materials may hold the key to successful gene-based vaccines for prevention and treatment of disease

  12. A sight on protein-based nanoparticles as drug/gene delivery systems.

    Science.gov (United States)

    Salatin, Sara; Jelvehgari, Mitra; Maleki-Dizaj, Solmaz; Adibkia, Khosro

    2015-01-01

    Polymeric nanomaterials have extensively been applied for the preparation of targeted and controlled release drug/gene delivery systems. However, problems involved in the formulation of synthetic polymers such as using of the toxic solvents and surfactants have limited their desirable applications. In this regard, natural biomolecules including proteins and polysaccharide are suitable alternatives due to their safety. According to literature, protein-based nanoparticles possess many advantages for drug and gene delivery such as biocompatibility, biodegradability and ability to functionalize with targeting ligands. This review provides a general sight on the application of biodegradable protein-based nanoparticles in drug/gene delivery based on their origins. Their unique physicochemical properties that help them to be formulated as pharmaceutical carriers are also discussed.

  13. Hessian regularization based non-negative matrix factorization for gene expression data clustering.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Wang, Congzhi

    2015-01-01

    Since a key step in the analysis of gene expression data is to detect groups of genes that have similar expression patterns, clustering technique is then commonly used to analyze gene expression data. Data representation plays an important role in clustering analysis. The non-negative matrix factorization (NMF) is a widely used data representation method with great success in machine learning. Although the traditional manifold regularization method, Laplacian regularization (LR), can improve the performance of NMF, LR still suffers from the problem of its weak extrapolating power. Hessian regularization (HR) is a newly developed manifold regularization method, whose natural properties make it more extrapolating, especially for small sample data. In this work, we propose the HR-based NMF (HR-NMF) algorithm, and then apply it to represent gene expression data for further clustering task. The clustering experiments are conducted on five commonly used gene datasets, and the results indicate that the proposed HR-NMF outperforms LR-based NMM and original NMF, which suggests the potential application of HR-NMF for gene expression data.

  14. GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.

    Science.gov (United States)

    Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H

    2010-04-01

    A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.

  15. Expression-based clustering of CAZyme-encoding genes of Aspergillus niger.

    Science.gov (United States)

    Gruben, Birgit S; Mäkelä, Miia R; Kowalczyk, Joanna E; Zhou, Miaomiao; Benoit-Gelber, Isabelle; De Vries, Ronald P

    2017-11-23

    The Aspergillus niger genome contains a large repertoire of genes encoding carbohydrate active enzymes (CAZymes) that are targeted to plant polysaccharide degradation enabling A. niger to grow on a wide range of plant biomass substrates. Which genes need to be activated in certain environmental conditions depends on the composition of the available substrate. Previous studies have demonstrated the involvement of a number of transcriptional regulators in plant biomass degradation and have identified sets of target genes for each regulator. In this study, a broad transcriptional analysis was performed of the A. niger genes encoding (putative) plant polysaccharide degrading enzymes. Microarray data focusing on the initial response of A. niger to the presence of plant biomass related carbon sources were analyzed of a wild-type strain N402 that was grown on a large range of carbon sources and of the regulatory mutant strains ΔxlnR, ΔaraR, ΔamyR, ΔrhaR and ΔgalX that were grown on their specific inducing compounds. The cluster analysis of the expression data revealed several groups of co-regulated genes, which goes beyond the traditionally described co-regulated gene sets. Additional putative target genes of the selected regulators were identified, based on their expression profile. Notably, in several cases the expression profile puts questions on the function assignment of uncharacterized genes that was based on homology searches, highlighting the need for more extensive biochemical studies into the substrate specificity of enzymes encoded by these non-characterized genes. The data also revealed sets of genes that were upregulated in the regulatory mutants, suggesting interaction between the regulatory systems and a therefore even more complex overall regulatory network than has been reported so far. Expression profiling on a large number of substrates provides better insight in the complex regulatory systems that drive the conversion of plant biomass by fungi. In

  16. Cellular automata-based artificial life system of horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Ji-xin Liu

    2016-02-01

    Full Text Available Mutation and natural selection is the core of Darwin's idea about evolution. Many algorithms and models are based on this idea. However, in the evolution of prokaryotes, more and more researches have indicated that horizontal gene transfer (HGT would be much more important and universal than the authors had imagined. Owing to this mechanism, the prokaryotes not only become adaptable in nearly any environment on Earth, but also form a global genetic bank and a super communication network with all the genes of the prokaryotic world. Under this background, they present a novel cellular automata model general gene transfer to simulate and study the vertical gene transfer and HGT in the prokaryotes. At the same time, they use Schrodinger's life theory to formulate some evaluation indices and to discuss the intelligence and cognition of prokaryotes which is derived from HGT.

  17. Analysis of mammalian gene function through broad based phenotypic screens across a consortium of mouse clinics

    Science.gov (United States)

    Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl MJ; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie

    2015-01-01

    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse ES cell knockout resource provides a basis for characterisation of relationships between gene and phenotype. The EUMODIC consortium developed and validated robust methodologies for broad-based phenotyping of knockouts through a pipeline comprising 20 disease-orientated platforms. We developed novel statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no prior functional annotation. We captured data from over 27,000 mice finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. Novel phenotypes were uncovered for many genes with unknown function providing a powerful basis for hypothesis generation and further investigation in diverse systems. PMID:26214591

  18. Clustering gene expression data based on predicted differential effects of GV interaction.

    Science.gov (United States)

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  19. Study of hepatitis B virus gene mutations with enzymatic colorimetry-based DNA microarray.

    Science.gov (United States)

    Mao, Hailei; Wang, Huimin; Zhang, Donglei; Mao, Hongju; Zhao, Jianlong; Shi, Jian; Cui, Zhichu

    2006-01-01

    To establish a modified microarray method for detecting HBV gene mutations in the clinic. Site-specific oligonucleotide probes were immobilized to microarray slides and hybridized to biotin-labeled HBV gene fragments amplified from two-step PCR. Hybridized targets were transferred to nitrocellulose membranes, followed by intensity measurement using BCIP/NBT colorimetry. HBV genes from 99 Hepatitis B patients and 40 healthy blood donors were analyzed. Mutation frequencies of HBV pre-core/core and basic core promoter (BCP) regions were found to be significantly higher in the patient group (42%, 40% versus 2.5%, 5%, P colorimetry method exhibited the same level of sensitivity and reproducibility. An enzymatic colorimetry-based DNA microarray assay was successfully established to monitor HBV mutations. Pre-core/core and BCP mutations of HBV genes could be major causes of HBV infection in HBeAg-negative patients and could also be relevant to chronicity and aggravation of hepatitis B.

  20. Molecular characterisation of lumpy skin disease virus and sheeppox virus based on P32 gene

    Directory of Open Access Journals (Sweden)

    P.M.A.Rashid

    2017-06-01

    Full Text Available Lumpy skin disease virus (LSDV and sheeppox virus (SPV have a considerable economic impact on the cattle and small ruminant industry. They are listed in group A of contagious disease by the World Organization for Animal Health (OIE. This study addressed molecular characterisation of first LSDV outbreak and an endemic SPV in Kurdistan region of Iraq based on P32 gene. The results indicated that P32 gene can be successfully used for diagnosis of LSDV. The phylogenic and molecular analysis showed that there may be a new LSDV isolate circulating in Kurdistan which uniquely shared the same characteristic amino acid sequence with SPV and GPV, leucine at amino acid position 51 in P32 gene as well as few genetically distinct SPV causing pox disease in Kurdistan sheep. This study provided sequence information of P32 gene for several LSDV isolates, which positively affects the epidemiological study of Capripoxvirus

  1. Herpes simplex virus type 1 strain KOS carries a defective US9 and a mutated US8A gene.

    Science.gov (United States)

    Negatsch, Alexandra; Mettenleiter, Thomas C; Fuchs, Walter

    2011-01-01

    The membrane protein encoded by the US9 gene of alphaherpesviruses plays an important role during virion assembly and transport in neurons. Here, we demonstrate that in herpes simplex virus type 1 (HSV-1) strain KOS, due to base substitutions, the predicted TATA-box of US9 is mutated, and a premature stop is present at codon 58 of US9, which contains 91 codons in other HSV-1 strains. The TATA-box mutation also removes the native stop codon of the adjacent US8A gene, leading to extension of the coding region from 160 to 191 codons. Northern blot analyses revealed reduced transcription of US9 in cells infected with HSV-1 KOS. Moreover, a US9-specific antiserum did not detect any gene products in Western blot and immunofluorescence analyses of KOS-infected cells, indicating that the truncated protein is not stable. In contrast, Western blot reactions of a pUS8A-specific antiserum confirmed enlargement of this protein in HSV-1 KOS.

  2. Network-based association of hypoxia-responsive genes with cardiovascular diseases

    International Nuclear Information System (INIS)

    Wang, Rui-Sheng; Oldham, William M; Loscalzo, Joseph

    2014-01-01

    Molecular oxygen is indispensable for cellular viability and function. Hypoxia is a stress condition in which oxygen demand exceeds supply. Low cellular oxygen content induces a number of molecular changes to activate regulatory pathways responsible for increasing the oxygen supply and optimizing cellular metabolism under limited oxygen conditions. Hypoxia plays critical roles in the pathobiology of many diseases, such as cancer, heart failure, myocardial ischemia, stroke, and chronic lung diseases. Although the complicated associations between hypoxia and cardiovascular (and cerebrovascular) diseases (CVD) have been recognized for some time, there are few studies that investigate their biological link from a systems biology perspective. In this study, we integrate hypoxia genes, CVD genes, and the human protein interactome in order to explore the relationship between hypoxia and cardiovascular diseases at a systems level. We show that hypoxia genes are much closer to CVD genes in the human protein interactome than that expected by chance. We also find that hypoxia genes play significant bridging roles in connecting different cardiovascular diseases. We construct a hypoxia-CVD bipartite network and find several interesting hypoxia-CVD modules with significant gene ontology similarity. Finally, we show that hypoxia genes tend to have more CVD interactors in the human interactome than in random networks of matching topology. Based on these observations, we can predict novel genes that may be associated with CVD. This network-based association study gives us a broad view of the relationships between hypoxia and cardiovascular diseases and provides new insights into the role of hypoxia in cardiovascular biology. (paper)

  3. Molecular characterization and expression of the M6 gene of grass carp hemorrhage virus (GCHV), an aquareovirus.

    Science.gov (United States)

    Qiu, T; Lu, R H; Zhang, J; Zhu, Z Y

    2001-07-01

    The complete nucleotide sequence of M6 gene of grass carp hemorrhage virus (GCHV) was determined. It is 2039 nucleotides in length and contains a single large open reading frame that could encode a protein of 648 amino acids with predicted molecular mass of 68.7 kDa. Amino acid sequence comparison revealed that the protein encoded by GCHV M6 is closely related to the protein mu1 of mammalian reovirus. The M6 gene, encoding the major outer-capsid protein, was expressed using the pET fusion protein vector in Escherichia coli and detected by Western blotting using chicken anti-GCHV immunoglobulin (IgY). The result indicates that the protein encoded by M6 may share a putative Asn-42-Pro-43 proteolytic cleavage site with mu1.

  4. A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    International Nuclear Information System (INIS)

    Gur-Dedeoglu, Bala; Konu, Ozlen; Kir, Serkan; Ozturk, Ahmet Rasit; Bozkurt, Betul; Ergul, Gulusan; Yulug, Isik G

    2008-01-01

    Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC), and invasive lobular carcinoma (ILC) samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively). The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real-time qRT-PCR supported the meta-analysis results. The

  5. A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    Directory of Open Access Journals (Sweden)

    Ergul Gulusan

    2008-12-01

    Full Text Available Abstract Background Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. Methods A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC, and invasive lobular carcinoma (ILC samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. Results The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively. The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real

  6. Double-Bottom Chaotic Map Particle Swarm Optimization Based on Chi-Square Test to Determine Gene-Gene Interactions

    Science.gov (United States)

    Yang, Cheng-Hong; Chang, Hsueh-Wei

    2014-01-01

    Gene-gene interaction studies focus on the investigation of the association between the single nucleotide polymorphisms (SNPs) of genes for disease susceptibility. Statistical methods are widely used to search for a good model of gene-gene interaction for disease analysis, and the previously determined models have successfully explained the effects between SNPs and diseases. However, the huge numbers of potential combinations of SNP genotypes limit the use of statistical methods for analysing high-order interaction, and finding an available high-order model of gene-gene interaction remains a challenge. In this study, an improved particle swarm optimization with double-bottom chaotic maps (DBM-PSO) was applied to assist statistical methods in the analysis of associated variations to disease susceptibility. A big data set was simulated using the published genotype frequencies of 26 SNPs amongst eight genes for breast cancer. Results showed that the proposed DBM-PSO successfully determined two- to six-order models of gene-gene interaction for the risk association with breast cancer (odds ratio > 1.0; P value <0.05). Analysis results supported that the proposed DBM-PSO can identify good models and provide higher chi-square values than conventional PSO. This study indicates that DBM-PSO is a robust and precise algorithm for determination of gene-gene interaction models for breast cancer. PMID:24895547

  7. Mesenchymal stem cell-based gene therapy: A promising therapeutic strategy.

    Science.gov (United States)

    Mohammadian, Mozhdeh; Abasi, Elham; Akbarzadeh, Abolfazl

    2016-08-01

    Mesenchymal stem cells (MSCs) are multipotent stromal cells that exist in bone marrow, fat, and so many other tissues, and can differentiate into a variety of cell types including osteoblasts, chondrocytes, and adipocytes, as well as myocytes and neurons. Moreover, they have great capacity for self-renewal while maintaining their multipotency. Their capacity for proliferation and differentiation, in addition to their immunomodulatory activity, makes them very promising candidates for cell-based regenerative medicine. Moreover, MSCs have the ability of mobilization to the site of damage; therefore, they can automatically migrate to the site of injury via their chemokine receptors following intravenous transplantation. In this respect, they can be applied for MSC-based gene therapy. In this new therapeutic method, genes of interest are introduced into MSCs via viral and non-viral-based methods that lead to transgene expression in them. Although stem cell-based gene therapy is a relatively new strategy, it lights a new hope for the treatment of a variety of genetic disorders. In the near future, MSCs can be of use in a vast number of clinical applications, because of their uncomplicated isolation, culture, and genetic manipulation. However, full consideration is still crucial before they are utilized for clinical trials, because the number of studies that signify the advantageous effects of MSC-based gene therapy are still limited.

  8. Environmental Application of Reporter-Genes Based Biosensors for Chemical Contamination Screening

    Directory of Open Access Journals (Sweden)

    Matejczyk Marzena

    2014-12-01

    Full Text Available The paper presents results of research concerning possibilities of applications of reporter-genes based microorganisms, including the selective presentation of defects and advantages of different new scientific achievements of methodical solutions in genetic system constructions of biosensing elements for environmental research. The most robust and popular genetic fusion and new trends in reporter genes technology – such as LacZ (β-galactosidase, xylE (catechol 2,3-dioxygenase, gfp (green fluorescent proteins and its mutated forms, lux (prokaryotic luciferase, luc (eukaryotic luciferase, phoA (alkaline phosphatase, gusA and gurA (β-glucuronidase, antibiotics and heavy metals resistance are described. Reporter-genes based biosensors with use of genetically modified bacteria and yeast successfully work for genotoxicity, bioavailability and oxidative stress assessment for detection and monitoring of toxic compounds in drinking water and different environmental samples, surface water, soil, sediments.

  9. Development of an ELA-DRA gene typing method based on pyrosequencing technology.

    Science.gov (United States)

    Díaz, S; Echeverría, M G; It, V; Posik, D M; Rogberg-Muñoz, A; Pena, N L; Peral-García, P; Vega-Pla, J L; Giovambattista, G

    2008-11-01

    The polymorphism of equine lymphocyte antigen (ELA) class II DRA gene had been detected by polymerase chain reaction-single-strand conformational polymorphism (PCR-SSCP) and reference strand-mediated conformation analysis. These methodologies allowed to identify 11 ELA-DRA exon 2 sequences, three of which are widely distributed among domestic horse breeds. Herein, we describe the development of a pyrosequencing-based method applicable to ELA-DRA typing, by screening samples from eight different horse breeds previously typed by PCR-SSCP. This sequence-based method would be useful in high-throughput genotyping of major histocompatibility complex genes in horses and other animal species, making this system interesting as a rapid screening method for animal genotyping of immune-related genes.

  10. Cytomegalovirus replicon-based regulation of gene expression in vitro and in vivo.

    Directory of Open Access Journals (Sweden)

    Hermine Mohr

    Full Text Available There is increasing evidence for a connection between DNA replication and the expression of adjacent genes. Therefore, this study addressed the question of whether a herpesvirus origin of replication can be used to activate or increase the expression of adjacent genes. Cell lines carrying an episomal vector, in which reporter genes are linked to the murine cytomegalovirus (MCMV origin of lytic replication (oriLyt, were constructed. Reporter gene expression was silenced by a histone-deacetylase-dependent mechanism, but was resolved upon lytic infection with MCMV. Replication of the episome was observed subsequent to infection, leading to the induction of gene expression by more than 1000-fold. oriLyt-based regulation thus provided a unique opportunity for virus-induced conditional gene expression without the need for an additional induction mechanism. This principle was exploited to show effective late trans-complementation of the toxic viral protein M50 and the glycoprotein gO of MCMV. Moreover, the application of this principle for intracellular immunization against herpesvirus infection was demonstrated. The results of the present study show that viral infection specifically activated the expression of a dominant-negative transgene, which inhibited viral growth. This conditional system was operative in explant cultures of transgenic mice, but not in vivo. Several applications are discussed.

  11. A pathway-based network analysis of hypertension-related genes

    Science.gov (United States)

    Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng

    2016-02-01

    Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.

  12. Machine learning approaches to supporting the identification of photoreceptor-enriched genes based on expression data

    Directory of Open Access Journals (Sweden)

    Simpson David

    2006-03-01

    Full Text Available Abstract Background Retinal photoreceptors are highly specialised cells, which detect light and are central to mammalian vision. Many retinal diseases occur as a result of inherited dysfunction of the rod and cone photoreceptor cells. Development and maintenance of photoreceptors requires appropriate regulation of the many genes specifically or highly expressed in these cells. Over the last decades, different experimental approaches have been developed to identify photoreceptor enriched genes. Recent progress in RNA analysis technology has generated large amounts of gene expression data relevant to retinal development. This paper assesses a machine learning methodology for supporting the identification of photoreceptor enriched genes based on expression data. Results Based on the analysis of publicly-available gene expression data from the developing mouse retina generated by serial analysis of gene expression (SAGE, this paper presents a predictive methodology comprising several in silico models for detecting key complex features and relationships encoded in the data, which may be useful to distinguish genes in terms of their functional roles. In order to understand temporal patterns of photoreceptor gene expression during retinal development, a two-way cluster analysis was firstly performed. By clustering SAGE libraries, a hierarchical tree reflecting relationships between developmental stages was obtained. By clustering SAGE tags, a more comprehensive expression profile for photoreceptor cells was revealed. To demonstrate the usefulness of machine learning-based models in predicting functional associations from the SAGE data, three supervised classification models were compared. The results indicated that a relatively simple instance-based model (KStar model performed significantly better than relatively more complex algorithms, e.g. neural networks. To deal with the problem of functional class imbalance occurring in the dataset, two data re

  13. A comparison of 100 human genes using an alu element-based instability model.

    Directory of Open Access Journals (Sweden)

    George W Cook

    Full Text Available The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1 the two-hit double-strand break potential of Alu elements and 2 the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.

  14. A comparison of 100 human genes using an alu element-based instability model.

    Science.gov (United States)

    Cook, George W; Konkel, Miriam K; Walker, Jerilyn A; Bourgeois, Matthew G; Fullerton, Mitchell L; Fussell, John T; Herbold, Heath D; Batzer, Mark A

    2013-01-01

    The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.

  15. Avirulence (AVR) Gene-Based Diagnosis Complements Existing Pathogen Surveillance Tools for Effective Deployment of Resistance (R) Genes Against Rice Blast Disease.

    Science.gov (United States)

    Selisana, S M; Yanoria, M J; Quime, B; Chaipanya, C; Lu, G; Opulencia, R; Wang, G-L; Mitchell, T; Correll, J; Talbot, N J; Leung, H; Zhou, B

    2017-06-01

    Avirulence (AVR) genes in Magnaporthe oryzae, the fungal pathogen that causes the devastating rice blast disease, have been documented to be major targets subject to mutations to avoid recognition by resistance (R) genes. In this study, an AVR-gene-based diagnosis tool for determining the virulence spectrum of a rice blast pathogen population was developed and validated. A set of 77 single-spore field isolates was subjected to pathotype analysis using differential lines, each containing a single R gene, and classified into 20 virulent pathotypes, except for 4 isolates that lost pathogenicity. In all, 10 differential lines showed low frequency (95%), inferring the effectiveness of R genes present in the respective differential lines. In addition, the haplotypes of seven AVR genes were determined by polymerase chain reaction amplification and sequencing, if applicable. The calculated frequency of different AVR genes displayed significant variations in the population. AVRPiz-t and AVR-Pii were detected in 100 and 84.9% of the isolates, respectively. Five AVR genes such as AVR-Pik-D (20.5%) and AVR-Pik-E (1.4%), AVRPiz-t (2.7%), AVR-Pita (0%), AVR-Pia (0%), and AVR1-CO39 (0%) displayed low or even zero frequency. The frequency of AVR genes correlated almost perfectly with the resistance frequency of the cognate R genes in differential lines, except for International Rice Research Institute-bred blast-resistant lines IRBLzt-T, IRBLta-K1, and IRBLkp-K60. Both genetic analysis and molecular marker validation revealed an additional R gene, most likely Pi19 or its allele, in these three differential lines. This can explain the spuriously higher resistance frequency of each target R gene based on conventional pathotyping. This study demonstrates that AVR-gene-based diagnosis provides a precise, R-gene-specific, and differential line-free assessment method that can be used for determining the virulence spectrum of a rice blast pathogen population and for predicting the

  16. Genome-Wide Identification and Expression Analysis of the UGlcAE Gene Family in Tomato

    OpenAIRE

    Xing Ding; Jinhua Li; Yu Pan; Yue Zhang; Lei Ni; Yaling Wang; Xingguo Zhang

    2018-01-01

    The UGlcAE has the capability of interconverting UDP-d-galacturonic acid and UDP-d-glucuronic acid, and UDP-d-galacturonic acid is an activated precursor for the synthesis of pectins in plants. In this study, we identified nine UGlcAE protein-encoding genes in tomato. The nine UGlcAE genes that were distributed on eight chromosomes in tomato, and the corresponding proteins contained one or two trans-membrane domains. The phylogenetic analysis showed that SlUGlcAE genes could be divided into s...

  17. RANWAR: rank-based weighted association rule mining from gene expression and methylation data.

    Science.gov (United States)

    Mallik, Saurav; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2015-01-01

    Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported.

  18. A contribution to the study of plant development evolution based on gene co-expression networks

    Directory of Open Access Journals (Sweden)

    Francisco J. Romero-Campero

    2013-08-01

    Full Text Available Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms.

  19. Beyond the Central Dogma: Model-Based Learning of How Genes Determine Phenotypes

    Science.gov (United States)

    Reinagel, Adam; Speth, Elena Bray

    2016-01-01

    In an introductory biology course, we implemented a learner-centered, model-based pedagogy that frequently engaged students in building conceptual models to explain how genes determine phenotypes. Model-building tasks were incorporated within case studies and aimed at eliciting students' understanding of 1) the origin of variation in a population…

  20. PINTA: a web server for network-based gene prioritization from expression data

    DEFF Research Database (Denmark)

    Nitsch, Daniela; Tranchevent, Léon-Charles; Goncalves, Joana P.

    2011-01-01

    PINTA (available at http://www.esat.kuleuven.be/ pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction...

  1. The UDP glucuronosyltransferase gene superfamily: suggested nomenclature based on evolutionary divergence

    NARCIS (Netherlands)

    Burchell, B.; Nebert, D. W.; Nelson, D. R.; Bock, K. W.; Iyanagi, T.; Jansen, P. L.; Lancet, D.; Mulder, G. J.; Chowdhury, J. R.; Siest, G.

    1991-01-01

    A nomenclature system for the UDP glucuronosyltransferase superfamily is proposed, based on divergent evolution of the genes. A total of 26 distinct cDNAs in five mammalian species have been sequenced to date. Comparison of the deduced amino acid sequences leads to the definition of two families and

  2. Establishment of a Cre recombinase based mutagenesis protocol for markerless gene deletion in Streptococcus suis.

    Science.gov (United States)

    Koczula, A; Willenborg, J; Bertram, R; Takamatsu, D; Valentin-Weigand, P; Goethe, R

    2014-12-01

    The lack of knowledge about pathogenicity mechanisms of Streptococcus (S.) suis is, at least partially, attributed to limited methods for its genetic manipulation. Here, we established a Cre-lox based recombination system for markerless gene deletions in S. suis serotype 2 with high selective pressure and without undesired side effects. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.

    Science.gov (United States)

    Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju

    2017-04-27

    means of network and pathway-based methodology, we explored the pathogenetic mechanism underlying AD at a systems biology level. Results from our work could provide valuable clues for understanding the molecular mechanism underlying AD. In addition, the framework proposed in this study could be used to investigate the pathological molecular network and genes relevant to other complex diseases or phenotypes.

  4. Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation.

    Science.gov (United States)

    Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

    2014-01-01

    Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific

  5. Mesenchymal Stem Cell-Based Tumor-Targeted Gene Therapy in Gastrointestinal Cancer

    OpenAIRE

    Bao, Qi; Zhao, Yue; Niess, Hanno; Conrad, Claudius; Schwarz, Bettina; Jauch, Karl-Walter; Huss, Ralf; Nelson, Peter J.; Bruns, Christiane J.

    2012-01-01

    Mesenchymal stem (or stromal) cells (MSCs) are nonhematopoietic progenitor cells that can be obtained from bone marrow aspirates or adipose tissue, expanded and genetically modified in vitro, and then used for cancer therapeutic strategies in vivo. Here, we review available data regarding the application of MSC-based tumor-targeted therapy in gastrointestinal cancer, provide an overview of the general history of MSC-based gene therapy in cancer research, and discuss potential problems associa...

  6. Analyzing Plasmodium falciparum erythrocyte membrane protein 1 gene expression by a next generation sequencing based method

    DEFF Research Database (Denmark)

    Jespersen, Jakob S.; Petersen, Bent; Seguin-Orlando, Andaine

    2013-01-01

    at identifying PfEMP1 features associated with high virulence. Here we present the first effective method for sequence analysis of var genes expressed in field samples: a sequential PCR and next generation sequencing based technique applied on expressed var sequence tags and subsequently on long range PCR......, encoded by ~60 highly variable 'var' genes per haploid genome. PfEMP1 is exported to the surface of infected erythrocytes and is thought to be fundamental to immune evasion by adhesion to host and parasite factors. The highly variable nature has constituted a roadblock in var expression studies aimed...

  7. Integration of Genome Scale Metabolic Networks and Gene Regulation of Metabolic Enzymes With Physiologically Based Pharmacokinetics.

    Science.gov (United States)

    Maldonado, Elaina M; Leoncikas, Vytautas; Fisher, Ciarán P; Moore, J Bernadette; Plant, Nick J; Kierzek, Andrzej M

    2017-11-01

    The scope of physiologically based pharmacokinetic (PBPK) modeling can be expanded by assimilation of the mechanistic models of intracellular processes from systems biology field. The genome scale metabolic networks (GSMNs) represent a whole set of metabolic enzymes expressed in human tissues. Dynamic models of the gene regulation of key drug metabolism enzymes are available. Here, we introduce GSMNs and review ongoing work on integration of PBPK, GSMNs, and metabolic gene regulation. We demonstrate example models. © 2017 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.

  8. Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

    Science.gov (United States)

    2011-01-01

    Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were

  9. Efficient gene transfer into nondividing cells by adeno-associated virus-based vectors.

    Science.gov (United States)

    Podsakoff, G; Wong, K K; Chatterjee, S

    1994-09-01

    Gene transfer vectors based on adeno-associated virus (AAV) are emerging as highly promising for use in human gene therapy by virtue of their characteristics of wide host range, high transduction efficiencies, and lack of cytopathogenicity. To better define the biology of AAV-mediated gene transfer, we tested the ability of an AAV vector to efficiently introduce transgenes into nonproliferating cell populations. Cells were induced into a nonproliferative state by treatment with the DNA synthesis inhibitors fluorodeoxyuridine and aphidicolin or by contact inhibition induced by confluence and serum starvation. Cells in logarithmic growth or DNA synthesis arrest were transduced with vCWR:beta gal, an AAV-based vector encoding beta-galactosidase under Rous sarcoma virus long terminal repeat promoter control. Under each condition tested, vCWR:beta Gal expression in nondividing cells was at least equivalent to that in actively proliferating cells, suggesting that mechanisms for virus attachment, nuclear transport, virion uncoating, and perhaps some limited second-strand synthesis of AAV vectors were present in nondividing cells. Southern hybridization analysis of vector sequences from cells transduced while in DNA synthetic arrest and expanded after release of the block confirmed ultimate integration of the vector genome into cellular chromosomal DNA. These findings may provide the basis for the use of AAV-based vectors for gene transfer into quiescent cell populations such as totipotent hematopoietic stem cells.

  10. Actionable gene-based classification toward precision medicine in gastric cancer

    Directory of Open Access Journals (Sweden)

    Hiroshi Ichikawa

    2017-10-01

    Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene-based

  11. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  12. Sex determination in insects: a binary decision based on alternative splicing.

    Science.gov (United States)

    Salz, Helen K

    2011-08-01

    The gene regulatory networks that control sex determination vary between species. Despite these differences, comparative studies in insects have found that alternative splicing is reiteratively used in evolution to control expression of the key sex-determining genes. Sex determination is best understood in Drosophila where activation of the RNA binding protein-encoding gene Sex-lethal is the central female-determining event. Sex-lethal serves as a genetic switch because once activated it controls its own expression by a positive feedback splicing mechanism. Sex fate choice in is also maintained by self-sustaining positive feedback splicing mechanisms in other dipteran and hymenopteran insects, although different RNA binding protein-encoding genes function as the binary switch. Studies exploring the mechanisms of sex-specific splicing have revealed the extent to which sex determination is integrated with other developmental regulatory networks. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Time warping of evolutionary distant temporal gene expression data based on noise suppression

    Directory of Open Access Journals (Sweden)

    Papatsenko Dmitri

    2009-10-01

    Full Text Available Abstract Background Comparative analysis of genome wide temporal gene expression data has a broad potential area of application, including evolutionary biology, developmental biology, and medicine. However, at large evolutionary distances, the construction of global alignments and the consequent comparison of the time-series data are difficult. The main reason is the accumulation of variability in expression profiles of orthologous genes, in the course of evolution. Results We applied Pearson distance matrices, in combination with other noise-suppression techniques and data filtering to improve alignments. This novel framework enhanced the capacity to capture the similarities between the temporal gene expression datasets separated by large evolutionary distances. We aligned and compared the temporal gene expression data in budding (Saccharomyces cerevisiae and fission (Schizosaccharomyces pombe yeast, which are separated by more then ~400 myr of evolution. We found that the global alignment (time warping properly matched the duration of cell cycle phases in these distant organisms, which was measured in prior studies. At the same time, when applied to individual ortholog pairs, this alignment procedure revealed groups of genes with distinct alignments, different from the global alignment. Conclusion Our alignment-based predictions of differences in the cell cycle phases between the two yeast species were in a good agreement with the existing data, thus supporting the computational strategy adopted in this study. We propose that the existence of the alternative alignments, specific to distinct groups of genes, suggests presence of different synchronization modes between the two organisms and possible functional decoupling of particular physiological gene networks in the course of evolution.

  14. Form gene clustering method about pan-ethnic-group products based on emotional semantic

    Science.gov (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

    2016-09-01

    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  15. Affinity-based biosensors as promising tools for gene doping detection.

    Science.gov (United States)

    Minunni, Maria; Scarano, Simona; Mascini, Marco

    2008-05-01

    Innovative bioanalytical approaches can be foreseen as interesting means for solving relevant emerging problems in anti-doping control. Sport authorities fear that the newer form of doping, so-called gene doping, based on a misuse of gene therapy, will be undetectable and thus much less preventable. The World Anti-Doping Agency has already asked scientists to assist in finding ways to prevent and detect this newest kind of doping. In this Opinion article we discuss the main aspects of gene doping, from the putative target analytes to suitable sampling strategies. Moreover, we discuss the potential application of affinity sensing in this field, which so far has been successfully applied to a variety of analytical problems, from clinical diagnostics to food and environmental analysis.

  16. Comparison of different cationized proteins as biomaterials for nanoparticle-based ocular gene delivery.

    Science.gov (United States)

    Zorzi, Giovanni K; Párraga, Jenny E; Seijo, Begoña; Sanchez, Alejandro

    2015-11-01

    Cationized polymers have been proposed as transfection agents for gene therapy. The present work aims to improve the understanding of the potential use of different cationized proteins (atelocollagen, albumin and gelatin) as nanoparticle components and to investigate the possibility of modulating the physicochemical properties of the resulting nanoparticle carriers by selecting specific protein characteristics in an attempt to improve current ocular gene-delivery approaches. The toxicity profiles, as well as internalization and transfection efficiency, of the developed nanoparticles can be modulated by modifying the molecular weight of the selected protein and the amine used for cationization. The most promising systems are nanoparticles based on intermediate molecular weight gelatin cationized with the endogenous amine spermine, which exhibit an adequate toxicological profile, as well as effective association and protection of pDNA or siRNA molecules, thereby resulting in higher transfection efficiency and gene silencing than the other studied formulations. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  18. Digital Gene Expression Analysis Based on De Novo Transcriptome Assembly Reveals New Genes Associated with Floral Organ Differentiation of the Orchid Plant Cymbidium ensifolium.

    Directory of Open Access Journals (Sweden)

    Fengxi Yang

    Full Text Available Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms

  19. Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

    Directory of Open Access Journals (Sweden)

    Sujan Mamidi

    2016-07-01

    Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.

  20. PHYLOGENETIC RELATIONSHIPS AMONGST 10 Durio SPECIES BASED ON PCR-RFLP ANALYSIS OF TWO CHLOROPLAST GENES

    Directory of Open Access Journals (Sweden)

    Panca J. Santoso

    2013-07-01

    Full Text Available Twenty seven species of Durio have been identified in Sabah and Sarawak, Malaysia, but their relationships have not been studied. This study was conducted to analyse phylogenetic relationships amongst 10 Durio species in Malaysia using PCR-RFLP on two chloroplast DNA genes, i.e. ndhC-trnV and rbcL. DNAs were extracted from young leaves of 11 accessions from 10 Durio species collected from the Tenom Agriculture Research Station, Sabah, and University Agriculture Park, Universiti Putra Malaysia. Two pairs of oligonucleotide primers, N1-N2 and rbcL1-rbcL2, were used to flank the target regions ndhC-trnV and rbcL. Eight restriction enzymes, HindIII, BsuRI, PstI, TaqI, MspI, SmaI, BshNI, and EcoR130I, were used to digest the amplicons. Based on the results of PCR-RFLP on ndhC-trnV gene, the 10 Durio species were grouped into five distinct clusters, and the accessions generally showed high variations. However, based on the results of PCR-RFLP on the rbcL gene, the species were grouped into three distinct clusters, and generally showed low variations. This means that ndhC-trnV gene is more reliable for phylogenetic analysis in lower taxonomic level of Durio species or for diversity analysis, while rbcL gene is reliable marker for phylogenetic analysis at higher taxonomic level. PCR-RFLP on the ndhC-trnV and rbcL genes could therefore be considered as useful markers to phylogenetic analysis amongst Durio species. These finding might be used for further molecular marker assisted in Durio breeding program.

  1. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  2. Improved in vivo gene transfer into tumor tissue by stabilization of pseudodendritic oligoethylenimine-based polyplexes.

    Science.gov (United States)

    Russ, Verena; Fröhlich, Thomas; Li, Yunqiu; Halama, Anna; Ogris, Manfred; Wagner, Ernst

    2010-02-01

    HD O is a low molecular weight pseudodendrimer containing oligoethylenimine and degradable hexanediol diacrylate diesters. DNA polyplexes display encouraging gene transfer efficiency in vitro and in vivo but also a limited stability under physiological conditions. This limitation must be overcome for further development into more sophisticated formulations. HD O polyplexes were laterally stabilized by crosslinking surface amines via bifunctional crosslinkers, bioreducible dithiobis(succimidyl propionate) (DSP) or the nonreducible analog disuccinimidyl suberate (DSS). Optionally, in a subsequent step, the targeting ligand transferrin (Tf) was attached to DSP-linked HD O polyplexes via Schiff base formation between HD O amino groups and Tf aldehyde groups, which were introduced into Tf by periodate oxidation of the glycosylation sites. Crosslinked DNA polyplexes showed an increased stability against exchange reaction by salt or heparin. Disulfide bond containing DSP-linked polyplexes were susceptible to reducing conditions. These polyplexes displayed the highest gene expression levels in vitro and in vivo (upon intratumoral application in mice), and these were significantly elevated and prolonged over standard or DSS-stabilized HD O formulations. DSP-stabilized HD O polyplexes with or without Tf coating were well-tolerated after intravenous application. High gene expression levels were found in tumor tissue, with negligible gene expression in any other organ. Lateral stabilization of HD O polyplexes with DSP crosslinker enhanced gene transfer efficacy and was essential for the incorporation of a ligand (Tf) into a stable particle formulation.

  3. Probability-based collaborative filtering model for predicting gene-disease associations.

    Science.gov (United States)

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  4. Allen Brain Atlas-Driven Visualizations: a web-based gene expression energy visualization tool.

    Science.gov (United States)

    Zaldivar, Andrew; Krichmar, Jeffrey L

    2014-01-01

    The Allen Brain Atlas-Driven Visualizations (ABADV) is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA) across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.

  5. Allen Brain Atlas-Driven Visualizations: A Web-Based Gene Expression Energy Visualization Tool

    Directory of Open Access Journals (Sweden)

    Andrew eZaldivar

    2014-05-01

    Full Text Available The Allen Brain Atlas-Driven Visualizations (ABADV is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.

  6. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  7. SoFoCles: feature filtering for microarray classification based on gene ontology.

    Science.gov (United States)

    Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A

    2010-02-01

    Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.

  8. A meta-analysis based method for prioritizing candidate genes involved in a pre-specific function

    Directory of Open Access Journals (Sweden)

    Jingjing Zhai

    2016-12-01

    Full Text Available The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Towards this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization, in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The ‘leave-one-out’ cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2. Moreover, RAP ranked 53.68% (204/380 flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software.

  9. fabp4 is central to eight obesity associated genes: a functional gene network-based polymorphic study.

    Science.gov (United States)

    Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand

    2015-01-07

    Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses

    OpenAIRE

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-01-01

    Background Female moths synthesize species-specific sex pheromone components and release them to attract male moths, which depend on precise sex pheromone chemosensory system to locate females. Two types of genes involved in the sex pheromone biosynthesis and degradation pathways play essential roles in this important moth behavior. To understand the function of genes in the sex pheromone pathway, this study investigated the genome-wide and digital gene expression of sex pheromone biosynthesi...

  11. DGGE based whole-gene mutation scanning of the dystrophlin gene in Duchenne and Becker muscular dystrophy patients

    NARCIS (Netherlands)

    Hofstra, RMW; Mulder, IM; Vossen, R; de Koning-Gans, PAM; Kraak, M; Ginjaar, IB; van der Hout, AH; Bakker, E; Buys, CHCM; van Essen, AJ; den Dunnen, JT

    2004-01-01

    Duchenne and Becker muscular dystrophy (DMD and BMD) are caused by mutations in the dystrophin gene. Large rearrangements in the gene are found in about two,thirds of DMD patients, with similar to60% carrying deletions and 5-10% carrying duplications. Most of the remaining 30-35% of patients are

  12. DNA base sequence changes induced by ultraviolet light mutagenesis of a gene on a chromosome in Chinese hamster ovary cells

    Energy Technology Data Exchange (ETDEWEB)

    Romac, S; Leong, P; Sockett, H; Hutchinson, F [Yale Univ., New Haven, CT (USA). Dept. of Molecular Biophysics and Biochemistry

    1989-09-20

    The DNA base sequence changes induced by mutagenesis with ultraviolet light have been determined in a gene on a chromosome of cultured Chinese hamster ovary (CHO) cells. The gene was the Excherichia coli gpt gene, of which a single copy was stably incorporated and expressed in the CHO cell genome. The cells were irradiated with ultraviolet light and gpt{sup -} colonies were selected by resistance to 6-thioguanine. The gpt gene was amplified from chromosomal DNA by use of the polymerase chain reaction (PCR) and the amplified DNA sequenced directly by the dideoxy method. Of the 58 sequenced mutants of independent origin 53 were base change mutations. Forty-one base substitutions were single base changes, ten had two adjacent (or tandem) base changes, and one had two base changes separated by a single base-pair. Only one mutant had a multiple base change mutation with two or more well separated base changes. In contrast much higher levels of such mutations were reported in ultraviolet mutagenesis of genes on a shuttle vector in primate cells. Two deletions of a single base-pair were observed and three deletions ranging from 6 to 37 base-pairs. The mutation spectrum in the gpt gene had similarities to the ultraviolet mutation spectra for several genes in prokaryotes, which suggests similarities in mutational mechanisms in prokaryotes and eukaryotes. (author).

  13. Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups.

    Science.gov (United States)

    Shiao, S Pamela K; Grayson, James; Yu, Chong Ho; Wasek, Brandi; Bottiglieri, Teodoro

    2018-02-16

    For the personalization of polygenic/omics-based health care, the purpose of this study was to examine the gene-environment interactions and predictors of colorectal cancer (CRC) by including five key genes in the one-carbon metabolism pathways. In this proof-of-concept study, we included a total of 54 families and 108 participants, 54 CRC cases and 54 matched family friends representing four major racial ethnic groups in southern California (White, Asian, Hispanics, and Black). We used three phases of data analytics, including exploratory, family-based analyses adjusting for the dependence within the family for sharing genetic heritage, the ensemble method, and generalized regression models for predictive modeling with a machine learning validation procedure to validate the results for enhanced prediction and reproducibility. The results revealed that despite the family members sharing genetic heritage, the CRC group had greater combined gene polymorphism rates than the family controls ( p relation to gene-environment interactions in the prevention of CRC.

  14. Tumor Suppressor Gene-Based Nanotherapy: From Test Tube to the Clinic

    Directory of Open Access Journals (Sweden)

    Manish Shanker

    2011-01-01

    Full Text Available Cancer is a major health problem in the world. Advances made in cancer therapy have improved the survival of patients in certain types of cancer. However, the overall five-year survival has not significantly improved in the majority of cancer types. Major challenges encountered in having effective cancer therapy are development of drug resistance by the tumor cells, nonspecific cytotoxicity, and inability to affect metastatic tumors by the chemodrugs. Overcoming these challenges requires development and testing of novel therapies. One attractive cancer therapeutic approach is cancer gene therapy. Several laboratories including the authors' laboratory have been investigating nonviral formulations for delivering therapeutic genes as a mode for effective cancer therapy. In this paper the authors will summarize their experience in the development and testing of a cationic lipid-based nanocarrier formulation and the results from their preclinical studies leading to a Phase I clinical trial for nonsmall cell lung cancer. Their nanocarrier formulation containing therapeutic genes such as tumor suppressor genes when administered intravenously effectively controls metastatic tumor growth. Additional Phase I clinical trials based on the results of their nanocarrier formulation have been initiated or proposed for treatment of cancer of the breast, ovary, pancreas, and metastatic melanoma, and will be discussed.

  15. GENECODIS-Grid: An online grid-based tool to predict functional information in gene lists

    International Nuclear Information System (INIS)

    Nogales, R.; Mejia, E.; Vicente, C.; Montes, E.; Delgado, A.; Perez Griffo, F. J.; Tirado, F.; Pascual-Montano, A.

    2007-01-01

    In this work we introduce GeneCodis-Grid, a grid-based alternative to a bioinformatics tool named Genecodis that integrates different sources of biological information to search for biological features (annotations) that frequently co-occur in a set of genes and rank them by statistical significance. GeneCodis-Grid is a web-based application that takes advantage of two independent grid networks and a computer cluster managed by a meta-scheduler and a web server that host the application. The mining of concurrent biological annotations provides significant information for the functional analysis of gene list obtained by high throughput experiments in biology. Due to the large popularity of this tool, that has registered more than 13000 visits since its publication in January 2007, there is a strong need to facilitate users from different sites to access the system simultaneously. In addition, the complexity of some of the statistical tests used in this approach has made this technique a good candidate for its implementation in a Grid opportunistic environment. (Author)

  16. Tumor suppressor gene-based nanotherapy: from test tube to the clinic.

    Science.gov (United States)

    Shanker, Manish; Jin, Jiankang; Branch, Cynthia D; Miyamoto, Shinya; Grimm, Elizabeth A; Roth, Jack A; Ramesh, Rajagopal

    2011-01-01

    Cancer is a major health problem in the world. Advances made in cancer therapy have improved the survival of patients in certain types of cancer. However, the overall five-year survival has not significantly improved in the majority of cancer types. Major challenges encountered in having effective cancer therapy are development of drug resistance by the tumor cells, nonspecific cytotoxicity, and inability to affect metastatic tumors by the chemodrugs. Overcoming these challenges requires development and testing of novel therapies. One attractive cancer therapeutic approach is cancer gene therapy. Several laboratories including the authors' laboratory have been investigating nonviral formulations for delivering therapeutic genes as a mode for effective cancer therapy. In this paper the authors will summarize their experience in the development and testing of a cationic lipid-based nanocarrier formulation and the results from their preclinical studies leading to a Phase I clinical trial for nonsmall cell lung cancer. Their nanocarrier formulation containing therapeutic genes such as tumor suppressor genes when administered intravenously effectively controls metastatic tumor growth. Additional Phase I clinical trials based on the results of their nanocarrier formulation have been initiated or proposed for treatment of cancer of the breast, ovary, pancreas, and metastatic melanoma, and will be discussed.

  17. Gene expression-based molecular diagnostic system for malignant gliomas is superior to histological diagnosis.

    Science.gov (United States)

    Shirahata, Mitsuaki; Iwao-Koizumi, Kyoko; Saito, Sakae; Ueno, Noriko; Oda, Masashi; Hashimoto, Nobuo; Takahashi, Jun A; Kato, Kikuya

    2007-12-15

    Current morphology-based glioma classification methods do not adequately reflect the complex biology of gliomas, thus limiting their prognostic ability. In this study, we focused on anaplastic oligodendroglioma and glioblastoma, which typically follow distinct clinical courses. Our goal was to construct a clinically useful molecular diagnostic system based on gene expression profiling. The expression of 3,456 genes in 32 patients, 12 and 20 of whom had prognostically distinct anaplastic oligodendroglioma and glioblastoma, respectively, was measured by PCR array. Next to unsupervised methods, we did supervised analysis using a weighted voting algorithm to construct a diagnostic system discriminating anaplastic oligodendroglioma from glioblastoma. The diagnostic accuracy of this system was evaluated by leave-one-out cross-validation. The clinical utility was tested on a microarray-based data set of 50 malignant gliomas from a previous study. Unsupervised analysis showed divergent global gene expression patterns between the two tumor classes. A supervised binary classification model showed 100% (95% confidence interval, 89.4-100%) diagnostic accuracy by leave-one-out cross-validation using 168 diagnostic genes. Applied to a gene expression data set from a previous study, our model correlated better with outcome than histologic diagnosis, and also displayed 96.6% (28 of 29) consistency with the molecular classification scheme used for these histologically controversial gliomas in the original article. Furthermore, we observed that histologically diagnosed glioblastoma samples that shared anaplastic oligodendroglioma molecular characteristics tended to be associated with longer survival. Our molecular diagnostic system showed reproducible clinical utility and prognostic ability superior to traditional histopathologic diagnosis for malignant glioma.

  18. Naturally occurring mutations in the human 5-lipoxygenase gene promoter that modify transcription factor binding and reporter gene transcription.

    OpenAIRE

    In, K H; Asano, K; Beier, D; Grobholz, J; Finn, P W; Silverman, E K; Silverman, E S; Collins, T; Fischer, A R; Keith, T P; Serino, K; Kim, S W; De Sanctis, G T; Yandava, C; Pillari, A

    1997-01-01

    Five lipoxygenase (5-LO) is the first committed enzyme in the metabolic pathway leading to the synthesis of the leukotrienes. We examined genomic DNA isolated from 25 normal subjects and 31 patients with asthma (6 of whom had aspirin-sensitive asthma) for mutations in the known transcription factor binding regions and the protein encoding region of the 5-LO gene. A family of mutations in the G + C-rich transcription factor binding region was identified consisting of the deletion of one, delet...

  19. Frequency-based time-series gene expression recomposition using PRIISM

    Directory of Open Access Journals (Sweden)

    Rosa Bruce A

    2012-06-01

    Full Text Available Abstract Background Circadian rhythm pathways influence the expression patterns of as much as 31% of the Arabidopsis genome through complicated interaction pathways, and have been found to be significantly disrupted by biotic and abiotic stress treatments, complicating treatment-response gene discovery methods due to clock pattern mismatches in the fold change-based statistics. The PRIISM (Pattern Recomposition for the Isolation of Independent Signals in Microarray data algorithm outlined in this paper is designed to separate pattern changes induced by different forces, including treatment-response pathways and circadian clock rhythm disruptions. Results Using the Fourier transform, high-resolution time-series microarray data is projected to the frequency domain. By identifying the clock frequency range from the core circadian clock genes, we separate the frequency spectrum to different sections containing treatment-frequency (representing up- or down-regulation by an adaptive treatment response, clock-frequency (representing the circadian clock-disruption response and noise-frequency components. Then, we project the components’ spectra back to the expression domain to reconstruct isolated, independent gene expression patterns representing the effects of the different influences. By applying PRIISM on a high-resolution time-series Arabidopsis microarray dataset under a cold treatment, we systematically evaluated our method using maximum fold change and principal component analyses. The results of this study showed that the ranked treatment-frequency fold change results produce fewer false positives than the original methodology, and the 26-hour timepoint in our dataset was the best statistic for distinguishing the most known cold-response genes. In addition, six novel cold-response genes were discovered. PRIISM also provides gene expression data which represents only circadian clock influences, and may be useful for circadian clock studies

  20. Isolation and characterisation of cDNA clones representing the genes encoding the major tuber storage protein (dioscorin) of yam (Dioscorea cayenensis Lam.).

    Science.gov (United States)

    Conlan, R S; Griffiths, L A; Napier, J A; Shewry, P R; Mantell, S; Ainsworth, C

    1995-06-01

    cDNA clones encoding dioscorins, the major tuber storage proteins (M(r) 32,000) of yam (Dioscorea cayenesis) have been isolated. Two classes of clone (A and B, based on hybrid release translation product sizes and nucleotide sequence differences) which are 84.1% similar in their protein coding regions, were identified. The protein encoded by the open reading frame of the class A cDNA insert is of M(r) 30,015. The difference in observed and calculated molecular mass might be attributed to glycosylation. Nucleotide sequencing and in vitro transcription/translation suggest that the class A dioscorin proteins are synthesised with signal peptides of 18 amino acid residues which are cleaved from the mature peptide. The class A and class B proteins are 69.6% similar with respect to each other, but show no sequence identity with other plant proteins or with the major tuber storage proteins of potato (patatin) or sweet potato (sporamin). Storage protein gene expression was restricted to developing tubers and was not induced by growth conditions known to induce expression of tuber storage protein genes in other plant species. The codon usage of the dioscorin genes suggests that the Dioscoreaceae are more closely related to dicotyledonous than to monocotyledonous plants.

  1. Alteration of gene conversion patterns in Sordaria fimicola by supplementation with DNA bases.

    Science.gov (United States)

    Kitani, Y; Olive, L S

    1970-08-01

    Supplementation with DNA bases in crosses of Sordaria fimicola heterozygous for spore color markers (g(1), h(2)) within the gray-spore (g) locus has been found to cause significant alterations in patterns of gene conversion at the two mutant sites. Each base had its own characteristic effect in altering the conversion pattern, and responses of the two mutant sites to the four bases were different in several ways. Also, the responses of the two involved chromatids of the meiotic bivalent were different.

  2. Mesenchymal Stem Cell-Based Tumor-Targeted Gene Therapy in Gastrointestinal Cancer

    Science.gov (United States)

    Bao, Qi; Zhao, Yue; Niess, Hanno; Conrad, Claudius; Schwarz, Bettina; Jauch, Karl-Walter; Huss, Ralf; Nelson, Peter J.

    2012-01-01

    Mesenchymal stem (or stromal) cells (MSCs) are nonhematopoietic progenitor cells that can be obtained from bone marrow aspirates or adipose tissue, expanded and genetically modified in vitro, and then used for cancer therapeutic strategies in vivo. Here, we review available data regarding the application of MSC-based tumor-targeted therapy in gastrointestinal cancer, provide an overview of the general history of MSC-based gene therapy in cancer research, and discuss potential problems associated with the utility of MSC-based therapy such as biosafety, immunoprivilege, transfection methods, and distribution in the host. PMID:22530882

  3. Integrative characterization of germ cell-specific genes from mouse spermatocyte UniGene library

    Directory of Open Access Journals (Sweden)

    Eddy Edward M

    2007-07-01

    Full Text Available Abstract Background The primary regulator of spermatogenesis, a highly ordered and tightly regulated developmental process, is an intrinsic genetic program involving male germ cell-specific genes. Results We analyzed the mouse spermatocyte UniGene library containing 2155 gene-oriented transcript clusters. We predict that 11% of these genes are testis-specific and systematically identified 24 authentic genes specifically and abundantly expressed in the testis via in silico and in vitro approaches. Northern blot analysis disclosed various transcript characteristics, such as expression level, size and the presence of isoform. Expression analysis revealed developmentally regulated and stage-specific expression patterns in all of the genes. We further analyzed the genes at the protein and cellular levels. Transfection assays performed using GC-2 cells provided information on the cellular characteristics of the gene products. In addition, antibodies were generated against proteins encoded by some of the genes to facilitate their identification and characterization in spermatogenic cells and sperm. Our data suggest that a number of the gene products are implicated in transcriptional regulation, nuclear integrity, sperm structure and motility, and fertilization. In particular, we found for the first time that Mm.333010, predicted to contain a trypsin-like serine protease domain, is a sperm acrosomal protein. Conclusion We identify 24 authentic genes with spermatogenic cell-specific expression, and provide comprehensive information about the genes. Our findings establish a new basis for future investigation into molecular mechanisms underlying male reproduction.

  4. Identification of human circadian genes based on time course gene expression profiles by using a deep learning method.

    Science.gov (United States)

    Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui

    2018-06-01

    Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological

  5. Genotet: An Interactive Web-based Visual Exploration Framework to Support Validation of Gene Regulatory Networks.

    Science.gov (United States)

    Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T

    2014-12-01

    Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).

  6. Monoterpenoid-based preparations in beehives affect learning, memory, and gene expression in the bee brain.

    Science.gov (United States)

    Bonnafé, Elsa; Alayrangues, Julie; Hotier, Lucie; Massou, Isabelle; Renom, Allan; Souesme, Guillaume; Marty, Pierre; Allaoua, Marion; Treilhou, Michel; Armengaud, Catherine

    2017-02-01

    Bees are exposed in their environment to contaminants that can weaken the colony and contribute to bee declines. Monoterpenoid-based preparations can be introduced into hives to control the parasitic mite Varroa destructor. The long-term effects of monoterpenoids are poorly investigated. Olfactory conditioning of the proboscis extension reflex (PER) has been used to evaluate the impact of stressors on cognitive functions of the honeybee such as learning and memory. The authors tested the PER to odorants on bees after exposure to monoterpenoids in hives. Octopamine receptors, transient receptor potential-like (TRPL), and γ-aminobutyric acid channels are thought to play a critical role in the memory of food experience. Gene expression levels of Amoa1, Rdl, and trpl were evaluated in parallel in the bee brain because these genes code for the cellular targets of monoterpenoids and some pesticides and neural circuits of memory require their expression. The miticide impaired the PER to odors in the 3 wk following treatment. Short-term and long-term olfactory memories were improved months after introduction of the monoterpenoids into the beehives. Chronic exposure to the miticide had significant effects on Amoa1, Rdl, and trpl gene expressions and modified seasonal changes in the expression of these genes in the brain. The decrease of expression of these genes in winter could partly explain the improvement of memory. The present study has led to new insights into alternative treatments, especially on their effects on memory and expression of selected genes involved in this cognitive function. Environ Toxicol Chem 2017;36:337-345. © 2016 SETAC. © 2016 SETAC.

  7. Identification of Key Pathways and Genes in the Dynamic Progression of HCC Based on WGCNA.

    Science.gov (United States)

    Yin, Li; Cai, Zhihui; Zhu, Baoan; Xu, Cunshuan

    2018-02-14

    Hepatocellular carcinoma (HCC) is a devastating disease worldwide. Though many efforts have been made to elucidate the process of HCC, its molecular mechanisms of development remain elusive due to its complexity. To explore the stepwise carcinogenic process from pre-neoplastic lesions to the end stage of HCC, we employed weighted gene co-expression network analysis (WGCNA) which has been proved to be an effective method in many diseases to detect co-expressed modules and hub genes using eight pathological stages including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early and early, advanced HCC and very advanced HCC. Among the eight consecutive pathological stages, five representative modules are selected to perform canonical pathway enrichment and upstream regulator analysis by using ingenuity pathway analysis (IPA) software. We found that cell cycle related biological processes were activated at four neoplastic stages, and the degree of activation of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and the expression value of the genes decreased only at four neoplastic stages. The brown module, enriched in protein ubiquitination and ephrin receptor signaling pathways, correlated mainly with the very early stage of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, correlated with the cirrhotic stage only. The high degree hub genes were identified based on the protein-protein interaction (PPI) network and were verified by Kaplan-Meier survival analysis. The novel five high degree hub genes signature that was identified in our study may shed light on future prognostic and therapeutic approaches. Our study brings a new perspective to the understanding of the key pathways and genes in the dynamic changes of HCC progression. These findings shed light on further investigations.

  8. Taxonomic resolutions based on 18S rRNA genes: a case study of subclass copepoda.

    Directory of Open Access Journals (Sweden)

    Shu Wu

    Full Text Available Biodiversity studies are commonly conducted using 18S rRNA genes. In this study, we compared the inter-species divergence of variable regions (V1-9 within the copepod 18S rRNA gene, and tested their taxonomic resolutions at different taxonomic levels. Our results indicate that the 18S rRNA gene is a good molecular marker for the study of copepod biodiversity, and our conclusions are as follows: 1 18S rRNA genes are highly conserved intra-species (intra-species similarities are close to 100%; and could aid in species-level analyses, but with some limitations; 2 nearly-whole-length sequences and some partial regions (around V2, V4, and V9 of the 18S rRNA gene can be used to discriminate between samples at both the family and order levels (with a success rate of about 80%; 3 compared with other regions, V9 has a higher resolution at the genus level (with an identification success rate of about 80%; and 4 V7 is most divergent in length, and would be a good candidate marker for the phylogenetic study of Acartia species. This study also evaluated the correlation between similarity thresholds and the accuracy of using nuclear 18S rRNA genes for the classification of organisms in the subclass Copepoda. We suggest that sample identification accuracy should be considered when a molecular sequence divergence threshold is used for taxonomic identification, and that the lowest similarity threshold should be determined based on a pre-designated level of acceptable accuracy.

  9. Mechanism-based biomarker gene sets for glutathione depletion-related hepatotoxicity in rats

    International Nuclear Information System (INIS)

    Gao Weihua; Mizukawa, Yumiko; Nakatsu, Noriyuki; Minowa, Yosuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

    2010-01-01

    Chemical-induced glutathione depletion is thought to be caused by two types of toxicological mechanisms: PHO-type glutathione depletion [glutathione conjugated with chemicals such as phorone (PHO) or diethyl maleate (DEM)], and BSO-type glutathione depletion [i.e., glutathione synthesis inhibited by chemicals such as L-buthionine-sulfoximine (BSO)]. In order to identify mechanism-based biomarker gene sets for glutathione depletion in rat liver, male SD rats were treated with various chemicals including PHO (40, 120 and 400 mg/kg), DEM (80, 240 and 800 mg/kg), BSO (150, 450 and 1500 mg/kg), and bromobenzene (BBZ, 10, 100 and 300 mg/kg). Liver samples were taken 3, 6, 9 and 24 h after administration and examined for hepatic glutathione content, physiological and pathological changes, and gene expression changes using Affymetrix GeneChip Arrays. To identify differentially expressed probe sets in response to glutathione depletion, we focused on the following two courses of events for the two types of mechanisms of glutathione depletion: a) gene expression changes occurring simultaneously in response to glutathione depletion, and b) gene expression changes after glutathione was depleted. The gene expression profiles of the identified probe sets for the two types of glutathione depletion differed markedly at times during and after glutathione depletion, whereas Srxn1 was markedly increased for both types as glutathione was depleted, suggesting that Srxn1 is a key molecule in oxidative stress related to glutathione. The extracted probe sets were refined and verified using various compounds including 13 additional positive or negative compounds, and they established two useful marker sets. One contained three probe sets (Akr7a3, Trib3 and Gstp1) that could detect conjugation-type glutathione depletors any time within 24 h after dosing, and the other contained 14 probe sets that could detect glutathione depletors by any mechanism. These two sets, with appropriate scoring

  10. DHPLC-based mutation analysis of ENG and ALK-1 genes in HHT Italian population.

    Science.gov (United States)

    Lenato, Gennaro M; Lastella, Patrizia; Di Giacomo, Marilena C; Resta, Nicoletta; Suppressa, Patrizia; Pasculli, Giovanna; Sabbà, Carlo; Guanti, Ginevra

    2006-02-01

    Hereditary haemorrhagic telangiectasia (HHT or Rendu-Osler-Weber syndrome) is an autosomal dominant disorder characterized by localized angiodysplasia due to mutations in endoglin, ALK-1 gene, and a still unidentified locus. The lack of highly recurrent mutations, locus heterogeneity, and the presence of mutations in almost all coding exons of the two genes makes the screening for mutations time-consuming and costly. In the present study, we developed a DHPLC-based protocol for mutation detection in ALK1 and ENG genes through retrospective analysis of known sequence variants, 20 causative mutations and 11 polymorphisms, and a prospective analysis on 47 probands with unknown mutation. Overall DHPLC analysis identified the causative mutation in 61 out 66 DNA samples (92.4%). We found 31 different mutations in the ALK1 gene, of which 15 are novel, and 20, of which 12 are novel, in the ENG gene, thus providing for the first time the mutational spectrum in a cohort of Italian HHT patients. In addition, we characterized the splicing pattern of ALK1 gene in lymphoblastoid cells, both in normal controls and in two individuals carrying a mutation in the non-invariant -3 position of the acceptor splice site upstream exon 6 (c.626-3C>G). Functional essay demonstrated the existence, also in normal individuals, of a small proportion of ALK1 alternative splicing, due to exon 5 skipping, and the presence of further aberrant splicing isoforms in the individuals carrying the c.626-3C>G mutation. 2006 Wiley-Liss, Inc.

  11. Pmr, a histone-like protein H1 (H-NS) family protein encoded by the IncP-7 plasmid pCAR1, is a key global regulator that alters host function.

    Science.gov (United States)

    Yun, Choong-Soo; Suzuki, Chiho; Naito, Kunihiko; Takeda, Toshiharu; Takahashi, Yurika; Sai, Fumiya; Terabayashi, Tsuguno; Miyakoshi, Masatoshi; Shintani, Masaki; Nishida, Hiromi; Yamane, Hisakazu; Nojiri, Hideaki

    2010-09-01

    Histone-like protein H1 (H-NS) family proteins are nucleoid-associated proteins (NAPs) conserved among many bacterial species. The IncP-7 plasmid pCAR1 is transmissible among various Pseudomonas strains and carries a gene encoding the H-NS family protein, Pmr. Pseudomonas putida KT2440 is a host of pCAR1, which harbors five genes encoding the H-NS family proteins PP_1366 (TurA), PP_3765 (TurB), PP_0017 (TurC), PP_3693 (TurD), and PP_2947 (TurE). Quantitative reverse transcription-PCR (qRT-PCR) demonstrated that the presence of pCAR1 does not affect the transcription of these five genes and that only pmr, turA, and turB were primarily transcribed in KT2440(pCAR1). In vitro pull-down assays revealed that Pmr strongly interacted with itself and with TurA, TurB, and TurE. Transcriptome comparisons of the pmr disruptant, KT2440, and KT2440(pCAR1) strains indicated that pmr disruption had greater effects on the host transcriptome than did pCAR1 carriage. The transcriptional levels of some genes that increased with pCAR1 carriage, such as the mexEF-oprN efflux pump genes and parI, reverted with pmr disruption to levels in pCAR1-free KT2440. Transcriptional levels of putative horizontally acquired host genes were not altered by pCAR1 carriage but were altered by pmr disruption. Identification of genome-wide Pmr binding sites by ChAP-chip (chromatin affinity purification coupled with high-density tiling chip) analysis demonstrated that Pmr preferentially binds to horizontally acquired DNA regions. The Pmr binding sites overlapped well with the location of the genes differentially transcribed following pmr disruption on both the plasmid and the chromosome. Our findings indicate that Pmr is a key factor in optimizing gene transcription on pCAR1 and the host chromosome.

  12. Disruption and analysis of the clpB, clpC, and clpE genes in Lactococcus lactis: ClpE, a new Clp family in gram-positive bacteria

    DEFF Research Database (Denmark)

    Ingmer, Hanne; Vogensen, Finn K.; Hammer, Karin

    1999-01-01

    In the genome of the gram-positive bacterium Lactococcus lactis MG1363, we have identified three genes (clpC, clpE, and clpB) which encode Clp proteins containing two conserved ATP binding domains. The proteins encoded by two of the genes belong to the previously described ClpB and ClpC families....... The clpE gene, however, encodes a member of a new Clp protein family that is characterized by a short N-terminal domain including a putative zinc binding domain (-CX2CX22CX2C-). Expression of the 83-kDa ClpE protein as well as of the two proteins encoded by clpB was strongly induced by heat shock and...... was shown to participate in the degradation of randomly folded proteins in L. lactis, could be necessary for degrading proteins generated by certain types of stress....

  13. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.

    Science.gov (United States)

    Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis

    2016-09-02

    Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal

  14. Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups

    Directory of Open Access Journals (Sweden)

    S. Pamela K. Shiao

    2018-02-01

    Full Text Available For the personalization of polygenic/omics-based health care, the purpose of this study was to examine the gene–environment interactions and predictors of colorectal cancer (CRC by including five key genes in the one-carbon metabolism pathways. In this proof-of-concept study, we included a total of 54 families and 108 participants, 54 CRC cases and 54 matched family friends representing four major racial ethnic groups in southern California (White, Asian, Hispanics, and Black. We used three phases of data analytics, including exploratory, family-based analyses adjusting for the dependence within the family for sharing genetic heritage, the ensemble method, and generalized regression models for predictive modeling with a machine learning validation procedure to validate the results for enhanced prediction and reproducibility. The results revealed that despite the family members sharing genetic heritage, the CRC group had greater combined gene polymorphism rates than the family controls (p < 0.05, on MTHFR C677T, MTR A2756G, MTRR A66G, and DHFR 19 bp except MTHFR A1298C. Four racial groups presented different polymorphism rates for four genes (all p < 0.05 except MTHFR A1298C. Following the ensemble method, the most influential factors were identified, and the best predictive models were generated by using the generalized regression models, with Akaike’s information criterion and leave-one-out cross validation methods. Body mass index (BMI and gender were consistent predictors of CRC for both models when individual genes versus total polymorphism counts were used, and alcohol use was interactive with BMI status. Body mass index status was also interactive with both gender and MTHFR C677T gene polymorphism, and the exposure to environmental pollutants was an additional predictor. These results point to the important roles of environmental and modifiable factors in relation to gene–environment interactions in the prevention of CRC.

  15. Consequences of population topology for studying gene flow using link-based landscape genetic methods.

    Science.gov (United States)

    van Strien, Maarten J

    2017-07-01

    Many landscape genetic studies aim to determine the effect of landscape on gene flow between populations. These studies frequently employ link-based methods that relate pairwise measures of historical gene flow to measures of the landscape and the geographical distance between populations. However, apart from landscape and distance, there is a third important factor that can influence historical gene flow, that is, population topology (i.e., the arrangement of populations throughout a landscape). As the population topology is determined in part by the landscape configuration, I argue that it should play a more prominent role in landscape genetics. Making use of existing literature and theoretical examples, I discuss how population topology can influence results in landscape genetic studies and how it can be taken into account to improve the accuracy of these results. In support of my arguments, I have performed a literature review of landscape genetic studies published during the first half of 2015 as well as several computer simulations of gene flow between populations. First, I argue why one should carefully consider which population pairs should be included in link-based analyses. Second, I discuss several ways in which the population topology can be incorporated in response and explanatory variables. Third, I outline why it is important to sample populations in such a way that a good representation of the population topology is obtained. Fourth, I discuss how statistical testing for link-based approaches could be influenced by the population topology. I conclude the article with six recommendations geared toward better incorporating population topology in link-based landscape genetic studies.

  16. PR Interval Associated Genes, Atrial Remodeling and Rhythm Outcome of Catheter Ablation of Atrial Fibrillation—A Gene-Based Analysis of GWAS Data

    Directory of Open Access Journals (Sweden)

    Daniela Husser

    2017-12-01

    Full Text Available Background: PR interval prolongation has recently been shown to associate with advanced left atrial remodeling and atrial fibrillation (AF recurrence after catheter ablation. While different genome-wide association studies (GWAS have implicated 13 loci to associate with the PR interval as an AF endophenotype their subsequent associations with AF remodeling and response to catheter ablation are unknown. Here, we perform a gene-based analysis of GWAS data to test the hypothesis that PR interval candidate genes also associate with left atrial remodeling and arrhythmia recurrence following AF catheter ablation.Methods and Results: Samples from 660 patients with paroxysmal (n = 370 or persistent AF (n = 290 undergoing AF catheter ablation were genotyped for ~1,000,000 SNPs. Gene-based association was investigated using VEGAS (versatile gene-based association study. Among the 13 candidate genes, SLC8A1, MEIS1, ITGA9, SCN5A, and SOX5 associated with the PR interval. Of those, ITGA9 and SOX5 were significantly associated with left atrial low voltage areas and left atrial diameter and subsequently with AF recurrence after radiofrequency catheter ablation.Conclusion: This study suggests contributions of ITGA9 and SOX5 to AF remodeling expressed as PR interval prolongation, low voltage areas and left atrial dilatation and subsequently to response to catheter ablation. Future and larger studies are necessary to replicate and apply these findings with the aim of designing AF pathophysiology-based multi-locus risk scores.

  17. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

    Science.gov (United States)

    Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

    2017-10-06

    Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.

  18. PGMA-Based Cationic Nanoparticles with Polyhydric Iodine Units for Advanced Gene Vectors.

    Science.gov (United States)

    Sun, Yue; Hu, Hao; Yu, Bingran; Xu, Fu-Jian

    2016-11-16

    It is crucial for successful gene delivery to develop safe, effective, and multifunctional polycations. Iodine-based small molecules are widely used as contrast agents for CT imaging. Herein, a series of star-like poly(glycidyl methacrylate) (PGMA)-based cationic vectors (II-PGEA/II) with abundant flanking polyhydric iodine units are prepared for multifunctional gene delivery systems. The proposed II-PGEA/II star vector is composed of one iohexol intermediate (II) core and five ethanolamine (EA) and II-difunctionalized PGMA arms. The amphipathic II-PGEA/II vectors readily self-assemble into well-defined cationic nanoparticles, where massive hydroxyl groups can establish a hydration shell to stabilize the nanoparticles. The II introduction improves cell viabilities of polycations. Moreover, by controlling the suitable amount of introduced II units, the resultant II-PGEA/II nanoparticles can produce fairly good transfection performances in different cell lines. Particularly, the II-PGEA/II nanoparticles induce much better in vitro CT imaging abilities in tumor cells than iohexol (one commonly used commercial CT contrast agent). The present design of amphipathic PGMA-based nanoparticles with CT contrast agents would provide useful information for the development of new multifunctional gene delivery systems.

  19. A postprocessing method in the HMC framework for predicting gene function based on biological instrumental data

    Science.gov (United States)

    Feng, Shou; Fu, Ping; Zheng, Wenbin

    2018-03-01

    Predicting gene function based on biological instrumental data is a complicated and challenging hierarchical multi-label classification (HMC) problem. When using local approach methods to solve this problem, a preliminary results processing method is usually needed. This paper proposed a novel preliminary results processing method called the nodes interaction method. The nodes interaction method revises the preliminary results and guarantees that the predictions are consistent with the hierarchy constraint. This method exploits the label dependency and considers the hierarchical interaction between nodes when making decisions based on the Bayesian network in its first phase. In the second phase, this method further adjusts the results according to the hierarchy constraint. Implementing the nodes interaction method in the HMC framework also enhances the HMC performance for solving the gene function prediction problem based on the Gene Ontology (GO), the hierarchy of which is a directed acyclic graph that is more difficult to tackle. The experimental results validate the promising performance of the proposed method compared to state-of-the-art methods on eight benchmark yeast data sets annotated by the GO.

  20. A cell-based in vitro alternative to identify skin sensitizers by gene expression

    International Nuclear Information System (INIS)

    Hooyberghs, Jef; Schoeters, Elke; Lambrechts, Nathalie; Nelissen, Inge; Witters, Hilda; Schoeters, Greet; Heuvel, Rosette van den

    2008-01-01

    The ethical and economic burden associated with animal testing for assessment of skin sensitization has triggered intensive research effort towards development and validation of alternative methods. In addition, new legislation on the registration and use of cosmetics and chemicals promote the use of suitable alternatives for hazard assessment. Our previous studies demonstrated that human CD34 + progenitor-derived dendritic cells from cord blood express specific gene profiles upon exposure to low molecular weight sensitizing chemicals. This paper presents a classification model based on this cell type which is successful in discriminating sensitizing chemicals from non-sensitizing chemicals based on transcriptome analysis of 13 genes. Expression profiles of a set of 10 sensitizers and 11 non-sensitizers were analyzed by RT-PCR using 9 different exposure conditions and a total of 73 donor samples. Based on these data a predictive dichotomous classifier for skin sensitizers has been constructed, which is referred to as . In a first step the dimensionality of the input data was reduced by selectively rejecting a number of exposure conditions and genes. Next, the generalization of a linear classifier was evaluated by a cross-validation which resulted in a prediction performance with a concordance of 89%, a specificity of 97% and a sensitivity of 82%. These results show that the present model may be a useful human in vitro alternative for further use in a test strategy towards the reduction of animal use for skin sensitization

  1. Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

    Directory of Open Access Journals (Sweden)

    Pugalendhi Ganesh Kumar

    Full Text Available This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR-based method for redefining the criterion function of f-information (FI to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA, which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS. Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA, particle swarm optimization (PSO, and artificial bee colony algorithm (ABC on all the datasets. In the global cancer map with repeated measurements (GCM_RM dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%. In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively

  2. RNA-based, transient modulation of gene expression in human haematopoietic stem and progenitor cells

    Science.gov (United States)

    Diener, Yvonne; Jurk, Marion; Kandil, Britta; Choi, Yeong-Hoon; Wild, Stefan; Bissels, Ute; Bosio, Andreas

    2015-01-01

    Modulation of gene expression is a useful tool to study the biology of haematopoietic stem and progenitor cells (HSPCs) and might also be instrumental to expand these cells for therapeutic approaches. Most of the studies so far have employed stable gene modification by viral vectors that are burdensome when translating protocols into clinical settings. Our study aimed at exploring new ways to transiently modify HSPC gene expression using non-integrating, RNA-based molecules. First, we tested different methods to deliver these molecules into HSPCs. The delivery of siRNAs with chemical transfection methods such as lipofection or cationic polymers did not lead to target knockdown, although we observed more than 90% fluorescent cells using a fluorochrome-coupled siRNA. Confocal microscopic analysis revealed that despite extensive washing, siRNA stuck to or in the cell surface, thereby mimicking a transfection event. In contrast, electroporation resulted in efficient, siRNA-mediated protein knockdown. For transient overexpression of proteins, we used optimised mRNA molecules with modified 5′- and 3′-UTRs. Electroporation of mRNA encoding GFP resulted in fast, efficient and persistent protein expression for at least seven days. Our data provide a broad-ranging comparison of transfection methods for hard-to-transfect cells and offer new opportunities for DNA-free, non-integrating gene modulation in HSPCs. PMID:26599627

  3. A family-based association study of the HTR1B gene in eating disorders

    Directory of Open Access Journals (Sweden)

    Sandra Hernández

    Full Text Available Objective: To explore the association of three polymorphisms of the serotonin receptor 1Dβ gene (HTR1B in the etiology of eating disorders and their relationship with clinical characteristics. Methods: We analyzed the G861C, A-161T, and A1180G polymorphisms of the HTR1B gene through a family-based association test (FBAT in 245 nuclear families. The sample was stratified into anorexia nervosa (AN spectrum and bulimia nervosa (BN spectrum. In addition, we performed a quantitative FBAT analysis of anxiety severity, depression severity, and Yale-Brown-Cornell Eating Disorders Scale (YBC-EDS in the AN and BN-spectrum groups. Results: FBAT analysis of the A-161T polymorphism found preferential transmission of allele A-161 in the overall sample. This association was stronger when the sample was stratified by spectrums, showing transmission disequilibrium between the A-161 allele and BN spectrum (z = 2.871, p = 0.004. Quantitative trait analysis showed an association between severity of anxiety symptoms and the C861 allele in AN-spectrum participants (z = 2.871, p = 0.004. We found no associations on analysis of depression severity or preoccupation and ritual scores in AN or BN-spectrum participants. Conclusions: Our preliminary findings suggest a role of the HTR1B gene in susceptibility to development of BN subtypes. Furthermore, this gene might have an impact on the severity of anxiety in AN-spectrum patients.

  4. An Intelligent Method of Product Scheme Design Based on Product Gene

    Directory of Open Access Journals (Sweden)

    Qing Song Ai

    2013-01-01

    Full Text Available Nowadays, in order to have some featured products, many customers tend to buy customized products instead of buying common ones in supermarket. The manufacturing enterprises, with the purpose of improving their competitiveness, are focusing on providing customized products with high quality and low cost as well. At present, how to produce customized products rapidly and cheaply has been the key challenge to manufacturing enterprises. In this paper, an intelligent modeling approach applied to supporting the modeling of customized products is proposed, which may improve the efficiency during the product design process. Specifically, the product gene (PG method, which is an analogy of biological evolution in engineering area, is employed to model products in a new way. Based on product gene, we focus on the intelligent modeling method to generate product schemes rapidly and automatically. The process of our research includes three steps: (1 develop a product gene model for customized products; (2 find the obtainment and storage method for product gene; and (3 propose a specific genetic algorithm used for calculating the solution of customized product and generating new product schemes. Finally, a case study is applied to test the usefulness of our study.

  5. Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

    Science.gov (United States)

    Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza

    2015-01-01

    Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.

  6. Algorithms for MDC-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles.

    Science.gov (United States)

    Yu, Yun; Warnow, Tandy; Nakhleh, Luay

    2011-11-01

    One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is Minimize Deep Coalescence (MDC). Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this article, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. In addition, we devise MDC-based algorithms for cases when multiple alleles per species may be sampled. We study the performance of these methods in coalescent-based computer simulations.

  7. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  8. Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

    Science.gov (United States)

    Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

    2006-11-01

    To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.

  9. Somatic mutations in the transcriptional corepressor gene BCORL1 in adult acute myelogenous leukemia

    OpenAIRE

    Li, Meng; Collins, Roxane; Jiao, Yuchen; Ouillette, Peter; Bixby, Dale; Erba, Harry; Vogelstein, Bert; Kinzler, Kenneth W.; Papadopoulos, Nickolas; Malek, Sami N.

    2011-01-01

    To further our understanding of the genetic basis of acute myelogenous leukemia (AML), we determined the coding exon sequences of ∼ 18 000 protein-encoding genes in 8 patients with secondary AML. Here we report the discovery of novel somatic mutations in the transcriptional corepressor gene BCORL1 that is located on the X-chromosome. Analysis of BCORL1 in an unselected cohort of 173 AML patients identified a total of 10 mutated cases (6%) with BCORL1 mutations, whereas analysis of 19 AML cell...

  10. Expression of a highly basic peroxidase gene in NaCl-adapted tomato cell suspensions.

    Science.gov (United States)

    Medina, M I; Botella, M A; Quesada, M A; Valpuesta, V

    1997-05-05

    A tomato peroxidase gene, TPX2, that is only weakly expressed in the roots of young tomato seedlings is highly expressed in tomato suspension cells adapted to high external NaCl concentration. The protein encoded by this gene, with an isolectric point value of approximately 9.6, is found in the culture medium of the growing cells. Our data suggest that the expression of TPX2 in the salt-adapted cells is not the result of the elicitation imposed by the in vitro culture or the presence of high NaCl concentration in the medium.

  11. The Arabidopsis co-expression tool (act): a WWW-based tool and database for microarray-based gene expression analysis

    DEFF Research Database (Denmark)

    Jen, C. H.; Manfield, I. W.; Michalopoulos, D. W.

    2006-01-01

    be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots......We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (act) , based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression...

  12. Inducible, tunable and multiplex human gene regulation using CRISPR-Cpf1-based transcription factors | Office of Cancer Genomics

    Science.gov (United States)

    Targeted and inducible regulation of mammalian gene expression is a broadly important research capability that may also enable development of novel therapeutics for treating human diseases. Here we demonstrate that a catalytically inactive RNA-guided CRISPR-Cpf1 nuclease fused to transcriptional activation domains can up-regulate endogenous human gene expression. We engineered drug-inducible Cpf1-based activators and show how this system can be used to tune the regulation of endogenous gene transcription in human cells.

  13. Identifying novel fruit-related genes in Arabidopsis thaliana based on the random walk with restart algorithm.

    Science.gov (United States)

    Zhang, Yunhua; Dai, Li; Liu, Ying; Zhang, YuHang; Wang, ShaoPeng

    2017-01-01

    Fruit is essential for plant reproduction and is responsible for protection and dispersal of seeds. The development and maturation of fruit is tightly regulated by numerous genetic factors that respond to environmental and internal stimulation. In this study, we attempted to identify novel fruit-related genes in a model organism, Arabidopsis thaliana, using a computational method. Based on validated fruit-related genes, the random walk with restart (RWR) algorithm was applied on a protein-protein interaction (PPI) network using these genes as seeds. The identified genes with high probabilities were filtered by the permutation test and linkage tests. In the permutation test, the genes that were selected due to the structure of the PPI network were discarded. In the linkage tests, the importance of each candidate gene was measured from two aspects: (1) its functional associations with validated genes and (2) its similarity with validated genes on gene ontology (GO) terms and KEGG pathways. Finally, 255 inferred genes were obtained, subsequent extensive analysis of important genes revealed that they mainly contribute to ubiquitination (UBQ9, UBQ8, UBQ11, UBQ10), serine hydroxymethyl transfer (SHM7, SHM5, SHM6) or glycol-metabolism (HXKL2_ARATH, CSY5, GAPCP1), suggesting essential roles during the development and maturation of fruit in Arabidopsis thaliana.

  14. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

    2013-01-01

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  15. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  16. Methodological issues in detecting gene-gene interactions in breast cancer susceptibility: a population-based study in Ontario

    Directory of Open Access Journals (Sweden)

    Onay Venus

    2007-08-01

    Full Text Available Abstract Background There is growing evidence that gene-gene interactions are ubiquitous in determining the susceptibility to common human diseases. The investigation of such gene-gene interactions presents new statistical challenges for studies with relatively small sample sizes as the number of potential interactions in the genome can be large. Breast cancer provides a useful paradigm to study genetically complex diseases because commonly occurring single nucleotide polymorphisms (SNPs may additively or synergistically disturb the system-wide communication of the cellular processes leading to cancer development. Methods In this study, we systematically studied SNP-SNP interactions among 19 SNPs from 18 key genes involved in major cancer pathways in a sample of 398 breast cancer cases and 372 controls from Ontario. We discuss the methodological issues associated with the detection of SNP-SNP interactions in this dataset by applying and comparing three commonly used methods: the logistic regression model, classification and regression trees (CART, and the multifactor dimensionality reduction (MDR method. Results Our analyses show evidence for several simple (two-way and complex (multi-way SNP-SNP interactions associated with breast cancer. For example, all three methods identified XPD-[Lys751Gln]*IL10-[G(-1082A] as the most significant two-way interaction. CART and MDR identified the same critical SNPs participating in complex interactions. Our results suggest that the use of multiple statistical approaches (or an integrated approach rather than a single methodology could be the best strategy to elucidate complex gene interactions that have generally very different patterns. Conclusion The strategy used here has the potential to identify complex biological relationships among breast cancer genes and processes. This will lead to the discovery of novel biological information, which will improve breast cancer risk management.

  17. An Individual-Based Diploid Model Predicts Limited Conditions Under Which Stochastic Gene Expression Becomes Advantageous

    KAUST Repository

    Matsumoto, Tomotaka

    2015-11-24

    Recent studies suggest the existence of a stochasticity in gene expression (SGE) in many organisms, and its non-negligible effect on their phenotype and fitness. To date, however, how SGE affects the key parameters of population genetics are not well understood. SGE can increase the phenotypic variation and act as a load for individuals, if they are at the adaptive optimum in a stable environment. On the other hand, part of the phenotypic variation caused by SGE might become advantageous if individuals at the adaptive optimum become genetically less-adaptive, for example due to an environmental change. Furthermore, SGE of unimportant genes might have little or no fitness consequences. Thus, SGE can be advantageous, disadvantageous, or selectively neutral depending on its context. In addition, there might be a genetic basis that regulates magnitude of SGE, which is often referred to as “modifier genes,” but little is known about the conditions under which such an SGE-modifier gene evolves. In the present study, we conducted individual-based computer simulations to examine these conditions in a diploid model. In the simulations, we considered a single locus that determines organismal fitness for simplicity, and that SGE on the locus creates fitness variation in a stochastic manner. We also considered another locus that modifies the magnitude of SGE. Our results suggested that SGE was always deleterious in stable environments and increased the fixation probability of deleterious mutations in this model. Even under frequently changing environmental conditions, only very strong natural selection made SGE adaptive. These results suggest that the evolution of SGE-modifier genes requires strict balance among the strength of natural selection, magnitude of SGE, and frequency of environmental changes. However, the degree of dominance affected the condition under which SGE becomes advantageous, indicating a better opportunity for the evolution of SGE in different genetic

  18. A strategy of gene overexpression based on tandem repetitive promoters in Escherichia coli

    Directory of Open Access Journals (Sweden)

    Li Mingji

    2012-02-01

    Full Text Available Abstract Background For metabolic engineering, many rate-limiting steps may exist in the pathways of accumulating the target metabolites. Increasing copy number of the desired genes in these pathways is a general method to solve the problem, for example, the employment of the multi-copy plasmid-based expression system. However, this method may bring genetic instability, structural instability and metabolic burden to the host, while integrating of the desired gene into the chromosome may cause inadequate transcription or expression. In this study, we developed a strategy for obtaining gene overexpression by engineering promoter clusters consisted of multiple core-tac-promoters (MCPtacs in tandem. Results Through a uniquely designed in vitro assembling process, a series of promoter clusters were constructed. The transcription strength of these promoter clusters showed a stepwise enhancement with the increase of tandem repeats number until it reached the critical value of five. Application of the MCPtacs promoter clusters in polyhydroxybutyrate (PHB production proved that it was efficient. Integration of the phaCAB genes with the 5CPtacs promoter cluster resulted in an engineered E.coli that can accumulate 23.7% PHB of the cell dry weight in batch cultivation. Conclusions The transcription strength of the MCPtacs promoter cluster can be greatly improved by increasing the tandem repeats number of the core-tac-promoter. By integrating the desired gene together with the MCPtacs promoter cluster into the chromosome of E. coli, we can achieve high and stale overexpression with only a small size. This strategy has an application potential in many fields and can be extended to other bacteria.

  19. Using FlyBase, a Database of Drosophila Genes and Genomes.

    Science.gov (United States)

    Marygold, Steven J; Crosby, Madeline A; Goodman, Joshua L

    2016-01-01

    For nearly 25 years, FlyBase (flybase.org) has provided a freely available online database of biological information about Drosophila species, focusing on the model organism D. melanogaster. The need for a centralized, integrated view of Drosophila research has never been greater as advances in genomic, proteomic, and high-throughput technologies add to the quantity and diversity of available data and resources.FlyBase has taken several approaches to respond to these changes in the research landscape. Novel report pages have been generated for new reagent types and physical interaction data; Drosophila models of human disease are now represented and showcased in dedicated Human Disease Model Reports; other integrated reports have been established that bring together related genes, datasets, or reagents; Gene Reports have been revised to improve access to new data types and to highlight functional data; links to external sites have been organized and expanded; and new tools have been developed to display and interrogate all these data, including improved batch processing and bulk file availability. In addition, several new community initiatives have served to enhance interactions between researchers and FlyBase, resulting in direct user contributions and improved feedback.This chapter provides an overview of the data content, organization, and available tools within FlyBase, focusing on recent improvements. We hope it serves as a guide for our diverse user base, enabling efficient and effective exploration of the database and thereby accelerating research discoveries.

  20. Transgenic Sugarcane Resistant to Sorghum mosaic virus Based on Coat Protein Gene Silencing by RNA Interference

    Directory of Open Access Journals (Sweden)

    Jinlong Guo

    2015-01-01

    Full Text Available As one of the critical diseases of sugarcane, sugarcane mosaic disease can lead to serious decline in stalk yield and sucrose content. It is mainly caused by Potyvirus sugarcane mosaic virus (SCMV and/or Sorghum mosaic virus (SrMV, with additional differences in viral strains. RNA interference (RNAi is a novel strategy for producing viral resistant plants. In this study, based on multiple sequence alignment conducted on genomic sequences of different strains and isolates of SrMV, the conserved region of coat protein (CP genes was selected as the target gene and the interference sequence with size of 423 bp in length was obtained through PCR amplification. The RNAi vector pGII00-HACP with an expression cassette containing both hairpin interference sequence and cp4-epsps herbicide-tolerant gene was transferred to sugarcane cultivar ROC22 via Agrobacterium-mediated transformation. After herbicide screening, PCR molecular identification, and artificial inoculation challenge, anti-SrMV positive transgenic lines were successfully obtained. SrMV resistance rate of the transgenic lines with the interference sequence was 87.5% based on SrMV challenge by artificial inoculation. The genetically modified SrMV-resistant lines of cultivar ROC22 provide resistant germplasm for breeding lines and can also serve as resistant lines having the same genetic background for study of resistance mechanisms.

  1. Bayesian inference based modelling for gene transcriptional dynamics by integrating multiple source of knowledge

    Directory of Open Access Journals (Sweden)

    Wang Shu-Qiang

    2012-07-01

    Full Text Available Abstract Background A key challenge in the post genome era is to identify genome-wide transcriptional regulatory networks, which specify the interactions between transcription factors and their target genes. Numerous methods have been developed for reconstructing gene regulatory networks from expression data. However, most of them are based on coarse grained qualitative models, and cannot provide a quantitative view of regulatory systems. Results A binding affinity based regulatory model is proposed to quantify the transcriptional regulatory network. Multiple quantities, including binding affinity and the activity level of transcription factor (TF are incorporated into a general learning model. The sequence features of the promoter and the possible occupancy of nucleosomes are exploited to estimate the binding probability of regulators. Comparing with the previous models that only employ microarray data, the proposed model can bridge the gap between the relative background frequency of the observed nucleotide and the gene's transcription rate. Conclusions We testify the proposed approach on two real-world microarray datasets. Experimental results show that the proposed model can effectively identify the parameters and the activity level of TF. Moreover, the kinetic parameters introduced in the proposed model can reveal more biological sense than previous models can do.

  2. Treatment planning of electroporation-based medical interventions: electrochemotherapy, gene electrotransfer and irreversible electroporation

    International Nuclear Information System (INIS)

    Zupanic, Anze; Kos, Bor; Miklavcic, Damijan

    2012-01-01

    In recent years, cancer electrochemotherapy (ECT), gene electrotransfer for gene therapy and DNA vaccination (GET) and tissue ablation with irreversible electroporation (IRE) have all entered clinical practice. We present a method for a personalized treatment planning procedure for ECT, GET and IRE, based on medical image analysis, numerical modelling of electroporation and optimization with the genetic algorithm, and several visualization tools for treatment plan assessment. Each treatment plan provides the attending physician with optimal positions of electrodes in the body and electric pulse parameters for optimal electroporation of the target tissues. For the studied case of a deep-seated tumour, the optimal treatment plans for ECT and IRE require at least two electrodes to be inserted into the target tissue, thus lowering the necessary voltage for electroporation and limiting damage to the surrounding healthy tissue. In GET, it is necessary to place the electrodes outside the target tissue to prevent damage to target cells intended to express the transfected genes. The presented treatment planning procedure is a valuable tool for clinical and experimental use and evaluation of electroporation-based treatments. (paper)

  3. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  4. Potential mechanisms for cell-based gene therapy to treat HIV/AIDS.

    Science.gov (United States)

    Herrera-Carrillo, Elena; Berkhout, Ben

    2015-02-01

    An estimated 35 million people are infected with HIV worldwide. Anti-retroviral therapy (ART) has reduced the morbidity and mortality of HIV-infected patients but efficacy requires strict adherence and the treatment is not curative. Most importantly, the emergence of drug-resistant virus strains and drug toxicity can restrict the long-term therapeutic efficacy in some patients. Therefore, novel treatment strategies that permanently control or eliminate the virus and restore the damaged immune system are required. Gene therapy against HIV infection has been the topic of intense investigations for the last two decades because it can theoretically provide such a durable anti-HIV control. In this review we discuss two major gene therapy strategies to combat HIV. One approach aims to kill HIV-infected cells and the other is based on the protection of cells from HIV infection. We discuss the underlying molecular mechanisms for candidate approaches to permanently block HIV infection, including the latest strategies and future therapeutic applications. Hematopoietic stem cell-based gene therapy for HIV/AIDS may eventually become an alternative for standard ART and should ideally provide a functional cure in which the virus is durably controlled without medication. Recent results from preclinical research and early-stage clinical trials support the feasibility and safety of this novel strategy.

  5. Zn(II)-dipicolylamine-based metallo-lipids as novel non-viral gene vectors.

    Science.gov (United States)

    Su, Rong-Chuan; Liu, Qiang; Yi, Wen-Jing; Zhao, Zhi-Gang

    2017-08-01

    In this study, a series of Zn(II)-dipicolylamine (Zn-DPA) based cationic lipids bearing different hydrophobic tails (long chains, α-tocopherol, cholesterol or diosgenin) were synthesized. Structure-activity relationship (SAR) of these lipids was studied in detail by investigating the effects of several structural aspects including the type of hydrophobic tails, the chain length and saturation degree. In addition, several assays were used to study their interactions with plasmid DNA, and results reveal that these lipids could condense DNA into nanosized particles with appropriate size and zeta-potentials. MTT-based cell viability assays showed that lipoplexes 5 had low cytotoxicity. The in vitro gene transfection studies showed the hydrophobic tails clearly affected the TE, and hexadecanol-containing lipid 5b gives the best TE, which was 2.2 times higher than bPEI 25k in the presence of 10% serum. The results not only demonstrate that these lipids might be promising non-viral gene vectors, but also afford us clues for further optimization of lipidic gene delivery materials.

  6. Preparation and Characterization of Gelatin-Based Mucoadhesive Nanocomposites as Intravesical Gene Delivery Scaffolds

    Directory of Open Access Journals (Sweden)

    Ching-Wen Liu

    2014-01-01

    Full Text Available This study aimed to develop optimal gelatin-based mucoadhesive nanocomposites as scaffolds for intravesical gene delivery to the urothelium. Hydrogels were prepared by chemically crosslinking gelatin A or B with glutaraldehyde. Physicochemical and delivery properties including hydration ratio, viscosity, size, yield, thermosensitivity, and enzymatic degradation were studied, and scanning electron microscopy (SEM was carried out. The optimal hydrogels (H, composed of 15% gelatin A175, displayed an 81.5% yield rate, 87.1% hydration ratio, 42.9 Pa·s viscosity, and 125.8 nm particle size. The crosslinking density of the hydrogels was determined by performing pronase degradation and ninhydrin assays. In vitro lentivirus (LV release studies involving p24 capsid protein analysis in 293T cells revealed that hydrogels containing lentivirus (H-LV had a higher cumulative release than that observed for LV alone (3.7-, 2.3-, and 2.3-fold at days 1, 3, and 5, resp.. Lentivirus from lentivector constructed green fluorescent protein (GFP was then entrapped in hydrogels (H-LV-GFP. H-LV-GFP showed enhanced gene delivery in AY-27 cells in vitro and to rat urothelium by intravesical instillation in vivo. Cystometrogram showed mucoadhesive H-LV reduced peak micturition and threshold pressure and increased bladder compliance. In this study, we successfully developed first optimal gelatin-based mucoadhesive nanocomposites as intravesical gene delivery scaffolds.

  7. Blood-based gene-expression predictors of PTSD risk and resilience among deployed marines: a pilot study.

    Science.gov (United States)

    Glatt, Stephen J; Tylee, Daniel S; Chandler, Sharon D; Pazol, Joel; Nievergelt, Caroline M; Woelk, Christopher H; Baker, Dewleen G; Lohr, James B; Kremen, William S; Litz, Brett T; Tsuang, Ming T

    2013-06-01

    Susceptibility to PTSD is determined by both genes and environment. Similarly, gene-expression levels in peripheral blood are influenced by both genes and environment, and expression levels of many genes show good correspondence between peripheral blood and brain. Therefore, our objectives were to test the following hypotheses: (1) pre-trauma expression levels of a gene subset (particularly immune-system genes) in peripheral blood would differ between trauma-exposed Marines who later developed PTSD and those who did not; (2) a predictive biomarker panel of the eventual emergence of PTSD among high-risk individuals could be developed based on gene expression in readily assessable peripheral blood cells; and (3) a predictive panel based on expression of individual exons would surpass the accuracy of a model based on expression of full-length gene transcripts. Gene-expression levels were assayed in peripheral blood samples from 50 U.S. Marines (25 eventual PTSD cases and 25 non-PTSD comparison subjects) prior to their deployment overseas to war-zones in Iraq or Afghanistan. The panel of biomarkers dysregulated in peripheral blood cells of eventual PTSD cases prior to deployment was significantly enriched for immune genes, achieved 70% prediction accuracy in an independent sample based on the expression of 23 full-length transcripts, and attained 80% accuracy in an independent sample based on the expression of one exon from each of five genes. If the observed profiles of pre-deployment mRNA-expression in eventual PTSD cases can be further refined and replicated, they could suggest avenues for early intervention and prevention among individuals at high risk for trauma exposure. Copyright © 2013 Wiley Periodicals, Inc.

  8. Integration of gene-based markers in a pearl millet genetic map for identification of candidate genes underlying drought tolerance quantitative trait loci

    Directory of Open Access Journals (Sweden)

    Sehgal Deepmala

    2012-01-01

    Full Text Available Abstract Background Identification of genes underlying drought tolerance (DT quantitative trait loci (QTLs will facilitate understanding of molecular mechanisms of drought tolerance, and also will accelerate genetic improvement of pearl millet through marker-assisted selection. We report a map based on genes with assigned functional roles in plant adaptation to drought and other abiotic stresses and demonstrate its use in identifying candidate genes underlying a major DT-QTL. Results Seventy five single nucleotide polymorphism (SNP and conserved intron spanning primer (CISP markers were developed from available expressed sequence tags (ESTs using four genotypes, H 77/833-2, PRLT 2/89-33, ICMR 01029 and ICMR 01004, representing parents of two mapping populations. A total of 228 SNPs were obtained from 30.5 kb sequenced region resulting in a SNP frequency of 1/134 bp. The positions of major pearl millet linkage group (LG 2 DT-QTLs (reported from crosses H 77/833-2 × PRLT 2/89-33 and 841B × 863B were added to the present consensus function map which identified 18 genes, coding for PSI reaction center subunit III, PHYC, actin, alanine glyoxylate aminotransferase, uridylate kinase, acyl-CoA oxidase, dipeptidyl peptidase IV, MADS-box, serine/threonine protein kinase, ubiquitin conjugating enzyme, zinc finger C- × 8-C × 5-C × 3-H type, Hd3, acetyl CoA carboxylase, chlorophyll a/b binding protein, photolyase, protein phosphatase1 regulatory subunit SDS22 and two hypothetical proteins, co-mapping in this DT-QTL interval. Many of these candidate genes were found to have significant association with QTLs of grain yield, flowering time and leaf rolling under drought stress conditions. Conclusions We have exploited available pearl millet EST sequences to generate a mapped resource of seventy five new gene-based markers for pearl millet and demonstrated its use in identifying candidate genes underlying a major DT-QTL in this species. The reported gene-based

  9. Comprehensive Protocols for CRISPR/Cas9-based Gene Editing in Human Pluripotent Stem Cells.

    Science.gov (United States)

    Santos, David P; Kiskinis, Evangelos; Eggan, Kevin; Merkle, Florian T

    2016-08-17

    Genome editing of human pluripotent stem cells (hPSCs) with the CRISPR/Cas9 system has the potential to revolutionize hPSC-based disease modeling, drug screening, and transplantation therapy. Here, we aim to provide a single resource to enable groups, even those with limited experience with hPSC culture or the CRISPR/Cas9 system, to successfully perform genome editing. The methods are presented in detail and are supported by a theoretical framework to allow for the incorporation of inevitable improvements in the rapidly evolving gene-editing field. We describe protocols to generate hPSC lines with gene-specific knock-outs, small targeted mutations, or knock-in reporters. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  10. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

    Science.gov (United States)

    Hiscock, D; Upton, C

    2000-05-01

    The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .

  11. Satellite DNA-based artificial chromosomes for use in gene therapy.

    Science.gov (United States)

    Hadlaczky, G

    2001-04-01

    Satellite DNA-based artificial chromosomes (SATACs) can be made by induced de novo chromosome formation in cells of different mammalian species. These artificially generated accessory chromosomes are composed of predictable DNA sequences and they contain defined genetic information. Prototype human SATACs have been successfully constructed in different cell types from 'neutral' endogenous DNA sequences from the short arm of the human chromosome 15. SATACs have already passed a number of hurdles crucial to their further development as gene therapy vectors, including: large-scale purification; transfer of purified artificial chromosomes into different cells and embryos; generation of transgenic animals and germline transmission with purified SATACs; and the tissue-specific expression of a therapeutic gene from an artificial chromosome in the milk of transgenic animals.

  12. P53 Gene Mutation as Biomarker of Radiation Induced Cell Injury and Genomic Instability

    International Nuclear Information System (INIS)

    Mukh-Syaifudin

    2006-01-01

    Gene expression profiling and its mutation has become one of the most widely used approaches to identify genes and their functions in the context of identify and categorize genes to be used as radiation effect markers including cell and tissue sensitivities. Ionizing radiation produces genetic damage and changes in gene expression that may lead to cancer due to specific protein that controlling cell proliferation altered the function, its expression or both. P53 protein encoded by p53 gene plays an important role in protecting cell by inducing growth arrest and or cell suicide (apoptosis) after deoxyribonucleic acid (DNA) damage induced by mutagen such as ionizing radiation. The mutant and thereby dysfunctional of this gene was found in more than 50% of various human cancers, but it is as yet unclear how p53 mutations lead to neoplastic development. Wild-type p53 has been postulated to play a role in DNA repair, suggesting that expression of mutant forms of p53 might alter cellular resistance to the DNA damage caused by radiation. Moreover, p53 is thought to function as a cell cycle checkpoint after irradiation, also suggesting that mutant p53 might change the cellular proliferative response to radiation. P53 mutations affect the cellular response to DNA damage, either by increasing DNA repair processes or, possibly, by increasing cellular tolerance to DNA damage. The association of p53 mutations with increased radioresistance suggests that alterations in the p53 gene might lead to oncogenic transformation. Current attractive model of carcinogenesis also showed that p53 gene is the major target of radiation. The majority of p53 mutations found so far is single base pair changes ( point mutations), which result in amino acid substitutions or truncated forms of the p53 protein, and are widely distributed throughout the evolutionary conserved regions of the gene. Examination of p53 mutations in human cancer also shows an association between particular carcinogens and

  13. Mesenchymal stem cell-based NK4 gene therapy in nude mice bearing gastric cancer xenografts

    Directory of Open Access Journals (Sweden)

    Zhu Y

    2014-12-01

    tissues after systemic injection. The microvessel density of tumor xenografts was decreased, and tumor cellular apoptosis was significantly induced in the mice treated with MSCs-NK4 compared to control mice. These findings demonstrate that MSC-based NK4 gene therapy can obviously inhibit the growth of gastric cancer xenografts, and MSCs are a better vehicle for NK4 gene therapy than lentiviral vectors. Further studies are warranted to explore the efficacy and safety of the MSC-based NK4 gene therapy in animals and cancer patients.Keywords: gastric cancer, gene therapy, tumor xenograft, hepatocyte growth factor, lentivirus, angiogenesis, apoptosis

  14. Shikonin enhances efficacy of a gene-based cancer vaccine via induction of RANTES

    Directory of Open Access Journals (Sweden)

    Chen Hui-Ming

    2012-04-01

    Full Text Available Abstract Background Shikonin, a phytochemical purified from Lithospermum erythrorhizon, has been shown to confer diverse pharmacological activities, including accelerating granuloma formation, wound healing, anti-inflammation and others, and is explored for immune-modifier activities for vaccination in this study. Transdermal gene-based vaccine is an attractive approach for delivery of DNA transgenes encoding specific tumor antigens to host skin tissues. Skin dendritic cells (DCs, a potent antigen-presenting cell type, is known to play a critical role in transmitting and orchestrating tumor antigen-specific immunities against cancers. The present study hence employs these various components for experimentation. Method The mRNA and protein expression of RANTES were detected by RT-PCR and ELISA, respectively. The regional expression of RANTES and tissue damage in test skin were evaluated via immunohistochemistry assay. Fluorescein isothiocyanate sensitization assay was performed to trace the trafficking of DCs from the skin vaccination site to draining lymph nodes. Adjuvantic effect of shikonin on gene gun-delivered human gp100 (hgp100 DNA cancer vaccine was studied in a human gp100-transfected B16 (B16/hgp100 tumor model. Results Among various phytochemicals tested, shikonin induced the highest level of expression of RANTES in normal skin tissues. In comparison, mouse RANTES cDNA gene transfection induced a higher level of mRANTES expression for a longer period, but caused more extensive skin damage. Topical application of shikonin onto the immunization site before gene gun-mediated vaccination augmented the population of skin DCs migrating into the draining lymph nodes. A hgp100 cDNA gene vaccination regimen with shikonin pretreatment as an adjuvant in a B16/hgp100 tumor model increased cytotoxic T lymphocyte activities in splenocytes and lymph node cells on target tumor cells. Conclusion Together, our findings suggest that shikonin can

  15. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    Science.gov (United States)

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based

  16. A candidate gene-based association study of tocopherol content and composition in rapeseed (Brassica napus

    Directory of Open Access Journals (Sweden)

    Steffi eFritsche

    2012-06-01

    Full Text Available Rapeseed (Brassica napus L. is the most important oil crop of temperate climates. Rapeseed oil contains tocopherols, also known as vitamin E, which is an indispensable nutrient for humans and animals due to its antioxidant and radical scavenging abilities. Moreover, tocopherols are also important for the oxidative stability of vegetable oils. Therefore, seed oil with increased tocopherol content or altered tocopherol composition is a target for breeding. We investigated the role of nucleotide variations within candidate genes from the tocopherol biosynthesis pathway. Field trials were carried out with 229 accessions from a worldwide B. napus collection which was divided into two panels of 96 and 133 accessions. Seed tocopherol content and composition were measured by HPLC. High heritabilities were found for both traits, ranging from 0.62 to 0.94. We identified polymorphisms by sequencing selected regions of the tocopherol genes from the 96 accession panel. Subsequently, we determined the population structure (Q and relative kinship (K as detected by genotyping with genome-wide distributed SSR markers. Association studies were performed using two models, the structure-based GLM+Q and the PK mixed model. Between 26 and 12 polymorphisms within two genes (BnaX.VTE3.a, BnaA.PDS1.c were significantly associated with tocopherol traits. The SNPs explained up to 16.93 % of the genetic variance for tocopherol composition and up to 10.48 % for total tocopherol content. Based on the sequence information we designed CAPS markers for genotyping the 133 accessions from the 2nd panel. Significant associations with various tocopherol traits confirmed the results from the first experiment. We demonstrate that the polymorphisms within the tocopherol genes clearly impact tocopherol content and composition in B. napus seeds. We suggest that these nucleotide variations may be used as selectable markers for breeding rapeseed with enhanced tocopherol quality.

  17. Systematics of Plant-Pathogenic and Related Streptomyces Species Based on Phylogenetic Analyses of Multiple Gene Loci

    Science.gov (United States)

    The 10 species of Streptomyces implicated as the etiological agents in scab disease of potatoes or soft rot disease of sweet potatoes are distributed among 7 different phylogenetic clades in analyses based on 16S rRNA gene sequences, but high sequence similarity of this gene among Streptomyces speci...

  18. Tsw gene-based resistance is triggered by a functional RNA silencing suppressor protein of the Tomato spotted wilt virus

    NARCIS (Netherlands)

    Ronde, de D.; Butterbach, P.B.E.; Lohuis, H.; Hedil, M.; Lent, van J.W.M.; Kormelink, R.J.M.

    2013-01-01

    As a result of contradictory reports, the avirulence (Avr) determinant that triggers Tsw gene-based resistance in Capsicum annuum against the Tomato spotted wilt virus (TSWV) is still unresolved. Here, the N and NSs genes of resistance-inducing (RI) and resistance-breaking (RB) isolates were cloned

  19. Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

    Directory of Open Access Journals (Sweden)

    Andrew J Burt

    Full Text Available Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris. Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08 where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.

  20. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis

    Directory of Open Access Journals (Sweden)

    Saurav Mallik

    2017-12-01

    Full Text Available For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures—weighted rank-based Jaccard and Cosine measures—and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm—RANWAR—was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  1. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis.

    Science.gov (United States)

    Mallik, Saurav; Zhao, Zhongming

    2017-12-28

    For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  2. Functional characterization of MAT1-1-specific mating-type genes in the homothallic ascomycete Sordaria macrospora provides new insights into essential and nonessential sexual regulators.

    Science.gov (United States)

    Klix, V; Nowrousian, M; Ringelberg, C; Loros, J J; Dunlap, J C; Pöggeler, S

    2010-06-01

    Mating-type genes in fungi encode regulators of mating and sexual development. Heterothallic ascomycete species require different sets of mating-type genes to control nonself-recognition and mating of compatible partners of different mating types. Homothallic (self-fertile) species also carry mating-type genes in their genome that are essential for sexual development. To analyze the molecular basis of homothallism and the role of mating-type genes during fruiting-body development, we deleted each of the three genes, SmtA-1 (MAT1-1-1), SmtA-2 (MAT1-1-2), and SmtA-3 (MAT1-1-3), contained in the MAT1-1 part of the mating-type locus of the homothallic ascomycete species Sordaria macrospora. Phenotypic analysis of deletion mutants revealed that the PPF domain protein-encoding gene SmtA-2 is essential for sexual reproduction, whereas the alpha domain protein-encoding genes SmtA-1 and SmtA-3 play no role in fruiting-body development. By means of cross-species microarray analysis using Neurospora crassa oligonucleotide microarrays hybridized with S. macrospora targets and quantitative real-time PCR, we identified genes expressed under the control of SmtA-1 and SmtA-2. Both genes are involved in the regulation of gene expression, including that of pheromone genes.

  3. AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density.

    Science.gov (United States)

    Zhao, X G; Dai, W; Li, Y; Tian, L

    2011-11-01

    The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no reliable numerical algorithm identifying optimal combination of a set of biomarkers to maximize the AUC, especially when the number of biomarkers is large. We have proposed a novel AUC-based statistical ensemble methods for combining multiple biomarkers to differentiate a binary response of interest. Specifically, we propose to replace the non-continuous and non-convex AUC objective function by a convex surrogate loss function, whose minimizer can be efficiently identified. With the established framework, the lasso and other regularization techniques enable feature selections. Extensive simulations have demonstrated the superiority of the new methods to the existing methods. The proposal has been applied to a gene expression dataset to construct gene expression scores to differentiate elderly women with low bone mineral density (BMD) and those with normal BMD. The AUCs of the resulting scores in the independent test dataset has been satisfactory. Aiming for directly maximizing AUC, the proposed AUC-based ensemble method provides an efficient means of generating a stable combination of multiple biomarkers, which is especially useful under the high-dimensional settings. lutian@stanford.edu. Supplementary data are available at Bioinformatics online.

  4. Genome-Wide Constitutively Expressed Gene Analysis and New Reference Gene Selection Based on Transcriptome Data: A Case Study from Poplar/Canker Disease Interaction

    Directory of Open Access Journals (Sweden)

    Jiaping Zhao

    2017-10-01

    Full Text Available A number of transcriptome datasets for differential expression (DE genes have been widely used for understanding organismal biology, but these datasets also contain untapped information that can be used to develop more precise analytical tools. With the use of transcriptome data generated from poplar/canker disease interaction system, we describe a methodology to identify candidate reference genes from high-throughput sequencing data. This methodology will improve the accuracy of RT-qPCR and will lead to better standards for the normalization of expression data. Expression stability analysis from xylem and phloem of Populus bejingensis inoculated with the fungal canker pathogen Botryosphaeria dothidea revealed that 729 poplar transcripts (1.11% were stably expressed, at a threshold level of coefficient of variance (CV of FPKM < 20% and maximum fold change (MFC of FPKM < 2.0. Expression stability and bioinformatics analysis suggested that commonly used house-keeping (HK genes were not the most appropriate internal controls: 70 of the 72 commonly used HK genes were not stably expressed, 45 of the 72 produced multiple isoform transcripts, and some of their reported primers produced unspecific amplicons in PCR amplification. RT-qPCR analysis to compare and evaluate the expression stability of 10 commonly used poplar HK genes and 20 of the 729 newly-identified stably expressed transcripts showed that some of the newly-identified genes (such as SSU_S8e, LSU_L5e, and 20S_PSU had higher stability ranking than most of commonly used HK genes. Based on these results, we recommend a pipeline for deriving reference genes from transcriptome data. An appropriate candidate gene should have a unique transcript, constitutive expression, CV value of expression < 20% (or possibly 30% and MFC value of expression <2, and an expression level of 50–1,000 units. Lastly, when four of the newly identified HK genes were used in the normalization of expression data for 20

  5. Prioritization of candidate genes for cattle reproductive traits, based on protein-protein interactions, gene expression, and text-mining

    DEFF Research Database (Denmark)

    Hulsegge, Ina; Woelders, Henri; Smits, Mari

    2013-01-01

    Reproduction is of significant economic importance in dairy cattle. Improved understanding of mechanisms that control estrous behavior and other reproduction traits could help in developing strategies to improve and/or monitor these traits. The objective of this study was to predict and rank gene...

  6. Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

    DEFF Research Database (Denmark)

    Ersland, Kari M; Christoforou, Andrea; Stansberg, Christine

    2012-01-01

    the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n...

  7. Microarray-based analysis of differential gene expression between infective and noninfective larvae of Strongyloides stercoralis.

    Directory of Open Access Journals (Sweden)

    Roshan Ramanathan

    2011-05-01

    Full Text Available Differences between noninfective first-stage (L1 and infective third-stage (L3i larvae of parasitic nematode Strongyloides stercoralis at the molecular level are relatively uncharacterized. DNA microarrays were developed and utilized for this purpose.Oligonucleotide hybridization probes for the array were designed to bind 3,571 putative mRNA transcripts predicted by analysis of 11,335 expressed sequence tags (ESTs obtained as part of the Nematode EST project. RNA obtained from S. stercoralis L3i and L1 was co-hybridized to each array after labeling the individual samples with different fluorescent tags. Bioinformatic predictions of gene function were developed using a novel cDNA Annotation System software. We identified 935 differentially expressed genes (469 L3i-biased; 466 L1-biased having two-fold expression differences or greater and microarray signals with a p value<0.01. Based on a functional analysis, L1 larvae have a larger number of genes putatively involved in transcription (p = 0.004, and L3i larvae have biased expression of putative heat shock proteins (such as hsp-90. Genes with products known to be immunoreactive in S. stercoralis-infected humans (such as SsIR and NIE had L3i biased expression. Abundantly expressed L3i contigs of interest included S. stercoralis orthologs of cytochrome oxidase ucr 2.1 and hsp-90, which may be potential chemotherapeutic targets. The S. stercoralis ortholog of fatty acid and retinol binding protein-1, successfully used in a vaccine against Ancylostoma ceylanicum, was identified among the 25 most highly expressed L3i genes. The sperm-containing glycoprotein domain, utilized in a vaccine against the nematode Cooperia punctata, was exclusively found in L3i biased genes and may be a valuable S. stercoralis target of interest.A new DNA microarray tool for the examination of S. stercoralis biology has been developed and provides new and valuable insights regarding differences between infective and

  8. A novel mutual information-based Boolean network inference method from time-series gene expression data.

    Directory of Open Access Journals (Sweden)

    Shohag Barman

    Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.

  9. A Peptide-based Vector for Efficient Gene Transfer In Vitro and In Vivo

    Science.gov (United States)

    Lehto, Taavi; Simonson, Oscar E; Mäger, Imre; Ezzat, Kariem; Sork, Helena; Copolovici, Dana-Maria; Viola, Joana R; Zaghloul, Eman M; Lundin, Per; Moreno, Pedro MD; Mäe, Maarja; Oskolkov, Nikita; Suhorutšenko, Julia; Smith, CI Edvard; Andaloussi, Samir EL

    2011-01-01

    Finding suitable nonviral delivery vehicles for nucleic acid–based therapeutics is a landmark goal in gene therapy. Cell-penetrating peptides (CPPs) are one class of delivery vectors that has been exploited for this purpose. However, since CPPs use endocytosis to enter cells, a large fraction of peptides remain trapped in endosomes. We have previously reported that stearylation of amphipathic CPPs, such as transportan 10 (TP10), dramatically increases transfection of oligonucleotides in vitro partially by promoting endosomal escape. Therefore, we aimed to evaluate whether stearyl-TP10 could be used for the delivery of plasmids as well. Our results demonstrate that stearyl-TP10 forms stable nanoparticles with plasmids that efficiently enter different cell-types in a ubiquitous manner, including primary cells, resulting in significantly higher gene expression levels than when using stearyl-Arg9 or unmodified CPPs. In fact, the transfection efficacy of stearyl-TP10 almost reached the levels of Lipofectamine 2000 (LF2000), however, without any of the observed lipofection-associated toxicities. Most importantly, stearyl-TP10/plasmid nanoparticles are nonimmunogenic, mediate efficient gene delivery in vivo, when administrated intramuscularly (i.m.) or intradermally (i.d.) without any associated toxicity in mice. PMID:21343913

  10. Biopolymer-Based Nanoparticles for Drug/Gene Delivery and Tissue Engineering

    Science.gov (United States)

    Nitta, Sachiko Kaihara; Numata, Keiji

    2013-01-01

    There has been a great interest in application of nanoparticles as biomaterials for delivery of therapeutic molecules such as drugs and genes, and for tissue engineering. In particular, biopolymers are suitable materials as nanoparticles for clinical application due to their versatile traits, including biocompatibility, biodegradability and low immunogenicity. Biopolymers are polymers that are produced from living organisms, which are classified in three groups: polysaccharides, proteins and nucleic acids. It is important to control particle size, charge, morphology of surface and release rate of loaded molecules to use biopolymer-based nanoparticles as drug/gene delivery carriers. To obtain a nano-carrier for therapeutic purposes, a variety of materials and preparation process has been attempted. This review focuses on fabrication of biocompatible nanoparticles consisting of biopolymers such as protein (silk, collagen, gelatin, β-casein, zein and albumin), protein-mimicked polypeptides and polysaccharides (chitosan, alginate, pullulan, starch and heparin). The effects of the nature of the materials and the fabrication process on the characteristics of the nanoparticles are described. In addition, their application as delivery carriers of therapeutic drugs and genes and biomaterials for tissue engineering are also reviewed. PMID:23344060

  11. Biopolymer-Based Nanoparticles for Drug/Gene Delivery and Tissue Engineering

    Directory of Open Access Journals (Sweden)

    Keiji Numata

    2013-01-01

    Full Text Available There has been a great interest in application of nanoparticles as biomaterials for delivery of therapeutic molecules such as drugs and genes, and for tissue engineering. In particular, biopolymers are suitable materials as nanoparticles for clinical application due to their versatile traits, including biocompatibility, biodegradability and low immunogenicity. Biopolymers are polymers that are produced from living organisms, which are classified in three groups: polysaccharides, proteins and nucleic acids. It is important to control particle size, charge, morphology of surface and release rate of loaded molecules to use biopolymer-based nanoparticles as drug/gene delivery carriers. To obtain a nano-carrier for therapeutic purposes, a variety of materials and preparation process has been attempted. This review focuses on fabrication of biocompatible nanoparticles consisting of biopolymers such as protein (silk, collagen, gelatin, β-casein, zein and albumin, protein-mimicked polypeptides and polysaccharides (chitosan, alginate, pullulan, starch and heparin. The effects of the nature of the materials and the fabrication process on the characteristics of the nanoparticles are described. In addition, their application as delivery carriers of therapeutic drugs and genes and biomaterials for tissue engineering are also reviewed.

  12. Use of reporter-gene based bacteria to quantify phenanthrene biodegradation and toxicity in soil

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Doyun [Department of Civil and Environmental Engineering, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of); Moon, Hee Sun [School of Earth and Environmental Science, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of); Lin, Chu-Ching; Barkay, Tamar [Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901 (United States); Nam, Kyoungphile, E-mail: kpnam@snu.ac.k [Department of Civil and Environmental Engineering, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of)

    2011-02-15

    A phenanthrene-degrading bacterium, Sphingomonas paucimobilis EPA505 was used to construct two fluorescence-based reporter strains. Strain D harboring gfp gene was constructed to generate green fluorescence when the strain started to biodegrade phenanthrene. Strain S possessing gef gene was designed to die once phenanthrene biodegradation was initiated and thus to lose green fluorescence when visualized by a live/dead cell staining. Confocal laser scanning microscopic observation followed by image analysis demonstrates that the fluorescence intensity generated by strain D increased and the intensity by strain S decreased linearly at the phenanthrene concentration of up to 200 mg/L. Such quantitative increase and decrease of fluorescence intensity in strain D (i.e., from 1 to 11.90 {+-} 0.72) and strain S (from 1 to 0.40 {+-} 0.07) were also evident in the presence of Ottawa sand spiked with the phenanthrene up to 1000 mg/kg. The potential use of the reporter strains in quantitatively determining biodegradable or toxic phenanthrene was discussed. - Research highlights: A novel reporter bacterial strain has been developed. The bacterium can quantitatively determine the change in fluorescence intensity. The intensity can represent the bioavailable phenanthrene in solid matrix. - A cell-killing gene harboring reporter bacterium shows phenanthrene toxicity.

  13. Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

    Science.gov (United States)

    Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

    2017-08-01

    Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.

  14. GeneTrailExpress: a web-based pipeline for the statistical evaluation of microarray experiments

    Directory of Open Access Journals (Sweden)

    Kohlbacher Oliver

    2008-12-01

    Full Text Available Abstract Background High-throughput methods that allow for measuring the expression of thousands of genes or proteins simultaneously have opened new avenues for studying biochemical processes. While the noisiness of the data necessitates an extensive pre-processing of the raw data, the high dimensionality requires effective statistical analysis methods that facilitate the identification of crucial biological features and relations. For these reasons, the evaluation and interpretation of expression data is a complex, labor-intensive multi-step process. While a variety of tools for normalizing, analysing, or visualizing expression profiles has been developed in the last years, most of these tools offer only functionality for accomplishing certain steps of the evaluation pipeline. Results Here, we present a web-based toolbox that provides rich functionality for all steps of the evaluation pipeline. Our tool GeneTrailExpress offers besides standard normalization procedures powerful statistical analysis methods for studying a large variety of biological categories and pathways. Furthermore, an integrated graph visualization tool, BiNA, enables the user to draw the relevant biological pathways applying cutting-edge graph-layout algorithms. Conclusion Our gene expression toolbox with its interactive visualization of the pathways and the expression values projected onto the nodes will simplify the analysis and interpretation of biochemical pathways considerably.

  15. Unveiling network-based functional features through integration of gene expression into protein networks.

    Science.gov (United States)

    Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali

    2018-06-01

    Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. Map-Based Cloning of the Gene Associated With the Soybean Maturity Locus E3

    Science.gov (United States)

    Watanabe, Satoshi; Hideshima, Rumiko; Xia, Zhengjun; Tsubokura, Yasutaka; Sato, Shusei; Nakamoto, Yumi; Yamanaka, Naoki; Takahashi, Ryoji; Ishimoto, Masao; Anai, Toyoaki; Tabata, Satoshi; Harada, Kyuya

    2009-01-01

    Photosensitivity plays an essential role in the response of plants to their changing environments throughout their life cycle. In soybean [Glycine max (L.) Merrill], several associations between photosensitivity and maturity loci are known, but only limited information at the molecular level is available. The FT3 locus is one of the quantitative trait loci (QTL) for flowering time that corresponds to the maturity locus E3. To identify the gene responsible for this QTL, a map-based cloning strategy was undertaken. One phytochrome A gene (GmPhyA3) was considered a strong candidate for the FT3 locus. Allelism tests and gene sequence comparisons showed that alleles of Misuzudaizu (FT3/FT3; JP28856) and Harosoy (E3/E3; PI548573) were identical. The GmPhyA3 alleles of Moshidou Gong 503 (ft3/ft3; JP27603) and L62-667 (e3/e3; PI547716) showed weak or complete loss of function, respectively. High red/far-red (R/FR) long-day conditions enhanced the effects of the E3/FT3 alleles in various genetic backgrounds. Moreover, a mutant line harboring the nonfunctional GmPhyA3 flowered earlier than the original Bay (E3/E3; PI553043) under similar conditions. These results suggest that the variation in phytochrome A may contribute to the complex systems of soybean flowering response and geographic adaptation. PMID:19474204

  17. Identification of New Single Nucleotide Polymorphism-Based Markers for Inter- and Intraspecies Discrimination of Obligate Bacterial Parasites (Pasteuria spp.) of Invertebrates ▿ †

    Science.gov (United States)

    Mauchline, Tim H.; Knox, Rachel; Mohan, Sharad; Powers, Stephen J.; Kerry, Brian R.; Davies, Keith G.; Hirsch, Penny R.

    2011-01-01

    Protein-encoding and 16S rRNA genes of Pasteuria penetrans populations from a wide range of geographic locations were examined. Most interpopulation single nucleotide polymorphisms (SNPs) were detected in the 16S rRNA gene. However, in order to fully resolve all populations, these were supplemented with SNPs from protein-encoding genes in a multilocus SNP typing approach. Examination of individual 16S rRNA gene sequences revealed the occurrence of “cryptic” SNPs which were not present in the consensus sequences of any P. penetrans population. Additionally, hierarchical cluster analysis separated P. penetrans 16S rRNA gene clones into four groups, and one of which contained sequences from the most highly passaged population, demonstrating that it is possible to manipulate the population structure of this fastidious bacterium. The other groups were made from representatives of the other populations in various proportions. Comparison of sequences among three Pasteuria species, namely, P. penetrans, P. hartismeri, and P. ramosa, showed that the protein-encoding genes provided greater discrimination than the 16S rRNA gene. From these findings, we have developed a toolbox for the discrimination of Pasteuria at both the inter- and intraspecies levels. We also provide a model to monitor genetic variation in other obligate hyperparasites and difficult-to-culture microorganisms. PMID:21803895

  18. Identification of new single nucleotide polymorphism-based markers for inter- and intraspecies discrimination of obligate bacterial parasites (Pasteuria spp.) of invertebrates.

    Science.gov (United States)

    Mauchline, Tim H; Knox, Rachel; Mohan, Sharad; Powers, Stephen J; Kerry, Brian R; Davies, Keith G; Hirsch, Penny R

    2011-09-01

    Protein-encoding and 16S rRNA genes of Pasteuria penetrans populations from a wide range of geographic locations were examined. Most interpopulation single nucleotide polymorphisms (SNPs) were detected in the 16S rRNA gene. However, in order to fully resolve all populations, these were supplemented with SNPs from protein-encoding genes in a multilocus SNP typing approach. Examination of individual 16S rRNA gene sequences revealed the occurrence of "cryptic" SNPs which were not present in the consensus sequences of any P. penetrans population. Additionally, hierarchical cluster analysis separated P. penetrans 16S rRNA gene clones into four groups, and one of which contained sequences from the most highly passaged population, demonstrating that it is possible to manipulate the population structure of this fastidious bacterium. The other groups were made from representatives of the other populations in various proportions. Comparison of sequences among three Pasteuria species, namely, P. penetrans, P. hartismeri, and P. ramosa, showed that the protein-encoding genes provided greater discrimination than the 16S rRNA gene. From these findings, we have developed a toolbox for the discrimination of Pasteuria at both the inter- and intraspecies levels. We also provide a model to monitor genetic variation in other obligate hyperparasites and difficult-to-culture microorganisms.

  19. Gene-ontology enrichment analysis in two independent family-based samples highlights biologically plausible processes for autism spectrum disorders.

    LENUS (Irish Health Repository)

    Anney, Richard J L

    2012-02-01

    Recent genome-wide association studies (GWAS) have implicated a range of genes from discrete biological pathways in the aetiology of autism. However, despite the strong influence of genetic factors, association studies have yet to identify statistically robust, replicated major effect genes or SNPs. We apply the principle of the SNP ratio test methodology described by O\\'Dushlaine et al to over 2100 families from the Autism Genome Project (AGP). Using a two-stage design we examine association enrichment in 5955 unique gene-ontology classifications across four groupings based on two phenotypic and two ancestral classifications. Based on estimates from simulation we identify excess of association enrichment across all analyses. We observe enrichment in association for sets of genes involved in diverse biological processes, including pyruvate metabolism, transcription factor activation, cell-signalling and cell-cycle regulation. Both genes and processes that show enrichment have previously been examined in autistic disorders and offer biologically plausibility to these findings.

  20. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  1. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  2. Detection of Fusarium verticillioides by PCR-ELISA based on FUM21 gene.

    Science.gov (United States)

    Omori, Aline Myuki; Ono, Elisabete Yurie Sataque; Bordini, Jaqueline Gozzi; Hirozawa, Melissa Tiemi; Fungaro, Maria Helena Pelegrinelli; Ono, Mario Augusto

    2018-08-01

    Fusarium verticillioides is a primary corn pathogen and fumonisin producer which is associated with toxic effects in humans and animals. The traditional methods for detection of fungal contamination based on morphological characteristics are time-consuming and show low sensitivity and specificity. Therefore, the objective of this study was to develop a PCR-ELISA based on the FUM21 gene for F. verticillioides detection. The DNA of the F. verticillioides, Fusarium sp., Aspergillus sp. and Penicillium sp. isolates was analyzed by conventional PCR and PCR-ELISA to determine the specificity. The PCR-ELISA was specific to F. verticillioides isolates, showed a 2.5 pg detection limit and was 100-fold more sensitive than conventional PCR. In corn samples inoculated with F. verticillioides conidia, the detection limit of the PCR-ELISA was 1 × 10 4 conidia/g and was also 100-fold more sensitive than conventional PCR. Naturally contaminated corn samples were analyzed by PCR-ELISA based on the FUM21 gene and PCR-ELISA absorbance values correlated positively (p PCR-ELISA developed in this study can be useful for F. verticillioides detection in corn samples. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

    Science.gov (United States)

    Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

    2016-02-01

    Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs

    Directory of Open Access Journals (Sweden)

    Xue Fuzhong

    2010-01-01

    Full Text Available Abstract Background Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s which usually involves multiple single-nucleotide polymorphisms (SNPs available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (HWE and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (PCA are preferable in this regard but their performance varies with methods of extracting principal components (PCs. Results PCA-based bootstrap confidence interval test (PCA-BCIT, which directly uses the PC scores to assess gene-disease association, was developed and evaluated for three ways of extracting PCs, i.e., cases only(CAES, controls only(COES and cases and controls combined(CES. Extraction of PCs with COES is preferred to that with CAES and CES. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test. Conclusions PCA-BCIT is a valid and powerful method for assessing gene-disease association involving multiple SNPs.

  5. Amino acid-substituted gemini surfactant-based nanoparticles as safe and versatile gene delivery agents.

    Science.gov (United States)

    Singh, Jagbir; Yang, Peng; Michel, Deborah; Verrall, Ronald E; Foldvari, Marianna; Badea, Ildiko

    2011-05-01

    Gene based therapy represents an important advance in the treatment of diseases that heretofore have had either no treatment or cure. To capitalize on the true potential of gene therapy, there is a need to develop better delivery systems that can protect these therapeutic biomolecules and deliver them safely to the target sites. Recently, we have designed and developed a series of novel amino acid-substituted gemini surfactants with the general chemical formula C(12)H(25) (CH(3))(2)N(+)-(CH(2))(3)-N(AA)-(CH(2))(3)-N(+) (CH(3))(2)-C(12)H(25) (AA= glycine, lysine, glycyl-lysine and, lysyl-lysine). These compounds were synthesized and tested in rabbit epithelial cells using a model plasmid and a helper lipid. Plasmid/gemini/lipid (P/G/L) nanoparticles formulated using these novel compounds achieved higher gene expression than the nanoparticles containing the parent unsubstituted compound. In this study, we evaluated the cytotoxicity of P/G/L nanoparticles and explored the relationship between transfection efficiency/toxicity and their physicochemical characteristics (such as size, binding properties, etc.). An overall low toxicity is observed for all complexes with no significant difference among substituted and unsubstituted compounds. An interesting result revealed by the dye exclusion assay suggests a more balanced protection of the DNA by the glycine and glycyl-lysine substituted compounds. Thus, the higher transfection efficiency is attributed to the greater biocompatibility and flexibility of the amino acid/peptide-substituted gemini surfactants and demonstrates the feasibility of using amino acid-substituted gemini surfactants as gene carriers for the treatment of diseases affecting epithelial tissue.

  6. Recurrent neural network based hybrid model for reconstructing gene regulatory network.

    Science.gov (United States)

    Raza, Khalid; Alam, Mansaf

    2016-10-01

    One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. A new measure for gene expression biclustering based on non-parametric correlation.

    Science.gov (United States)

    Flores, Jose L; Inza, Iñaki; Larrañaga, Pedro; Calvo, Borja

    2013-12-01

    One of the emerging techniques for performing the analysis of the DNA microarray data known as biclustering is the search of subsets of genes and conditions which are coherently expressed. These subgroups provide clues about the main biological processes. Until now, different approaches to this problem have been proposed. Most of them use the mean squared residue as quality measure but relevant and interesting patterns can not be detected such as shifting, or scaling patterns. Furthermore, recent papers show that there exist new coherence patterns involved in different kinds of cancer and tumors such as inverse relationships between genes which can not be captured. The proposed measure is called Spearman's biclustering measure (SBM) which performs an estimation of the quality of a bicluster based on the non-linear correlation among genes and conditions simultaneously. The search of biclusters is performed by using a evolutionary technique called estimation of distribution algorithms which uses the SBM measure as fitness function. This approach has been examined from different points of view by using artificial and real microarrays. The assessment process has involved the use of quality indexes, a set of bicluster patterns of reference including new patterns and a set of statistical tests. It has been also examined the performance using real microarrays and comparing to different algorithmic approaches such as Bimax, CC, OPSM, Plaid and xMotifs. SBM shows several advantages such as the ability to recognize more complex coherence patterns such as shifting, scaling and inversion and the capability to selectively marginalize genes and conditions depending on the statistical significance. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  8. Lineage relationship of prostate cancer cell types based on gene expression

    Directory of Open Access Journals (Sweden)

    Ware Carol B

    2011-05-01

    Full Text Available Abstract Background Prostate tumor heterogeneity is a major factor in disease management. Heterogeneity could be due to multiple cancer cell types with distinct gene expression. Of clinical importance is the so-called cancer stem cell type. Cell type-specific transcriptomes are used to examine lineage relationship among cancer cell types and their expression similarity to normal cell types including stem/progenitor cells. Methods Transcriptomes were determined by Affymetrix DNA array analysis for the following cell types. Putative prostate progenitor cell populations were characterized and isolated by expression of the membrane transporter ABCG2. Stem cells were represented by embryonic stem and embryonal carcinoma cells. The cancer cell types were Gleason pattern 3 (glandular histomorphology and pattern 4 (aglandular sorted from primary tumors, cultured prostate cancer cell lines originally established from metastatic lesions, xenografts LuCaP 35 (adenocarcinoma phenotype and LuCaP 49 (neuroendocrine/small cell carcinoma grown in mice. No detectable gene expression differences were detected among serial passages of the LuCaP xenografts. Results Based on transcriptomes, the different cancer cell types could be clustered into a luminal-like grouping and a non-luminal-like (also not basal-like grouping. The non-luminal-like types showed expression more similar to that of stem/progenitor cells than the luminal-like types. However, none showed expression of stem cell genes known to maintain stemness. Conclusions Non-luminal-like types are all representatives of aggressive disease, and this could be attributed to the similarity in overall gene expression to stem and progenitor cell types.

  9. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  10. Gene expression based evidence of innate immune response activation in the epithelium with oral lichen planus

    Science.gov (United States)

    Adami, Guy R.; Yeung, Alexander C.F.; Stucki, Grant; Kolokythas, Antonia; Sroussi, Herve Y.; Cabay, Robert J.; Kuzin, Igor; Schwartz, Joel L.

    2014-01-01

    Objective Oral lichen planus (OLP) is a disease of the oral mucosa of unknown cause producing lesions with an intense band-like inflammatory infiltrate of T cells to the subepithelium and keratinocyte cell death. We performed gene expression analysis of the oral epithelium of lesions in subjects with OLP and its sister disease, oral lichenoid reaction (OLR), in order to better understand the role of the keratinocytes in these diseases. Design Fourteen patients with OLP or OLR were included in the study, along with a control group of 23 subjects with a variety of oral diseases and a normal group of 17 subjects with no clinically visible mucosal abnormalities. Various proteins have been associated with OLP, based on detection of secreted proteins or changes in RNA levels in tissue samples consisting of epithelium, stroma, and immune cells. The mRNA level of twelve of these genes expressed in the epithelium was tested in the three groups. Results Four genes showed increased expression in the epithelium of OLP patients: CD14, CXCL1, IL8, and TLR1, and at least two of these proteins, TLR1 and CXCL1, were expressed at substantial levels in oral keratinocytes. Conclusions Because of the large accumulation of T cells in lesions of OLP it has long been thought to be an adaptive immunity malfunction. We provide evidence that there is increased expression of innate immune genes in the epithelium with this illness, suggesting a role for this process in the disease and a possible target for treatment. PMID:24581860

  11. KMgene: a unified R package for gene-based association analysis for complex traits.

    Science.gov (United States)

    Yan, Qi; Fang, Zhou; Chen, Wei; Stegle, Oliver

    2018-02-09

    In this report, we introduce an R package KMgene for performing gene-based association tests for familial, multivariate or longitudinal traits using kernel machine (KM) regression under a generalized linear mixed model (GLMM) framework. Extensive simulations were performed to evaluate the validity of the approaches implemented in KMgene. http://cran.r-project.org/web/packages/KMgene. qi.yan@chp.edu or wei.chen@chp.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.

  12. Genetic characterization of Italian field strains of Schmallenberg virus based on N and NSs genes.

    Science.gov (United States)

    Izzo, Francesca; Cosseddu, Gian Mario; Polci, Andrea; Iapaolo, Federica; Pinoni, Chiara; Capobianco Dondona, Andrea; Valleriani, Fabrizia; Monaco, Federica

    2016-08-01

    Following its first identification in Germany in 2011, the Schmallenberg virus (SBV) has rapidly spread to many other European countries. Despite the wide dissemination, the molecular characterization of the circulating strains is limited to German, Belgian, Dutch, and Swiss viruses. To fill this gap, partial genetic characterization of 15 Italian field strains was performed, based on S segment genes. Samples were collected in 2012 in two different regions where outbreaks occurred during distinct epidemic seasons. The comparative sequence analysis demonstrated a high molecular stability of the circulating viruses; nevertheless, we identified several variants of the N and NSs proteins not described in other SBV isolates circulating in Europe.

  13. Applications of gene-based technologies for improving animal production and health in developing countries

    International Nuclear Information System (INIS)

    Makkar, H.P.S.; Viljoen, G.J.

    2005-01-01

    This book provides a compilation of peer-reviewed scientific contributions from authoritative researchers attending an international symposium convened by the Animal Production and Health Sub-programme of the Animal Production and Health (APH), Joint FAO/IAEA Programme in cooperation with the Animal Production and Health Division of the FAO. These Proceedings contain invaluable information on the role and future potential of gene-based technologies for improving animal production and health, possible applications and constraints in the use of this technology in developing countries and their specific research needs

  14. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    Science.gov (United States)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  15. A Cas9-based toolkit to program gene expression in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Apel, Amanda Reider; d'Espaux, Leo; Wehrs, Maren

    2017-01-01

    of these parts via a web-based tool, that automates the generation of DNA fragments for integration. Our system builds upon existing gene editing methods in the thoroughness with which the parts are standardized and characterized, the types and number of parts available and the ease with which our methodology...... can be used to perform genetic edits in yeast. We demonstrated the applicability of this toolkit by optimizing the expression of a challenging but industrially important enzyme, taxadiene synthase (TXS). This approach enabled us to diagnose an issue with TXS solubility, the resolution of which yielded...

  16. Finding differentially expressed genes in high dimensional data: Rank based test statistic via a distance measure.

    Science.gov (United States)

    Mathur, Sunil; Sadana, Ajit

    2015-12-01

    We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.

  17. Sensitive detection of novel Indian isolate of BTV 21 using ns1 gene based real-time PCR assay

    Directory of Open Access Journals (Sweden)

    Gaya Prasad

    2013-06-01

    Full Text Available Aim: The study was conducted to develop ns1 gene based sensitive real-time RT-PCR assay for diagnosis of India isolates of bluetongue virus (BTV. Materials and Methods: The BTV serotype 21 isolate (KMNO7 was isolated from Andhra Pradesh and propagated in BHK-21 cell line in our laboratory. The Nucleic acid (dsRNA of virus was extracted using Trizol method and cDNA was prepared using a standard protocol. The cDNA was allowed to ns1 gene based group specific PCR to confirm the isolate as BTV. The viral RNA was diluted 10 folds and the detection limit of ns1 gene based RT-PCR was determined. Finally the tenfold diluted viral RNA was subjected to real-time RT-PCR using ns1 gene primer and Taq man probe to standardized the reaction and determine the detection limit. Results: The ns1 gene based group specific PCR showed a single 366bp amplicon in agarose gel electrophoresis confirmed the sample as BTV. The ns1 gene RT-PCR using tenfold diluted viral RNA showed the detection limit of 70.0 fg in 1%agarose gel electrophoresis. The ns1 gene based real time RT-PCR was successfully standardized and the detection limit was found to be 7.0 fg. Conclusion: The ns1 gene based real-time RT-PCR was successfully standardized and it was found to be 10 times more sensitive than conventional RT-PCR. Key words: bluetongue, BTV21, RT-PCR, Real time RT-PCR, ns1 gene [Vet World 2013; 6(8.000: 554-557

  18. Side-by-side comparison of gene-based smallpox vaccine with MVA in nonhuman primates.

    Science.gov (United States)

    Golden, Joseph W; Josleyn, Matthew; Mucker, Eric M; Hung, Chien-Fu; Loudon, Peter T; Wu, T C; Hooper, Jay W

    2012-01-01

    Orthopoxviruses remain a threat as biological weapons and zoonoses. The licensed live-virus vaccine is associated with serious health risks, making its general usage unacceptable. Attenuated vaccines are being developed as alternatives, the most advanced of which is modified-vaccinia virus Ankara (MVA). We previously developed a gene-based vaccine, termed 4pox, which targets four orthopoxvirus antigens, A33, B5, A27 and L1. This vaccine protects mice and non-human primates from lethal orthopoxvirus disease. Here, we investigated the capacity of the molecular adjuvants GM-CSF and Escherichia coli heat-labile enterotoxin (LT) to enhance the efficacy of the 4pox gene-based vaccine. Both adjuvants significantly increased protective antibody responses in mice. We directly compared the 4pox plus LT vaccine against MVA in a monkeypox virus (MPXV) nonhuman primate (NHP) challenge model. NHPs were vaccinated twice with MVA by intramuscular injection or the 4pox/LT vaccine delivered using a disposable gene gun device. As a positive control, one NHP was vaccinated with ACAM2000. NHPs vaccinated with each vaccine developed anti-orthopoxvirus antibody responses, including those against the 4pox antigens. After MPXV intravenous challenge, all control NHPs developed severe disease, while the ACAM2000 vaccinated animal was well protected. All NHPs vaccinated with MVA were protected from lethality, but three of five developed severe disease and all animals shed virus. All five NHPs vaccinated with 4pox/LT survived and only one developed severe disease. None of the 4pox/LT-vaccinated animals shed virus. Our findings show, for the first time, that a subunit orthopoxvirus vaccine delivered by the same schedule can provide a degree of protection at least as high as that of MVA.

  19. Side-by-side comparison of gene-based smallpox vaccine with MVA in nonhuman primates.

    Directory of Open Access Journals (Sweden)

    Joseph W Golden

    Full Text Available Orthopoxviruses remain a threat as biological weapons and zoonoses. The licensed live-virus vaccine is associated with serious health risks, making its general usage unacceptable. Attenuated vaccines are being developed as alternatives, the most advanced of which is modified-vaccinia virus Ankara (MVA. We previously developed a gene-based vaccine, termed 4pox, which targets four orthopoxvirus antigens, A33, B5, A27 and L1. This vaccine protects mice and non-human primates from lethal orthopoxvirus disease. Here, we investigated the capacity of the molecular adjuvants GM-CSF and Escherichia coli heat-labile enterotoxin (LT to enhance the efficacy of the 4pox gene-based vaccine. Both adjuvants significantly increased protective antibody responses in mice. We directly compared the 4pox plus LT vaccine against MVA in a monkeypox virus (MPXV nonhuman primate (NHP challenge model. NHPs were vaccinated twice with MVA by intramuscular injection or the 4pox/LT vaccine delivered using a disposable gene gun device. As a positive control, one NHP was vaccinated with ACAM2000. NHPs vaccinated with each vaccine developed anti-orthopoxvirus antibody responses, including those against the 4pox antigens. After MPXV intravenous challenge, all control NHPs developed severe disease, while the ACAM2000 vaccinated animal was well protected. All NHPs vaccinated with MVA were protected from lethality, but three of five developed severe disease and all animals shed virus. All five NHPs vaccinated with 4pox/LT survived and only one developed severe disease. None of the 4pox/LT-vaccinated animals shed virus. Our findings show, for the first time, that a subunit orthopoxvirus vaccine delivered by the same schedule can provide a degree of protection at least as high as that of MVA.

  20. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  1. Gene-based vaccine development for improving animal production in developing countries. Possibilities and constraints

    International Nuclear Information System (INIS)

    Egerton, J.R.

    2005-01-01

    For vaccine production, recombinant antigens must be protective. Identifying protective antigens or candidate antigens is an essential precursor to vaccine development. Even when a protective antigen has been identified, cloning of its gene does not lead directly to vaccine development. The fimbrial protein of Dichelobacter nodosus, the agent of foot-rot in ruminants, was known to be protective. Recombinant vaccines against this infection are ineffective if expressed protein subunits are not assembled as mature fimbriae. Antigenic competition between different, but closely related, recombinant antigens limited the use of multivalent vaccines based on this technology. Recombinant antigens may need adjuvants to enhance response. DNA vaccines, potentiated with genes for different cytokines, may replace the need for aggressive adjuvants, and especially where cellular immunity is essential for protection. The expression of antigens from animal pathogens in plants and the demonstration of some immunity to a disease like rinderpest after ingestion of these, suggests an alternative approach to vaccination by injection. Research on disease pathogenesis and the identification of candidate antigens is specific to the disease agent. The definition of expression systems and the formulation of a vaccine for each disease must be followed by research to establish safety and efficacy. Where vaccines are based on unique gene sequences, the intellectual property is likely to be protected by patent. Organizations, licensed to produce recombinant vaccines, expect to recover their costs and to make a profit. The consequence is that genetically-derived vaccines are expensive. The capacity of vaccines to help animal owners of poorer countries depends not only on quality and cost but also on the veterinary infrastructure where they are used. Ensuring the existence of an effective animal health infrastructure in developing countries is as great a challenge for the developed world as

  2. Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

    Science.gov (United States)

    Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

    2018-04-01

    The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.

  3. Pea Marker Database (PMD) - A new online database combining known pea (Pisum sativum L.) gene-based markers.

    Science.gov (United States)

    Kulaeva, Olga A; Zhernakov, Aleksandr I; Afonin, Alexey M; Boikov, Sergei S; Sulima, Anton S; Tikhonovich, Igor A; Zhukov, Vladimir A

    2017-01-01

    Pea (Pisum sativum L.) is the oldest model object of plant genetics and one of the most agriculturally important legumes in the world. Since the pea genome has not been sequenced yet, identification of genes responsible for mutant phenotypes or desirable agricultural traits is usually performed via genetic mapping followed by candidate gene search. Such mapping is best carried out using gene-based molecular markers, as it opens the possibility for exploiting genome synteny between pea and its close relative Medicago truncatula Gaertn., possessing sequenced and annotated genome. In the last 5 years, a large number of pea gene-based molecular markers have been designed and mapped owing to the rapid evolution of "next-generation sequencing" technologies. However, the access to the complete set of markers designed worldwide is limited because the data are not uniformed and therefore hard to use. The Pea Marker Database was designed to combine the information about pea markers in a form of user-friendly and practical online tool. Version 1 (PMD1) comprises information about 2484 genic markers, including their locations in linkage groups, the sequences of corresponding pea transcripts and the names of related genes in M. truncatula. Version 2 (PMD2) is an updated version comprising 15944 pea markers in the same format with several advanced features. To test the performance of the PMD, fine mapping of pea symbiotic genes Sym13 and Sym27 in linkage groups VII and V, respectively, was carried out. The results of mapping allowed us to propose the Sen1 gene (a homologue of SEN1 gene of Lotus japonicus (Regel) K. Larsen) as the best candidate gene for Sym13, and to narrow the list of possible candidate genes for Sym27 to ten, thus proving PMD to be useful for pea gene mapping and cloning. All information contained in PMD1 and PMD2 is available at www.peamarker.arriam.ru.

  4. Area-Specific Cell Stimulation via Surface-Mediated Gene Transfer Using Apatite-Based Composite Layers

    Directory of Open Access Journals (Sweden)

    Yushin Yazaki

    2015-04-01

    Full Text Available Surface-mediated gene transfer systems using biocompatible calcium phosphate (CaP-based composite layers have attracted attention as a tool for controlling cell behaviors. In the present study we aimed to demonstrate the potential of CaP-based composite layers to mediate area-specific dual gene transfer and to stimulate cells on an area-by-area basis in the same well. For this purpose we prepared two pairs of DNA–fibronectin–apatite composite (DF-Ap layers using a pair of reporter genes and pair of differentiation factor genes. The results of the area-specific dual gene transfer successfully demonstrated that the cells cultured on a pair of DF-Ap layers that were adjacently placed in the same well showed specific gene expression patterns depending on the gene that was immobilized in theunderlying layer. Moreover, preliminary real-time PCR results indicated that multipotential C3H10T1/2 cells may have a potential to change into different types of cells depending on the differentiation factor gene that was immobilized in the underlying layer, even in the same well. Because DF-Ap layers have a potential to mediate area-specific cell stimulation on their surfaces, they could be useful in tissue engineering applications.

  5. Microarray-Based Gene Expression Profiling to Elucidate Effectiveness of Fermented Codonopsis lanceolata in Mice

    Directory of Open Access Journals (Sweden)

    Woon Yong Choi

    2014-04-01

    Full Text Available In this study, the effect of Codonopsis lanceolata fermented by lactic acid on controlling gene expression levels related to obesity was observed in an oligonucleotide chip microarray. Among 8170 genes, 393 genes were up regulated and 760 genes were down regulated in feeding the fermented C. lanceolata (FCL. Another 374 genes were up regulated and 527 genes down regulated without feeding the sample. The genes were not affected by the FCL sample. It was interesting that among those genes, Chytochrome P450, Dmbt1, LOC76487, and thyroid hormones, etc., were mostly up or down regulated. These genes are more related to lipid synthesis. We could conclude that the FCL possibly controlled the gene expression levels related to lipid synthesis, which resulted in reducing obesity. However, more detailed protein expression experiments should be carried out.

  6. A gene-based analysis of variants in the serum/glucocorticoid regulated kinase (SGK genes with blood pressure responses to sodium intake: the GenSalt Study.

    Directory of Open Access Journals (Sweden)

    Changwei Li

    Full Text Available Serum and glucocorticoid regulated kinase (SGK plays a critical role in the regulation of renal sodium transport. We examined the association between SGK genes and salt sensitivity of blood pressure (BP using single-marker and gene-based association analysis.A 7-day low-sodium (51.3 mmol sodium/day followed by a 7-day high-sodium intervention (307.8 mmol sodium/day was conducted among 1,906 Chinese participants. BP measurements were obtained at baseline and each intervention using a random-zero sphygmomanometer. Additive associations between each SNP and salt-sensitivity phenotypes were assessed using a mixed linear regression model to account for family dependencies. Gene-based analyses were conducted using the truncated p-value method. The Bonferroni-method was used to adjust for multiple testing in all analyses.In single-marker association analyses, SGK1 marker rs2758151 was significantly associated with diastolic BP (DBP response to high-sodium intervention (P = 0.0010. DBP responses (95% confidence interval to high-sodium intervention for genotypes C/C, C/T, and T/T were 2.04 (1.57 to 2.52, 1.79 (1.42 to 2.16, and 0.85 (0.30 to 1.41 mmHg, respectively. Similar trends were observed for SBP and MAP responses although not significant (P = 0.15 and 0.0026, respectively. In addition, gene-based analyses demonstrated significant associations between SGK1 and SBP, DBP and MAP responses to high sodium intervention (P = 0.0002, 0.0076, and 0.00001, respectively. Neither SGK2 nor SGK3 were associated with the salt-sensitivity phenotypes in single-maker or gene-based analyses.The current study identified association of the SGK1 gene and BP salt-sensitivity in the Han Chinese population. Further studies are warranted to identify causal SGK1 gene variants.

  7. Association Study between BDNF Gene Polymorphisms and Autism by Three-Dimensional Gel-Based Microarray

    Directory of Open Access Journals (Sweden)

    Zuhong Lu

    2009-06-01

    Full Text Available Single nucleotide polymorphisms (SNPs are important markers which can be used in association studies searching for susceptible genes of complex diseases. High-throughput methods are needed for SNP genotyping in a large number of samples. In this study, we applied polyacrylamide gel-based microarray combined with dual-color hybridization for association study of four BDNF polymorphisms with autism. All the SNPs in both patients and controls could be analyzed quickly and correctly. Among four SNPs, only C270T polymorphism showed significant differences in the frequency of the allele (χ2 = 7.809, p = 0.005 and genotype (χ2 = 7.800, p = 0.020. In the haplotype association analysis, there was significant difference in global haplotype distribution between the groups (χ2 = 28.19,p = 3.44e-005. We suggest that BDNF has a possible role in the pathogenesis of autism. The study also show that the polyacrylamide gel-based microarray combined with dual-color hybridization is a rapid, simple and high-throughput method for SNPs genotyping, and can be used for association study of susceptible gene with disorders in large samples.

  8. Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease.

    Science.gov (United States)

    Azuaje, Francisco; Zheng, Huiru; Camargo, Anyela; Wang, Haiying

    2011-08-01

    The discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive and functional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms. Copyright © 2011 Elsevier Inc. All rights reserved.

  9. Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

    Science.gov (United States)

    Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre

    2009-01-01

    Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932

  10. Optimal consistency in microRNA expression analysis using reference-gene-based normalization.

    Science.gov (United States)

    Wang, Xi; Gardiner, Erin J; Cairns, Murray J

    2015-05-01

    Normalization of high-throughput molecular expression profiles secures differential expression analysis between samples of different phenotypes or biological conditions, and facilitates comparison between experimental batches. While the same general principles apply to microRNA (miRNA) normalization, there is mounting evidence that global shifts in their expression patterns occur in specific circumstances, which pose a challenge for normalizing miRNA expression data. As an alternative to global normalization, which has the propensity to flatten large trends, normalization against constitutively expressed reference genes presents an advantage through their relative independence. Here we investigated the performance of reference-gene-based (RGB) normalization for differential miRNA expression analysis of microarray expression data, and compared the results with other normalization methods, including: quantile, variance stabilization, robust spline, simple scaling, rank invariant, and Loess regression. The comparative analyses were executed using miRNA expression in tissue samples derived from subjects with schizophrenia and non-psychiatric controls. We proposed a consistency criterion for evaluating methods by examining the overlapping of differentially expressed miRNAs detected using different partitions of the whole data. Based on this criterion, we found that RGB normalization generally outperformed global normalization methods. Thus we recommend the application of RGB normalization for miRNA expression data sets, and believe that this will yield a more consistent and useful readout of differentially expressed miRNAs, particularly in biological conditions characterized by large shifts in miRNA expression.

  11. rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks.

    Science.gov (United States)

    Guo, Liyuan; Wang, Jing

    2018-01-04

    Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis

    Directory of Open Access Journals (Sweden)

    Stajich Jason E

    2006-11-01

    Full Text Available Abstract Background To date, most fungal phylogenies have been derived from single gene comparisons, or from concatenated alignments of a small number of genes. The increase in fungal genome sequencing presents an opportunity to reconstruct evolutionary events using entire genomes. As a tool for future comparative, phylogenomic and phylogenetic studies, we used both supertrees and concatenated alignments to infer relationships between 42 species of fungi for which complete genome sequences are available. Results A dataset of 345,829 genes was extracted from 42 publicly available fungal genomes. Supertree methods were employed to derive phylogenies from 4,805 single gene families. We found that the average consensus supertree method may suffer from long-branch attraction artifacts, while matrix representation with parsimony (MRP appears to be immune from these. A genome phylogeny was also reconstructed from a concatenated alignment of 153 universally distributed orthologs. Our MRP supertree and concatenated phylogeny are highly congruent. Within the Ascomycota, the sub-phyla Pezizomycotina and Saccharomycotina were resolved. Both phylogenies infer that the Leotiomycetes are the closest sister group to the Sordariomycetes. There is some ambiguity regarding the placement of Stagonospora nodurum, the sole member of the class Dothideomycetes present in the dataset. Within the Saccharomycotina, a monophyletic clade containing organisms that translate CTG as serine instead of leucine is evident. There is also strong support for two groups within the CTG clade, one containing the fully sexual species Candida lusitaniae, Candida guilliermondii and Debaryomyces hansenii, and the second group containing Candida albicans, Candida dubliniensis, Candida tropicalis, Candida parapsilosis and Lodderomyces elongisporus. The second major clade within the Saccharomycotina contains species whose genomes have undergone a whole genome duplication (WGD, and their close

  13. Molecular cloning of a Candida albicans gene (SSB1) coding for a protein related to the Hsp70 family.

    Science.gov (United States)

    Maneu, V; Cervera, A M; Martinez, J P; Gozalbo, D

    1997-06-15

    We have cloned and sequenced a Candida albicans gene (SSB1) encoding a potential member of the heat-shock protein seventy (hsp70) family. The protein encoded by this gene contains 613 amino acids and shows a high degree (85%) of sequence identity to the ssb subfamily (ssb1 and ssb2) of the Saccharomyces cerevisiae hsp70 family. The transcribed mRNA (2.1 kb) is present in similar amounts both in yeast and germ tube cells of C. albicans.

  14. Network-based differential gene expression analysis suggests cell cycle related genes regulated by E2F1 underlie the molecular difference between smoker and non-smoker lung adenocarcinoma

    Science.gov (United States)

    2013-01-01

    Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize

  15. Sexual selection, genetic conflict, selfish genes, and the atypical patterns of gene expression in spermatogenic cells.

    Science.gov (United States)

    Kleene, Kenneth C

    2005-01-01

    This review proposes that the peculiar patterns of gene expression in spermatogenic cells are the consequence of powerful evolutionary forces known as sexual selection. Sexual selection is generally characterized by intense competition of males for females, an enormous variety of the strategies to maximize male reproductive success, exaggerated male traits at all levels of biological organization, co-evolution of sexual traits in males and females, and conflict between the sexual advantage of the male trait and the reproductive fitness of females and the individual fitness of both sexes. In addition, spermatogenesis is afflicted by selfish genes that promote their transmission to progeny while causing deleterious effects. Sexual selection, selfish genes, and genetic conflict provide compelling explanations for many atypical features of gene expression in spermatogenic cells including the gross overexpression of certain mRNAs, transcripts encoding truncated proteins that cannot carry out basic functions of the proteins encoded by the same genes in somatic cells, the large number of gene families containing paralogous genes encoding spermatogenic cell-specific isoforms, the large number of testis-cancer-associated genes that are expressed only in spermatogenic cells and malignant cells, and the overbearing role of Sertoli cells in regulating the number and quality of spermatozoa.

  16. New tools for Mendelian disease gene identification: PhenoDB variant analysis module; and GeneMatcher, a web-based tool for linking investigators with an interest in the same gene.

    Science.gov (United States)

    Sobreira, Nara; Schiettecatte, François; Boehm, Corinne; Valle, David; Hamosh, Ada

    2015-04-01

    Identifying the causative variant from among the thousands identified by whole-exome sequencing or whole-genome sequencing is a formidable challenge. To make this process as efficient and flexible as possible, we have developed a Variant Analysis Module coupled to our previously described Web-based phenotype intake tool, PhenoDB (http://researchphenodb.net and http://phenodb.org). When a small number of candidate-causative variants have been identified in a study of a particular patient or family, a second, more difficult challenge becomes proof of causality for any given variant. One approach to this problem is to find other cases with a similar phenotype and mutations in the same candidate gene. Alternatively, it may be possible to develop biological evidence for causality, an approach that is assisted by making connections to basic scientists studying the gene of interest, often in the setting of a model organism. Both of these strategies benefit from an open access, online site where individual clinicians and investigators could post genes of interest. To this end, we developed GeneMatcher (http://genematcher.org), a freely accessible Website that enables connections between clinicians and researchers across the world who share an interest in the same gene(s). © 2015 WILEY PERIODICALS, INC.

  17. The FTO (fat mass and obesity associated gene codes for a novel member of the non-heme dioxygenase superfamily

    Directory of Open Access Journals (Sweden)

    Andrade-Navarro Miguel A

    2007-11-01

    Full Text Available Abstract Background Genetic variants in the FTO (fat mass and obesity associated gene have been associated with an increased risk of obesity. However, the function of its protein product has not been experimentally studied and previously reported sequence similarity analyses suggested the absence of homologs in existing protein databases. Here, we present the first detailed computational analysis of the sequence and predicted structure of the protein encoded by FTO. Results We performed a sequence similarity search using the human FTO protein as query and then generated a profile using the multiple sequence alignment of the homologous sequences. Profile-to-sequence and profile-to-profile based comparisons identified remote homologs of the non-heme dioxygenase family. Conclusion Our analysis suggests that human FTO is a member of the non-heme dioxygenase (Fe(II- and 2-oxoglutarate-dependent dioxygenases superfamily. Amino acid conservation patterns support this hypothesis and indicate that both 2-oxoglutarate and iron should be important for FTO function. This computational prediction of the function of FTO should suggest further steps for its experimental characterization and help to formulate hypothesis about the mechanisms by which it relates to obesity in humans.

  18. Pathway-based analysis of a melanoma genome-wide association study: analysis of genes related to tumour-immunosuppression.

    Directory of Open Access Journals (Sweden)

    Nils Schoof

    Full Text Available Systemic immunosuppression is a risk factor for melanoma, and sunburn-induced immunosuppression is thought to be causal. Genes in immunosuppression pathways are therefore candidate melanoma-susceptibility genes. If variants within these genes individually have a small effect on disease risk, the association may be undetected in genome-wide association (GWA studies due to low power to reach a high significance level. Pathway-based approaches have been suggested as a method of incorporating a priori knowledge into the analysis of GWA studies. In this study, the association of 1113 single nucleotide polymorphisms (SNPs in 43 genes (39 genomic regions related to immunosuppression have been analysed using a gene-set approach in 1539 melanoma cases and 3917 controls from the GenoMEL consortium GWA study. The association between melanoma susceptibility and the whole set of tumour-immunosuppression genes, and also predefined functional subgroups of genes, was considered. The analysis was based on a measure formed by summing the evidence from the most significant SNP in each gene, and significance was evaluated empirically by case-control label permutation. An association was found between melanoma and the complete set of genes (p(emp=0.002, as well as the subgroups related to the generation of tolerogenic dendritic cells (p(emp=0.006 and secretion of suppressive factors (p(emp=0.0004, thus providing preliminary evidence of involvement of tumour-immunosuppression gene polymorphisms in melanoma susceptibility. The analysis was repeated on a second phase of the GenoMEL study, which showed no evidence of an association. As one of the first attempts to replicate a pathway-level association, our results suggest that low power and heterogeneity may present challenges.

  19. Genetic diversity of perch rhabdoviruses isolates based on the nucleoprotein and glycoprotein genes.

    Science.gov (United States)

    Talbi, Chiraz; Cabon, Joelle; Baud, Marine; Bourjaily, Maya; de Boisséson, Claire; Castric, Jeannette; Bigarré, Laurent

    2011-12-01

    Despite the increasing impact of rhabdoviruses in European percid farming, the diversity of the viral populations is still poorly investigated. To address this issue, we sequenced the partial nucleoprotein (N) and complete glycoprotein (G) genes of nine rhabdoviruses isolated from perch (Perca fluviatilis) between 1999 and 2010, mostly from France, and analyzed six of them by immunofluorescence antibody test (IFAT). Using two rabbit antisera raised against either the reference perch rhabdovirus (PRhV) isolated in 1980 or the perch isolate R6146, two serogroups were distinguished. Meanwhile, based on partial N and complete G gene analysis, perch rhabdoviruses were divided into four genogroups, A-B-D and E, with a maximum of 32.9% divergence (G gene) between isolates. A comparison of the G amino acid sequences of isolates from the two identified serogroups revealed several variable regions that might account for antigenic differences. Comparative analysis of perch isolates with other rhabdoviruses isolated from black bass, pike-perch and pike showed some strong phylogenetic relationships, suggesting cross-host transmission. Similarly, striking genetic similarities were shown between perch rhabdoviruses and isolates from other European countries and various ecological niches, most likely reflecting the circulation of viruses through fish trade as well as putative transfers from marine to freshwater fish. Phylogenetic relationships of the newly characterized viruses were also determined within the family Rhabdoviridae. The analysis revealed a genetic cluster containing only fish viruses, including all rhabdoviruses from perch, as well as siniperca chuatsi rhabdovirus (SCRV) and eel virus X (EVEX). This cluster was distinct from the one represented by spring viraemia of carp vesiculovirus (SVCV), pike fry rhabdovirus (PFRV) and mammalian vesiculoviruses. The new genetic data provided in the present study shed light on the diversity of rhabdoviruses infecting perch in

  20. Genomic DNA-based absolute quantification of gene expression in Vitis.

    Science.gov (United States)

    Gambetta, Gregory A; McElrone, Andrew J; Matthews, Mark A

    2013-07-01

    Many studies in which gene expression is quantified by polymerase chain reaction represent the expression of a gene of interest (GOI) relative to that of a reference gene (RG). Relative expression is founded on the assumptions that RG expression is stable across samples, treatments, organs, etc., and that reaction efficiencies of the GOI and RG are equal; assumptions which are often faulty. The true variability in RG expression and actual reaction efficiencies are seldom determined experimentally. Here we present a rapid and robust method for absolute quantification of expression in Vitis where varying concentrations of genomic DNA were used to construct GOI standard curves. This methodology was utilized to absolutely quantify and determine the variability of the previously validated RG ubiquitin (VvUbi) across three test studies in three different tissues (roots, leaves and berries). In addition, in each study a GOI was absolutely quantified. Data sets resulting from relative and absolute methods of quantification were compared and the differences were striking. VvUbi expression was significantly different in magnitude between test studies and variable among individual samples. Absolute quantification consistently reduced the coefficients of variation of the GOIs by more than half, often resulting in differences in statistical significance and in some cases even changing the fundamental nature of the result. Utilizing genomic DNA-based absolute quantification is fast and efficient. Through eliminating error introduced by assuming RG stability and equal reaction efficiencies between the RG and GOI this methodology produces less variation, increased accuracy and greater statistical power. © 2012 Scandinavian Plant Physiology Society.

  1. Differential gene expression in an elite hybrid rice cultivar (Oryza sativa, L and its parental lines based on SAGE data

    Directory of Open Access Journals (Sweden)

    Chen Chen

    2007-09-01

    Full Text Available Abstract Background It was proposed that differentially-expressed genes, aside from genetic variations affecting protein processing and functioning, between hybrid and its parents provide essential candidates for studying heterosis or hybrid vigor. Based our serial analysis of gene expression (SAGE data from an elite Chinese super-hybrid rice (LYP9 and its parental cultivars (93-11 and PA64s in three major tissue types (leaves, roots and panicles at different developmental stages, we analyzed the transcriptome and looked for candidate genes related to rice heterosis. Results By using an improved strategy of tag-to-gene mapping and two recently annotated genome assemblies (93-11 and PA64s, we identified 10,268 additional high-quality tags, reaching a grand total of 20,595 together with our previous result. We further detected 8.5% and 5.9% physically-mapped genes that are differentially-expressed among the triad (in at least one of the three stages with P-values less than 0.05 and 0.01, respectively. These genes distributed in 12 major gene expression patterns; among them, 406 up-regulated and 469 down-regulated genes (P Conclusion We improved tag-to-gene mapping strategy by combining information from transcript sequences and rice genome annotation, and obtained a more comprehensive view on genes that related to rice heterosis. The candidates for heterosis-related genes among different genotypes provided new avenue for exploring the molecular mechanism underlying heterosis.

  2. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

    Directory of Open Access Journals (Sweden)

    Kelsey E. Grinde

    2017-09-01

    Full Text Available To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6 and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures.

  3. Genome-wide conserved non-coding microsatellite (CNMS) marker-based integrative genetical genomics for quantitative dissection of seed weight in chickpea.

    Science.gov (United States)

    Bajaj, Deepak; Saxena, Maneesha S; Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Tripathi, Shailesh; Upadhyaya, Hari D; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-03-01

    Phylogenetic footprinting identified 666 genome-wide paralogous and orthologous CNMS (conserved non-coding microsatellite) markers from 5'-untranslated and regulatory regions (URRs) of 603 protein-coding chickpea genes. The (CT)n and (GA)n CNMS carrying CTRMCAMV35S and GAGA8BKN3 regulatory elements, respectively, are abundant in the chickpea genome. The mapped genic CNMS markers with robust amplification efficiencies (94.7%) detected higher intraspecific polymorphic potential (37.6%) among genotypes, implying their immense utility in chickpea breeding and genetic analyses. Seventeen differentially expressed CNMS marker-associated genes showing strong preferential and seed tissue/developmental stage-specific expression in contrasting genotypes were selected to narrow down the gene targets underlying seed weight quantitative trait loci (QTLs)/eQTLs (expression QTLs) through integrative genetical genomics. The integration of transcript profiling with seed weight QTL/eQTL mapping, molecular haplotyping, and association analyses identified potential molecular tags (GAGA8BKN3 and RAV1AAT regulatory elements and alleles/haplotypes) in the LOB-domain-containing protein- and KANADI protein-encoding transcription factor genes controlling the cis-regulated expression for seed weight in the chickpea. This emphasizes the potential of CNMS marker-based integrative genetical genomics for the quantitative genetic dissection of complex seed weight in chickpea. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  4. Systems Pharmacology-Based Approach of Connecting Disease Genes in Genome-Wide Association Studies with Traditional Chinese Medicine.

    Science.gov (United States)

    Kim, Jihye; Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kang, Jaewoo; Tan, Aik Choon

    2018-01-01

    Traditional Chinese medicine (TCM) originated in ancient China has been practiced over thousands of years for treating various symptoms and diseases. However, the molecular mechanisms of TCM in treating these diseases remain unknown. In this study, we employ a systems pharmacology-based approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. We studied 102 TCM components and their target genes by analyzing microarray gene expression experiments. We constructed disease-gene networks from 2558 GWAS studies. We applied a systems pharmacology approach to prioritize disease-target genes. Using this bioinformatics approach, we analyzed 14,713 GWAS disease-TCM-target gene pairs and identified 115 disease-gene pairs with q value < 0.2. We validated several of these GWAS disease-TCM-target gene pairs with literature evidence, demonstrating that this computational approach could reveal novel indications for TCM. We also develop TCM-Disease web application to facilitate the traditional Chinese medicine drug repurposing efforts. Systems pharmacology is a promising approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. The computational approaches described in this study could be easily expandable to other disease-gene network analysis.

  5. Ecdysone Receptor-based Singular Gene Switches for Regulated Transgene Expression in Cells and Adult Rodent Tissues

    Directory of Open Access Journals (Sweden)

    Seoghyun Lee

    2016-01-01

    Full Text Available Controlled gene expression is an indispensable technique in biomedical research. Here, we report a convenient, straightforward, and reliable way to induce expression of a gene of interest with negligible background expression compared to the most widely used tetracycline (Tet-regulated system. Exploiting a Drosophila ecdysone receptor (EcR-based gene regulatory system, we generated nonviral and adenoviral singular vectors designated as pEUI(+ and pENTR-EUI, respectively, which contain all the required elements to guarantee regulated transgene expression (GAL4-miniVP16-EcR, termed GvEcR hereafter, and 10 tandem repeats of an upstream activation sequence promoter followed by a multiple cloning site. Through the transient and stable transfection of mammalian cell lines with reporter genes, we validated that tebufenozide, an ecdysone agonist, reversibly induced gene expression, in a dose- and time-dependent manner, with negligible background expression. In addition, we created an adenovirus derived from the pENTR-EUI vector that readily infected not only cultured cells but also rodent tissues and was sensitive to tebufenozide treatment for regulated transgene expression. These results suggest that EcR-based singular gene regulatory switches would be convenient tools for the induction of gene expression in cells and tissues in a tightly controlled fashion.

  6. AUDIOME: a tiered exome sequencing-based comprehensive gene panel for the diagnosis of heterogeneous nonsyndromic sensorineural hearing loss.

    Science.gov (United States)

    Guan, Qiaoning; Balciuniene, Jorune; Cao, Kajia; Fan, Zhiqian; Biswas, Sawona; Wilkens, Alisha; Gallo, Daniel J; Bedoukian, Emma; Tarpinian, Jennifer; Jayaraman, Pushkala; Sarmady, Mahdi; Dulik, Matthew; Santani, Avni; Spinner, Nancy; Abou Tayoun, Ahmad N; Krantz, Ian D; Conlin, Laura K; Luo, Minjie

    2018-03-29

    PurposeHereditary hearing loss is highly heterogeneous. To keep up with rapidly emerging disease-causing genes, we developed the AUDIOME test for nonsyndromic hearing loss (NSHL) using an exome sequencing (ES) platform and targeted analysis for the curated genes.MethodsA tiered strategy was implemented for this test. Tier 1 includes combined Sanger and targeted deletion analyses of the two most common NSHL genes and two mitochondrial genes. Nondiagnostic tier 1 cases are subjected to ES and array followed by targeted analysis of the remaining AUDIOME genes.ResultsES resulted in good coverage of the selected genes with 98.24% of targeted bases at >15 ×. A fill-in strategy was developed for the poorly covered regions, which generally fell within GC-rich or highly homologous regions. Prospective testing of 33 patients with NSHL revealed a diagnosis in 11 (33%) and a possible diagnosis in 8 cases (24.2%). Among those, 10 individuals had variants in tier 1 genes. The ES data in the remaining nondiagnostic cases are readily available for further analysis.ConclusionThe tiered and ES-based test provides an efficient and cost-effective diagnostic strategy for NSHL, with the potential to reflex to full exome to identify causal changes outside of the AUDIOME test.Genetics in Medicine advance online publication, 29 March 2018; doi:10.1038/gim.2018.48.

  7. Gene-based single nucleotide polymorphism markers for genetic and association mapping in common bean.

    Science.gov (United States)

    Galeano, Carlos H; Cortés, Andrés J; Fernández, Andrea C; Soler, Álvaro; Franco-Herrera, Natalia; Makunde, Godwill; Vanderleyden, Jos; Blair, Matthew W

    2012-06-26

    In common bean, expressed sequence tags (ESTs) are an underestimated source of gene-based markers such as insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). However, due to the nature of these conserved sequences, detection of markers is difficult and portrays low levels of polymorphism. Therefore, development of intron-spanning EST-SNP markers can be a valuable resource for genetic experiments such as genetic mapping and association studies. In this study, a total of 313 new gene-based markers were developed at target genes. Intronic variation was deeply explored in order to capture more polymorphism. Introns were putatively identified after comparing the common bean ESTs with the soybean genome, and the primers were designed over intron-flanking regions. The intronic regions were evaluated for parental polymorphisms using the single strand conformational polymorphism (SSCP) technique and Sequenom MassARRAY system. A total of 53 new marker loci were placed on an integrated molecular map in the DOR364 × G19833 recombinant inbred line (RIL) population. The new linkage map was used to build a consensus map, merging the linkage maps of the BAT93 × JALO EEP558 and DOR364 × BAT477 populations. A total of 1,060 markers were mapped, with a total map length of 2,041 cM across 11 linkage groups. As a second application of the generated resource, a diversity panel with 93 genotypes was evaluated with 173 SNP markers using the MassARRAY-platform and KASPar technology. These results were coupled with previous SSR evaluations and drought tolerance assays carried out on the same individuals. This agglomerative dataset was examined, in order to discover marker-trait associations, using general linear model (GLM) and mixed linear model (MLM). Some significant associations with yield components were identified, and were consistent with previous findings. In short, this study illustrates the power of intron-based markers for linkage and association mapping in

  8. Gene Therapy Vectors with Enhanced Transfection Based on Hydrogels Modified with Affinity Peptides

    Science.gov (United States)

    Shepard, Jaclyn A.; Wesson, Paul J.; Wang, Christine E.; Stevans, Alyson C.; Holland, Samantha J.; Shikanov, Ariella; Grzybowski, Bartosz A.; Shea, Lonnie D.

    2011-01-01

    Regenerative strategies for damaged tissue aim to present biochemical cues that recruit and direct progenitor cell migration and differentiation. Hydrogels capable of localized gene delivery are being developed to provide a support for tissue growth, and as a versatile method to induce the expression of inductive proteins; however, the duration, level, and localization of expression isoften insufficient for regeneration. We thus investigated the modification of hydrogels with affinity peptides to enhance vector retention and increase transfection within the matrix. PEG hydrogels were modified with lysine-based repeats (K4, K8), which retained approximately 25% more vector than control peptides. Transfection increased 5- to 15-fold with K8 and K4 respectively, over the RDG control peptide. K8- and K4-modified hydrogels bound similar quantities of vector, yet the vector dissociation rate was reduced for K8, suggesting excessive binding that limited transfection. These hydrogels were subsequently applied to an in vitro co-culture model to induce NGF expression and promote neurite outgrowth. K4-modified hydrogels promoted maximal neurite outgrowth, likely due to retention of both the vector and the NGF. Thus, hydrogels modified with affinity peptides enhanced vector retention and increased gene delivery, and these hydrogels may provide a versatile scaffold for numerous regenerative medicine applications. PMID:21514659

  9. Effective generation of transgenic pigs and mice by linker based sperm-mediated gene transfer.

    Directory of Open Access Journals (Sweden)

    Shih Ping Yao

    2002-04-01

    Full Text Available Abstract Background Transgenic animals have become valuable tools for both research and applied purposes. The current method of gene transfer, microinjection, which is widely used in transgenic mouse production, has only had limited success in producing transgenic animals of larger or higher species. Here, we report a linker based sperm-mediated gene transfer method (LB-SMGT that greatly improves the production efficiency of large transgenic animals. Results The linker protein, a monoclonal antibody (mAb C, is reactive to a surface antigen on sperm of all tested species including pig, mouse, chicken, cow, goat, sheep, and human. mAb C is a basic protein that binds to DNA through ionic interaction allowing exogenous DNA to be linked specifically to sperm. After fertilization of the egg, the DNA is shown to be successfully integrated into the genome of viable pig and mouse offspring with germ-line transfer to the F1 generation at a highly efficient rate: 37.5% of pigs and 33% of mice. The integration is demonstrated again by FISH analysis and F2 transmission in pigs. Furthermore, expression of the transgene is demonstrated in 61% (35/57 of transgenic pigs (F0 generation. Conclusions Our data suggests that LB-SMGT could be used to generate transgenic animals efficiently in many different species.

  10. Elastin overexpression by cell-based gene therapy preserves matrix and prevents cardiac dilation

    Science.gov (United States)

    Li, Shu-Hong; Sun, Zhuo; Guo, Lily; Han, Mihan; Wood, Michael F G; Ghosh, Nirmalya; Alex Vitkin, I; Weisel, Richard D; Li, Ren-Ke

    2012-01-01

    After a myocardial infarction, thinning and expansion of the fibrotic scar contribute to progressive heart failure. The loss of elastin is a major contributor to adverse extracellular matrix remodelling of the infarcted heart, and restoration of the elastic properties of the infarct region can prevent ventricular dysfunction. We implanted cells genetically modified to overexpress elastin to re-establish the elastic properties of the infarcted myocardium and prevent cardiac failure. A full-length human elastin cDNA was cloned, subcloned into an adenoviral vector and then transduced into rat bone marrow stromal cells (BMSCs). In vitro studies showed that BMSCs expressed the elastin protein, which was deposited into the extracellular matrix. Transduced BMSCs were injected into the infarcted myocardium of adult rats. Control groups received either BMSCs transduced with the green fluorescent protein gene or medium alone. Elastin deposition in the infarcted myocardium was associated with preservation of myocardial tissue structural integrity (by birefringence of polarized light; P elastin showed the greatest functional improvement (P elastin in the infarcted heart preserved the elastic structure of the extracellular matrix, which, in turn, preserved diastolic function, prevented ventricular dilation and preserved cardiac function. This cell-based gene therapy provides a new approach to cardiac regeneration. PMID:22435995

  11. Advances in Viral Vector-Based TRAIL Gene Therapy for Cancer

    International Nuclear Information System (INIS)

    Norian, Lyse A.; James, Britnie R.; Griffith, Thomas S.

    2011-01-01

    Numerous biologic approaches are being investigated as anti-cancer therapies in an attempt to induce tumor regression while circumventing the toxic side effects associated with standard chemo- or radiotherapies. Among these, tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) has shown particular promise in pre-clinical and early clinical trials, due to its preferential ability to induce apoptotic cell death in cancer cells and its minimal toxicity. One limitation of TRAIL use is the fact that many tumor types display an inherent resistance to TRAIL-induced apoptosis. To circumvent this problem, researchers have explored a number of strategies to optimize TRAIL delivery and to improve its efficacy via co-administration with other anti-cancer agents. In this review, we will focus on TRAIL-based gene therapy approaches for the treatment of malignancies. We will discuss the main viral vectors that are being used for TRAIL gene therapy and the strategies that are currently being attempted to improve the efficacy of TRAIL as an anti-cancer therapeutic

  12. Rapid and tunable method to temporally control gene editing based on conditional Cas9 stabilization. | Office of Cancer Genomics

    Science.gov (United States)

    The CRISPR/Cas9 system is a powerful tool for studying gene function. Here, we describe a method that allows temporal control of CRISPR/Cas9 activity based on conditional Cas9 destabilization. We demonstrate that fusing an FKBP12-derived destabilizing domain to Cas9 (DD-Cas9) enables conditional Cas9 expression and temporal control of gene editing in the presence of an FKBP12 synthetic ligand. This system can be easily adapted to co-express, from the same promoter, DD-Cas9 with any other gene of interest without co-modulation of the latter.

  13. Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning.

    Science.gov (United States)

    Doll, Bradley B; Bath, Kevin G; Daw, Nathaniel D; Frank, Michael J

    2016-01-27

    Considerable evidence suggests that multiple learning systems can drive behavior. Choice can proceed reflexively from previous actions and their associated outcomes, as captured by "model-free" learning algorithms, or flexibly from prospective consideration of outcomes that might occur, as captured by "model-based" learning algorithms. However, differential contributions of dopamine to these systems are poorly understood. Dopamine is widely thought to support model-free learning by modulating plasticity in striatum. Model-based learning may also be affected by these striatal effects, or by other dopaminergic effects elsewhere, notably on prefrontal working memory function. Indeed, prominent demonstrations linking striatal dopamine to putatively model-free learning did not rule out model-based effects, whereas other studies have reported dopaminergic modulation of verifiably model-based learning, but without distinguishing a prefrontal versus striatal locus. To clarify the relationships between dopamine, neural systems, and learning strategies, we combine a genetic association approach in humans with two well-studied reinforcement learning tasks: one isolating model-based from model-free behavior and the other sensitive to key aspects of striatal plasticity. Prefrontal function was indexed by a polymorphism in the COMT gene, differences of which reflect dopamine levels in the prefrontal cortex. This polymorphism has been associated with differences in prefrontal activity and working memory. Striatal function was indexed by a gene coding for DARPP-32, which is densely expressed in the striatum where it is necessary for synaptic plasticity. We found evidence for our hypothesis that variations in prefrontal dopamine relate to model-based learning, whereas variations in striatal dopamine function relate to model-free learning. Decisions can stem reflexively from their previously associated outcomes or flexibly from deliberative consideration of potential choice outcomes

  14. Sphingolipid base modifying enzymes in sunflower (Helianthus annuus): cloning and characterization of a C4-hydroxylase gene and a new paralogous Δ8-desaturase gene.

    Science.gov (United States)

    Moreno-Pérez, Antonio J; Martínez-Force, Enrique; Garcés, Rafael; Salas, Joaquín J

    2011-05-15

    Sphingolipids are components of plant cell membranes that participate in the regulation of important physiological processes. Unlike their animal counterparts, plant sphingolipids are characterized by high levels of base C4-hydroxylation. Moreover, desaturation at the Δ8 position predominates over the Δ4 desaturation typically found in animal sphingolipids. These modifications are due to the action of C4-hydroxylases and Δ8-long chain base desaturases, and they are important for complex sphingolipids finally becoming functional. The long chain bases of sunflower sphingolipids have high levels of hydroxylated and unsaturated moieties. Here, a C4-long chain base hydroxylase was functionally characterized in sunflower plant, an enzyme that could complement the sur2Δ mutation when heterologously expressed in this yeast mutant deficient in hydroxylation. This hydroxylase was ubiquitously expressed in sunflower, with the highest levels found in the developing cotyledons. In addition, we identified a new Δ8-long base chain desaturase gene that displays strong homology to a previously reported desaturase gene. This desaturase was also expressed in yeast and was able to change the long chain base composition of the transformed host. We studied the expression of this desaturase and compared it with that of the other isoform described in sunflower. The desaturase form studied in this paper displayed higher expression levels in developing seeds. Copyright © 2010 Elsevier GmbH. All rights reserved.

  15. Functional Characterization of MAT1-1-Specific Mating-Type Genes in the Homothallic Ascomycete Sordaria macrospora Provides New Insights into Essential and Nonessential Sexual Regulators▿†

    Science.gov (United States)

    Klix, V.; Nowrousian, M.; Ringelberg, C.; Loros, J. J.; Dunlap, J. C.; Pöggeler, S.

    2010-01-01

    Mating-type genes in fungi encode regulators of mating and sexual development. Heterothallic ascomycete species require different sets of mating-type genes to control nonself-recognition and mating of compatible partners of different mating types. Homothallic (self-fertile) species also carry mating-type genes in their genome that are essential for sexual development. To analyze the molecular basis of homothallism and the role of mating-type genes during fruiting-body development, we deleted each of the three genes, SmtA-1 (MAT1-1-1), SmtA-2 (MAT1-1-2), and SmtA-3 (MAT1-1-3), contained in the MAT1-1 part of the mating-type locus of the homothallic ascomycete species Sordaria macrospora. Phenotypic analysis of deletion mutants revealed that the PPF domain protein-encoding gene SmtA-2 is essential for sexual reproduction, whereas the α domain protein-encoding genes SmtA-1 and SmtA-3 play no role in fruiting-body development. By means of cross-species microarray analysis using Neurospora crassa oligonucleotide microarrays hybridized with S. macrospora targets and quantitative real-time PCR, we identified genes expressed under the control of SmtA-1 and SmtA-2. Both genes are involved in the regulation of gene expression, including that of pheromone genes. PMID:20435701

  16. West German Study Group Phase III PlanB Trial: First Prospective Outcome Data for the 21-Gene Recurrence Score Assay and Concordance of Prognostic Markers by Central and Local Pathology Assessment.

    Science.gov (United States)

    Gluz, Oleg; Nitz, Ulrike A; Christgen, Matthias; Kates, Ronald E; Shak, Steven; Clemens, Michael; Kraemer, Stefan; Aktas, Bahriye; Kuemmel, Sherko; Reimer, Toralf; Kusche, Manfred; Heyl, Volker; Lorenz-Salehi, Fatemeh; Just, Marianne; Hofmann, Daniel; Degenhardt, Tom; Liedtke, Cornelia; Svedman, Christer; Wuerstlein, Rachel; Kreipe, Hans H; Harbeck, Nadia

    2016-07-10

    The 21-gene Recurrence Score (RS) assay is a validated prognostic/predictive tool in early hormone receptor-positive breast cancer (BC); however, only a few prospective outcome results have been available so far. In the phase III PlanB trial, RS was prospectively used to define a subset of patients who received only endocrine therapy. We present 3-year outcome data and concordance analysis (among biomarkers/RS). Central tumor bank was established prospectively from PlanB (intermediate and high-risk, locally human epidermal growth factor receptor 2-negative BC). After an early amendment, HR-positive, pN0-1 patients with RS ≤ 11 were recommended to omit chemotherapy. From 2009 to 2011, PlanB enrolled 3,198 patients with a median age of 56 years; 41.1% had node-positive and 32.5% grade 3 disease. In 348 patients (15.3%), chemotherapy was omitted based on RS ≤ 11. After 35 months median follow-up, 3-year disease-free survival in patients with RS ≤ 11 and endocrine therapy alone was 98% versus 92% and 98% in RS > 25 and RS 12 to 25 in chemotherapy-treated patients, respectively. Nodal status, central and local grade, the Ki-67 protein encoded by the MKI67 gene, estrogen receptor, progesterone receptor, tumor size, and RS were univariate prognostic factors for disease-free survival; only nodal status, both central and local grade, and RS were independent multivariate factors. Histologic grade was discordant between central and local laboratories in 44%. RS was positively but moderately correlated with the Ki-67 protein encoded by the MKI67 gene and grade and negatively correlated with progesterone receptor and estrogen receptor. In this prospective trial, patients with enhanced clinical risk and omitted chemotherapy on the basis of RS ≤ 11 had excellent 3-year survival. The substantial discordance observed between traditional prognostic markers and RS emphasizes the need for standardized assessment and supports the potential integration of standardized, well

  17. Intellectual property rights and gene-based technologies for animal production and health. Issues for developing countries

    International Nuclear Information System (INIS)

    Dutfield, G.

    2005-01-01

    Intellectual property rights (IPR) are legal and institutional devices to protect creations of the mind. With respect to gene-based innovation, the most significant IPR is patents. Appropriate patent regimes have the potential to foster innovation in animal biotechnology and the transfer of gene-based technologies. Inappropriate patent systems may be counter-productive. Indeed, many critics are doubtful that the current international patent standards, based as they are on a combination of the United States of America' and European regimes, can help countries that lack the capacity to do much life science and biotechnology research to become more innovative o r contribute to the acquisition, absorption and, where desirable, the adaptation of new gene-based technologies from outside. Present legislation in Europe, North America and internationally is considered, together with the controversies and important policy questions for developing countries, and the choices facing countries seeking to enhance their scientific and technological capacities in these areas. (author)

  18. Gene introduction into the mitochondria of Arabidopsis thaliana via peptide-based carriers

    Science.gov (United States)

    Chuah, Jo-Ann; Yoshizumi, Takeshi; Kodama, Yutaka; Numata, Keiji

    2015-01-01

    Available methods in plant genetic transformation are nuclear and plastid transformations because similar procedures have not yet been established for the mitochondria. The double membrane and small size of the organelle, in addition to its large population in cells, are major obstacles in mitochondrial transfection. Here we report the intracellular delivery of exogenous DNA localized to the mitochondria of Arabidopsis thaliana using a combination of mitochondria-targeting peptide and cell-penetrating peptide. Low concentrations of peptides were sufficient to deliver DNA into the mitochondria and expression of imported DNA reached detectable levels within a short incubation period (12 h). We found that electrostatic interaction with the cell membrane is not a critical factor for complex internalization, instead, improved intracellular penetration of mitochondria-targeted complexes significantly enhanced gene transfer efficiency. Our results delineate a simple and effective peptide-based method, as a starting point for the development of more sophisticated plant mitochondrial transfection strategies.

  19. Gene delivery by microfluidic flow-through electroporation based on constant DC and AC field.

    Science.gov (United States)

    Geng, Tao; Zhan, Yihong; Lu, Chang

    2012-01-01

    Electroporation is one of the most widely used physical methods to deliver exogenous nucleic acids into cells with high efficiency and low toxicity. Conventional electroporation systems typically require expensive pulse generators to provide short electrical pulses at high voltage. In this work, we demonstrate a flow-through electroporation method for continuous transfection of cells based on disposable chips, a syringe pump, and a low-cost power supply that provides a constant voltage. We successfully transfect cells using either DC or AC voltage with high flow rates (ranging from 40 µl/min to 20 ml/min) and high efficiency (up to 75%). We also enable the entire cell membrane to be uniformly permeabilized and dramatically improve gene delivery by inducing complex migrations of cells during the flow.

  20. Development of new USER-based cloning vectors for multiple genes expression in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Kildegaard, Kanchana Rueksomtawin; Jensen, Niels Bjerg; Maury, Jerome

    2013-01-01

    auxotrophic and dominant markers for convenience of use. Our vector set also contains both integrating and multicopy vectors for stability of protein expression and high expression level. We will make the new vector system available to the yeast community and provide a comprehensive protocol for cloning...... the production strain with the proper phenotype and product yield. However, the sequential number of metabolic engineering is time-consuming. Furthermore, the number of available selectable markers is also limiting the number of genetic modifications. To overcome these limitations, we have developed a new set...... of shuttle vectors for convenience of use for high-throughput cloning and selectable marker recycling. The new USER-based cloning vectors consist of a unique USER site and a CRE-loxP-mediated marker recycling system. The USER site allows insertion of genes of interest along with a bidirectional promoter...

  1. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis

    Science.gov (United States)

    Kato, Hirotomo; Gomez, Eduardo A.; Martini-Robles, Luiggi; Muzzio, Jenny; Velez, Lenin; Calvopiña, Manuel; Romero-Alvarez, Daniel; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

    2016-01-01

    A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL) in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia) guyanensis, L. (V.) braziliensis, L. (V.) naiffi, L. (V.) lainsoni, and L. (Leishmania) mexicana. Two dominant species, L. (V.) guyanensis and L. (V.) braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V.) naiffi and L. (V.) lainsoni were identified in Amazonian areas, and L. (L.) mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V.) braziliensis infection are increasing in Pacific coast areas. PMID:27410039

  2. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis.

    Science.gov (United States)

    Kato, Hirotomo; Gomez, Eduardo A; Martini-Robles, Luiggi; Muzzio, Jenny; Velez, Lenin; Calvopiña, Manuel; Romero-Alvarez, Daniel; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

    2016-07-01

    A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL) in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia) guyanensis, L. (V.) braziliensis, L. (V.) naiffi, L. (V.) lainsoni, and L. (Leishmania) mexicana. Two dominant species, L. (V.) guyanensis and L. (V.) braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V.) naiffi and L. (V.) lainsoni were identified in Amazonian areas, and L. (L.) mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V.) braziliensis infection are increasing in Pacific coast areas.

  3. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis.

    Directory of Open Access Journals (Sweden)

    Hirotomo Kato

    2016-07-01

    Full Text Available A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia guyanensis, L. (V. braziliensis, L. (V. naiffi, L. (V. lainsoni, and L. (Leishmania mexicana. Two dominant species, L. (V. guyanensis and L. (V. braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V. naiffi and L. (V. lainsoni were identified in Amazonian areas, and L. (L. mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V. braziliensis infection are increasing in Pacific coast areas.

  4. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

    Science.gov (United States)

    Grinde, Kelsey E.; Arbet, Jaron; Green, Alden; O'Connell, Michael; Valcarcel, Alessandra; Westra, Jason; Tintle, Nathan

    2017-01-01

    To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures. PMID:28959274

  5. Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification.

    Science.gov (United States)

    Doostparast Torshizi, Abolfazl; Petzold, Linda R

    2018-01-01

    Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded