WorldWideScience

Sample records for smn2 gene based

  1. Genetic and expression studies of SMN2 gene in Russian patients with spinal muscular atrophy type II and III

    Directory of Open Access Journals (Sweden)

    Schiöth Helgi B

    2011-07-01

    Full Text Available Abstract Background Spinal muscular atrophy (SMA type I, II and III is an autosomal recessive neuromuscular disorder caused by mutations in the survival motor neuron gene (SMN1. SMN2 is a centromeric copy gene that has been characterized as a major modifier of SMA severity. SMA type I patients have one or two SMN2 copies while most SMA type II patients carry three SMN2 copies and SMA III patients have three or four SMN2 copies. The SMN1 gene produces a full-length transcript (FL-SMN while SMN2 is only able to produce a small portion of the FL-SMN because of a splice mutation which results in the production of abnormal SMNΔ7 mRNA. Methods In this study we performed quantification of the SMN2 gene copy number in Russian patients affected by SMA type II and III (42 and 19 patients, respectively by means of real-time PCR. Moreover, we present two families consisting of asymptomatic carriers of a homozygous absence of the SMN1 gene. We also developed a novel RT-qPCR-based assay to determine the FL-SMN/SMNΔ7 mRNA ratio as SMA biomarker. Results Comparison of the SMN2 copy number and clinical features revealed a significant correlation between mild clinical phenotype (SMA type III and presence of four copies of the SMN2 gene. In both asymptomatic cases we found an increased number of SMN2 copies in the healthy carriers and a biallelic SMN1 absence. Furthermore, the novel assay revealed a difference between SMA patients and healthy controls. Conclusions We suggest that the SMN2 gene copy quantification in SMA patients could be used as a prognostic tool for discrimination between the SMA type II and SMA type III diagnoses, whereas the FL-SMN/SMNΔ7 mRNA ratio could be a useful biomarker for detecting changes during SMA pharmacotherapy.

  2. Association between the SMN2 gene copy number and clinical characteristics of patients with spinal muscular atrophy with homozygous deletion of exon 7 of the SMN1 gene

    Directory of Open Access Journals (Sweden)

    Žarkov Marija

    2015-01-01

    Full Text Available Background/Aim. Spinal muscular atrophy (SMA is an autosomal recessive disease characterized by degeneration of alpha motor neurons in the spinal cord and the medulla oblongata, causing progressive muscle weakness and atrophy. The aim of this study was to determine association between the SMN2 gene copy number and disease phenotype in Serbian patients with SMA with homozygous deletion of exon 7 of the SMN1 gene. Methods. The patients were identified using regional Serbian hospital databases. Investigated clinical characteristics of the disease were: patients’ gender, age at disease onset, achieved and current developmental milestones, disease duration, current age, and the presence of the spinal deformities and joint contractures. The number of SMN1 and SMN2 gene copies was determined using real-time polymerase chain reaction (PCR. Results. Among 43 identified patients, 37 (86.0% showed homozygous deletion of SMN1 exon 7. One (2.7% of 37 patients had SMA type I with 3 SMN2 copies, 11 (29.7% patients had SMA type II with 3.1 ± 0.7 copies, 17 (45.9% patients had SMA type III with 3.7 ± 0.9 copies, while 8 (21.6% patients had SMA type IV with 4.2 ± 0.9 copies. There was a progressive increase in the SMN2 gene copy number from type II towards type IV (p < 0.05. A higher SMN2 gene copy number was associated with better current motor performance (p < 0.05. Conclusion. In the Serbian patients with SMA, a higher SMN2 gene copy number correlated with less severe disease phenotype. A possible effect of other phenotype modifiers should not be neglected.

  3. Mouse survival motor neuron alleles that mimic SMN2 splicing and are inducible rescue embryonic lethality early in development but not late.

    Directory of Open Access Journals (Sweden)

    Suzan M Hammond

    Full Text Available Spinal muscular atrophy (SMA is caused by low survival motor neuron (SMN levels and patients represent a clinical spectrum due primarily to varying copies of the survival motor neuron-2 (SMN2 gene. Patient and animals studies show that disease severity is abrogated as SMN levels increase. Since therapies currently being pursued target the induction of SMN, it will be important to understand the dosage, timing and cellular requirements of SMN for disease etiology and potential therapeutic intervention. This requires new mouse models that can induce SMN temporally and/or spatially. Here we describe the generation of two hypomorphic Smn alleles, Smn(C-T-Neo and Smn(2B-Neo. These alleles mimic SMN2 exon 7 splicing, titre Smn levels and are inducible. They were specifically designed so that up to three independent lines of mice could be generated, herein we describe two. In a homozygous state each allele results in embryonic lethality. Analysis of these mutants indicates that greater than 5% of Smn protein is required for normal development. The severe hypomorphic nature of these alleles is caused by inclusion of a loxP-flanked neomycin gene selection cassette in Smn intron 7, which can be removed with Cre recombinase. In vitro and in vivo experiments demonstrate these as inducible Smn alleles. When combined with an inducible Cre mouse, embryonic lethality caused by low Smn levels can be rescued early in gestation but not late. This provides direct genetic evidence that a therapeutic window for SMN inductive therapies may exist. Importantly, these lines fill a void for inducible Smn alleles. They also provide a base from which to generate a large repertoire of SMA models of varying disease severities when combined with other Smn alleles or SMN2-containing mice.

  4. Seamless Genetic Conversion of SMN2 to SMN1 via CRISPR/Cpf1 and Single-Stranded Oligodeoxynucleotides in Spinal Muscular Atrophy Patient-Specific Induced Pluripotent Stem Cells.

    Science.gov (United States)

    Zhou, Miaojin; Hu, Zhiqing; Qiu, Liyan; Zhou, Tao; Feng, Mai; Hu, Qian; Zeng, Baitao; Li, Zhuo; Sun, Qianru; Wu, Yong; Liu, Xionghao; Wu, Lingqian; Liang, Desheng

    2018-05-09

    Spinal muscular atrophy (SMA) is a kind of neuromuscular disease characterized by progressive motor neuron loss in the spinal cord. It is caused by mutations in the survival motor neuron 1 (SMN1) gene. SMN1 has a paralogous gene, survival motor neuron 2 (SMN2), in humans that is present in almost all SMA patients. The generation and genetic correction of SMA patient-specific induced pluripotent stem cells (iPSCs) is a viable, autologous therapeutic strategy for the disease. Here, c-Myc-free and non-integrating iPSCs were generated from the urine cells of an SMA patient using an episomal iPSC reprogramming vector, and a unique crRNA was designed that does not have similar sequences (≤3 mismatches) anywhere in the human reference genome. In situ gene conversion of the SMN2 gene to an SMN1-like gene in SMA-iPSCs was achieved using CRISPR/Cpf1 and single-stranded oligodeoxynucleotide with a high efficiency of 4/36. Seamlessly gene-converted iPSC lines contained no exogenous sequences and retained a normal karyotype. Significantly, the SMN expression and gems localization were rescued in the gene-converted iPSCs and their derived motor neurons. This is the first report of an efficient gene conversion mediated by Cpf1 homology-directed repair in human cells and may provide a universal gene therapeutic approach for most SMA patients.

  5. Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.

    Science.gov (United States)

    Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian

    2018-02-23

    Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.

  6. Scuba: scalable kernel-based gene prioritization.

    Science.gov (United States)

    Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

    2018-01-25

    The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .

  7. Rational design of gene-based vaccines.

    Science.gov (United States)

    Barouch, Dan H

    2006-01-01

    Vaccine development has traditionally been an empirical discipline. Classical vaccine strategies include the development of attenuated organisms, whole killed organisms, and protein subunits, followed by empirical optimization and iterative improvements. While these strategies have been remarkably successful for a wide variety of viruses and bacteria, these approaches have proven more limited for pathogens that require cellular immune responses for their control. In this review, current strategies to develop and optimize gene-based vaccines are described, with an emphasis on novel approaches to improve plasmid DNA vaccines and recombinant adenovirus vector-based vaccines. Copyright 2006 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  8. Paper-based synthetic gene networks.

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J

    2014-11-06

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.

  9. Paper-based Synthetic Gene Networks

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

    2014-01-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  10. DNA Array-Based Gene Profiling

    Science.gov (United States)

    Mocellin, Simone; Provenzano, Maurizio; Rossi, Carlo Riccardo; Pilati, Pierluigi; Nitti, Donato; Lise, Mario

    2005-01-01

    Cancer is a heterogeneous disease in most respects, including its cellularity, different genetic alterations, and diverse clinical behaviors. Traditional molecular analyses are reductionist, assessing only 1 or a few genes at a time, thus working with a biologic model too specific and limited to confront a process whose clinical outcome is likely to be governed by the combined influence of many genes. The potential of functional genomics is enormous, because for each experiment, thousands of relevant observations can be made simultaneously. Accordingly, DNA array, like other high-throughput technologies, might catalyze and ultimately accelerate the development of knowledge in tumor cell biology. Although in its infancy, the implementation of DNA array technology in cancer research has already provided investigators with novel data and intriguing new hypotheses on the molecular cascade leading to carcinogenesis, tumor aggressiveness, and sensitivity to antiblastic agents. Given the revolutionary implications that the use of this technology might have in the clinical management of patients with cancer, principles of DNA array-based tumor gene profiling need to be clearly understood for the data to be correctly interpreted and appreciated. In the present work, we discuss the technical features characterizing this powerful laboratory tool and review the applications so far described in the field of oncology. PMID:15621987

  11. [Smart therapeutics based on synthetic gene circuits].

    Science.gov (United States)

    Peng, Shuguang; Xie, Zhen

    2017-03-25

    Synthetic biology has an important impact on biology research since its birth. Applying the thought and methods that reference from electrical engineering, synthetic biology uncovers many regulatory mechanisms of life systems, transforms and expands a series of biological components. Therefore, it brings a wide range of biomedical applications, including providing new ideas for disease diagnosis and treatment. This review describes the latest advances in the field of disease diagnosis and therapy based on mammalian cell or bacterial synthetic gene circuits, and provides new ideas for future smart therapy design.

  12. Protective Effects of Butyrate-based Compounds on a Mouse Model for Spinal Muscular Atrophy

    Science.gov (United States)

    Butchbach, Matthew E. R.; Lumpkin, Casey J.; Harris, Ashlee W.; Saieva, Luciano; Edwards, Jonathan D.; Workman, Eileen; Simard, Louise R.; Pellizzoni, Livio; Burghes, Arthur H. M.

    2016-01-01

    Proximal spinal muscular atrophy (SMA) is a childhood-onset degenerative disease resulting from the selective loss of motor neurons in the spinal cord. SMA is caused by the loss of SMN1 (survival motor neuron 1) but retention of SMN2. The number of copies of SMN2 modifies disease severity in SMA patients as well as in mouse models, making SMN2 a target for therapeutics development. Sodium butyrate (BA) and its analogue (4PBA) have been shown to increase SMN2 expression in SMA cultured cells. In this study, we examined the effects of BA, 4PBA as well as two BA prodrugs—glyceryl tributyrate (BA3G) and VX563—on the phenotype of SMNΔ7 SMA mice. Treatment with 4PBA, BA3G and VX563 but not BA beginning at PND04 significantly improved the lifespan and delayed disease end stage, with administration of VX563 also improving the growth rate of these mice. 4PBA and VX563 improved the motor phenotype of SMNΔ7 SMA mice and prevented spinal motor neuron loss. Interestingly, neither 4PBA nor VX563 had an effect on SMN expression in the spinal cords of treated SMNΔ7 SMA mice; however, they inhibited histone deacetylase (HDAC) activity and restored the normal phosphorylation states of Akt and glycogen synthase kinase 3β, both of which are altered by SMN deficiency in vivo. These observations show that BA-based compounds with favourable pharmacokinetics ameliorate SMA pathology possibly by modulating HDAC and Akt signaling. PMID:26892876

  13. Modifier genes: Moving from pathogenesis to therapy.

    Science.gov (United States)

    McCabe, Edward R B

    2017-09-01

    This commentary will focus on how we can use our knowledge about the complexity of human disease and its pathogenesis to identify novel approaches to therapy. We know that even for single gene Mendelian disorders, patients with identical mutations often have different presentations and outcomes. This lack of genotype-phenotype correlation led us and others to examine the roles of modifier genes in the context of biological networks. These investigations have utilized vertebrate and invertebrate model organisms. Since one of the goals of research on modifier genes and networks is to identify novel therapeutic targets, the challenges to patient access and compliance because of the high costs of medications for rare genetic diseases must be recognized. A recent article explored protective modifiers, including plastin 3 (PLS3) and coronin 1C (CORO1C), in spinal muscular atrophy (SMA). SMA is an autosomal recessive deficit of survival motor neuron protein (SMN) caused by mutations in SMN1. However, the severity of SMA is determined primarily by the number of SMN2 copies, and this results in significant phenotypic variability. PLS3 was upregulated in siblings who were asymptomatic compared with those who had SMA2 or SMA3, but identical homozygous SMN1 deletions and equal numbers of SMN2 copies. CORO1C was identified by interrogation of the PLS3 interactome. Overexpression of these proteins rescued endocytosis in SMA models. In addition, antisense RNA for upregulation of SMN2 protein expression is being developed as another way of modifying the SMA phenotype. These investigations suggest the practical application of protective modifiers to rescue SMA phenotypes. Other examples of the potential therapeutic value of novel protective modifiers will be discussed, including in Duchenne muscular dystrophy and glycerol kinase deficiency. This work shows that while we live in an exciting era of genomic sequencing, a functional understanding of biology, the impact of its

  14. A powerful score-based test statistic for detecting gene-gene co-association.

    Science.gov (United States)

    Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun

    2016-01-29

    The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.

  15. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  16. HMM-Based Gene Annotation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Haussler, David; Hughey, Richard; Karplus, Keven

    1999-09-20

    Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.

  17. KBERG: KnowledgeBase for Estrogen Responsive Genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam

    2007-01-01

    Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection...... is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition...

  18. Ranking candidate disease genes from gene expression and protein interaction: a Katz-centrality based approach.

    Directory of Open Access Journals (Sweden)

    Jing Zhao

    Full Text Available Many diseases have complex genetic causes, where a set of alleles can affect the propensity of getting the disease. The identification of such disease genes is important to understand the mechanistic and evolutionary aspects of pathogenesis, improve diagnosis and treatment of the disease, and aid in drug discovery. Current genetic studies typically identify chromosomal regions associated specific diseases. But picking out an unknown disease gene from hundreds of candidates located on the same genomic interval is still challenging. In this study, we propose an approach to prioritize candidate genes by integrating data of gene expression level, protein-protein interaction strength and known disease genes. Our method is based only on two, simple, biologically motivated assumptions--that a gene is a good disease-gene candidate if it is differentially expressed in cases and controls, or that it is close to other disease-gene candidates in its protein interaction network. We tested our method on 40 diseases in 58 gene expression datasets of the NCBI Gene Expression Omnibus database. On these datasets our method is able to predict unknown disease genes as well as identifying pleiotropic genes involved in the physiological cellular processes of many diseases. Our study not only provides an effective algorithm for prioritizing candidate disease genes but is also a way to discover phenotypic interdependency, cooccurrence and shared pathophysiology between different disorders.

  19. Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

    Directory of Open Access Journals (Sweden)

    Ettore Mosca

    2017-09-01

    Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.

  20. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  1. RNAi-based silencing of genes encoding the vacuolar- ATPase ...

    African Journals Online (AJOL)

    RNAi-based silencing of genes encoding the vacuolar- ATPase subunits a and c in pink bollworm (Pectinophora gossypiella). Ahmed M. A. Mohammed. Abstract. RNA interference is a post- transcriptional gene regulation mechanism that is predominantly found in eukaryotic organisms. RNAi demonstrated a successful ...

  2. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  3. Hepatitis B virus DNA polymerase gene polymorphism based ...

    African Journals Online (AJOL)

    Hepatitis B virus DNA polymerase gene polymorphism based prediction of genotypes in chronic HBV patients from Western India. Yashwant G. Chavan, Sharad R. Pawar, Minal Wani, Amol D. Raut, Rabindra N. Misra ...

  4. Evaluation of Gene-Based Family-Based Methods to Detect Novel Genes Associated With Familial Late Onset Alzheimer Disease

    Directory of Open Access Journals (Sweden)

    Maria V. Fernández

    2018-04-01

    Full Text Available Gene-based tests to study the combined effect of rare variants on a particular phenotype have been widely developed for case-control studies, but their evolution and adaptation for family-based studies, especially studies of complex incomplete families, has been slower. In this study, we have performed a practical examination of all the latest gene-based methods available for family-based study designs using both simulated and real datasets. We examined the performance of several collapsing, variance-component, and transmission disequilibrium tests across eight different software packages and 22 models utilizing a cohort of 285 families (N = 1,235 with late-onset Alzheimer disease (LOAD. After a thorough examination of each of these tests, we propose a methodological approach to identify, with high confidence, genes associated with the tested phenotype and we provide recommendations to select the best software and model for family-based gene-based analyses. Additionally, in our dataset, we identified PTK2B, a GWAS candidate gene for sporadic AD, along with six novel genes (CHRD, CLCN2, HDLBP, CPAMD8, NLRP9, and MAS1L as candidate genes for familial LOAD.

  5. A Nonlinear Model for Gene-Based Gene-Environment Interaction

    Directory of Open Access Journals (Sweden)

    Jian Sa

    2016-06-01

    Full Text Available A vast amount of literature has confirmed the role of gene-environment (G×E interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.

  6. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  7. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    2014-01-01

    Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.

  8. Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

    KAUST Repository

    AlShahrani, Mona; Hoehndorf, Robert

    2018-01-01

    In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.

  9. Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

    KAUST Repository

    Alshahrani, Mona

    2018-04-30

    In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease\\'s (or patient\\'s) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.

  10. Link-based quantitative methods to identify differentially coexpressed genes and gene Pairs

    Directory of Open Access Journals (Sweden)

    Ye Zhi-Qiang

    2011-08-01

    Full Text Available Abstract Background Differential coexpression analysis (DCEA is increasingly used for investigating the global transcriptional mechanisms underlying phenotypic changes. Current DCEA methods mostly adopt a gene connectivity-based strategy to estimate differential coexpression, which is characterized by comparing the numbers of gene neighbors in different coexpression networks. Although it simplifies the calculation, this strategy mixes up the identities of different coexpression neighbors of a gene, and fails to differentiate significant differential coexpression changes from those trivial ones. Especially, the correlation-reversal is easily missed although it probably indicates remarkable biological significance. Results We developed two link-based quantitative methods, DCp and DCe, to identify differentially coexpressed genes and gene pairs (links. Bearing the uniqueness of exploiting the quantitative coexpression change of each gene pair in the coexpression networks, both methods proved to be superior to currently popular methods in simulation studies. Re-mining of a publicly available type 2 diabetes (T2D expression dataset from the perspective of differential coexpression analysis led to additional discoveries than those from differential expression analysis. Conclusions This work pointed out the critical weakness of current popular DCEA methods, and proposed two link-based DCEA algorithms that will make contribution to the development of DCEA and help extend it to a broader spectrum.

  11. New Genome Similarity Measures based on Conserved Gene Adjacencies.

    Science.gov (United States)

    Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

    2017-06-01

    Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

  12. GOBO: gene expression-based outcome for breast cancer online.

    Directory of Open Access Journals (Sweden)

    Markus Ringnér

    Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.

  13. PCR-based detection of gene transfer vectors: application to gene doping surveillance.

    Science.gov (United States)

    Perez, Irene C; Le Guiner, Caroline; Ni, Weiyi; Lyles, Jennifer; Moullier, Philippe; Snyder, Richard O

    2013-12-01

    Athletes who illicitly use drugs to enhance their athletic performance are at risk of being banned from sports competitions. Consequently, some athletes may seek new doping methods that they expect to be capable of circumventing detection. With advances in gene transfer vector design and therapeutic gene transfer, and demonstrations of safety and therapeutic benefit in humans, there is an increased probability of the pursuit of gene doping by athletes. In anticipation of the potential for gene doping, assays have been established to directly detect complementary DNA of genes that are top candidates for use in doping, as well as vector control elements. The development of molecular assays that are capable of exposing gene doping in sports can serve as a deterrent and may also identify athletes who have illicitly used gene transfer for performance enhancement. PCR-based methods to detect foreign DNA with high reliability, sensitivity, and specificity include TaqMan real-time PCR, nested PCR, and internal threshold control PCR.

  14. Finding gene regulatory network candidates using the gene expression knowledge base.

    Science.gov (United States)

    Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

    2014-12-10

    Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

  15. Model-based gene set analysis for Bioconductor.

    Science.gov (United States)

    Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien

    2011-07-01

    Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.

  16. A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes.

    Directory of Open Access Journals (Sweden)

    Samuel Sunghwan Cho

    Full Text Available Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs. However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods

  17. A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes

    Science.gov (United States)

    Seo, Minseok; Shin, Su-kyung; Kwon, Eun-Young; Kim, Sung-Eun; Bae, Yun-Jung; Lee, Seungyeoun; Sung, Mi-Kyung; Choi, Myung-Sook; Park, Taesung

    2016-01-01

    Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs) among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs). However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods. Through analysis of

  18. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  19. Detection of Gene Interactions Based on Syntactic Relations

    Directory of Open Access Journals (Sweden)

    Mi-Young Kim

    2008-01-01

    Full Text Available Interactions between proteins and genes are considered essential in the description of biomolecular phenomena, and networks of interactions are applied in a system's biology approach. Recently, many studies have sought to extract information from biomolecular text using natural language processing technology. Previous studies have asserted that linguistic information is useful for improving the detection of gene interactions. In particular, syntactic relations among linguistic information are good for detecting gene interactions. However, previous systems give a reasonably good precision but poor recall. To improve recall without sacrificing precision, this paper proposes a three-phase method for detecting gene interactions based on syntactic relations. In the first phase, we retrieve syntactic encapsulation categories for each candidate agent and target. In the second phase, we construct a verb list that indicates the nature of the interaction between pairs of genes. In the last phase, we determine direction rules to detect which of two genes is the agent or target. Even without biomolecular knowledge, our method performs reasonably well using a small training dataset. While the first phase contributes to improve recall, the second and third phases contribute to improve precision. In the experimental results using ICML 05 Workshop on Learning Language in Logic (LLL05 data, our proposed method gave an F-measure of 67.2% for the test data, significantly outperforming previous methods. We also describe the contribution of each phase to the performance.

  20. Construction of coffee transcriptome networks based on gene annotation semantics

    Directory of Open Access Journals (Sweden)

    Castillo Luis F.

    2012-12-01

    Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

  1. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Directory of Open Access Journals (Sweden)

    Mario Fruzangohar

    Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.

  2. Analysis of regulatory networks constructed based on gene ...

    Indian Academy of Sciences (India)

    2013-12-09

    Dec 9, 2013 ... early diagnosis of complex diseases or cancer without obvious symptoms. [Gong J., Diao B., Yao G. J., ... expression levels of thousands of genes in a specific cell or tissue. Previous ..... base of the brain. It mainly controls the ...

  3. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  4. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    Science.gov (United States)

    Zhou, Xionghui; Liu, Juan

    2014-01-01

    Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for

  5. Thermodynamics-based models of transcriptional regulation with gene sequence.

    Science.gov (United States)

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  6. Minimal gene selection for classification and diagnosis prediction based on gene expression profile

    Directory of Open Access Journals (Sweden)

    Alireza Mehridehnavi

    2013-01-01

    Conclusion: We have shown that the use of two most significant genes based on their S/N ratios and selection of suitable training samples can lead to classify DLBCL patients with a rather good result. Actually with the aid of mentioned methods we could compensate lack of enough number of patients, improve accuracy of classifying and reduce complication of computations and so running time.

  7. Development of gene diagnosis for diabetes and cholecystitis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Kono, Akira

    1999-01-01

    Base sequence analysis of CCKAR gene (a gene of A-type receptor for cholecystokinin) from OLETF rat, a model rat for insulin-independent diabetes was made based on the base sequence of wild CCKAR gene, which had been clarified in the previous year. From the pancreas of OLETF rat, DNA was extracted and transduced into λphage after fragmentation to construct the gene library of OLETF. Then, λphage DNA clone bound with labelled cDNA of CCKAR gene was analyzed and the gene structure was compared with that of the wild gene. It was demonstrated that CCKAR gene of OLETF had a deletion (6800 b.p.) ranging from the promoter region to the Exon 2, suggesting that CCKAR gene is not functional in OLETF rat. The whole sequence of this mutant gene was registered into Japan DNA Bank (D 50610). Then, F 2 offspring rats were obtained through crossing OLETF (female) and F344 (male) and the time course-changes in the blood glucose level after glucose loading were compared among them. The blood glucose level after glucose loading was significantly higher in the homo-mutant F 2 (CCKAR,-/-) as well as the parent OLETF rat than hetero-mutant F 2 (CCKARm-/+) or the wild rat (CCKAR,+/+). This suggests that CCKAR gene might be involved in the control of blood glucose level and an alteration of the expression level or the functions of CCKAR gene might affect the blood glucose level. (M.N.)

  8. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  9. Comparison of lists of genes based on functional profiles

    Directory of Open Access Journals (Sweden)

    Salicrú Miquel

    2011-10-01

    Full Text Available Abstract Background How to compare studies on the basis of their biological significance is a problem of central importance in high-throughput genomics. Many methods for performing such comparisons are based on the information in databases of functional annotation, such as those that form the Gene Ontology (GO. Typically, they consist of analyzing gene annotation frequencies in some pre-specified GO classes, in a class-by-class way, followed by p-value adjustment for multiple testing. Enrichment analysis, where a list of genes is compared against a wider universe of genes, is the most common example. Results A new global testing procedure and a method incorporating it are presented. Instead of testing separately for each GO class, a single global test for all classes under consideration is performed. The test is based on the distance between the functional profiles, defined as the joint frequencies of annotation in a given set of GO classes. These classes may be chosen at one or more GO levels. The new global test is more powerful and accurate with respect to type I errors than the usual class-by-class approach. When applied to some real datasets, the results suggest that the method may also provide useful information that complements the tests performed using a class-by-class approach if gene counts are sparse in some classes. An R library, goProfiles, implements these methods and is available from Bioconductor, http://bioconductor.org/packages/release/bioc/html/goProfiles.html. Conclusions The method provides an inferential basis for deciding whether two lists are functionally different. For global comparisons it is preferable to the global chi-square test of homogeneity. Furthermore, it may provide additional information if used in conjunction with class-by-class methods.

  10. Global Regulatory Differences for Gene- and Cell-Based Therapies

    DEFF Research Database (Denmark)

    Coppens, Delphi G M; De Bruin, Marie L; Leufkens, Hubert G M

    2017-01-01

    Gene- and cell-based therapies (GCTs) offer potential new treatment options for unmet medical needs. However, the use of conventional regulatory requirements for medicinal products to approve GCTs may impede patient access and therapeutic innovation. Furthermore, requirements differ between...... jurisdictions, complicating the global regulatory landscape. We provide a comparative overview of regulatory requirements for GCT approval in five jurisdictions and hypothesize on the consequences of the observed global differences on patient access and therapeutic innovation....

  11. Canonical correlation analysis for gene-based pleiotropy discovery.

    Directory of Open Access Journals (Sweden)

    Jose A Seoane

    2014-10-01

    Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.

  12. Sequence-based model of gap gene regulatory network.

    Science.gov (United States)

    Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

    2014-01-01

    The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3

  13. A Fisheye Viewer for microarray-based gene expression data.

    Science.gov (United States)

    Wu, Min; Thao, Cheng; Mu, Xiangming; Munson, Ethan V

    2006-10-13

    Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface--an electronic table (E-table) that uses fisheye distortion technology. The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.

  14. A fisheye viewer for microarray-based gene expression data

    Directory of Open Access Journals (Sweden)

    Munson Ethan V

    2006-10-01

    Full Text Available Abstract Background Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface – an electronic table (E-table that uses fisheye distortion technology. Results The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. Conclusion This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.

  15. A modular positive feedback-based gene amplifier

    Directory of Open Access Journals (Sweden)

    Bhalerao Kaustubh D

    2010-02-01

    Full Text Available Abstract Background Positive feedback is a common mechanism used in the regulation of many gene circuits as it can amplify the response to inducers and also generate binary outputs and hysteresis. In the context of electrical circuit design, positive feedback is often considered in the design of amplifiers. Similar approaches, therefore, may be used for the design of amplifiers in synthetic gene circuits with applications, for example, in cell-based sensors. Results We developed a modular positive feedback circuit that can function as a genetic signal amplifier, heightening the sensitivity to inducer signals as well as increasing maximum expression levels without the need for an external cofactor. The design utilizes a constitutively active, autoinducer-independent variant of the quorum-sensing regulator LuxR. We experimentally tested the ability of the positive feedback module to separately amplify the output of a one-component tetracycline sensor and a two-component aspartate sensor. In each case, the positive feedback module amplified the response to the respective inducers, both with regards to the dynamic range and sensitivity. Conclusions The advantage of our design is that the actual feedback mechanism depends only on a single gene and does not require any other modulation. Furthermore, this circuit can amplify any transcriptional signal, not just one encoded within the circuit or tuned by an external inducer. As our design is modular, it can potentially be used as a component in the design of more complex synthetic gene circuits.

  16. Molecular typing of Staphylococcus aureus based on coagulase gene.

    Science.gov (United States)

    Javid, Faizan; Taku, Anil; Bhat, Mohd Altaf; Badroo, Gulzar Ahmad; Mudasir, Mir; Sofi, Tanveer Ahmad

    2018-04-01

    This study was conducted to study the coagulase gene-based genetic diversity of Staphylococcus aureus , isolated from different samples of cattle using restriction fragment length polymorphism (RFLP) and their sequence-based phylogenetic analysis. A total of 192 different samples from mastitic milk, nasal cavity, and pus from skin wounds of cattle from Military Dairy Farm, Jammu, India, were screened for the presence of S. aureus . The presumptive isolates were confirmed by nuc gene-based polymerase chain reaction (PCR). The confirmed S. aureus isolates were subjected to coagulase ( coa ) gene PCR. Different coa genotypes observed were subjected to RFLP using restriction enzymes Hae111 and Alu1 , to obtain the different restriction patterns. One isolate from each restriction pattern was sequenced. These sequences were aligned for maximum homology using the Bioedit softwareandsimilarity in the sequences was inferred with the help of sequence identity matrix. Of 192 different samples,39 (20.31%) isolates of S. aureus were confirmed by targeting nuc gene using PCR. Of 39 S. aureus isolates, 25 (64.10%) isolates carried coa gene. Four different genotypes of coa gene, i.e., 514 bp, 595 bp, 757 bp, and 802 bp were obtained. Two coa genotypes, 595 bp (15 isolates) and 802 bp (4 isolates), were observed in mastitic milk. 514 bp (2 isolates) and 757 bp (4 isolates) coa genotypes were observed from nasal cavity and pus from skin wounds, respectively. On RFLP using both restriction enzymes, four different restriction patterns P1, P2, P3, and P4 were observed. On sequencing, four different sequences having unique restriction patterns were obtained. The most identical sequences with the value of 0.810 were found between isolate S. aureus 514 (nasal cavity) and S. aureus 595 (mastitic milk), and thus, they are most closely related. While as the most distant sequences with the value of 0.483 were found between S. aureus 514 and S. aureus 802 isolates. The study, being localized

  17. Information dimension analysis of bacterial essential and nonessential genes based on chaos game representation

    International Nuclear Information System (INIS)

    Zhou, Qian; Yu, Yong-ming

    2014-01-01

    Essential genes are indispensable for the survival of an organism. Investigating features associated with gene essentiality is fundamental to the prediction and identification of the essential genes. Selecting features associated with gene essentiality is fundamental to predict essential genes with computational techniques. We use fractal theory to make comparative analysis of essential and nonessential genes in bacteria. The information dimensions of essential genes and nonessential genes available in the DEG database for 27 bacteria are calculated based on their gene chaos game representations (CGRs). It is found that weak positive linear correlation exists between information dimension and gene length. Moreover, for genes of similar length, the average information dimension of essential genes is larger than that of nonessential genes. This indicates that essential genes show less regularity and higher complexity than nonessential genes. Our results show that for bacterium with a similar number of essential genes and nonessential genes, the CGR information dimension is helpful for the classification of essential genes and nonessential genes. Therefore, the gene CGR information dimension is very probably a useful gene feature for a genetic algorithm predicting essential genes. (paper)

  18. Development of gene diagnosis for diabetes and cholecystis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Kono, Akira

    1998-01-01

    The gene structures of CCK, A type receptor in human, the rat and the mouse were investigated aiming to clarify that the aberration of the gene is involved in the incidences of diabetes and cholecystis. In this fiscal year, 1997, the normal structure of the gene and the accurate base sequence were analyzed using DNA fragments bound to 32 P-labelled cDNA of human CCKAR originated from the gene library of leucocyte. This gene contained about 2.2 x 10 5 base pairs and the base sequence was completely determined and registered to Japan DNA data bank (D85606). In addition, the genome structures and base sequences of mouse and rat CCKAR were analyzed and registered (D 85605 and D 50608, respectively). The differences in the base sequence of CCKAR among the species were found in the promotor region and the intron regions, suggesting that there might be differences in splicing among species. (M.N.)

  19. Design of Knowledge Bases for Plant Gene Regulatory Networks.

    Science.gov (United States)

    Mukundi, Eric; Gomez-Cano, Fabio; Ouma, Wilberforce Zachary; Grotewold, Erich

    2017-01-01

    Developing a knowledge base that contains all the information necessary for the researcher studying gene regulation in a particular organism can be accomplished in four stages. This begins with defining the data scope. We describe here the necessary information and resources, and outline the methods for obtaining data. The second stage consists of designing the schema, which involves defining the entire arrangement of the database in a systematic plan. The third stage is the implementation, defined by actualization of the database by using software according to a predefined schema. The final stage is development, where the database is made available to users in a web-accessible system. The result is a knowledgebase that integrates all the information pertaining to gene regulation, and which is easily expandable and transferable.

  20. Gene ontology based transfer learning for protein subcellular localization

    Directory of Open Access Journals (Sweden)

    Zhou Shuigeng

    2011-02-01

    Full Text Available Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for

  1. Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

    Directory of Open Access Journals (Sweden)

    Xiaobo Guo

    Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.

  2. A Gene Module-Based eQTL Analysis Prioritizing Disease Genes and Pathways in Kidney Cancer

    Directory of Open Access Journals (Sweden)

    Mary Qu Yang

    Full Text Available Clear cell renal cell carcinoma (ccRCC is the most common and most aggressive form of renal cell cancer (RCC. The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1, as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways. Keywords: ccRCC, Causative mutation, Pathways, Protein-protein interaction, Gene module, eQTL

  3. A comprehensive family-based replication study of schizophrenia genes

    DEFF Research Database (Denmark)

    Aberg, Karolina A; Liu, Youfang; Bukszár, Jozsef

    2013-01-01

     768 control subjects from 6 databases and, after quality control 6298 individuals (including 3286 cases) from 1811 nuclear families. MAIN OUTCOMES AND MEASURES Case-control status for SCZ. RESULTS Replication results showed a highly significant enrichment of SNPs with small P values. Of the SNPs...... in an independent family-based replication study that, after quality control, consisted of 8107 SNPs. SETTING Linkage meta-analysis, brain transcriptome meta-analysis, candidate gene database, OMIM, relevant mouse studies, and expression quantitative trait locus databases. PATIENTS We included 11 185 cases and 10...

  4. Gene-based Association Approach Identify Genes Across Stress Traits in Fruit Flies

    DEFF Research Database (Denmark)

    Rohde, Palle Duun; Edwards, Stefan McKinnon; Sarup, Pernille Merete

    Identification of genes explaining variation in quantitative traits or genetic risk factors of human diseases requires both good phenotypic- and genotypic data, but also efficient statistical methods. Genome-wide association studies may reveal association between phenotypic variation and variation...... approach grouping variants accordingly to gene position, thus lowering the number of statistical tests performed and increasing the probability of identifying genes with small to moderate effects. Using this approach we identify numerous genes associated with different types of stresses in Drosophila...... melanogaster, but also identify common genes that affects the stress traits....

  5. Gene-Based Genome-Wide Association Analysis in European and Asian Populations Identified Novel Genes for Rheumatoid Arthritis.

    Directory of Open Access Journals (Sweden)

    Hong Zhu

    Full Text Available Rheumatoid arthritis (RA is a complex autoimmune disease. Using a gene-based association research strategy, the present study aims to detect unknown susceptibility to RA and to address the ethnic differences in genetic susceptibility to RA between European and Asian populations.Gene-based association analyses were performed with KGG 2.5 by using publicly available large RA datasets (14,361 RA cases and 43,923 controls of European subjects, 4,873 RA cases and 17,642 controls of Asian Subjects. For the newly identified RA-associated genes, gene set enrichment analyses and protein-protein interactions analyses were carried out with DAVID and STRING version 10.0, respectively. Differential expression verification was conducted using 4 GEO datasets. The expression levels of three selected 'highly verified' genes were measured by ELISA among our in-house RA cases and controls.A total of 221 RA-associated genes were newly identified by gene-based association study, including 71'overlapped', 76 'European-specific' and 74 'Asian-specific' genes. Among them, 105 genes had significant differential expressions between RA patients and health controls at least in one dataset, especially for 20 genes including 11 'overlapped' (ABCF1, FLOT1, HLA-F, IER3, TUBB, ZKSCAN4, BTN3A3, HSP90AB1, CUTA, BRD2, HLA-DMA, 5 'European-specific' (PHTF1, RPS18, BAK1, TNFRSF14, SUOX and 4 'Asian-specific' (RNASET2, HFE, BTN2A2, MAPK13 genes whose differential expressions were significant at least in three datasets. The protein expressions of two selected genes FLOT1 (P value = 1.70E-02 and HLA-DMA (P value = 4.70E-02 in plasma were significantly different in our in-house samples.Our study identified 221 novel RA-associated genes and especially highlighted the importance of 20 candidate genes on RA. The results addressed ethnic genetic background differences for RA susceptibility between European and Asian populations and detected a long list of overlapped or ethnic specific RA

  6. Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

    Science.gov (United States)

    Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

    2016-06-01

    Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

    2014-06-01

    In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both

  8. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo

    2014-01-01

    Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further

  9. Statistics on gene-based laser speckles with a small number of scatterers: implications for the detection of polymorphism in the Chlamydia trachomatis omp1 gene

    Science.gov (United States)

    Ulyanov, Sergey S.; Ulianova, Onega V.; Zaytsev, Sergey S.; Saltykov, Yury V.; Feodorova, Valentina A.

    2018-04-01

    The transformation mechanism for a nucleotide sequence of the Chlamydia trachomatis gene into a speckle pattern has been considered. The first and second-order statistics of gene-based speckles have been analyzed. It has been demonstrated that gene-based speckles do not obey Gaussian statistics and belong to the class of speckles with a small number of scatterers. It has been shown that gene polymorphism can be easily detected through analysis of the statistical characteristics of gene-based speckles.

  10. Optimization of conditions for gene delivery system based on PEI

    Directory of Open Access Journals (Sweden)

    Roya Cheraghi

    2017-01-01

    Full Text Available Objective(s: PEI based nanoparticle (NP due to dual capabilities of proton sponge and DNA binding is known as powerful tool for nucleic acid delivery to cells. However, serious cytotoxicity and complicated conditions, which govern NPs properties and its interactions with cells practically, hindered achievement to high transfection efficiency. Here, we have tried to optimize the properties of PEI/ firefly luciferase plasmid complexes and cellular condition to improve transfection efficiency. Materials and Methods: For this purpose, firefly luciferase, as a robust gene reporter, was complexed with PEI to prepare NPs with different size and charge. The physicochemical properties of nanoparticles were evaluated using agarose gel retardation and dynamic light scattering.  MCF7 and BT474 cells at different confluency were also transfected with prepared nanoparticles at various concentrations for short and long times. Results: The branched PEI can instantaneously bind to DNA and form cationic NPs. The results demonstrated the production of nanoparticles with size about 100-500 nm dependent on N/P ratio. Moreover, increase of nanoparticles concentration on the cell surface drastically improved the transfection rate, so at a concentration of 30 ng/ìl, the highest transfection efficiency was achieved. On the other side, at confluency between 40-60%, the maximum efficiency was obtained. The result demonstrated that N/P ratio of 12 could establish an optimized ratio between transfection efficiency and cytotoxicity of PEI/plasmid nanoparticles. The increase of NPs N/P ratio led to significant cytotoxicity. Conclusion: Obtained results verified the optimum conditions for PEI based gene delivery in different cell lines.

  11. Density based pruning for identification of differentially expressed genes from microarray data

    Directory of Open Access Journals (Sweden)

    Xu Jia

    2010-11-01

    Full Text Available Abstract Motivation Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes. Results We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change. Conclusions Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune

  12. Identification of Constrained Cancer Driver Genes Based on Mutation Timing

    Science.gov (United States)

    Sakoparnig, Thomas; Fried, Patrick; Beerenwinkel, Niko

    2015-01-01

    Cancer drivers are genomic alterations that provide cells containing them with a selective advantage over their local competitors, whereas neutral passengers do not change the somatic fitness of cells. Cancer-driving mutations are usually discriminated from passenger mutations by their higher degree of recurrence in tumor samples. However, there is increasing evidence that many additional driver mutations may exist that occur at very low frequencies among tumors. This observation has prompted alternative methods for driver detection, including finding groups of mutually exclusive mutations and incorporating prior biological knowledge about gene function or network structure. Dependencies among drivers due to epistatic interactions can also result in low mutation frequencies, but this effect has been ignored in driver detection so far. Here, we present a new computational approach for identifying genomic alterations that occur at low frequencies because they depend on other events. Unlike passengers, these constrained mutations display punctuated patterns of occurrence in time. We test this driver–passenger discrimination approach based on mutation timing in extensive simulation studies, and we apply it to cross-sectional copy number alteration (CNA) data from ovarian cancer, CNA and single-nucleotide variant (SNV) data from breast tumors and SNV data from colorectal cancer. Among the top ranked predicted drivers, we find low-frequency genes that have already been shown to be involved in carcinogenesis, as well as many new candidate drivers. The mutation timing approach is orthogonal and complementary to existing driver prediction methods. It will help identifying from cancer genome data the alterations that drive tumor progression. PMID:25569148

  13. Gene-based interaction analysis shows GABAergic genes interacting with parenting in adolescent depressive symptoms

    NARCIS (Netherlands)

    Van Assche, Evelien; Moons, Tim; Cinar, Ozan; Viechtbauer, Wolfgang; Oldehinkel, Albertine J.; Van Leeuwen, Karla; Verschueren, Karine; Colpin, Hilde; Lambrechts, Diether; Van den Noortgate, Wim; Goossens, Luc; Claes, Stephan; van Winkel, Ruud

    2017-01-01

    BACKGROUND: Most gene-environment interaction studies (G × E) have focused on single candidate genes. This approach is criticized for its expectations of large effect sizes and occurrence of spurious results. We describe an approach that accounts for the polygenic nature of most psychiatric

  14. Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.

    Science.gov (United States)

    Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph

    2013-11-07

    The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the

  15. Screening key genes for abdominal aortic aneurysm based on gene expression omnibus dataset.

    Science.gov (United States)

    Wan, Li; Huang, Jingyong; Ni, Haizhen; Yu, Guanfeng

    2018-02-13

    Abdominal aortic aneurysm (AAA) is a common cardiovascular system disease with high mortality. The aim of this study was to identify potential genes for diagnosis and therapy in AAA. We searched and downloaded mRNA expression data from the Gene Expression Omnibus (GEO) database to identify differentially expressed genes (DEGs) from AAA and normal individuals. Then, Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis, transcriptional factors (TFs) network and protein-protein interaction (PPI) network were used to explore the function of genes. Additionally, immunohistochemical (IHC) staining was used to validate the expression of identified genes. Finally, the diagnostic value of identified genes was accessed by receiver operating characteristic (ROC) analysis in GEO database. A total of 1199 DEGs (188 up-regulated and 1011 down-regulated) were identified between AAA and normal individual. KEGG pathway analysis displayed that vascular smooth muscle contraction and pathways in cancer were significantly enriched signal pathway. The top 10 up-regulated and top 10 down-regulated DEGs were used to construct TFs and PPI networks. Some genes with high degrees such as NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16 and FOXO1 were identified to be related to AAA. The consequences of IHC staining showed that CCR7 and PDGFA were up-regulated in tissue samples of AAA. ROC analysis showed that NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16, FOXO1 and PDGFA had the potential diagnostic value for AAA. The identified genes including NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16, FOXO1 and PDGFA might be involved in the pathology of AAA.

  16. The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes

    Directory of Open Access Journals (Sweden)

    Xinguo Lu

    2018-01-01

    Full Text Available With advances in next-generation sequencing(NGS technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast cancer is known as a heterogeneous disease. The identification of subtype-specific driver genes is critical to guide the diagnosis, assessment of prognosis and treatment of breast cancer. We developed an integrated frame based on gene expression profiles and copy number variation (CNV data to identify breast cancer subtype-specific driver genes. In this frame, we employed statistical machine-learning method to select gene subsets and utilized an module-network analysis method to identify potential candidate driver genes. The final subtype-specific driver genes were acquired by paired-wise comparison in subtypes. To validate specificity of the driver genes, the gene expression data of these genes were applied to classify the patient samples with 10-fold cross validation and the enrichment analysis were also conducted on the identified driver genes. The experimental results show that the proposed integrative method can identify the potential driver genes and the classifier with these genes acquired better performance than with genes identified by other methods.

  17. GeneRecon Users' Manual — A coalescent based tool for fine-scale association mapping

    DEFF Research Database (Denmark)

    Mailund, T

    2006-01-01

    GeneRecon is a software package for linkage disequilibrium mapping using coalescent theory. It is based on Bayesian Markov-chain Monte Carlo (MCMC) method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. GeneRecon explicitly models the genealogy of a sample of th...

  18. A PLSPM-Based Test Statistic for Detecting Gene-Gene Co-Association in Genome-Wide Association Study with Case-Control Design

    Science.gov (United States)

    Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

    2013-01-01

    For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods. PMID:23620809

  19. A hybrid network-based method for the detection of disease-related genes

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

    2018-02-01

    Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.

  20. RNAi-Based Identification of Gene-Specific Nuclear Cofactor Networks Regulating Interleukin-1 Target Genes

    Directory of Open Access Journals (Sweden)

    Johanna Meier-Soelch

    2018-04-01

    Full Text Available The potent proinflammatory cytokine interleukin (IL-1 triggers gene expression through the NF-κB signaling pathway. Here, we investigated the cofactor requirements of strongly regulated IL-1 target genes whose expression is impaired in p65 NF-κB-deficient murine embryonic fibroblasts. By two independent small-hairpin (shRNA screens, we examined 170 genes annotated to encode nuclear cofactors for their role in Cxcl2 mRNA expression and identified 22 factors that modulated basal or IL-1-inducible Cxcl2 levels. The functions of 16 of these factors were validated for Cxcl2 and further analyzed for their role in regulation of 10 additional IL-1 target genes by RT-qPCR. These data reveal that each inducible gene has its own (quantitative requirement of cofactors to maintain basal levels and to respond to IL-1. Twelve factors (Epc1, H2afz, Kdm2b, Kdm6a, Mbd3, Mta2, Phf21a, Ruvbl1, Sin3b, Suv420h1, Taf1, and Ube3a have not been previously implicated in inflammatory cytokine functions. Bioinformatics analysis indicates that they are components of complex nuclear protein networks that regulate chromatin functions and gene transcription. Collectively, these data suggest that downstream from the essential NF-κB signal each cytokine-inducible target gene has further subtle requirements for individual sets of nuclear cofactors that shape its transcriptional activation profile.

  1. A new measure for functional similarity of gene products based on Gene Ontology

    Directory of Open Access Journals (Sweden)

    Lengauer Thomas

    2006-06-01

    Full Text Available Abstract Background Gene Ontology (GO is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. Results We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; simRel and funSim. One measure (simRel is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. Conclusion The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families.

  2. TXTGate: profiling gene groups with text-based information

    DEFF Research Database (Denmark)

    Glenisson, P.; Coessens, B.; Van Vooren, S.

    2004-01-01

    We implemented a framework called TXTGate that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, term-as well as gene-centric views are offered on selected textual...

  3. Targeted delivery of genes to endothelial cells and cell- and gene-based therapy in pulmonary vascular diseases.

    Science.gov (United States)

    Suen, Colin M; Mei, Shirley H J; Kugathasan, Lakshmi; Stewart, Duncan J

    2013-10-01

    Pulmonary arterial hypertension (PAH) is a devastating disease that, despite significant advances in medical therapies over the last several decades, continues to have an extremely poor prognosis. Gene therapy is a method to deliver therapeutic genes to replace defective or mutant genes or supplement existing cellular processes to modify disease. Over the last few decades, several viral and nonviral methods of gene therapy have been developed for preclinical PAH studies with varying degrees of efficacy. However, these gene delivery methods face challenges of immunogenicity, low transduction rates, and nonspecific targeting which have limited their translation to clinical studies. More recently, the emergence of regenerative approaches using stem and progenitor cells such as endothelial progenitor cells (EPCs) and mesenchymal stem cells (MSCs) have offered a new approach to gene therapy. Cell-based gene therapy is an approach that augments the therapeutic potential of EPCs and MSCs and may deliver on the promise of reversal of established PAH. These new regenerative approaches have shown tremendous potential in preclinical studies; however, large, rigorously designed clinical studies will be necessary to evaluate clinical efficacy and safety. © 2013 American Physiological Society. Compr Physiol 3:1749-1779, 2013.

  4. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses.

    Science.gov (United States)

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-03-01

    comprehensive gene data set of sex pheromone biosynthesis and degradation enzyme related genes in DBM created by genome- and transcriptome-wide identification, characterization and expression profiling. Our findings provide a basis to better understand the function of genes with tissue enriched expression. The results also provide information on the genes involved in sex pheromone biosynthesis and degradation, and may be useful to identify potential gene targets for pest control strategies by disrupting the insect-insect communication using pheromone-based behavioral antagonists.

  5. GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies.

    Science.gov (United States)

    Yung, Ling Sing; Yang, Can; Wan, Xiang; Yu, Weichuan

    2011-05-01

    Collecting millions of genetic variations is feasible with the advanced genotyping technology. With a huge amount of genetic variations data in hand, developing efficient algorithms to carry out the gene-gene interaction analysis in a timely manner has become one of the key problems in genome-wide association studies (GWAS). Boolean operation-based screening and testing (BOOST), a recent work in GWAS, completes gene-gene interaction analysis in 2.5 days on a desktop computer. Compared with central processing units (CPUs), graphic processing units (GPUs) are highly parallel hardware and provide massive computing resources. We are, therefore, motivated to use GPUs to further speed up the analysis of gene-gene interactions. We implement the BOOST method based on a GPU framework and name it GBOOST. GBOOST achieves a 40-fold speedup compared with BOOST. It completes the analysis of Wellcome Trust Case Control Consortium Type 2 Diabetes (WTCCC T2D) genome data within 1.34 h on a desktop computer equipped with Nvidia GeForce GTX 285 display card. GBOOST code is available at http://bioinformatics.ust.hk/BOOST.html#GBOOST.

  6. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  7. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  8. Candidate genes and pathogenesis investigation for sepsis-related acute respiratory distress syndrome based on gene expression profile.

    Science.gov (United States)

    Wang, Min; Yan, Jingjun; He, Xingxing; Zhong, Qiang; Zhan, Chengye; Li, Shusheng

    2016-04-18

    Acute respiratory distress syndrome (ARDS) is a potentially devastating form of acute inflammatory lung injury as well as a major cause of acute respiratory failure. Although researchers have made significant progresses in elucidating the pathophysiology of this complex syndrome over the years, the absence of a universal detail disease mechanism up until now has led to a series of practical problems for a definitive treatment. This study aimed to predict some genes or pathways associated with sepsis-related ARDS based on a public microarray dataset and to further explore the molecular mechanism of ARDS. A total of 122 up-regulated DEGs and 91 down-regulated differentially expressed genes (DEGs) were obtained. The up- and down-regulated DEGs were mainly involved in functions like mitotic cell cycle and pathway like cell cycle. Protein-protein interaction network of ARDS analysis revealed 20 hub genes including cyclin B1 (CCNB1), cyclin B2 (CCNB2) and topoisomerase II alpha (TOP2A). A total of seven transcription factors including forkhead box protein M1 (FOXM1) and 30 target genes were revealed in the transcription factor-target gene regulation network. Furthermore, co-cited genes including CCNB2-CCNB1 were revealed in literature mining for the relations ARDS related genes. Pathways like mitotic cell cycle were closed related with the development of ARDS. Genes including CCNB1, CCNB2 and TOP2A, as well as transcription factors like FOXM1 might be used as the novel gene therapy targets for sepsis related ARDS.

  9. Identification of potential crucial genes associated with steroid-induced necrosis of femoral head based on gene expression profile.

    Science.gov (United States)

    Lin, Zhe; Lin, Yongsheng

    2017-09-05

    The aim of this study was to explore potential crucial genes associated with the steroid-induced necrosis of femoral head (SINFH) and to provide valid biological information for further investigation of SINFH. Gene expression profile of GSE26316, generated from 3 SINFH rat samples and 3 normal rat samples were downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified using LIMMA package. After functional enrichment analyses of DEGs, protein-protein interaction (PPI) network and sub-PPI network analyses were conducted based on the STRING database and cytoscape. In total, 59 up-regulated DEGs and 156 downregulated DEGs were identified. The up-regulated DEGs were mainly involved in functions about immunity (e.g. Fcer1A and Il7R), and the downregulated DEGs were mainly enriched in muscle system process (e.g. Tnni2, Mylpf and Myl1). The PPI network of DEGs consisted of 123 nodes and 300 interactions. Tnni2, Mylpf, and Myl1 were the top 3 outstanding genes based on both subgraph centrality and degree centrality evaluation. These three genes interacted with each other in the network. Furthermore, the significant network module was composed of 22 downregulated genes (e.g. Tnni2, Mylpf and Myl1). These genes were mainly enriched in functions like muscle system process. The DEGs related to the regulation of immune system process (e.g. Fcer1A and Il7R), and DEGs correlated with muscle system process (e.g. Tnni2, Mylpf and Myl1) may be closely associated with the progress of SINFH, which is still needed to be confirmed by experiments. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Josephine Erhiakporeh

    2016-07-06

    Jul 6, 2016 ... candidate genes for drought tolerance in sesame. (Sesamum ... Our results provided genomic resources for further functional analysis and genetic engineering .... reverse transcribed using the Reverse Transcription System.

  11. The progress of PET based reporter gene imaging

    International Nuclear Information System (INIS)

    Zhao Wei; Zhang Xiuli

    2005-01-01

    More than two decades of intense research have allowed gene therapy to move from the laboratory to the clinical setting, where its use for the treatment of human pathologies has been considerably increased in the last years. However, many crucial questions remain to be solved in this challenging field. In vivo imaging with positron emission tomography (PET) by combination of the appropriate PET reporter gene and PET reporter probe could provide invaluable qualitative and quantitative information to answer multiple unsolved questions about gene therapy. PET imaging could be used to define parameters not available by other techniques that are of substantial interest not only for the proper understanding of the gene therapy process, but also for its future development and clinical application in humans. (authors)

  12. Embryo quality predictive models based on cumulus cells gene expression

    Directory of Open Access Journals (Sweden)

    Devjak R

    2016-06-01

    Full Text Available Since the introduction of in vitro fertilization (IVF in clinical practice of infertility treatment, the indicators for high quality embryos were investigated. Cumulus cells (CC have a specific gene expression profile according to the developmental potential of the oocyte they are surrounding, and therefore, specific gene expression could be used as a biomarker. The aim of our study was to combine more than one biomarker to observe improvement in prediction value of embryo development. In this study, 58 CC samples from 17 IVF patients were analyzed. This study was approved by the Republic of Slovenia National Medical Ethics Committee. Gene expression analysis [quantitative real time polymerase chain reaction (qPCR] for five genes, analyzed according to embryo quality level, was performed. Two prediction models were tested for embryo quality prediction: a binary logistic and a decision tree model. As the main outcome, gene expression levels for five genes were taken and the area under the curve (AUC for two prediction models were calculated. Among tested genes, AMHR2 and LIF showed significant expression difference between high quality and low quality embryos. These two genes were used for the construction of two prediction models: the binary logistic model yielded an AUC of 0.72 ± 0.08 and the decision tree model yielded an AUC of 0.73 ± 0.03. Two different prediction models yielded similar predictive power to differentiate high and low quality embryos. In terms of eventual clinical decision making, the decision tree model resulted in easy-to-interpret rules that are highly applicable in clinical practice.

  13. A sight on the current nanoparticle-based gene delivery vectors

    Science.gov (United States)

    Dizaj, Solmaz Maleki; Jafari, Samira; Khosroushahi, Ahmad Yari

    2014-05-01

    Nowadays, gene delivery for therapeutic objects is considered one of the most promising strategies to cure both the genetic and acquired diseases of human. The design of efficient gene delivery vectors possessing the high transfection efficiencies and low cytotoxicity is considered the major challenge for delivering a target gene to specific tissues or cells. On this base, the investigations on non-viral gene vectors with the ability to overcome physiological barriers are increasing. Among the non-viral vectors, nanoparticles showed remarkable properties regarding gene delivery such as the ability to target the specific tissue or cells, protect target gene against nuclease degradation, improve DNA stability, and increase the transformation efficiency or safety. This review attempts to represent a current nanoparticle based on its lipid, polymer, hybrid, and inorganic properties. Among them, hybrids, as efficient vectors, are utilized in gene delivery in terms of materials (synthetic or natural), design, and in vitro/ in vivo transformation efficiency.

  14. Gene Ontology-Based Analysis of Zebrafish Omics Data Using the Web Tool Comparative Gene Ontology.

    Science.gov (United States)

    Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan

    2017-10-01

    Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.

  15. A Cancer Gene Selection Algorithm Based on the K-S Test and CFS

    Directory of Open Access Journals (Sweden)

    Qiang Su

    2017-01-01

    Full Text Available Background. To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S test and correlation-based feature selection (CFS principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. Results. We adopted support vector machines (SVM as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR, and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. Conclusions. The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.

  16. Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; He, Yongqun

    2017-03-14

    Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of

  17. Gene mutation-based and specific therapies in precision medicine.

    Science.gov (United States)

    Wang, Xiangdong

    2016-04-01

    Precision medicine has been initiated and gains more and more attention from preclinical and clinical scientists. A number of key elements or critical parts in precision medicine have been described and emphasized to establish a systems understanding of precision medicine. The principle of precision medicine is to treat patients on the basis of genetic alterations after gene mutations are identified, although questions and challenges still remain before clinical application. Therapeutic strategies of precision medicine should be considered according to gene mutation, after biological and functional mechanisms of mutated gene expression or epigenetics, or the correspondent protein, are clearly validated. It is time to explore and develop a strategy to target and correct mutated genes by direct elimination, restoration, correction or repair of mutated sequences/genes. Nevertheless, there are still numerous challenges to integrating widespread genomic testing into individual cancer therapies and into decision making for one or another treatment. There are wide-ranging and complex issues to be solved before precision medicine becomes clinical reality. Thus, the precision medicine can be considered as an extension and part of clinical and translational medicine, a new alternative of clinical therapies and strategies, and have an important impact on disease cures and patient prognoses. © 2015 The Author. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

  18. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  19. An Entropy-based gene selection method for cancer classification using microarray data

    Directory of Open Access Journals (Sweden)

    Krishnan Arun

    2005-03-01

    Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.

  20. LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

    Science.gov (United States)

    Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

    2012-01-01

    Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene

  1. Comparative study on gene set and pathway topology-based enrichment methods.

    Science.gov (United States)

    Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim

    2015-10-22

    Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both

  2. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Directory of Open Access Journals (Sweden)

    Enrico Glaab

    Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  3. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

    Science.gov (United States)

    Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

    2012-01-01

    Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.

  4. Towards gene therapy based on femtosecond optical transfection

    Science.gov (United States)

    Antkowiak, M.; Torres-Mapa, M. L.; McGinty, J.; Chahine, M.; Bugeon, L.; Rose, A.; Finn, A.; Moleirinho, S.; Okuse, K.; Dallman, M.; French, P.; Harding, S. E.; Reynolds, P.; Gunn-Moore, F.; Dholakia, K.

    2012-06-01

    Gene therapy poses a great promise in treatment and prevention of a variety of diseases. However, crucial to studying and the development of this therapeutic approach is a reliable and efficient technique of gene and drug delivery into primary cell types. These cells, freshly derived from an organ or tissue, mimic more closely the in vivo state and present more physiologically relevant information compared to cultured cell lines. However, primary cells are known to be difficult to transfect and are typically transfected using viral methods, which are not only questionable in the context of an in vivo application but rely on time consuming vector construction and may also result in cell de-differentiation and loss of functionality. At the same time, well established non-viral methods do not guarantee satisfactory efficiency and viability. Recently, optical laser mediated poration of cell membrane has received interest as a viable gene and drug delivery technique. It has been shown to deliver a variety of biomolecules and genes into cultured mammalian cells; however, its applicability to primary cells remains to be proven. We demonstrate how optical transfection can be an enabling technique in research areas, such as neuropathic pain, neurodegenerative diseases, heart failure and immune or inflammatory-related diseases. Several primary cell types are used in this study, namely cardiomyocytes, dendritic cells, and neurons. We present our recent progress in optimizing this technique's efficiency and post-treatment cell viability for these types of cells and discuss future directions towards in vivo applications.

  5. Whole genome homology-based identification of candidate genes ...

    African Journals Online (AJOL)

    Sesame (Sesamum indicum L.) is one of the most important oilseed crops. It is mainly grown in arid and semi-arid regions with occurrence of unpredictable drought which is one of the major constraints of its production. However, the lack of gene resources associated with drought tolerance hinders sesame genetic ...

  6. Microarray-Based Identification of Transcription Factor Target Genes

    NARCIS (Netherlands)

    Gorte, M.; Horstman, A.; Page, R.B.; Heidstra, R.; Stromberg, A.; Boutilier, K.A.

    2011-01-01

    Microarray analysis is widely used to identify transcriptional changes associated with genetic perturbation or signaling events. Here we describe its application in the identification of plant transcription factor target genes with emphasis on the design of suitable DNA constructs for controlling TF

  7. RNAi-based silencing of genes encoding the vacuolar- ATPase ...

    African Journals Online (AJOL)

    2016-11-09

    Nov 9, 2016 ... Spodoptera exigua larval development by silencing chitin synthase gene with RNA interference. Bull. Entomol. Res. 98:613-619. Dow JAT (1999). The Multifunctional Drosophila melanogaster V-. ATPase is encoded by a multigene family. J. Bioenerg. Biomembr. 31:75-83. Fire A, Xu SQ, Montgomery MK, ...

  8. GO(vis), a gene ontology visualization tool based on multi-dimensional values.

    Science.gov (United States)

    Ning, Zi; Jiang, Zhenran

    2010-05-01

    Most of gene product similarity measurements concentrate on the information content of Gene Ontology (GO) terms or use a path-based similarity between GO terms, which may ignore other important information contained in the structure of the ontology. In our study, we integrate different GO similarity measure approaches to analyze the functional relationship of genes and gene products with a new triangle-based visualization tool called GO(Vis). The purpose of this tool is to demonstrate the effect of three important information factors when measuring the similarity between gene products. One advantage of this tool is that its important ratio can be adjusted to meet different measuring requirements according to the biological knowledge of each factor. The experimental results demonstrate that GO(Vis) can display diagrams of the functional relationship for gene products effectively.

  9. Gene-based testing of interactions in association studies of quantitative traits.

    Directory of Open Access Journals (Sweden)

    Li Ma

    Full Text Available Various methods have been developed for identifying gene-gene interactions in genome-wide association studies (GWAS. However, most methods focus on individual markers as the testing unit, and the large number of such tests drastically erodes statistical power. In this study, we propose novel interaction tests of quantitative traits that are gene-based and that confer advantage in both statistical power and biological interpretation. The framework of gene-based gene-gene interaction (GGG tests combine marker-based interaction tests between all pairs of markers in two genes to produce a gene-level test for interaction between the two. The tests are based on an analytical formula we derive for the correlation between marker-based interaction tests due to linkage disequilibrium. We propose four GGG tests that extend the following P value combining methods: minimum P value, extended Simes procedure, truncated tail strength, and truncated P value product. Extensive simulations point to correct type I error rates of all tests and show that the two truncated tests are more powerful than the other tests in cases of markers involved in the underlying interaction not being directly genotyped and in cases of multiple underlying interactions. We applied our tests to pairs of genes that exhibit a protein-protein interaction to test for gene-level interactions underlying lipid levels using genotype data from the Atherosclerosis Risk in Communities study. We identified five novel interactions that are not evident from marker-based interaction testing and successfully replicated one of these interactions, between SMAD3 and NEDD9, in an independent sample from the Multi-Ethnic Study of Atherosclerosis. We conclude that our GGG tests show improved power to identify gene-level interactions in existing, as well as emerging, association studies.

  10. Two-Way Gene Interaction From Microarray Data Based on Correlation Methods.

    Science.gov (United States)

    Alavi Majd, Hamid; Talebi, Atefeh; Gilany, Kambiz; Khayyer, Nasibeh

    2016-06-01

    Gene networks have generated a massive explosion in the development of high-throughput techniques for monitoring various aspects of gene activity. Networks offer a natural way to model interactions between genes, and extracting gene network information from high-throughput genomic data is an important and difficult task. The purpose of this study is to construct a two-way gene network based on parametric and nonparametric correlation coefficients. The first step in constructing a Gene Co-expression Network is to score all pairs of gene vectors. The second step is to select a score threshold and connect all gene pairs whose scores exceed this value. In the foundation-application study, we constructed two-way gene networks using nonparametric methods, such as Spearman's rank correlation coefficient and Blomqvist's measure, and compared them with Pearson's correlation coefficient. We surveyed six genes of venous thrombosis disease, made a matrix entry representing the score for the corresponding gene pair, and obtained two-way interactions using Pearson's correlation, Spearman's rank correlation, and Blomqvist's coefficient. Finally, these methods were compared with Cytoscape, based on BIND, and Gene Ontology, based on molecular function visual methods; R software version 3.2 and Bioconductor were used to perform these methods. Based on the Pearson and Spearman correlations, the results were the same and were confirmed by Cytoscape and GO visual methods; however, Blomqvist's coefficient was not confirmed by visual methods. Some results of the correlation coefficients are not the same with visualization. The reason may be due to the small number of data.

  11. Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Jing; Ma, Zihao; Carr, Steven A.; Mertins, Philipp; Zhang, Hui; Zhang, Zhen; Chan, Daniel W.; Ellis, Matthew J. C.; Townsend, R. Reid; Smith, Richard D.; McDermott, Jason E.; Chen, Xian; Paulovich, Amanda G.; Boja, Emily S.; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Rodland, Karin D.; Liebler, Daniel C.; Zhang, Bing

    2016-11-11

    Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies

  12. Efficient gene transfer into nondividing cells by adeno-associated virus-based vectors.

    OpenAIRE

    Podsakoff, G; Wong, K K; Chatterjee, S

    1994-01-01

    Gene transfer vectors based on adeno-associated virus (AAV) are emerging as highly promising for use in human gene therapy by virtue of their characteristics of wide host range, high transduction efficiencies, and lack of cytopathogenicity. To better define the biology of AAV-mediated gene transfer, we tested the ability of an AAV vector to efficiently introduce transgenes into nonproliferating cell populations. Cells were induced into a nonproliferative state by treatment with the DNA synthe...

  13. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  14. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

    Science.gov (United States)

    Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

    2017-10-02

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.

  15. Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

    Directory of Open Access Journals (Sweden)

    Zhai Chengxiang

    2010-05-01

    Full Text Available Abstract Background Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO. However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. Results We propose a statistical method that uses the primary literature, i.e. free-text, as the source to perform overrepresentation analysis. The method is based on a statistical framework of mixture model and addresses the methodological flaws in several existing programs. We implemented this method within a literature mining system, BeeSpace, taking advantage of its analysis environment and added features that facilitate the interactive analysis of gene sets. Through experimentation with several datasets, we showed that our program can effectively summarize the important conceptual themes of large gene sets, even when traditional GO-based analysis does not yield informative results. Conclusions We conclude that the current work will provide biologists with a tool that effectively complements the existing ones for overrepresentation analysis from genomic experiments. Our program, Genelist Analyzer, is freely available at: http://workerbee.igb.uiuc.edu:8080/BeeSpace/Search.jsp

  16. Queueing-Based Synchronization and Entrainment for Synthetic Gene Oscillators

    Science.gov (United States)

    Mather, William; Butzin, Nicholas; Hochendoner, Philip; Ogle, Curtis

    Synthetic gene oscillators have been a major focus of synthetic biology research since the beginning of the field 15 years ago. They have proven to be useful both for biotechnological applications as well as a testing ground to significantly develop our understanding of the design principles behind synthetic and native gene oscillators. In particular, the principles governing synchronization and entrainment of biological oscillators have been explored using a synthetic biology approach. Our work combines experimental and theoretical approaches to specifically investigate how a bottleneck for protein degradation, which is present in most if not all existing synthetic oscillators, can be leveraged to robustly synchronize and entrain biological oscillators. We use both the terminology and mathematical tools of queueing theory to intuitively explain the role of this bottleneck in both synchronization and entrainment, which extends prior work demonstrating the usefulness of queueing theory in synthetic and native gene circuits. We conclude with an investigation of how synchronization and entrainment may be sensitive to the presence of multiple proteolytic pathways in a cell that couple weakly through crosstalk. This work was supported by NSF Grant #1330180.

  17. Evaluation of gene importance in microarray data based upon probability of selection

    Directory of Open Access Journals (Sweden)

    Fu Li M

    2005-03-01

    Full Text Available Abstract Background Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. Results Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes with optimal classification performance, compared with results reported in the literature. Conclusion In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities.

  18. Design-Based Learning for Biology: Genetic Engineering Experience Improves Understanding of Gene Expression

    Science.gov (United States)

    Ellefson, Michelle R.; Brinker, Rebecca A.; Vernacchio, Vincent J.; Schunn, Christian D.

    2008-01-01

    Gene expression is a difficult topic for students to learn and comprehend, at least partially because it involves various biochemical structures and processes occurring at the microscopic level. Designer Bacteria, a design-based learning (DBL) unit for high-school students, applies principles of DBL to the teaching of gene expression. Throughout…

  19. A phylogenetic analysis of the genus Psathyrostachys (Poaceae) based on one nuclear gene, three plastid genes, and morphology

    DEFF Research Database (Denmark)

    Petersen, Gitte; Seberg, Ole; Baden, Claus

    2004-01-01

    A phylogenetic analysis of the small, Central Asian genus Psathyrostachys Nevski is presented. The analysis is based on morphological characters and nucleotide sequence data from one nuclear gene, DMC1, and three plastid genes, rbcL, rpoA, and rpoC2. Separate analyses of the three data partitions...... (morphology, nuclear sequences, and plastid sequences) result in mostly congruent trees. The plastid and nuclear sequences produce completely congruent trees, and only the trees based on plastid sequences and morphological characters are incongruent. Combined analysis of all data results in a fairly well......-resolved strict consensus tree: Ps. rupestris is the sister to the remaining species, which are divided into two clades: one including Ps. fragilis and Ps. caduca, the other including Ps. juncea, Ps. huashanica, Ps. lanuginosa, Ps. stoloniformis, and Ps. kronenburgii. Pubescent culms and more than 20 mm long...

  20. Research on the Bionics Design of Automobile Styling Based on the Form Gene

    Science.gov (United States)

    Aili, Zhao; Long, Jiang

    2017-09-01

    From the heritage of form gene point of view, this thesis has analyzed the gene make-up, cultural inheritance and aesthetic features in the evolution and development of forms of brand automobiles and proposed the bionic design concept and methods in the automobile styling design. And this innovative method must be based on the form gene, and the consistency and combination of form element must be maintained during the design. Taking the design of Maserati as an example, the thesis will show you the design method and philosophy in the aspects of form gene expression and bionic design innovation for the future automobile styling.

  1. Ortholog-based screening and identification of genes related to intracellular survival.

    Science.gov (United States)

    Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin

    2018-04-20

    Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.

  2. A Region-Based GeneSIS Segmentation Algorithm for the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    Stelios K. Mylonas

    2015-03-01

    Full Text Available This paper proposes an object-based segmentation/classification scheme for remotely sensed images, based on a novel variant of the recently proposed Genetic Sequential Image Segmentation (GeneSIS algorithm. GeneSIS segments the image in an iterative manner, whereby at each iteration a single object is extracted via a genetic-based object extraction algorithm. Contrary to the previous pixel-based GeneSIS where the candidate objects to be extracted were evaluated through the fuzzy content of their included pixels, in the newly developed region-based GeneSIS algorithm, a watershed-driven fine segmentation map is initially obtained from the original image, which serves as the basis for the forthcoming GeneSIS segmentation. Furthermore, in order to enhance the spatial search capabilities, we introduce a more descriptive encoding scheme in the object extraction algorithm, where the structural search modules are represented by polygonal shapes. Our objectives in the new framework are posed as follows: enhance the flexibility of the algorithm in extracting more flexible object shapes, assure high level classification accuracies, and reduce the execution time of the segmentation, while at the same time preserving all the inherent attributes of the GeneSIS approach. Finally, exploiting the inherent attribute of GeneSIS to produce multiple segmentations, we also propose two segmentation fusion schemes that operate on the ensemble of segmentations generated by GeneSIS. Our approaches are tested on an urban and two agricultural images. The results show that region-based GeneSIS has considerably lower computational demands compared to the pixel-based one. Furthermore, the suggested methods achieve higher classification accuracies and good segmentation maps compared to a series of existing algorithms.

  3. FiGS: a filter-based gene selection workbench for microarray data

    Directory of Open Access Journals (Sweden)

    Yun Taegyun

    2010-01-01

    Full Text Available Abstract Background The selection of genes that discriminate disease classes from microarray data is widely used for the identification of diagnostic biomarkers. Although various gene selection methods are currently available and some of them have shown excellent performance, no single method can retain the best performance for all types of microarray datasets. It is desirable to use a comparative approach to find the best gene selection result after rigorous test of different methodological strategies for a given microarray dataset. Results FiGS is a web-based workbench that automatically compares various gene selection procedures and provides the optimal gene selection result for an input microarray dataset. FiGS builds up diverse gene selection procedures by aligning different feature selection techniques and classifiers. In addition to the highly reputed techniques, FiGS diversifies the gene selection procedures by incorporating gene clustering options in the feature selection step and different data pre-processing options in classifier training step. All candidate gene selection procedures are evaluated by the .632+ bootstrap errors and listed with their classification accuracies and selected gene sets. FiGS runs on parallelized computing nodes that capacitate heavy computations. FiGS is freely accessible at http://gexp.kaist.ac.kr/figs. Conclusion FiGS is an web-based application that automates an extensive search for the optimized gene selection analysis for a microarray dataset in a parallel computing environment. FiGS will provide both an efficient and comprehensive means of acquiring optimal gene sets that discriminate disease states from microarray datasets.

  4. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

    Science.gov (United States)

    Hoff, Katharina J; Lange, Simone; Lomsadze, Alexandre; Borodovsky, Mark; Stanke, Mario

    2016-03-01

    Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. Partial least squares based gene expression analysis in estrogen receptor positive and negative breast tumors.

    Science.gov (United States)

    Ma, W; Zhang, T-F; Lu, P; Lu, S H

    2014-01-01

    Breast cancer is categorized into two broad groups: estrogen receptor positive (ER+) and ER negative (ER-) groups. Previous study proposed that under trastuzumab-based neoadjuvant chemotherapy, tumor initiating cell (TIC) featured ER- tumors response better than ER+ tumors. Exploration of the molecular difference of these two groups may help developing new therapeutic strategies, especially for ER- patients. With gene expression profile from the Gene Expression Omnibus (GEO) database, we performed partial least squares (PLS) based analysis, which is more sensitive than common variance/regression analysis. We acquired 512 differentially expressed genes. Four pathways were found to be enriched with differentially expressed genes, involving immune system, metabolism and genetic information processing process. Network analysis identified five hub genes with degrees higher than 10, including APP, ESR1, SMAD3, HDAC2, and PRKAA1. Our findings provide new understanding for the molecular difference between TIC featured ER- and ER+ breast tumors with the hope offer supports for therapeutic studies.

  6. Efficient CRISPR/Cas9-based gene knockout in watermelon.

    Science.gov (United States)

    Tian, Shouwei; Jiang, Linjian; Gao, Qiang; Zhang, Jie; Zong, Mei; Zhang, Haiying; Ren, Yi; Guo, Shaogui; Gong, Guoyi; Liu, Fan; Xu, Yong

    2017-03-01

    CRISPR/Cas9 system can precisely edit genomic sequence and effectively create knockout mutations in T0 generation watermelon plants. Genome editing offers great advantage to reveal gene function and generate agronomically important mutations to crops. Recently, RNA-guided genome editing system using the type II clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein 9 (Cas9) has been applied to several plant species, achieving successful targeted mutagenesis. Here, we report the genome of watermelon, an important fruit crop, can also be precisely edited by CRISPR/Cas9 system. ClPDS, phytoene desaturase in watermelon, was selected as the target gene because its mutant bears evident albino phenotype. CRISPR/Cas9 system performed genome editing, such as insertions or deletions at the expected position, in transfected watermelon protoplast cells. More importantly, all transgenic watermelon plants harbored ClPDS mutations and showed clear or mosaic albino phenotype, indicating that CRISPR/Cas9 system has technically 100% of genome editing efficiency in transgenic watermelon lines. Furthermore, there were very likely no off-target mutations, indicated by examining regions that were highly homologous to sgRNA sequences. Our results show that CRISPR/Cas9 system is a powerful tool to effectively create knockout mutations in watermelon.

  7. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  8. Exploring the role of peptides in polymer-based gene delivery.

    Science.gov (United States)

    Sun, Yanping; Yang, Zhen; Wang, Chunxi; Yang, Tianzhi; Cai, Cuifang; Zhao, Xiaoyun; Yang, Li; Ding, Pingtian

    2017-09-15

    Polymers are widely studied as non-viral gene vectors because of their strong DNA binding ability, capacity to carry large payload, flexibility of chemical modifications, low immunogenicity, and facile processes for manufacturing. However, high cytotoxicity and low transfection efficiency substantially restrict their application in clinical trials. Incorporating functional peptides is a promising approach to address these issues. Peptides demonstrate various functions in polymer-based gene delivery systems, such as targeting to specific cells, breaching membrane barriers, facilitating DNA condensation and release, and lowering cytotoxicity. In this review, we systematically summarize the role of peptides in polymer-based gene delivery, and elaborate how to rationally design polymer-peptide based gene delivery vectors. Polymers are widely studied as non-viral gene vectors, but suffer from high cytotoxicity and low transfection efficiency. Incorporating short, bioactive peptides into polymer-based gene delivery systems can address this issue. Peptides demonstrate various functions in polymer-based gene delivery systems, such as targeting to specific cells, breaching membrane barriers, facilitating DNA condensation and release, and lowering cytotoxicity. In this review, we highlight the peptides' roles in polymer-based gene delivery, and elaborate how to utilize various functional peptides to enhance the transfection efficiency of polymers. The optimized peptide-polymer vectors should be able to alter their structures and functions according to biological microenvironments and utilize inherent intracellular pathways of cells, and consequently overcome the barriers during gene delivery to enhance transfection efficiency. Copyright © 2017 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.

  9. A relative variation-based method to unraveling gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Yali Wang

    Full Text Available Gene regulatory network (GRN reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called Z-score, usually perform better. A fundamental problem with the Z-score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the Z-score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be

  10. A network-based gene expression signature informs prognosis and treatment for colorectal cancer patients.

    Directory of Open Access Journals (Sweden)

    Mingguang Shi

    Full Text Available Several studies have reported gene expression signatures that predict recurrence risk in stage II and III colorectal cancer (CRC patients with minimal gene membership overlap and undefined biological relevance. The goal of this study was to investigate biological themes underlying these signatures, to infer genes of potential mechanistic importance to the CRC recurrence phenotype and to test whether accurate prognostic models can be developed using mechanistically important genes.We investigated eight published CRC gene expression signatures and found no functional convergence in Gene Ontology enrichment analysis. Using a random walk-based approach, we integrated these signatures and publicly available somatic mutation data on a protein-protein interaction network and inferred 487 genes that were plausible candidate molecular underpinnings for the CRC recurrence phenotype. We named the list of 487 genes a NEM signature because it integrated information from Network, Expression, and Mutation. The signature showed significant enrichment in four biological processes closely related to cancer pathophysiology and provided good coverage of known oncogenes, tumor suppressors, and CRC-related signaling pathways. A NEM signature-based Survival Support Vector Machine prognostic model was trained using a microarray gene expression dataset and tested on an independent dataset. The model-based scores showed a 75.7% concordance with the real survival data and separated patients into two groups with significantly different relapse-free survival (p = 0.002. Similar results were obtained with reversed training and testing datasets (p = 0.007. Furthermore, adjuvant chemotherapy was significantly associated with prolonged survival of the high-risk patients (p = 0.006, but not beneficial to the low-risk patients (p = 0.491.The NEM signature not only reflects CRC biology but also informs patient prognosis and treatment response. Thus, the network-based

  11. Gene-based meta-analysis of genome-wide association studies implicates new loci involved in obesity

    DEFF Research Database (Denmark)

    Hägg, Sara; Ganna, Andrea; Van Der Laan, Sander W

    2015-01-01

    ) approach to assign variants to genes and to calculate gene-based P-values based on simulations. The VEGAS method was applied to each cohort separately before a gene-based meta-analysis was performed. In Stage 1, two known (FTO and TMEM18) and six novel (PEX2, MTFR2, SSFA2, IARS2, CEP295 and TXNDC12) loci...

  12. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  13. Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

    Science.gov (United States)

    Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

    2005-07-15

    A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.

  14. Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Keita Mori

    2013-01-01

    Full Text Available Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.

  15. Outreach Science Education: Evidence-Based Studies in a Gene Technology Lab

    Science.gov (United States)

    Scharfenberg, Franz-Josef; Bogner, Franz X.

    2014-01-01

    Nowadays, outreach labs are important informal learning environments in science education. After summarizing research to goals outreach labs focus on, we describe our evidence-based gene technology lab as a model of a research-driven outreach program. Evaluation-based optimizations of hands-on teaching based on cognitive load theory (additional…

  16. A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis.

    Science.gov (United States)

    Voyle, Nicola; Keohane, Aoife; Newhouse, Stephen; Lunnon, Katie; Johnston, Caroline; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Hodges, Angela; Kiddle, Steven; Dobson, Richard Jb

    2016-01-01

    Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer's disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.

  17. A robust approach based on Weibull distribution for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Gong Binsheng

    2011-05-01

    Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters

  18. Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

    Science.gov (United States)

    Zhang, Xue; Acencio, Marcio Luis; Lemke, Ney

    2016-01-01

    Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research. PMID:27014079

  19. Towards precise classification of cancers based on robust gene functional expression profiles

    Directory of Open Access Journals (Sweden)

    Zhu Jing

    2005-03-01

    Full Text Available Abstract Background Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level. Results Inspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles. Conclusion This modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge

  20. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    Science.gov (United States)

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. HSD3B and gene-gene interactions in a pathway-based analysis of genetic susceptibility to bladder cancer.

    Directory of Open Access Journals (Sweden)

    Angeline S Andrew

    Full Text Available Bladder cancer is the 4(th most common cancer among men in the U.S. We analyzed variant genotypes hypothesized to modify major biological processes involved in bladder carcinogenesis, including hormone regulation, apoptosis, DNA repair, immune surveillance, metabolism, proliferation, and telomere maintenance. Logistic regression was used to assess the relationship between genetic variation affecting these processes and susceptibility in 563 genotyped urothelial cell carcinoma cases and 863 controls enrolled in a case-control study of incident bladder cancer conducted in New Hampshire, U.S. We evaluated gene-gene interactions using Multifactor Dimensionality Reduction (MDR and Statistical Epistasis Network analysis. The 3'UTR flanking variant form of the hormone regulation gene HSD3B2 was associated with increased bladder cancer risk in the New Hampshire population (adjusted OR 1.85 95%CI 1.31-2.62. This finding was successfully replicated in the Texas Bladder Cancer Study with 957 controls, 497 cases (adjusted OR 3.66 95%CI 1.06-12.63. The effect of this prevalent SNP was stronger among males (OR 2.13 95%CI 1.40-3.25 than females (OR 1.56 95%CI 0.83-2.95, (SNP-gender interaction P = 0.048. We also identified a SNP-SNP interaction between T-cell activation related genes GATA3 and CD81 (interaction P = 0.0003. The fact that bladder cancer incidence is 3-4 times higher in males suggests the involvement of hormone levels. This biologic process-based analysis suggests candidate susceptibility markers and supports the theory that disrupted hormone regulation plays a role in bladder carcinogenesis.

  2. SVMRFE based approach for prediction of most discriminatory gene target for type II diabetes

    Directory of Open Access Journals (Sweden)

    Atul Kumar

    2017-06-01

    Full Text Available Type II diabetes is a chronic condition that affects the way our body metabolizes sugar. The body's important source of fuel is now becoming a chronic disease all over the world. It is now very necessary to identify the new potential targets for the drugs which not only control the disease but also can treat it. Support vector machines are the classifier which has a potential to make a classification of the discriminatory genes and non-discriminatory genes. SVMRFE a modification of SVM ranks the genes based on their discriminatory power and eliminate the genes which are not involved in causing the disease. A gene regulatory network has been formed with the top ranked coding genes to identify their role in causing diabetes. To further validate the results pathway study was performed to identify the involvement of the coding genes in type II diabetes. The genes obtained from this study showed a significant involvement in causing the disease, which may be used as a potential drug target.

  3. A dual selection based, targeted gene replacement tool for Magnaporthe grisea and Fusarium oxysporum.

    Science.gov (United States)

    Khang, Chang Hyun; Park, Sook-Young; Lee, Yong-Hwan; Kang, Seogchan

    2005-06-01

    Rapid progress in fungal genome sequencing presents many new opportunities for functional genomic analysis of fungal biology through the systematic mutagenesis of the genes identified through sequencing. However, the lack of efficient tools for targeted gene replacement is a limiting factor for fungal functional genomics, as it often necessitates the screening of a large number of transformants to identify the desired mutant. We developed an efficient method of gene replacement and evaluated factors affecting the efficiency of this method using two plant pathogenic fungi, Magnaporthe grisea and Fusarium oxysporum. This method is based on Agrobacterium tumefaciens-mediated transformation with a mutant allele of the target gene flanked by the herpes simplex virus thymidine kinase (HSVtk) gene as a conditional negative selection marker against ectopic transformants. The HSVtk gene product converts 5-fluoro-2'-deoxyuridine to a compound toxic to diverse fungi. Because ectopic transformants express HSVtk, while gene replacement mutants lack HSVtk, growing transformants on a medium amended with 5-fluoro-2'-deoxyuridine facilitates the identification of targeted mutants by counter-selecting against ectopic transformants. In addition to M. grisea and F. oxysporum, the method and associated vectors are likely to be applicable to manipulating genes in a broad spectrum of fungi, thus potentially serving as an efficient, universal functional genomic tool for harnessing the growing body of fungal genome sequence data to study fungal biology.

  4. Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures

    Directory of Open Access Journals (Sweden)

    Anders E. Berglund

    2017-01-01

    Full Text Available Background. Many gene-expression signatures exist for describing the biological state of profiled tumors. Principal Component Analysis (PCA can be used to summarize a gene signature into a single score. Our hypothesis is that gene signatures can be validated when applied to new datasets, using inherent properties of PCA. Results. This validation is based on four key concepts. Coherence: elements of a gene signature should be correlated beyond chance. Uniqueness: the general direction of the data being examined can drive most of the observed signal. Robustness: if a gene signature is designed to measure a single biological effect, then this signal should be sufficiently strong and distinct compared to other signals within the signature. Transferability: the derived PCA gene signature score should describe the same biology in the target dataset as it does in the training dataset. Conclusions. The proposed validation procedure ensures that PCA-based gene signatures perform as expected when applied to datasets other than those that the signatures were trained upon. Complex signatures, describing multiple independent biological components, are also easily identified.

  5. An Improved Fuzzy Based Missing Value Estimation in DNA Microarray Validated by Gene Ranking

    Directory of Open Access Journals (Sweden)

    Sujay Saha

    2016-01-01

    Full Text Available Most of the gene expression data analysis algorithms require the entire gene expression matrix without any missing values. Hence, it is necessary to devise methods which would impute missing data values accurately. There exist a number of imputation algorithms to estimate those missing values. This work starts with a microarray dataset containing multiple missing values. We first apply the modified version of the fuzzy theory based existing method LRFDVImpute to impute multiple missing values of time series gene expression data and then validate the result of imputation by genetic algorithm (GA based gene ranking methodology along with some regular statistical validation techniques, like RMSE method. Gene ranking, as far as our knowledge, has not been used yet to validate the result of missing value estimation. Firstly, the proposed method has been tested on the very popular Spellman dataset and results show that error margins have been drastically reduced compared to some previous works, which indirectly validates the statistical significance of the proposed method. Then it has been applied on four other 2-class benchmark datasets, like Colorectal Cancer tumours dataset (GDS4382, Breast Cancer dataset (GSE349-350, Prostate Cancer dataset, and DLBCL-FL (Leukaemia for both missing value estimation and ranking the genes, and the results show that the proposed method can reach 100% classification accuracy with very few dominant genes, which indirectly validates the biological significance of the proposed method.

  6. Network Based Integrated Analysis of Phenotype-Genotype Data for Prioritization of Candidate Symptom Genes

    Directory of Open Access Journals (Sweden)

    Xing Li

    2014-01-01

    Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

  7. A non-inheritable maternal Cas9-based multiple-gene editing system in mice

    OpenAIRE

    Takayuki Sakurai; Akiko Kamiyoshi; Hisaka Kawate; Chie Mori; Satoshi Watanabe; Megumu Tanaka; Ryuichi Uetake; Masahiro Sato; Takayuki Shindo

    2016-01-01

    The CRISPR/Cas9 system is capable of editing multiple genes through one-step zygote injection. The preexisting method is largely based on the co-injection of Cas9 DNA (or mRNA) and guide RNAs (gRNAs); however, it is unclear how many genes can be simultaneously edited by this method, and a reliable means to generate transgenic (Tg) animals with multiple gene editing has yet to be developed. Here, we employed non-inheritable maternal Cas9 (maCas9) protein derived from Tg mice with systemic Cas9...

  8. Pairagon+N-SCAN_EST: a model-based gene annotation pipeline

    DEFF Research Database (Denmark)

    Arumugam, Manimozhiyan; Wei, Chaochun; Brown, Randall H

    2006-01-01

    This paper describes Pairagon+N-SCAN_EST, a gene annotation pipeline that uses only native alignments. For each expressed sequence it chooses the best genomic alignment. Systems like ENSEMBL and ExoGean rely on trans alignments, in which expressed sequences are aligned to the genomic loci...... with de novo gene prediction by using N-SCAN_EST. N-SCAN_EST is based on a generalized HMM probability model augmented with a phylogenetic conservation model and EST alignments. It can predict complete transcripts by extending or merging EST alignments, but it can also predict genes in regions without EST...

  9. Effective generation of transgenic pigs and mice by linker based sperm-mediated gene transfer.

    OpenAIRE

    Chang, Keejong; Qian, Jin; Jiang, MeiSheng; Liu, Yi-Hsin; Wu, Ming-Che; Chen, Chi-Dar; Lai, Chao-Kuen; Lo, Hsin-Lung; Hsiao, Chin-Ton; Brown, Lucy; Bolen, James; Huang, Hsiao-I; Ho, Pei-Yu; Shih, Ping Yao; Yao, Chen-Wen

    2002-01-01

    Abstract Background Transgenic animals have become valuable tools for both research and applied purposes. The current method of gene transfer, microinjection, which is widely used in transgenic mouse production, has only had limited success in producing transgenic animals of larger or higher species. Here, we report a linker based sperm-mediated gene transfer method (LB-SMGT) that greatly improves the production efficiency of large transgenic animals. Results The linker protein, a monoclonal ...

  10. Network-based prediction and knowledge mining of disease genes.

    Science.gov (United States)

    Carson, Matthew B; Lu, Hui

    2015-01-01

    In recent years, high-throughput protein interaction identification methods have generated a large amount of data. When combined with the results from other in vivo and in vitro experiments, a complex set of relationships between biological molecules emerges. The growing popularity of network analysis and data mining has allowed researchers to recognize indirect connections between these molecules. Due to the interdependent nature of network entities, evaluating proteins in this context can reveal relationships that may not otherwise be evident. We examined the human protein interaction network as it relates to human illness using the Disease Ontology. After calculating several topological metrics, we trained an alternating decision tree (ADTree) classifier to identify disease-associated proteins. Using a bootstrapping method, we created a tree to highlight conserved characteristics shared by many of these proteins. Subsequently, we reviewed a set of non-disease-associated proteins that were misclassified by the algorithm with high confidence and searched for evidence of a disease relationship. Our classifier was able to predict disease-related genes with 79% area under the receiver operating characteristic (ROC) curve (AUC), which indicates the tradeoff between sensitivity and specificity and is a good predictor of how a classifier will perform on future data sets. We found that a combination of several network characteristics including degree centrality, disease neighbor ratio, eccentricity, and neighborhood connectivity help to distinguish between disease- and non-disease-related proteins. Furthermore, the ADTree allowed us to understand which combinations of strongly predictive attributes contributed most to protein-disease classification. In our post-processing evaluation, we found several examples of potential novel disease-related proteins and corresponding literature evidence. In addition, we showed that first- and second-order neighbors in the PPI network

  11. Cell based-gene delivery approaches for the treatment of spinal cord injury and neurodegenerative disorders.

    Science.gov (United States)

    Taha, Masoumeh Fakhr

    2010-03-01

    Cell based-gene delivery has provided an important therapeutic strategy for different disorders in the recent years. This strategy is based on the transplantation of genetically modified cells to express specific genes and to target the delivery of therapeutic factors, especially for the treatment of cancers and neurological, immunological, cardiovascular and heamatopoietic disorders. Although, preliminary reports are encouraging, and experimental studies indicate functionally and structurally improvements in the animal models of different disorders, universal application of this strategy for human diseases requires more evidence. There are a number of parameters that need to be evaluated, including the optimal cell source, the most effective gene/genes to be delivered, the optimal vector and method of gene delivery into the cells and the most efficient route for the delivery of genetically modified cells into the patient. Also, some obstacles have to be overcome, including the safety and usefulness of the approaches and the stability of the improvements. Here, recent studies concerning with the cell-based gene delivery for spinal cord injury and some neurodegenerative disorders such as amyotrophic lateral sclerosis, Parkinson's disease and Alzheimer's disease are briefly reviewed, and their exciting consequences are discussed.

  12. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

    Science.gov (United States)

    Jia, Bin; Wang, Xiaodong

    2013-12-17

    : The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

  13. Identification of novel risk genes associated with type 1 diabetes mellitus using a genome-wide gene-based association analysis.

    Science.gov (United States)

    Qiu, Ying-Hua; Deng, Fei-Yan; Li, Min-Jing; Lei, Shu-Feng

    2014-11-01

    Type 1 diabetes mellitus is a serious disorder characterized by destruction of pancreatic β-cells, culminating in absolute insulin deficiency. Genetic factors contribute to the susceptibility of type 1 diabetes mellitus. The aim of the present study was to identify more susceptibility genes of type 1 diabetes mellitus. We carried out an initial gene-based genome-wide association study in a total of 4,075 type 1 diabetes mellitus cases and 2,604 controls by using the Gene-based Association Test using Extended Simes procedure. Furthermore, we carried out replication studies, differential expression analysis and functional annotation clustering analysis to support the significance of the identified susceptibility genes. We identified 452 genes associated with type 1 diabetes mellitus, even after adapting the genome-wide threshold for significance (P diabetes mellitus, which were ignored in single-nucleotide polymorphism-based association analysis and were not previously reported. We found that 53 genes have supportive evidence from replication studies and/or differential expression studies. In particular, seven genes including four non-human leukocyte antigen (HLA) genes (RASIP1, STRN4, BCAR1 and MYL2) are replicated in at least one independent population and also differentially expressed in peripheral blood mononuclear cells or monocytes. Furthermore, the associated genes tend to enrich in immune-related pathways or Gene Ontology project terms. The present results suggest the high power of gene-based association analysis in detecting disease-susceptibility genes. Our findings provide more insights into the genetic basis of type 1 diabetes mellitus.

  14. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations.

    Directory of Open Access Journals (Sweden)

    Xingli Guo

    Full Text Available The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.

  15. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Science.gov (United States)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  16. Gene therapy for spinomuscular atrophy: a biomedical advance, a missed opportunity for more equitable drug pricing.

    Science.gov (United States)

    Friedmann, T

    2017-09-01

    An experimental approach for gene therapy of spinomuscular atrophy has been reported to prevent development of the neuromuscular features of this lethal and previously untreatable disorder. The approach involves treatment of patients suffering from SMN1-associated infantile form of the disease with a splice-switching antisense oligonucleotide (ASO) that corrects aberrant splicing of the nearly identical SMN2 gene to allow the generation of functional SMN protein, thereby mitigating the development of the disease. This technique represents the first apparently effective therapy for spinal muscular atrophy (SMA) and an important documentation for ASO technology for therapy of neurodegenerative disease. These results with one form of SMA are likely to be relevant for similar applications to other SMA types and are likely to inspire application to a number of other intractable neurodegenerative diseases such as Huntington's disease, amyotrophic lateral sclerosis and possibly even the extremely common Parkinson's and Alzheimer's diseases and others. Nevertheless, the scientific and medical importance of this advance is marred by a pricing policy by the corporate sponsors that may complicate accessibility of the drug for some desperate patients.

  17. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

    Science.gov (United States)

    Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

    2018-04-01

    Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  18. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes.

    Directory of Open Access Journals (Sweden)

    Emre Guney

    Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.

  19. A Morpholino-based screen to identify novel genes involved in craniofacial morphogenesis

    Science.gov (United States)

    Melvin, Vida Senkus; Feng, Weiguo; Hernandez-Lagunas, Laura; Artinger, Kristin Bruk; Williams, Trevor

    2014-01-01

    BACKGROUND The regulatory mechanisms underpinning facial development are conserved between diverse species. Therefore, results from model systems provide insight into the genetic causes of human craniofacial defects. Previously, we generated a comprehensive dataset examining gene expression during development and fusion of the mouse facial prominences. Here, we used this resource to identify genes that have dynamic expression patterns in the facial prominences, but for which only limited information exists concerning developmental function. RESULTS This set of ~80 genes was used for a high throughput functional analysis in the zebrafish system using Morpholino gene knockdown technology. This screen revealed three classes of cranial cartilage phenotypes depending upon whether knockdown of the gene affected the neurocranium, viscerocranium, or both. The targeted genes that produced consistent phenotypes encoded proteins linked to transcription (meis1, meis2a, tshz2, vgll4l), signaling (pkdcc, vlk, macc1, wu:fb16h09), and extracellular matrix function (smoc2). The majority of these phenotypes were not altered by reduction of p53 levels, demonstrating that both p53 dependent and independent mechanisms were involved in the craniofacial abnormalities. CONCLUSIONS This Morpholino-based screen highlights new genes involved in development of the zebrafish craniofacial skeleton with wider relevance to formation of the face in other species, particularly mouse and human. PMID:23559552

  20. Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Directory of Open Access Journals (Sweden)

    Serafini Maria

    2003-11-01

    Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  1. Single-gene prognostic signatures for advanced stage serous ovarian cancer based on 1257 patient samples.

    Science.gov (United States)

    Zhang, Fan; Yang, Kai; Deng, Kui; Zhang, Yuanyuan; Zhao, Weiwei; Xu, Huan; Rong, Zhiwei; Li, Kang

    2018-04-16

    We sought to identify stable single-gene prognostic signatures based on a large collection of advanced stage serous ovarian cancer (AS-OvCa) gene expression data and explore their functions. The empirical Bayes (EB) method was used to remove the batch effect and integrate 8 ovarian cancer datasets. Univariate Cox regression was used to evaluate the association between gene and overall survival (OS). The Database for Annotation, Visualization and Integrated Discovery (DAVID) tool was used for the functional annotation of genes for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The batch effect was removed by the EB method, and 1257 patient samples were used for further analysis. We selected 341 single-gene prognostic signatures with FDR matrix organization, focal adhesion and DNA replication which are closely associated with cancer. We used the EB method to remove the batch effect of 8 datasets, integrated these datasets and identified stable prognosis signatures for AS-OvCa.

  2. Prediction of highly expressed genes in microbes based on chromatin accessibility

    Directory of Open Access Journals (Sweden)

    Ussery David W

    2007-02-01

    Full Text Available Abstract Background It is well known that gene expression is dependent on chromatin structure in eukaryotes and it is likely that chromatin can play a role in bacterial gene expression as well. Here, we use a nucleosomal position preference measure of anisotropic DNA flexibility to predict highly expressed genes in microbial genomes. We compare these predictions with those based on codon adaptation index (CAI values, and also with experimental data for 6 different microbial genomes, with a particular interest in experimental data from Escherichia coli. Moreover, position preference is examined further in 328 sequenced microbial genomes. Results We find that absolute gene expression levels are correlated with the position preference in many microbial genomes. It is postulated that in these regions, the DNA may be more accessible to the transcriptional machinery. Moreover, ribosomal proteins and ribosomal RNA are encoded by DNA having significantly lower position preference values than other genes in fast-replicating microbes. Conclusion This insight into DNA structure-dependent gene expression in microbes may be exploited for predicting the expression of non-translated genes such as non-coding RNAs that may not be predicted by any of the conventional codon usage bias approaches.

  3. Weighted functional linear regression models for gene-based association analysis.

    Science.gov (United States)

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  4. Tipping the Proteome with Gene-Based Vaccines: Weighing in on the Role of Nano materials

    International Nuclear Information System (INIS)

    Flores, K.J.; Craig, M.; Smith, J.J.; DeLong, R.K.; Wanekaya, A.; Dong, L.

    2012-01-01

    Since the first generation of DNA vaccines was introduced in 1988, remarkable improvements have been made to improve their efficacy and immunogenicity. Although human clinical trials have shown that delivery of DNA vaccines is well tolerated and safe, the potency of these vaccines in humans is somewhat less than optimal. The development of a gene-based vaccine that was effective enough to be approved for clinical use in humans would be one of, if not the most important, advance in vaccines to date. This paper highlights the literature relating to gene-based vaccines, specifically DNA vaccines, and suggests possible approaches to boost their performance. In addition, we explore the idea that combining RNA and nano materials may hold the key to successful gene-based vaccines for prevention and treatment of disease

  5. A sight on protein-based nanoparticles as drug/gene delivery systems.

    Science.gov (United States)

    Salatin, Sara; Jelvehgari, Mitra; Maleki-Dizaj, Solmaz; Adibkia, Khosro

    2015-01-01

    Polymeric nanomaterials have extensively been applied for the preparation of targeted and controlled release drug/gene delivery systems. However, problems involved in the formulation of synthetic polymers such as using of the toxic solvents and surfactants have limited their desirable applications. In this regard, natural biomolecules including proteins and polysaccharide are suitable alternatives due to their safety. According to literature, protein-based nanoparticles possess many advantages for drug and gene delivery such as biocompatibility, biodegradability and ability to functionalize with targeting ligands. This review provides a general sight on the application of biodegradable protein-based nanoparticles in drug/gene delivery based on their origins. Their unique physicochemical properties that help them to be formulated as pharmaceutical carriers are also discussed.

  6. Hessian regularization based non-negative matrix factorization for gene expression data clustering.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Wang, Congzhi

    2015-01-01

    Since a key step in the analysis of gene expression data is to detect groups of genes that have similar expression patterns, clustering technique is then commonly used to analyze gene expression data. Data representation plays an important role in clustering analysis. The non-negative matrix factorization (NMF) is a widely used data representation method with great success in machine learning. Although the traditional manifold regularization method, Laplacian regularization (LR), can improve the performance of NMF, LR still suffers from the problem of its weak extrapolating power. Hessian regularization (HR) is a newly developed manifold regularization method, whose natural properties make it more extrapolating, especially for small sample data. In this work, we propose the HR-based NMF (HR-NMF) algorithm, and then apply it to represent gene expression data for further clustering task. The clustering experiments are conducted on five commonly used gene datasets, and the results indicate that the proposed HR-NMF outperforms LR-based NMM and original NMF, which suggests the potential application of HR-NMF for gene expression data.

  7. GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.

    Science.gov (United States)

    Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H

    2010-04-01

    A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.

  8. Expression-based clustering of CAZyme-encoding genes of Aspergillus niger.

    Science.gov (United States)

    Gruben, Birgit S; Mäkelä, Miia R; Kowalczyk, Joanna E; Zhou, Miaomiao; Benoit-Gelber, Isabelle; De Vries, Ronald P

    2017-11-23

    The Aspergillus niger genome contains a large repertoire of genes encoding carbohydrate active enzymes (CAZymes) that are targeted to plant polysaccharide degradation enabling A. niger to grow on a wide range of plant biomass substrates. Which genes need to be activated in certain environmental conditions depends on the composition of the available substrate. Previous studies have demonstrated the involvement of a number of transcriptional regulators in plant biomass degradation and have identified sets of target genes for each regulator. In this study, a broad transcriptional analysis was performed of the A. niger genes encoding (putative) plant polysaccharide degrading enzymes. Microarray data focusing on the initial response of A. niger to the presence of plant biomass related carbon sources were analyzed of a wild-type strain N402 that was grown on a large range of carbon sources and of the regulatory mutant strains ΔxlnR, ΔaraR, ΔamyR, ΔrhaR and ΔgalX that were grown on their specific inducing compounds. The cluster analysis of the expression data revealed several groups of co-regulated genes, which goes beyond the traditionally described co-regulated gene sets. Additional putative target genes of the selected regulators were identified, based on their expression profile. Notably, in several cases the expression profile puts questions on the function assignment of uncharacterized genes that was based on homology searches, highlighting the need for more extensive biochemical studies into the substrate specificity of enzymes encoded by these non-characterized genes. The data also revealed sets of genes that were upregulated in the regulatory mutants, suggesting interaction between the regulatory systems and a therefore even more complex overall regulatory network than has been reported so far. Expression profiling on a large number of substrates provides better insight in the complex regulatory systems that drive the conversion of plant biomass by fungi. In

  9. Cellular automata-based artificial life system of horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Ji-xin Liu

    2016-02-01

    Full Text Available Mutation and natural selection is the core of Darwin's idea about evolution. Many algorithms and models are based on this idea. However, in the evolution of prokaryotes, more and more researches have indicated that horizontal gene transfer (HGT would be much more important and universal than the authors had imagined. Owing to this mechanism, the prokaryotes not only become adaptable in nearly any environment on Earth, but also form a global genetic bank and a super communication network with all the genes of the prokaryotic world. Under this background, they present a novel cellular automata model general gene transfer to simulate and study the vertical gene transfer and HGT in the prokaryotes. At the same time, they use Schrodinger's life theory to formulate some evaluation indices and to discuss the intelligence and cognition of prokaryotes which is derived from HGT.

  10. Analysis of mammalian gene function through broad based phenotypic screens across a consortium of mouse clinics

    Science.gov (United States)

    Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl MJ; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie

    2015-01-01

    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse ES cell knockout resource provides a basis for characterisation of relationships between gene and phenotype. The EUMODIC consortium developed and validated robust methodologies for broad-based phenotyping of knockouts through a pipeline comprising 20 disease-orientated platforms. We developed novel statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no prior functional annotation. We captured data from over 27,000 mice finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. Novel phenotypes were uncovered for many genes with unknown function providing a powerful basis for hypothesis generation and further investigation in diverse systems. PMID:26214591

  11. Clustering gene expression data based on predicted differential effects of GV interaction.

    Science.gov (United States)

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  12. Study of hepatitis B virus gene mutations with enzymatic colorimetry-based DNA microarray.

    Science.gov (United States)

    Mao, Hailei; Wang, Huimin; Zhang, Donglei; Mao, Hongju; Zhao, Jianlong; Shi, Jian; Cui, Zhichu

    2006-01-01

    To establish a modified microarray method for detecting HBV gene mutations in the clinic. Site-specific oligonucleotide probes were immobilized to microarray slides and hybridized to biotin-labeled HBV gene fragments amplified from two-step PCR. Hybridized targets were transferred to nitrocellulose membranes, followed by intensity measurement using BCIP/NBT colorimetry. HBV genes from 99 Hepatitis B patients and 40 healthy blood donors were analyzed. Mutation frequencies of HBV pre-core/core and basic core promoter (BCP) regions were found to be significantly higher in the patient group (42%, 40% versus 2.5%, 5%, P colorimetry method exhibited the same level of sensitivity and reproducibility. An enzymatic colorimetry-based DNA microarray assay was successfully established to monitor HBV mutations. Pre-core/core and BCP mutations of HBV genes could be major causes of HBV infection in HBeAg-negative patients and could also be relevant to chronicity and aggravation of hepatitis B.

  13. Molecular characterisation of lumpy skin disease virus and sheeppox virus based on P32 gene

    Directory of Open Access Journals (Sweden)

    P.M.A.Rashid

    2017-06-01

    Full Text Available Lumpy skin disease virus (LSDV and sheeppox virus (SPV have a considerable economic impact on the cattle and small ruminant industry. They are listed in group A of contagious disease by the World Organization for Animal Health (OIE. This study addressed molecular characterisation of first LSDV outbreak and an endemic SPV in Kurdistan region of Iraq based on P32 gene. The results indicated that P32 gene can be successfully used for diagnosis of LSDV. The phylogenic and molecular analysis showed that there may be a new LSDV isolate circulating in Kurdistan which uniquely shared the same characteristic amino acid sequence with SPV and GPV, leucine at amino acid position 51 in P32 gene as well as few genetically distinct SPV causing pox disease in Kurdistan sheep. This study provided sequence information of P32 gene for several LSDV isolates, which positively affects the epidemiological study of Capripoxvirus

  14. Network-based association of hypoxia-responsive genes with cardiovascular diseases

    International Nuclear Information System (INIS)

    Wang, Rui-Sheng; Oldham, William M; Loscalzo, Joseph

    2014-01-01

    Molecular oxygen is indispensable for cellular viability and function. Hypoxia is a stress condition in which oxygen demand exceeds supply. Low cellular oxygen content induces a number of molecular changes to activate regulatory pathways responsible for increasing the oxygen supply and optimizing cellular metabolism under limited oxygen conditions. Hypoxia plays critical roles in the pathobiology of many diseases, such as cancer, heart failure, myocardial ischemia, stroke, and chronic lung diseases. Although the complicated associations between hypoxia and cardiovascular (and cerebrovascular) diseases (CVD) have been recognized for some time, there are few studies that investigate their biological link from a systems biology perspective. In this study, we integrate hypoxia genes, CVD genes, and the human protein interactome in order to explore the relationship between hypoxia and cardiovascular diseases at a systems level. We show that hypoxia genes are much closer to CVD genes in the human protein interactome than that expected by chance. We also find that hypoxia genes play significant bridging roles in connecting different cardiovascular diseases. We construct a hypoxia-CVD bipartite network and find several interesting hypoxia-CVD modules with significant gene ontology similarity. Finally, we show that hypoxia genes tend to have more CVD interactors in the human interactome than in random networks of matching topology. Based on these observations, we can predict novel genes that may be associated with CVD. This network-based association study gives us a broad view of the relationships between hypoxia and cardiovascular diseases and provides new insights into the role of hypoxia in cardiovascular biology. (paper)

  15. A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    International Nuclear Information System (INIS)

    Gur-Dedeoglu, Bala; Konu, Ozlen; Kir, Serkan; Ozturk, Ahmet Rasit; Bozkurt, Betul; Ergul, Gulusan; Yulug, Isik G

    2008-01-01

    Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC), and invasive lobular carcinoma (ILC) samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively). The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real-time qRT-PCR supported the meta-analysis results. The

  16. A resampling-based meta-analysis for detection of differential gene expression in breast cancer

    Directory of Open Access Journals (Sweden)

    Ergul Gulusan

    2008-12-01

    Full Text Available Abstract Background Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. Methods A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC, and invasive lobular carcinoma (ILC samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. Results The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively. The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real

  17. Double-Bottom Chaotic Map Particle Swarm Optimization Based on Chi-Square Test to Determine Gene-Gene Interactions

    Science.gov (United States)

    Yang, Cheng-Hong; Chang, Hsueh-Wei

    2014-01-01

    Gene-gene interaction studies focus on the investigation of the association between the single nucleotide polymorphisms (SNPs) of genes for disease susceptibility. Statistical methods are widely used to search for a good model of gene-gene interaction for disease analysis, and the previously determined models have successfully explained the effects between SNPs and diseases. However, the huge numbers of potential combinations of SNP genotypes limit the use of statistical methods for analysing high-order interaction, and finding an available high-order model of gene-gene interaction remains a challenge. In this study, an improved particle swarm optimization with double-bottom chaotic maps (DBM-PSO) was applied to assist statistical methods in the analysis of associated variations to disease susceptibility. A big data set was simulated using the published genotype frequencies of 26 SNPs amongst eight genes for breast cancer. Results showed that the proposed DBM-PSO successfully determined two- to six-order models of gene-gene interaction for the risk association with breast cancer (odds ratio > 1.0; P value <0.05). Analysis results supported that the proposed DBM-PSO can identify good models and provide higher chi-square values than conventional PSO. This study indicates that DBM-PSO is a robust and precise algorithm for determination of gene-gene interaction models for breast cancer. PMID:24895547

  18. Mesenchymal stem cell-based gene therapy: A promising therapeutic strategy.

    Science.gov (United States)

    Mohammadian, Mozhdeh; Abasi, Elham; Akbarzadeh, Abolfazl

    2016-08-01

    Mesenchymal stem cells (MSCs) are multipotent stromal cells that exist in bone marrow, fat, and so many other tissues, and can differentiate into a variety of cell types including osteoblasts, chondrocytes, and adipocytes, as well as myocytes and neurons. Moreover, they have great capacity for self-renewal while maintaining their multipotency. Their capacity for proliferation and differentiation, in addition to their immunomodulatory activity, makes them very promising candidates for cell-based regenerative medicine. Moreover, MSCs have the ability of mobilization to the site of damage; therefore, they can automatically migrate to the site of injury via their chemokine receptors following intravenous transplantation. In this respect, they can be applied for MSC-based gene therapy. In this new therapeutic method, genes of interest are introduced into MSCs via viral and non-viral-based methods that lead to transgene expression in them. Although stem cell-based gene therapy is a relatively new strategy, it lights a new hope for the treatment of a variety of genetic disorders. In the near future, MSCs can be of use in a vast number of clinical applications, because of their uncomplicated isolation, culture, and genetic manipulation. However, full consideration is still crucial before they are utilized for clinical trials, because the number of studies that signify the advantageous effects of MSC-based gene therapy are still limited.

  19. Environmental Application of Reporter-Genes Based Biosensors for Chemical Contamination Screening

    Directory of Open Access Journals (Sweden)

    Matejczyk Marzena

    2014-12-01

    Full Text Available The paper presents results of research concerning possibilities of applications of reporter-genes based microorganisms, including the selective presentation of defects and advantages of different new scientific achievements of methodical solutions in genetic system constructions of biosensing elements for environmental research. The most robust and popular genetic fusion and new trends in reporter genes technology – such as LacZ (β-galactosidase, xylE (catechol 2,3-dioxygenase, gfp (green fluorescent proteins and its mutated forms, lux (prokaryotic luciferase, luc (eukaryotic luciferase, phoA (alkaline phosphatase, gusA and gurA (β-glucuronidase, antibiotics and heavy metals resistance are described. Reporter-genes based biosensors with use of genetically modified bacteria and yeast successfully work for genotoxicity, bioavailability and oxidative stress assessment for detection and monitoring of toxic compounds in drinking water and different environmental samples, surface water, soil, sediments.

  20. Development of an ELA-DRA gene typing method based on pyrosequencing technology.

    Science.gov (United States)

    Díaz, S; Echeverría, M G; It, V; Posik, D M; Rogberg-Muñoz, A; Pena, N L; Peral-García, P; Vega-Pla, J L; Giovambattista, G

    2008-11-01

    The polymorphism of equine lymphocyte antigen (ELA) class II DRA gene had been detected by polymerase chain reaction-single-strand conformational polymorphism (PCR-SSCP) and reference strand-mediated conformation analysis. These methodologies allowed to identify 11 ELA-DRA exon 2 sequences, three of which are widely distributed among domestic horse breeds. Herein, we describe the development of a pyrosequencing-based method applicable to ELA-DRA typing, by screening samples from eight different horse breeds previously typed by PCR-SSCP. This sequence-based method would be useful in high-throughput genotyping of major histocompatibility complex genes in horses and other animal species, making this system interesting as a rapid screening method for animal genotyping of immune-related genes.

  1. Cytomegalovirus replicon-based regulation of gene expression in vitro and in vivo.

    Directory of Open Access Journals (Sweden)

    Hermine Mohr

    Full Text Available There is increasing evidence for a connection between DNA replication and the expression of adjacent genes. Therefore, this study addressed the question of whether a herpesvirus origin of replication can be used to activate or increase the expression of adjacent genes. Cell lines carrying an episomal vector, in which reporter genes are linked to the murine cytomegalovirus (MCMV origin of lytic replication (oriLyt, were constructed. Reporter gene expression was silenced by a histone-deacetylase-dependent mechanism, but was resolved upon lytic infection with MCMV. Replication of the episome was observed subsequent to infection, leading to the induction of gene expression by more than 1000-fold. oriLyt-based regulation thus provided a unique opportunity for virus-induced conditional gene expression without the need for an additional induction mechanism. This principle was exploited to show effective late trans-complementation of the toxic viral protein M50 and the glycoprotein gO of MCMV. Moreover, the application of this principle for intracellular immunization against herpesvirus infection was demonstrated. The results of the present study show that viral infection specifically activated the expression of a dominant-negative transgene, which inhibited viral growth. This conditional system was operative in explant cultures of transgenic mice, but not in vivo. Several applications are discussed.

  2. A pathway-based network analysis of hypertension-related genes

    Science.gov (United States)

    Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng

    2016-02-01

    Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.

  3. Machine learning approaches to supporting the identification of photoreceptor-enriched genes based on expression data

    Directory of Open Access Journals (Sweden)

    Simpson David

    2006-03-01

    Full Text Available Abstract Background Retinal photoreceptors are highly specialised cells, which detect light and are central to mammalian vision. Many retinal diseases occur as a result of inherited dysfunction of the rod and cone photoreceptor cells. Development and maintenance of photoreceptors requires appropriate regulation of the many genes specifically or highly expressed in these cells. Over the last decades, different experimental approaches have been developed to identify photoreceptor enriched genes. Recent progress in RNA analysis technology has generated large amounts of gene expression data relevant to retinal development. This paper assesses a machine learning methodology for supporting the identification of photoreceptor enriched genes based on expression data. Results Based on the analysis of publicly-available gene expression data from the developing mouse retina generated by serial analysis of gene expression (SAGE, this paper presents a predictive methodology comprising several in silico models for detecting key complex features and relationships encoded in the data, which may be useful to distinguish genes in terms of their functional roles. In order to understand temporal patterns of photoreceptor gene expression during retinal development, a two-way cluster analysis was firstly performed. By clustering SAGE libraries, a hierarchical tree reflecting relationships between developmental stages was obtained. By clustering SAGE tags, a more comprehensive expression profile for photoreceptor cells was revealed. To demonstrate the usefulness of machine learning-based models in predicting functional associations from the SAGE data, three supervised classification models were compared. The results indicated that a relatively simple instance-based model (KStar model performed significantly better than relatively more complex algorithms, e.g. neural networks. To deal with the problem of functional class imbalance occurring in the dataset, two data re

  4. A comparison of 100 human genes using an alu element-based instability model.

    Directory of Open Access Journals (Sweden)

    George W Cook

    Full Text Available The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1 the two-hit double-strand break potential of Alu elements and 2 the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.

  5. A comparison of 100 human genes using an alu element-based instability model.

    Science.gov (United States)

    Cook, George W; Konkel, Miriam K; Walker, Jerilyn A; Bourgeois, Matthew G; Fullerton, Mitchell L; Fussell, John T; Herbold, Heath D; Batzer, Mark A

    2013-01-01

    The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.

  6. Avirulence (AVR) Gene-Based Diagnosis Complements Existing Pathogen Surveillance Tools for Effective Deployment of Resistance (R) Genes Against Rice Blast Disease.

    Science.gov (United States)

    Selisana, S M; Yanoria, M J; Quime, B; Chaipanya, C; Lu, G; Opulencia, R; Wang, G-L; Mitchell, T; Correll, J; Talbot, N J; Leung, H; Zhou, B

    2017-06-01

    Avirulence (AVR) genes in Magnaporthe oryzae, the fungal pathogen that causes the devastating rice blast disease, have been documented to be major targets subject to mutations to avoid recognition by resistance (R) genes. In this study, an AVR-gene-based diagnosis tool for determining the virulence spectrum of a rice blast pathogen population was developed and validated. A set of 77 single-spore field isolates was subjected to pathotype analysis using differential lines, each containing a single R gene, and classified into 20 virulent pathotypes, except for 4 isolates that lost pathogenicity. In all, 10 differential lines showed low frequency (95%), inferring the effectiveness of R genes present in the respective differential lines. In addition, the haplotypes of seven AVR genes were determined by polymerase chain reaction amplification and sequencing, if applicable. The calculated frequency of different AVR genes displayed significant variations in the population. AVRPiz-t and AVR-Pii were detected in 100 and 84.9% of the isolates, respectively. Five AVR genes such as AVR-Pik-D (20.5%) and AVR-Pik-E (1.4%), AVRPiz-t (2.7%), AVR-Pita (0%), AVR-Pia (0%), and AVR1-CO39 (0%) displayed low or even zero frequency. The frequency of AVR genes correlated almost perfectly with the resistance frequency of the cognate R genes in differential lines, except for International Rice Research Institute-bred blast-resistant lines IRBLzt-T, IRBLta-K1, and IRBLkp-K60. Both genetic analysis and molecular marker validation revealed an additional R gene, most likely Pi19 or its allele, in these three differential lines. This can explain the spuriously higher resistance frequency of each target R gene based on conventional pathotyping. This study demonstrates that AVR-gene-based diagnosis provides a precise, R-gene-specific, and differential line-free assessment method that can be used for determining the virulence spectrum of a rice blast pathogen population and for predicting the

  7. RANWAR: rank-based weighted association rule mining from gene expression and methylation data.

    Science.gov (United States)

    Mallik, Saurav; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2015-01-01

    Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported.

  8. A contribution to the study of plant development evolution based on gene co-expression networks

    Directory of Open Access Journals (Sweden)

    Francisco J. Romero-Campero

    2013-08-01

    Full Text Available Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms.

  9. Beyond the Central Dogma: Model-Based Learning of How Genes Determine Phenotypes

    Science.gov (United States)

    Reinagel, Adam; Speth, Elena Bray

    2016-01-01

    In an introductory biology course, we implemented a learner-centered, model-based pedagogy that frequently engaged students in building conceptual models to explain how genes determine phenotypes. Model-building tasks were incorporated within case studies and aimed at eliciting students' understanding of 1) the origin of variation in a population…

  10. PINTA: a web server for network-based gene prioritization from expression data

    DEFF Research Database (Denmark)

    Nitsch, Daniela; Tranchevent, Léon-Charles; Goncalves, Joana P.

    2011-01-01

    PINTA (available at http://www.esat.kuleuven.be/ pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction...

  11. The UDP glucuronosyltransferase gene superfamily: suggested nomenclature based on evolutionary divergence

    NARCIS (Netherlands)

    Burchell, B.; Nebert, D. W.; Nelson, D. R.; Bock, K. W.; Iyanagi, T.; Jansen, P. L.; Lancet, D.; Mulder, G. J.; Chowdhury, J. R.; Siest, G.

    1991-01-01

    A nomenclature system for the UDP glucuronosyltransferase superfamily is proposed, based on divergent evolution of the genes. A total of 26 distinct cDNAs in five mammalian species have been sequenced to date. Comparison of the deduced amino acid sequences leads to the definition of two families and

  12. Establishment of a Cre recombinase based mutagenesis protocol for markerless gene deletion in Streptococcus suis.

    Science.gov (United States)

    Koczula, A; Willenborg, J; Bertram, R; Takamatsu, D; Valentin-Weigand, P; Goethe, R

    2014-12-01

    The lack of knowledge about pathogenicity mechanisms of Streptococcus (S.) suis is, at least partially, attributed to limited methods for its genetic manipulation. Here, we established a Cre-lox based recombination system for markerless gene deletions in S. suis serotype 2 with high selective pressure and without undesired side effects. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.

    Science.gov (United States)

    Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju

    2017-04-27

    means of network and pathway-based methodology, we explored the pathogenetic mechanism underlying AD at a systems biology level. Results from our work could provide valuable clues for understanding the molecular mechanism underlying AD. In addition, the framework proposed in this study could be used to investigate the pathological molecular network and genes relevant to other complex diseases or phenotypes.

  14. Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation.

    Science.gov (United States)

    Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

    2014-01-01

    Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific

  15. Mesenchymal Stem Cell-Based Tumor-Targeted Gene Therapy in Gastrointestinal Cancer

    OpenAIRE

    Bao, Qi; Zhao, Yue; Niess, Hanno; Conrad, Claudius; Schwarz, Bettina; Jauch, Karl-Walter; Huss, Ralf; Nelson, Peter J.; Bruns, Christiane J.

    2012-01-01

    Mesenchymal stem (or stromal) cells (MSCs) are nonhematopoietic progenitor cells that can be obtained from bone marrow aspirates or adipose tissue, expanded and genetically modified in vitro, and then used for cancer therapeutic strategies in vivo. Here, we review available data regarding the application of MSC-based tumor-targeted therapy in gastrointestinal cancer, provide an overview of the general history of MSC-based gene therapy in cancer research, and discuss potential problems associa...

  16. Analyzing Plasmodium falciparum erythrocyte membrane protein 1 gene expression by a next generation sequencing based method

    DEFF Research Database (Denmark)

    Jespersen, Jakob S.; Petersen, Bent; Seguin-Orlando, Andaine

    2013-01-01

    at identifying PfEMP1 features associated with high virulence. Here we present the first effective method for sequence analysis of var genes expressed in field samples: a sequential PCR and next generation sequencing based technique applied on expressed var sequence tags and subsequently on long range PCR......, encoded by ~60 highly variable 'var' genes per haploid genome. PfEMP1 is exported to the surface of infected erythrocytes and is thought to be fundamental to immune evasion by adhesion to host and parasite factors. The highly variable nature has constituted a roadblock in var expression studies aimed...

  17. Integration of Genome Scale Metabolic Networks and Gene Regulation of Metabolic Enzymes With Physiologically Based Pharmacokinetics.

    Science.gov (United States)

    Maldonado, Elaina M; Leoncikas, Vytautas; Fisher, Ciarán P; Moore, J Bernadette; Plant, Nick J; Kierzek, Andrzej M

    2017-11-01

    The scope of physiologically based pharmacokinetic (PBPK) modeling can be expanded by assimilation of the mechanistic models of intracellular processes from systems biology field. The genome scale metabolic networks (GSMNs) represent a whole set of metabolic enzymes expressed in human tissues. Dynamic models of the gene regulation of key drug metabolism enzymes are available. Here, we introduce GSMNs and review ongoing work on integration of PBPK, GSMNs, and metabolic gene regulation. We demonstrate example models. © 2017 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.

  18. Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

    Science.gov (United States)

    2011-01-01

    Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were

  19. Efficient gene transfer into nondividing cells by adeno-associated virus-based vectors.

    Science.gov (United States)

    Podsakoff, G; Wong, K K; Chatterjee, S

    1994-09-01

    Gene transfer vectors based on adeno-associated virus (AAV) are emerging as highly promising for use in human gene therapy by virtue of their characteristics of wide host range, high transduction efficiencies, and lack of cytopathogenicity. To better define the biology of AAV-mediated gene transfer, we tested the ability of an AAV vector to efficiently introduce transgenes into nonproliferating cell populations. Cells were induced into a nonproliferative state by treatment with the DNA synthesis inhibitors fluorodeoxyuridine and aphidicolin or by contact inhibition induced by confluence and serum starvation. Cells in logarithmic growth or DNA synthesis arrest were transduced with vCWR:beta gal, an AAV-based vector encoding beta-galactosidase under Rous sarcoma virus long terminal repeat promoter control. Under each condition tested, vCWR:beta Gal expression in nondividing cells was at least equivalent to that in actively proliferating cells, suggesting that mechanisms for virus attachment, nuclear transport, virion uncoating, and perhaps some limited second-strand synthesis of AAV vectors were present in nondividing cells. Southern hybridization analysis of vector sequences from cells transduced while in DNA synthetic arrest and expanded after release of the block confirmed ultimate integration of the vector genome into cellular chromosomal DNA. These findings may provide the basis for the use of AAV-based vectors for gene transfer into quiescent cell populations such as totipotent hematopoietic stem cells.

  20. Actionable gene-based classification toward precision medicine in gastric cancer

    Directory of Open Access Journals (Sweden)

    Hiroshi Ichikawa

    2017-10-01

    Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene-based

  1. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  2. Time warping of evolutionary distant temporal gene expression data based on noise suppression

    Directory of Open Access Journals (Sweden)

    Papatsenko Dmitri

    2009-10-01

    Full Text Available Abstract Background Comparative analysis of genome wide temporal gene expression data has a broad potential area of application, including evolutionary biology, developmental biology, and medicine. However, at large evolutionary distances, the construction of global alignments and the consequent comparison of the time-series data are difficult. The main reason is the accumulation of variability in expression profiles of orthologous genes, in the course of evolution. Results We applied Pearson distance matrices, in combination with other noise-suppression techniques and data filtering to improve alignments. This novel framework enhanced the capacity to capture the similarities between the temporal gene expression datasets separated by large evolutionary distances. We aligned and compared the temporal gene expression data in budding (Saccharomyces cerevisiae and fission (Schizosaccharomyces pombe yeast, which are separated by more then ~400 myr of evolution. We found that the global alignment (time warping properly matched the duration of cell cycle phases in these distant organisms, which was measured in prior studies. At the same time, when applied to individual ortholog pairs, this alignment procedure revealed groups of genes with distinct alignments, different from the global alignment. Conclusion Our alignment-based predictions of differences in the cell cycle phases between the two yeast species were in a good agreement with the existing data, thus supporting the computational strategy adopted in this study. We propose that the existence of the alternative alignments, specific to distinct groups of genes, suggests presence of different synchronization modes between the two organisms and possible functional decoupling of particular physiological gene networks in the course of evolution.

  3. Form gene clustering method about pan-ethnic-group products based on emotional semantic

    Science.gov (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

    2016-09-01

    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  4. Affinity-based biosensors as promising tools for gene doping detection.

    Science.gov (United States)

    Minunni, Maria; Scarano, Simona; Mascini, Marco

    2008-05-01

    Innovative bioanalytical approaches can be foreseen as interesting means for solving relevant emerging problems in anti-doping control. Sport authorities fear that the newer form of doping, so-called gene doping, based on a misuse of gene therapy, will be undetectable and thus much less preventable. The World Anti-Doping Agency has already asked scientists to assist in finding ways to prevent and detect this newest kind of doping. In this Opinion article we discuss the main aspects of gene doping, from the putative target analytes to suitable sampling strategies. Moreover, we discuss the potential application of affinity sensing in this field, which so far has been successfully applied to a variety of analytical problems, from clinical diagnostics to food and environmental analysis.

  5. Comparison of different cationized proteins as biomaterials for nanoparticle-based ocular gene delivery.

    Science.gov (United States)

    Zorzi, Giovanni K; Párraga, Jenny E; Seijo, Begoña; Sanchez, Alejandro

    2015-11-01

    Cationized polymers have been proposed as transfection agents for gene therapy. The present work aims to improve the understanding of the potential use of different cationized proteins (atelocollagen, albumin and gelatin) as nanoparticle components and to investigate the possibility of modulating the physicochemical properties of the resulting nanoparticle carriers by selecting specific protein characteristics in an attempt to improve current ocular gene-delivery approaches. The toxicity profiles, as well as internalization and transfection efficiency, of the developed nanoparticles can be modulated by modifying the molecular weight of the selected protein and the amine used for cationization. The most promising systems are nanoparticles based on intermediate molecular weight gelatin cationized with the endogenous amine spermine, which exhibit an adequate toxicological profile, as well as effective association and protection of pDNA or siRNA molecules, thereby resulting in higher transfection efficiency and gene silencing than the other studied formulations. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  7. Digital Gene Expression Analysis Based on De Novo Transcriptome Assembly Reveals New Genes Associated with Floral Organ Differentiation of the Orchid Plant Cymbidium ensifolium.

    Directory of Open Access Journals (Sweden)

    Fengxi Yang

    Full Text Available Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms

  8. Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

    Directory of Open Access Journals (Sweden)

    Sujan Mamidi

    2016-07-01

    Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.

  9. PHYLOGENETIC RELATIONSHIPS AMONGST 10 Durio SPECIES BASED ON PCR-RFLP ANALYSIS OF TWO CHLOROPLAST GENES

    Directory of Open Access Journals (Sweden)

    Panca J. Santoso

    2013-07-01

    Full Text Available Twenty seven species of Durio have been identified in Sabah and Sarawak, Malaysia, but their relationships have not been studied. This study was conducted to analyse phylogenetic relationships amongst 10 Durio species in Malaysia using PCR-RFLP on two chloroplast DNA genes, i.e. ndhC-trnV and rbcL. DNAs were extracted from young leaves of 11 accessions from 10 Durio species collected from the Tenom Agriculture Research Station, Sabah, and University Agriculture Park, Universiti Putra Malaysia. Two pairs of oligonucleotide primers, N1-N2 and rbcL1-rbcL2, were used to flank the target regions ndhC-trnV and rbcL. Eight restriction enzymes, HindIII, BsuRI, PstI, TaqI, MspI, SmaI, BshNI, and EcoR130I, were used to digest the amplicons. Based on the results of PCR-RFLP on ndhC-trnV gene, the 10 Durio species were grouped into five distinct clusters, and the accessions generally showed high variations. However, based on the results of PCR-RFLP on the rbcL gene, the species were grouped into three distinct clusters, and generally showed low variations. This means that ndhC-trnV gene is more reliable for phylogenetic analysis in lower taxonomic level of Durio species or for diversity analysis, while rbcL gene is reliable marker for phylogenetic analysis at higher taxonomic level. PCR-RFLP on the ndhC-trnV and rbcL genes could therefore be considered as useful markers to phylogenetic analysis amongst Durio species. These finding might be used for further molecular marker assisted in Durio breeding program.

  10. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  11. Improved in vivo gene transfer into tumor tissue by stabilization of pseudodendritic oligoethylenimine-based polyplexes.

    Science.gov (United States)

    Russ, Verena; Fröhlich, Thomas; Li, Yunqiu; Halama, Anna; Ogris, Manfred; Wagner, Ernst

    2010-02-01

    HD O is a low molecular weight pseudodendrimer containing oligoethylenimine and degradable hexanediol diacrylate diesters. DNA polyplexes display encouraging gene transfer efficiency in vitro and in vivo but also a limited stability under physiological conditions. This limitation must be overcome for further development into more sophisticated formulations. HD O polyplexes were laterally stabilized by crosslinking surface amines via bifunctional crosslinkers, bioreducible dithiobis(succimidyl propionate) (DSP) or the nonreducible analog disuccinimidyl suberate (DSS). Optionally, in a subsequent step, the targeting ligand transferrin (Tf) was attached to DSP-linked HD O polyplexes via Schiff base formation between HD O amino groups and Tf aldehyde groups, which were introduced into Tf by periodate oxidation of the glycosylation sites. Crosslinked DNA polyplexes showed an increased stability against exchange reaction by salt or heparin. Disulfide bond containing DSP-linked polyplexes were susceptible to reducing conditions. These polyplexes displayed the highest gene expression levels in vitro and in vivo (upon intratumoral application in mice), and these were significantly elevated and prolonged over standard or DSS-stabilized HD O formulations. DSP-stabilized HD O polyplexes with or without Tf coating were well-tolerated after intravenous application. High gene expression levels were found in tumor tissue, with negligible gene expression in any other organ. Lateral stabilization of HD O polyplexes with DSP crosslinker enhanced gene transfer efficacy and was essential for the incorporation of a ligand (Tf) into a stable particle formulation.

  12. Probability-based collaborative filtering model for predicting gene-disease associations.

    Science.gov (United States)

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  13. Allen Brain Atlas-Driven Visualizations: a web-based gene expression energy visualization tool.

    Science.gov (United States)

    Zaldivar, Andrew; Krichmar, Jeffrey L

    2014-01-01

    The Allen Brain Atlas-Driven Visualizations (ABADV) is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA) across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.

  14. Allen Brain Atlas-Driven Visualizations: A Web-Based Gene Expression Energy Visualization Tool

    Directory of Open Access Journals (Sweden)

    Andrew eZaldivar

    2014-05-01

    Full Text Available The Allen Brain Atlas-Driven Visualizations (ABADV is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.

  15. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  16. SoFoCles: feature filtering for microarray classification based on gene ontology.

    Science.gov (United States)

    Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A

    2010-02-01

    Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.

  17. A meta-analysis based method for prioritizing candidate genes involved in a pre-specific function

    Directory of Open Access Journals (Sweden)

    Jingjing Zhai

    2016-12-01

    Full Text Available The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Towards this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization, in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The ‘leave-one-out’ cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2. Moreover, RAP ranked 53.68% (204/380 flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software.

  18. fabp4 is central to eight obesity associated genes: a functional gene network-based polymorphic study.

    Science.gov (United States)

    Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand

    2015-01-07

    Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses

    OpenAIRE

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-01-01

    Background Female moths synthesize species-specific sex pheromone components and release them to attract male moths, which depend on precise sex pheromone chemosensory system to locate females. Two types of genes involved in the sex pheromone biosynthesis and degradation pathways play essential roles in this important moth behavior. To understand the function of genes in the sex pheromone pathway, this study investigated the genome-wide and digital gene expression of sex pheromone biosynthesi...

  20. DGGE based whole-gene mutation scanning of the dystrophlin gene in Duchenne and Becker muscular dystrophy patients

    NARCIS (Netherlands)

    Hofstra, RMW; Mulder, IM; Vossen, R; de Koning-Gans, PAM; Kraak, M; Ginjaar, IB; van der Hout, AH; Bakker, E; Buys, CHCM; van Essen, AJ; den Dunnen, JT

    2004-01-01

    Duchenne and Becker muscular dystrophy (DMD and BMD) are caused by mutations in the dystrophin gene. Large rearrangements in the gene are found in about two,thirds of DMD patients, with similar to60% carrying deletions and 5-10% carrying duplications. Most of the remaining 30-35% of patients are

  1. DNA base sequence changes induced by ultraviolet light mutagenesis of a gene on a chromosome in Chinese hamster ovary cells

    Energy Technology Data Exchange (ETDEWEB)

    Romac, S; Leong, P; Sockett, H; Hutchinson, F [Yale Univ., New Haven, CT (USA). Dept. of Molecular Biophysics and Biochemistry

    1989-09-20

    The DNA base sequence changes induced by mutagenesis with ultraviolet light have been determined in a gene on a chromosome of cultured Chinese hamster ovary (CHO) cells. The gene was the Excherichia coli gpt gene, of which a single copy was stably incorporated and expressed in the CHO cell genome. The cells were irradiated with ultraviolet light and gpt{sup -} colonies were selected by resistance to 6-thioguanine. The gpt gene was amplified from chromosomal DNA by use of the polymerase chain reaction (PCR) and the amplified DNA sequenced directly by the dideoxy method. Of the 58 sequenced mutants of independent origin 53 were base change mutations. Forty-one base substitutions were single base changes, ten had two adjacent (or tandem) base changes, and one had two base changes separated by a single base-pair. Only one mutant had a multiple base change mutation with two or more well separated base changes. In contrast much higher levels of such mutations were reported in ultraviolet mutagenesis of genes on a shuttle vector in primate cells. Two deletions of a single base-pair were observed and three deletions ranging from 6 to 37 base-pairs. The mutation spectrum in the gpt gene had similarities to the ultraviolet mutation spectra for several genes in prokaryotes, which suggests similarities in mutational mechanisms in prokaryotes and eukaryotes. (author).

  2. Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups.

    Science.gov (United States)

    Shiao, S Pamela K; Grayson, James; Yu, Chong Ho; Wasek, Brandi; Bottiglieri, Teodoro

    2018-02-16

    For the personalization of polygenic/omics-based health care, the purpose of this study was to examine the gene-environment interactions and predictors of colorectal cancer (CRC) by including five key genes in the one-carbon metabolism pathways. In this proof-of-concept study, we included a total of 54 families and 108 participants, 54 CRC cases and 54 matched family friends representing four major racial ethnic groups in southern California (White, Asian, Hispanics, and Black). We used three phases of data analytics, including exploratory, family-based analyses adjusting for the dependence within the family for sharing genetic heritage, the ensemble method, and generalized regression models for predictive modeling with a machine learning validation procedure to validate the results for enhanced prediction and reproducibility. The results revealed that despite the family members sharing genetic heritage, the CRC group had greater combined gene polymorphism rates than the family controls ( p relation to gene-environment interactions in the prevention of CRC.

  3. Tumor Suppressor Gene-Based Nanotherapy: From Test Tube to the Clinic

    Directory of Open Access Journals (Sweden)

    Manish Shanker

    2011-01-01

    Full Text Available Cancer is a major health problem in the world. Advances made in cancer therapy have improved the survival of patients in certain types of cancer. However, the overall five-year survival has not significantly improved in the majority of cancer types. Major challenges encountered in having effective cancer therapy are development of drug resistance by the tumor cells, nonspecific cytotoxicity, and inability to affect metastatic tumors by the chemodrugs. Overcoming these challenges requires development and testing of novel therapies. One attractive cancer therapeutic approach is cancer gene therapy. Several laboratories including the authors' laboratory have been investigating nonviral formulations for delivering therapeutic genes as a mode for effective cancer therapy. In this paper the authors will summarize their experience in the development and testing of a cationic lipid-based nanocarrier formulation and the results from their preclinical studies leading to a Phase I clinical trial for nonsmall cell lung cancer. Their nanocarrier formulation containing therapeutic genes such as tumor suppressor genes when administered intravenously effectively controls metastatic tumor growth. Additional Phase I clinical trials based on the results of their nanocarrier formulation have been initiated or proposed for treatment of cancer of the breast, ovary, pancreas, and metastatic melanoma, and will be discussed.

  4. GENECODIS-Grid: An online grid-based tool to predict functional information in gene lists

    International Nuclear Information System (INIS)

    Nogales, R.; Mejia, E.; Vicente, C.; Montes, E.; Delgado, A.; Perez Griffo, F. J.; Tirado, F.; Pascual-Montano, A.

    2007-01-01

    In this work we introduce GeneCodis-Grid, a grid-based alternative to a bioinformatics tool named Genecodis that integrates different sources of biological information to search for biological features (annotations) that frequently co-occur in a set of genes and rank them by statistical significance. GeneCodis-Grid is a web-based application that takes advantage of two independent grid networks and a computer cluster managed by a meta-scheduler and a web server that host the application. The mining of concurrent biological annotations provides significant information for the functional analysis of gene list obtained by high throughput experiments in biology. Due to the large popularity of this tool, that has registered more than 13000 visits since its publication in January 2007, there is a strong need to facilitate users from different sites to access the system simultaneously. In addition, the complexity of some of the statistical tests used in this approach has made this technique a good candidate for its implementation in a Grid opportunistic environment. (Author)

  5. Tumor suppressor gene-based nanotherapy: from test tube to the clinic.

    Science.gov (United States)

    Shanker, Manish; Jin, Jiankang; Branch, Cynthia D; Miyamoto, Shinya; Grimm, Elizabeth A; Roth, Jack A; Ramesh, Rajagopal

    2011-01-01

    Cancer is a major health problem in the world. Advances made in cancer therapy have improved the survival of patients in certain types of cancer. However, the overall five-year survival has not significantly improved in the majority of cancer types. Major challenges encountered in having effective cancer therapy are development of drug resistance by the tumor cells, nonspecific cytotoxicity, and inability to affect metastatic tumors by the chemodrugs. Overcoming these challenges requires development and testing of novel therapies. One attractive cancer therapeutic approach is cancer gene therapy. Several laboratories including the authors' laboratory have been investigating nonviral formulations for delivering therapeutic genes as a mode for effective cancer therapy. In this paper the authors will summarize their experience in the development and testing of a cationic lipid-based nanocarrier formulation and the results from their preclinical studies leading to a Phase I clinical trial for nonsmall cell lung cancer. Their nanocarrier formulation containing therapeutic genes such as tumor suppressor genes when administered intravenously effectively controls metastatic tumor growth. Additional Phase I clinical trials based on the results of their nanocarrier formulation have been initiated or proposed for treatment of cancer of the breast, ovary, pancreas, and metastatic melanoma, and will be discussed.

  6. Gene expression-based molecular diagnostic system for malignant gliomas is superior to histological diagnosis.

    Science.gov (United States)

    Shirahata, Mitsuaki; Iwao-Koizumi, Kyoko; Saito, Sakae; Ueno, Noriko; Oda, Masashi; Hashimoto, Nobuo; Takahashi, Jun A; Kato, Kikuya

    2007-12-15

    Current morphology-based glioma classification methods do not adequately reflect the complex biology of gliomas, thus limiting their prognostic ability. In this study, we focused on anaplastic oligodendroglioma and glioblastoma, which typically follow distinct clinical courses. Our goal was to construct a clinically useful molecular diagnostic system based on gene expression profiling. The expression of 3,456 genes in 32 patients, 12 and 20 of whom had prognostically distinct anaplastic oligodendroglioma and glioblastoma, respectively, was measured by PCR array. Next to unsupervised methods, we did supervised analysis using a weighted voting algorithm to construct a diagnostic system discriminating anaplastic oligodendroglioma from glioblastoma. The diagnostic accuracy of this system was evaluated by leave-one-out cross-validation. The clinical utility was tested on a microarray-based data set of 50 malignant gliomas from a previous study. Unsupervised analysis showed divergent global gene expression patterns between the two tumor classes. A supervised binary classification model showed 100% (95% confidence interval, 89.4-100%) diagnostic accuracy by leave-one-out cross-validation using 168 diagnostic genes. Applied to a gene expression data set from a previous study, our model correlated better with outcome than histologic diagnosis, and also displayed 96.6% (28 of 29) consistency with the molecular classification scheme used for these histologically controversial gliomas in the original article. Furthermore, we observed that histologically diagnosed glioblastoma samples that shared anaplastic oligodendroglioma molecular characteristics tended to be associated with longer survival. Our molecular diagnostic system showed reproducible clinical utility and prognostic ability superior to traditional histopathologic diagnosis for malignant glioma.

  7. Frequency-based time-series gene expression recomposition using PRIISM

    Directory of Open Access Journals (Sweden)

    Rosa Bruce A

    2012-06-01

    Full Text Available Abstract Background Circadian rhythm pathways influence the expression patterns of as much as 31% of the Arabidopsis genome through complicated interaction pathways, and have been found to be significantly disrupted by biotic and abiotic stress treatments, complicating treatment-response gene discovery methods due to clock pattern mismatches in the fold change-based statistics. The PRIISM (Pattern Recomposition for the Isolation of Independent Signals in Microarray data algorithm outlined in this paper is designed to separate pattern changes induced by different forces, including treatment-response pathways and circadian clock rhythm disruptions. Results Using the Fourier transform, high-resolution time-series microarray data is projected to the frequency domain. By identifying the clock frequency range from the core circadian clock genes, we separate the frequency spectrum to different sections containing treatment-frequency (representing up- or down-regulation by an adaptive treatment response, clock-frequency (representing the circadian clock-disruption response and noise-frequency components. Then, we project the components’ spectra back to the expression domain to reconstruct isolated, independent gene expression patterns representing the effects of the different influences. By applying PRIISM on a high-resolution time-series Arabidopsis microarray dataset under a cold treatment, we systematically evaluated our method using maximum fold change and principal component analyses. The results of this study showed that the ranked treatment-frequency fold change results produce fewer false positives than the original methodology, and the 26-hour timepoint in our dataset was the best statistic for distinguishing the most known cold-response genes. In addition, six novel cold-response genes were discovered. PRIISM also provides gene expression data which represents only circadian clock influences, and may be useful for circadian clock studies

  8. Alteration of gene conversion patterns in Sordaria fimicola by supplementation with DNA bases.

    Science.gov (United States)

    Kitani, Y; Olive, L S

    1970-08-01

    Supplementation with DNA bases in crosses of Sordaria fimicola heterozygous for spore color markers (g(1), h(2)) within the gray-spore (g) locus has been found to cause significant alterations in patterns of gene conversion at the two mutant sites. Each base had its own characteristic effect in altering the conversion pattern, and responses of the two mutant sites to the four bases were different in several ways. Also, the responses of the two involved chromatids of the meiotic bivalent were different.

  9. Mesenchymal Stem Cell-Based Tumor-Targeted Gene Therapy in Gastrointestinal Cancer

    Science.gov (United States)

    Bao, Qi; Zhao, Yue; Niess, Hanno; Conrad, Claudius; Schwarz, Bettina; Jauch, Karl-Walter; Huss, Ralf; Nelson, Peter J.

    2012-01-01

    Mesenchymal stem (or stromal) cells (MSCs) are nonhematopoietic progenitor cells that can be obtained from bone marrow aspirates or adipose tissue, expanded and genetically modified in vitro, and then used for cancer therapeutic strategies in vivo. Here, we review available data regarding the application of MSC-based tumor-targeted therapy in gastrointestinal cancer, provide an overview of the general history of MSC-based gene therapy in cancer research, and discuss potential problems associated with the utility of MSC-based therapy such as biosafety, immunoprivilege, transfection methods, and distribution in the host. PMID:22530882

  10. The Smn-independent beneficial effects of trichostatin A on an intermediate mouse model of spinal muscular atrophy.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available Spinal muscular atrophy is an autosomal recessive neuromuscular disease characterized by the progressive loss of alpha motor neurons in the spinal cord. Trichostatin A (TSA is a histone deacetylase inhibitor with beneficial effects in spinal muscular atrophy mouse models that carry the human SMN2 transgene. It is currently unclear whether TSA specifically targets the SMN2 gene or whether other genes respond to TSA and in turn provide neuroprotection in SMA mice. We have taken advantage of the Smn2B/- mouse model that does not harbor the human SMN2 transgene, to test the hypothesis that TSA has its beneficial effects through a non-SMN mediated pathway. TSA increased the median lifespan of Smn2B/- mice from twenty days to eight weeks. As well, there was a significant attenuation of weight loss and improved motor behavior. Pen test and righting reflex both showed significant improvement, and motor neurons in the spinal cord of Smn2B/- mice were protected from degeneration. Both the size and maturity of neuromuscular junctions were significantly improved in TSA treated Smn2B/- mice. Of interest, TSA treatment did not increase the levels of Smn protein in mouse embryonic fibroblasts or myoblasts obtained from the Smn2B/- mice. In addition, no change in the level of Smn transcripts or protein in the brain or spinal cord of TSA-treated SMA model mice was observed. Furthermore, TSA did not increase Smn protein levels in the hind limb muscle, heart, or liver of Smn2B/- mice. We therefore conclude that TSA likely exerts its effects independent of the endogenous mouse Smn gene. As such, identification of the pathways regulated by TSA in the Smn2B/- mice could lead to the development of novel therapeutics for treating SMA.

  11. Identification of human circadian genes based on time course gene expression profiles by using a deep learning method.

    Science.gov (United States)

    Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui

    2018-06-01

    Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological

  12. Genotet: An Interactive Web-based Visual Exploration Framework to Support Validation of Gene Regulatory Networks.

    Science.gov (United States)

    Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T

    2014-12-01

    Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).

  13. Monoterpenoid-based preparations in beehives affect learning, memory, and gene expression in the bee brain.

    Science.gov (United States)

    Bonnafé, Elsa; Alayrangues, Julie; Hotier, Lucie; Massou, Isabelle; Renom, Allan; Souesme, Guillaume; Marty, Pierre; Allaoua, Marion; Treilhou, Michel; Armengaud, Catherine

    2017-02-01

    Bees are exposed in their environment to contaminants that can weaken the colony and contribute to bee declines. Monoterpenoid-based preparations can be introduced into hives to control the parasitic mite Varroa destructor. The long-term effects of monoterpenoids are poorly investigated. Olfactory conditioning of the proboscis extension reflex (PER) has been used to evaluate the impact of stressors on cognitive functions of the honeybee such as learning and memory. The authors tested the PER to odorants on bees after exposure to monoterpenoids in hives. Octopamine receptors, transient receptor potential-like (TRPL), and γ-aminobutyric acid channels are thought to play a critical role in the memory of food experience. Gene expression levels of Amoa1, Rdl, and trpl were evaluated in parallel in the bee brain because these genes code for the cellular targets of monoterpenoids and some pesticides and neural circuits of memory require their expression. The miticide impaired the PER to odors in the 3 wk following treatment. Short-term and long-term olfactory memories were improved months after introduction of the monoterpenoids into the beehives. Chronic exposure to the miticide had significant effects on Amoa1, Rdl, and trpl gene expressions and modified seasonal changes in the expression of these genes in the brain. The decrease of expression of these genes in winter could partly explain the improvement of memory. The present study has led to new insights into alternative treatments, especially on their effects on memory and expression of selected genes involved in this cognitive function. Environ Toxicol Chem 2017;36:337-345. © 2016 SETAC. © 2016 SETAC.

  14. Identification of Key Pathways and Genes in the Dynamic Progression of HCC Based on WGCNA.

    Science.gov (United States)

    Yin, Li; Cai, Zhihui; Zhu, Baoan; Xu, Cunshuan

    2018-02-14

    Hepatocellular carcinoma (HCC) is a devastating disease worldwide. Though many efforts have been made to elucidate the process of HCC, its molecular mechanisms of development remain elusive due to its complexity. To explore the stepwise carcinogenic process from pre-neoplastic lesions to the end stage of HCC, we employed weighted gene co-expression network analysis (WGCNA) which has been proved to be an effective method in many diseases to detect co-expressed modules and hub genes using eight pathological stages including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early and early, advanced HCC and very advanced HCC. Among the eight consecutive pathological stages, five representative modules are selected to perform canonical pathway enrichment and upstream regulator analysis by using ingenuity pathway analysis (IPA) software. We found that cell cycle related biological processes were activated at four neoplastic stages, and the degree of activation of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and the expression value of the genes decreased only at four neoplastic stages. The brown module, enriched in protein ubiquitination and ephrin receptor signaling pathways, correlated mainly with the very early stage of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, correlated with the cirrhotic stage only. The high degree hub genes were identified based on the protein-protein interaction (PPI) network and were verified by Kaplan-Meier survival analysis. The novel five high degree hub genes signature that was identified in our study may shed light on future prognostic and therapeutic approaches. Our study brings a new perspective to the understanding of the key pathways and genes in the dynamic changes of HCC progression. These findings shed light on further investigations.

  15. Taxonomic resolutions based on 18S rRNA genes: a case study of subclass copepoda.

    Directory of Open Access Journals (Sweden)

    Shu Wu

    Full Text Available Biodiversity studies are commonly conducted using 18S rRNA genes. In this study, we compared the inter-species divergence of variable regions (V1-9 within the copepod 18S rRNA gene, and tested their taxonomic resolutions at different taxonomic levels. Our results indicate that the 18S rRNA gene is a good molecular marker for the study of copepod biodiversity, and our conclusions are as follows: 1 18S rRNA genes are highly conserved intra-species (intra-species similarities are close to 100%; and could aid in species-level analyses, but with some limitations; 2 nearly-whole-length sequences and some partial regions (around V2, V4, and V9 of the 18S rRNA gene can be used to discriminate between samples at both the family and order levels (with a success rate of about 80%; 3 compared with other regions, V9 has a higher resolution at the genus level (with an identification success rate of about 80%; and 4 V7 is most divergent in length, and would be a good candidate marker for the phylogenetic study of Acartia species. This study also evaluated the correlation between similarity thresholds and the accuracy of using nuclear 18S rRNA genes for the classification of organisms in the subclass Copepoda. We suggest that sample identification accuracy should be considered when a molecular sequence divergence threshold is used for taxonomic identification, and that the lowest similarity threshold should be determined based on a pre-designated level of acceptable accuracy.

  16. Mechanism-based biomarker gene sets for glutathione depletion-related hepatotoxicity in rats

    International Nuclear Information System (INIS)

    Gao Weihua; Mizukawa, Yumiko; Nakatsu, Noriyuki; Minowa, Yosuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

    2010-01-01

    Chemical-induced glutathione depletion is thought to be caused by two types of toxicological mechanisms: PHO-type glutathione depletion [glutathione conjugated with chemicals such as phorone (PHO) or diethyl maleate (DEM)], and BSO-type glutathione depletion [i.e., glutathione synthesis inhibited by chemicals such as L-buthionine-sulfoximine (BSO)]. In order to identify mechanism-based biomarker gene sets for glutathione depletion in rat liver, male SD rats were treated with various chemicals including PHO (40, 120 and 400 mg/kg), DEM (80, 240 and 800 mg/kg), BSO (150, 450 and 1500 mg/kg), and bromobenzene (BBZ, 10, 100 and 300 mg/kg). Liver samples were taken 3, 6, 9 and 24 h after administration and examined for hepatic glutathione content, physiological and pathological changes, and gene expression changes using Affymetrix GeneChip Arrays. To identify differentially expressed probe sets in response to glutathione depletion, we focused on the following two courses of events for the two types of mechanisms of glutathione depletion: a) gene expression changes occurring simultaneously in response to glutathione depletion, and b) gene expression changes after glutathione was depleted. The gene expression profiles of the identified probe sets for the two types of glutathione depletion differed markedly at times during and after glutathione depletion, whereas Srxn1 was markedly increased for both types as glutathione was depleted, suggesting that Srxn1 is a key molecule in oxidative stress related to glutathione. The extracted probe sets were refined and verified using various compounds including 13 additional positive or negative compounds, and they established two useful marker sets. One contained three probe sets (Akr7a3, Trib3 and Gstp1) that could detect conjugation-type glutathione depletors any time within 24 h after dosing, and the other contained 14 probe sets that could detect glutathione depletors by any mechanism. These two sets, with appropriate scoring

  17. DHPLC-based mutation analysis of ENG and ALK-1 genes in HHT Italian population.

    Science.gov (United States)

    Lenato, Gennaro M; Lastella, Patrizia; Di Giacomo, Marilena C; Resta, Nicoletta; Suppressa, Patrizia; Pasculli, Giovanna; Sabbà, Carlo; Guanti, Ginevra

    2006-02-01

    Hereditary haemorrhagic telangiectasia (HHT or Rendu-Osler-Weber syndrome) is an autosomal dominant disorder characterized by localized angiodysplasia due to mutations in endoglin, ALK-1 gene, and a still unidentified locus. The lack of highly recurrent mutations, locus heterogeneity, and the presence of mutations in almost all coding exons of the two genes makes the screening for mutations time-consuming and costly. In the present study, we developed a DHPLC-based protocol for mutation detection in ALK1 and ENG genes through retrospective analysis of known sequence variants, 20 causative mutations and 11 polymorphisms, and a prospective analysis on 47 probands with unknown mutation. Overall DHPLC analysis identified the causative mutation in 61 out 66 DNA samples (92.4%). We found 31 different mutations in the ALK1 gene, of which 15 are novel, and 20, of which 12 are novel, in the ENG gene, thus providing for the first time the mutational spectrum in a cohort of Italian HHT patients. In addition, we characterized the splicing pattern of ALK1 gene in lymphoblastoid cells, both in normal controls and in two individuals carrying a mutation in the non-invariant -3 position of the acceptor splice site upstream exon 6 (c.626-3C>G). Functional essay demonstrated the existence, also in normal individuals, of a small proportion of ALK1 alternative splicing, due to exon 5 skipping, and the presence of further aberrant splicing isoforms in the individuals carrying the c.626-3C>G mutation. 2006 Wiley-Liss, Inc.

  18. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.

    Science.gov (United States)

    Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis

    2016-09-02

    Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal

  19. Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups

    Directory of Open Access Journals (Sweden)

    S. Pamela K. Shiao

    2018-02-01

    Full Text Available For the personalization of polygenic/omics-based health care, the purpose of this study was to examine the gene–environment interactions and predictors of colorectal cancer (CRC by including five key genes in the one-carbon metabolism pathways. In this proof-of-concept study, we included a total of 54 families and 108 participants, 54 CRC cases and 54 matched family friends representing four major racial ethnic groups in southern California (White, Asian, Hispanics, and Black. We used three phases of data analytics, including exploratory, family-based analyses adjusting for the dependence within the family for sharing genetic heritage, the ensemble method, and generalized regression models for predictive modeling with a machine learning validation procedure to validate the results for enhanced prediction and reproducibility. The results revealed that despite the family members sharing genetic heritage, the CRC group had greater combined gene polymorphism rates than the family controls (p < 0.05, on MTHFR C677T, MTR A2756G, MTRR A66G, and DHFR 19 bp except MTHFR A1298C. Four racial groups presented different polymorphism rates for four genes (all p < 0.05 except MTHFR A1298C. Following the ensemble method, the most influential factors were identified, and the best predictive models were generated by using the generalized regression models, with Akaike’s information criterion and leave-one-out cross validation methods. Body mass index (BMI and gender were consistent predictors of CRC for both models when individual genes versus total polymorphism counts were used, and alcohol use was interactive with BMI status. Body mass index status was also interactive with both gender and MTHFR C677T gene polymorphism, and the exposure to environmental pollutants was an additional predictor. These results point to the important roles of environmental and modifiable factors in relation to gene–environment interactions in the prevention of CRC.

  20. Consequences of population topology for studying gene flow using link-based landscape genetic methods.

    Science.gov (United States)

    van Strien, Maarten J

    2017-07-01

    Many landscape genetic studies aim to determine the effect of landscape on gene flow between populations. These studies frequently employ link-based methods that relate pairwise measures of historical gene flow to measures of the landscape and the geographical distance between populations. However, apart from landscape and distance, there is a third important factor that can influence historical gene flow, that is, population topology (i.e., the arrangement of populations throughout a landscape). As the population topology is determined in part by the landscape configuration, I argue that it should play a more prominent role in landscape genetics. Making use of existing literature and theoretical examples, I discuss how population topology can influence results in landscape genetic studies and how it can be taken into account to improve the accuracy of these results. In support of my arguments, I have performed a literature review of landscape genetic studies published during the first half of 2015 as well as several computer simulations of gene flow between populations. First, I argue why one should carefully consider which population pairs should be included in link-based analyses. Second, I discuss several ways in which the population topology can be incorporated in response and explanatory variables. Third, I outline why it is important to sample populations in such a way that a good representation of the population topology is obtained. Fourth, I discuss how statistical testing for link-based approaches could be influenced by the population topology. I conclude the article with six recommendations geared toward better incorporating population topology in link-based landscape genetic studies.

  1. PR Interval Associated Genes, Atrial Remodeling and Rhythm Outcome of Catheter Ablation of Atrial Fibrillation—A Gene-Based Analysis of GWAS Data

    Directory of Open Access Journals (Sweden)

    Daniela Husser

    2017-12-01

    Full Text Available Background: PR interval prolongation has recently been shown to associate with advanced left atrial remodeling and atrial fibrillation (AF recurrence after catheter ablation. While different genome-wide association studies (GWAS have implicated 13 loci to associate with the PR interval as an AF endophenotype their subsequent associations with AF remodeling and response to catheter ablation are unknown. Here, we perform a gene-based analysis of GWAS data to test the hypothesis that PR interval candidate genes also associate with left atrial remodeling and arrhythmia recurrence following AF catheter ablation.Methods and Results: Samples from 660 patients with paroxysmal (n = 370 or persistent AF (n = 290 undergoing AF catheter ablation were genotyped for ~1,000,000 SNPs. Gene-based association was investigated using VEGAS (versatile gene-based association study. Among the 13 candidate genes, SLC8A1, MEIS1, ITGA9, SCN5A, and SOX5 associated with the PR interval. Of those, ITGA9 and SOX5 were significantly associated with left atrial low voltage areas and left atrial diameter and subsequently with AF recurrence after radiofrequency catheter ablation.Conclusion: This study suggests contributions of ITGA9 and SOX5 to AF remodeling expressed as PR interval prolongation, low voltage areas and left atrial dilatation and subsequently to response to catheter ablation. Future and larger studies are necessary to replicate and apply these findings with the aim of designing AF pathophysiology-based multi-locus risk scores.

  2. Gene expression change in human dental pulp cells exposed to a low-level toxic concentration of triethylene glycol dimethacrylate: an RNA-seq analysis.

    Science.gov (United States)

    Cho, Sung-Geun; Lee, Jin-Woo; Heo, Jung Sun; Kim, Sun-Young

    2014-09-01

    Dental composite resin restoration for defective tooth may lead unpolymerized resin monomers to be leached into dental pulp tissue. The aim of this study was to investigate the early gene expression change over time of human dental pulp cells (HDPCs) treated with a low-level toxic concentration of Triethylene Glycol Dimethacrylate (TEGDMA), a common dental resin monomer, by adopting the novel high-throughput transcriptome analysis of RNA-seq. The low-level toxic concentration of TEGDMA was determined through MTT assays with serially diluted concentrations. After the HDPCs were exposed to TEGDMA for 6, 12, 24 or 48 hr, the total RNA of the samples was prepared for RNA-seq. qRT-PCR for several genes was performed for validation of RNA-seq results. In the treated group, 1280 genes were differentially expressed compared with the control group. Five patterns of time-series gene expression profiles were identified through k-means clustering analysis. Angiogenesis, cell adhesion and migration, extracellular matrix organization, response to extracellular stimulus, inflammatory response and mineralization-related process were major gene ontology terms in functional annotation clustering. HMOX1, OSGIN1, SMN2, SRXN1 AKR1C1, SPP1 and TOMM40L were highly up-regulated genes, and WRAP53 and CCL2 were highly down-regulated genes over time. qRT-PCR for several genes exhibited a high level of agreement with RNA-seq. TEGDMA induced the HDPCs to show massive and dynamic gene expression changes over time. The previously suggested toxic mechanism of TEGDMA was not only verified, but new genes whose functions have yet to be determined were also found. © 2014 Nordic Association for the Publication of BCPT (former Nordic Pharmacological Society).

  3. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

    Science.gov (United States)

    Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

    2017-10-06

    Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.

  4. PGMA-Based Cationic Nanoparticles with Polyhydric Iodine Units for Advanced Gene Vectors.

    Science.gov (United States)

    Sun, Yue; Hu, Hao; Yu, Bingran; Xu, Fu-Jian

    2016-11-16

    It is crucial for successful gene delivery to develop safe, effective, and multifunctional polycations. Iodine-based small molecules are widely used as contrast agents for CT imaging. Herein, a series of star-like poly(glycidyl methacrylate) (PGMA)-based cationic vectors (II-PGEA/II) with abundant flanking polyhydric iodine units are prepared for multifunctional gene delivery systems. The proposed II-PGEA/II star vector is composed of one iohexol intermediate (II) core and five ethanolamine (EA) and II-difunctionalized PGMA arms. The amphipathic II-PGEA/II vectors readily self-assemble into well-defined cationic nanoparticles, where massive hydroxyl groups can establish a hydration shell to stabilize the nanoparticles. The II introduction improves cell viabilities of polycations. Moreover, by controlling the suitable amount of introduced II units, the resultant II-PGEA/II nanoparticles can produce fairly good transfection performances in different cell lines. Particularly, the II-PGEA/II nanoparticles induce much better in vitro CT imaging abilities in tumor cells than iohexol (one commonly used commercial CT contrast agent). The present design of amphipathic PGMA-based nanoparticles with CT contrast agents would provide useful information for the development of new multifunctional gene delivery systems.

  5. A postprocessing method in the HMC framework for predicting gene function based on biological instrumental data

    Science.gov (United States)

    Feng, Shou; Fu, Ping; Zheng, Wenbin

    2018-03-01

    Predicting gene function based on biological instrumental data is a complicated and challenging hierarchical multi-label classification (HMC) problem. When using local approach methods to solve this problem, a preliminary results processing method is usually needed. This paper proposed a novel preliminary results processing method called the nodes interaction method. The nodes interaction method revises the preliminary results and guarantees that the predictions are consistent with the hierarchy constraint. This method exploits the label dependency and considers the hierarchical interaction between nodes when making decisions based on the Bayesian network in its first phase. In the second phase, this method further adjusts the results according to the hierarchy constraint. Implementing the nodes interaction method in the HMC framework also enhances the HMC performance for solving the gene function prediction problem based on the Gene Ontology (GO), the hierarchy of which is a directed acyclic graph that is more difficult to tackle. The experimental results validate the promising performance of the proposed method compared to state-of-the-art methods on eight benchmark yeast data sets annotated by the GO.

  6. A cell-based in vitro alternative to identify skin sensitizers by gene expression

    International Nuclear Information System (INIS)

    Hooyberghs, Jef; Schoeters, Elke; Lambrechts, Nathalie; Nelissen, Inge; Witters, Hilda; Schoeters, Greet; Heuvel, Rosette van den

    2008-01-01

    The ethical and economic burden associated with animal testing for assessment of skin sensitization has triggered intensive research effort towards development and validation of alternative methods. In addition, new legislation on the registration and use of cosmetics and chemicals promote the use of suitable alternatives for hazard assessment. Our previous studies demonstrated that human CD34 + progenitor-derived dendritic cells from cord blood express specific gene profiles upon exposure to low molecular weight sensitizing chemicals. This paper presents a classification model based on this cell type which is successful in discriminating sensitizing chemicals from non-sensitizing chemicals based on transcriptome analysis of 13 genes. Expression profiles of a set of 10 sensitizers and 11 non-sensitizers were analyzed by RT-PCR using 9 different exposure conditions and a total of 73 donor samples. Based on these data a predictive dichotomous classifier for skin sensitizers has been constructed, which is referred to as . In a first step the dimensionality of the input data was reduced by selectively rejecting a number of exposure conditions and genes. Next, the generalization of a linear classifier was evaluated by a cross-validation which resulted in a prediction performance with a concordance of 89%, a specificity of 97% and a sensitivity of 82%. These results show that the present model may be a useful human in vitro alternative for further use in a test strategy towards the reduction of animal use for skin sensitization

  7. Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

    Directory of Open Access Journals (Sweden)

    Pugalendhi Ganesh Kumar

    Full Text Available This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR-based method for redefining the criterion function of f-information (FI to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA, which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS. Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA, particle swarm optimization (PSO, and artificial bee colony algorithm (ABC on all the datasets. In the global cancer map with repeated measurements (GCM_RM dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%. In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively

  8. RNA-based, transient modulation of gene expression in human haematopoietic stem and progenitor cells

    Science.gov (United States)

    Diener, Yvonne; Jurk, Marion; Kandil, Britta; Choi, Yeong-Hoon; Wild, Stefan; Bissels, Ute; Bosio, Andreas

    2015-01-01

    Modulation of gene expression is a useful tool to study the biology of haematopoietic stem and progenitor cells (HSPCs) and might also be instrumental to expand these cells for therapeutic approaches. Most of the studies so far have employed stable gene modification by viral vectors that are burdensome when translating protocols into clinical settings. Our study aimed at exploring new ways to transiently modify HSPC gene expression using non-integrating, RNA-based molecules. First, we tested different methods to deliver these molecules into HSPCs. The delivery of siRNAs with chemical transfection methods such as lipofection or cationic polymers did not lead to target knockdown, although we observed more than 90% fluorescent cells using a fluorochrome-coupled siRNA. Confocal microscopic analysis revealed that despite extensive washing, siRNA stuck to or in the cell surface, thereby mimicking a transfection event. In contrast, electroporation resulted in efficient, siRNA-mediated protein knockdown. For transient overexpression of proteins, we used optimised mRNA molecules with modified 5′- and 3′-UTRs. Electroporation of mRNA encoding GFP resulted in fast, efficient and persistent protein expression for at least seven days. Our data provide a broad-ranging comparison of transfection methods for hard-to-transfect cells and offer new opportunities for DNA-free, non-integrating gene modulation in HSPCs. PMID:26599627

  9. A family-based association study of the HTR1B gene in eating disorders

    Directory of Open Access Journals (Sweden)

    Sandra Hernández

    Full Text Available Objective: To explore the association of three polymorphisms of the serotonin receptor 1Dβ gene (HTR1B in the etiology of eating disorders and their relationship with clinical characteristics. Methods: We analyzed the G861C, A-161T, and A1180G polymorphisms of the HTR1B gene through a family-based association test (FBAT in 245 nuclear families. The sample was stratified into anorexia nervosa (AN spectrum and bulimia nervosa (BN spectrum. In addition, we performed a quantitative FBAT analysis of anxiety severity, depression severity, and Yale-Brown-Cornell Eating Disorders Scale (YBC-EDS in the AN and BN-spectrum groups. Results: FBAT analysis of the A-161T polymorphism found preferential transmission of allele A-161 in the overall sample. This association was stronger when the sample was stratified by spectrums, showing transmission disequilibrium between the A-161 allele and BN spectrum (z = 2.871, p = 0.004. Quantitative trait analysis showed an association between severity of anxiety symptoms and the C861 allele in AN-spectrum participants (z = 2.871, p = 0.004. We found no associations on analysis of depression severity or preoccupation and ritual scores in AN or BN-spectrum participants. Conclusions: Our preliminary findings suggest a role of the HTR1B gene in susceptibility to development of BN subtypes. Furthermore, this gene might have an impact on the severity of anxiety in AN-spectrum patients.

  10. An Intelligent Method of Product Scheme Design Based on Product Gene

    Directory of Open Access Journals (Sweden)

    Qing Song Ai

    2013-01-01

    Full Text Available Nowadays, in order to have some featured products, many customers tend to buy customized products instead of buying common ones in supermarket. The manufacturing enterprises, with the purpose of improving their competitiveness, are focusing on providing customized products with high quality and low cost as well. At present, how to produce customized products rapidly and cheaply has been the key challenge to manufacturing enterprises. In this paper, an intelligent modeling approach applied to supporting the modeling of customized products is proposed, which may improve the efficiency during the product design process. Specifically, the product gene (PG method, which is an analogy of biological evolution in engineering area, is employed to model products in a new way. Based on product gene, we focus on the intelligent modeling method to generate product schemes rapidly and automatically. The process of our research includes three steps: (1 develop a product gene model for customized products; (2 find the obtainment and storage method for product gene; and (3 propose a specific genetic algorithm used for calculating the solution of customized product and generating new product schemes. Finally, a case study is applied to test the usefulness of our study.

  11. Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

    Science.gov (United States)

    Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza

    2015-01-01

    Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.

  12. Algorithms for MDC-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles.

    Science.gov (United States)

    Yu, Yun; Warnow, Tandy; Nakhleh, Luay

    2011-11-01

    One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is Minimize Deep Coalescence (MDC). Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this article, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. In addition, we devise MDC-based algorithms for cases when multiple alleles per species may be sampled. We study the performance of these methods in coalescent-based computer simulations.

  13. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  14. Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

    Science.gov (United States)

    Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

    2006-11-01

    To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.

  15. The Arabidopsis co-expression tool (act): a WWW-based tool and database for microarray-based gene expression analysis

    DEFF Research Database (Denmark)

    Jen, C. H.; Manfield, I. W.; Michalopoulos, D. W.

    2006-01-01

    be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots......We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (act) , based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression...

  16. Inducible, tunable and multiplex human gene regulation using CRISPR-Cpf1-based transcription factors | Office of Cancer Genomics

    Science.gov (United States)

    Targeted and inducible regulation of mammalian gene expression is a broadly important research capability that may also enable development of novel therapeutics for treating human diseases. Here we demonstrate that a catalytically inactive RNA-guided CRISPR-Cpf1 nuclease fused to transcriptional activation domains can up-regulate endogenous human gene expression. We engineered drug-inducible Cpf1-based activators and show how this system can be used to tune the regulation of endogenous gene transcription in human cells.

  17. Identifying novel fruit-related genes in Arabidopsis thaliana based on the random walk with restart algorithm.

    Science.gov (United States)

    Zhang, Yunhua; Dai, Li; Liu, Ying; Zhang, YuHang; Wang, ShaoPeng

    2017-01-01

    Fruit is essential for plant reproduction and is responsible for protection and dispersal of seeds. The development and maturation of fruit is tightly regulated by numerous genetic factors that respond to environmental and internal stimulation. In this study, we attempted to identify novel fruit-related genes in a model organism, Arabidopsis thaliana, using a computational method. Based on validated fruit-related genes, the random walk with restart (RWR) algorithm was applied on a protein-protein interaction (PPI) network using these genes as seeds. The identified genes with high probabilities were filtered by the permutation test and linkage tests. In the permutation test, the genes that were selected due to the structure of the PPI network were discarded. In the linkage tests, the importance of each candidate gene was measured from two aspects: (1) its functional associations with validated genes and (2) its similarity with validated genes on gene ontology (GO) terms and KEGG pathways. Finally, 255 inferred genes were obtained, subsequent extensive analysis of important genes revealed that they mainly contribute to ubiquitination (UBQ9, UBQ8, UBQ11, UBQ10), serine hydroxymethyl transfer (SHM7, SHM5, SHM6) or glycol-metabolism (HXKL2_ARATH, CSY5, GAPCP1), suggesting essential roles during the development and maturation of fruit in Arabidopsis thaliana.

  18. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

    2013-01-01

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  19. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  20. Methodological issues in detecting gene-gene interactions in breast cancer susceptibility: a population-based study in Ontario

    Directory of Open Access Journals (Sweden)

    Onay Venus

    2007-08-01

    Full Text Available Abstract Background There is growing evidence that gene-gene interactions are ubiquitous in determining the susceptibility to common human diseases. The investigation of such gene-gene interactions presents new statistical challenges for studies with relatively small sample sizes as the number of potential interactions in the genome can be large. Breast cancer provides a useful paradigm to study genetically complex diseases because commonly occurring single nucleotide polymorphisms (SNPs may additively or synergistically disturb the system-wide communication of the cellular processes leading to cancer development. Methods In this study, we systematically studied SNP-SNP interactions among 19 SNPs from 18 key genes involved in major cancer pathways in a sample of 398 breast cancer cases and 372 controls from Ontario. We discuss the methodological issues associated with the detection of SNP-SNP interactions in this dataset by applying and comparing three commonly used methods: the logistic regression model, classification and regression trees (CART, and the multifactor dimensionality reduction (MDR method. Results Our analyses show evidence for several simple (two-way and complex (multi-way SNP-SNP interactions associated with breast cancer. For example, all three methods identified XPD-[Lys751Gln]*IL10-[G(-1082A] as the most significant two-way interaction. CART and MDR identified the same critical SNPs participating in complex interactions. Our results suggest that the use of multiple statistical approaches (or an integrated approach rather than a single methodology could be the best strategy to elucidate complex gene interactions that have generally very different patterns. Conclusion The strategy used here has the potential to identify complex biological relationships among breast cancer genes and processes. This will lead to the discovery of novel biological information, which will improve breast cancer risk management.

  1. An Individual-Based Diploid Model Predicts Limited Conditions Under Which Stochastic Gene Expression Becomes Advantageous

    KAUST Repository

    Matsumoto, Tomotaka

    2015-11-24

    Recent studies suggest the existence of a stochasticity in gene expression (SGE) in many organisms, and its non-negligible effect on their phenotype and fitness. To date, however, how SGE affects the key parameters of population genetics are not well understood. SGE can increase the phenotypic variation and act as a load for individuals, if they are at the adaptive optimum in a stable environment. On the other hand, part of the phenotypic variation caused by SGE might become advantageous if individuals at the adaptive optimum become genetically less-adaptive, for example due to an environmental change. Furthermore, SGE of unimportant genes might have little or no fitness consequences. Thus, SGE can be advantageous, disadvantageous, or selectively neutral depending on its context. In addition, there might be a genetic basis that regulates magnitude of SGE, which is often referred to as “modifier genes,” but little is known about the conditions under which such an SGE-modifier gene evolves. In the present study, we conducted individual-based computer simulations to examine these conditions in a diploid model. In the simulations, we considered a single locus that determines organismal fitness for simplicity, and that SGE on the locus creates fitness variation in a stochastic manner. We also considered another locus that modifies the magnitude of SGE. Our results suggested that SGE was always deleterious in stable environments and increased the fixation probability of deleterious mutations in this model. Even under frequently changing environmental conditions, only very strong natural selection made SGE adaptive. These results suggest that the evolution of SGE-modifier genes requires strict balance among the strength of natural selection, magnitude of SGE, and frequency of environmental changes. However, the degree of dominance affected the condition under which SGE becomes advantageous, indicating a better opportunity for the evolution of SGE in different genetic

  2. A strategy of gene overexpression based on tandem repetitive promoters in Escherichia coli

    Directory of Open Access Journals (Sweden)

    Li Mingji

    2012-02-01

    Full Text Available Abstract Background For metabolic engineering, many rate-limiting steps may exist in the pathways of accumulating the target metabolites. Increasing copy number of the desired genes in these pathways is a general method to solve the problem, for example, the employment of the multi-copy plasmid-based expression system. However, this method may bring genetic instability, structural instability and metabolic burden to the host, while integrating of the desired gene into the chromosome may cause inadequate transcription or expression. In this study, we developed a strategy for obtaining gene overexpression by engineering promoter clusters consisted of multiple core-tac-promoters (MCPtacs in tandem. Results Through a uniquely designed in vitro assembling process, a series of promoter clusters were constructed. The transcription strength of these promoter clusters showed a stepwise enhancement with the increase of tandem repeats number until it reached the critical value of five. Application of the MCPtacs promoter clusters in polyhydroxybutyrate (PHB production proved that it was efficient. Integration of the phaCAB genes with the 5CPtacs promoter cluster resulted in an engineered E.coli that can accumulate 23.7% PHB of the cell dry weight in batch cultivation. Conclusions The transcription strength of the MCPtacs promoter cluster can be greatly improved by increasing the tandem repeats number of the core-tac-promoter. By integrating the desired gene together with the MCPtacs promoter cluster into the chromosome of E. coli, we can achieve high and stale overexpression with only a small size. This strategy has an application potential in many fields and can be extended to other bacteria.

  3. Using FlyBase, a Database of Drosophila Genes and Genomes.

    Science.gov (United States)

    Marygold, Steven J; Crosby, Madeline A; Goodman, Joshua L

    2016-01-01

    For nearly 25 years, FlyBase (flybase.org) has provided a freely available online database of biological information about Drosophila species, focusing on the model organism D. melanogaster. The need for a centralized, integrated view of Drosophila research has never been greater as advances in genomic, proteomic, and high-throughput technologies add to the quantity and diversity of available data and resources.FlyBase has taken several approaches to respond to these changes in the research landscape. Novel report pages have been generated for new reagent types and physical interaction data; Drosophila models of human disease are now represented and showcased in dedicated Human Disease Model Reports; other integrated reports have been established that bring together related genes, datasets, or reagents; Gene Reports have been revised to improve access to new data types and to highlight functional data; links to external sites have been organized and expanded; and new tools have been developed to display and interrogate all these data, including improved batch processing and bulk file availability. In addition, several new community initiatives have served to enhance interactions between researchers and FlyBase, resulting in direct user contributions and improved feedback.This chapter provides an overview of the data content, organization, and available tools within FlyBase, focusing on recent improvements. We hope it serves as a guide for our diverse user base, enabling efficient and effective exploration of the database and thereby accelerating research discoveries.

  4. Transgenic Sugarcane Resistant to Sorghum mosaic virus Based on Coat Protein Gene Silencing by RNA Interference

    Directory of Open Access Journals (Sweden)

    Jinlong Guo

    2015-01-01

    Full Text Available As one of the critical diseases of sugarcane, sugarcane mosaic disease can lead to serious decline in stalk yield and sucrose content. It is mainly caused by Potyvirus sugarcane mosaic virus (SCMV and/or Sorghum mosaic virus (SrMV, with additional differences in viral strains. RNA interference (RNAi is a novel strategy for producing viral resistant plants. In this study, based on multiple sequence alignment conducted on genomic sequences of different strains and isolates of SrMV, the conserved region of coat protein (CP genes was selected as the target gene and the interference sequence with size of 423 bp in length was obtained through PCR amplification. The RNAi vector pGII00-HACP with an expression cassette containing both hairpin interference sequence and cp4-epsps herbicide-tolerant gene was transferred to sugarcane cultivar ROC22 via Agrobacterium-mediated transformation. After herbicide screening, PCR molecular identification, and artificial inoculation challenge, anti-SrMV positive transgenic lines were successfully obtained. SrMV resistance rate of the transgenic lines with the interference sequence was 87.5% based on SrMV challenge by artificial inoculation. The genetically modified SrMV-resistant lines of cultivar ROC22 provide resistant germplasm for breeding lines and can also serve as resistant lines having the same genetic background for study of resistance mechanisms.

  5. Bayesian inference based modelling for gene transcriptional dynamics by integrating multiple source of knowledge

    Directory of Open Access Journals (Sweden)

    Wang Shu-Qiang

    2012-07-01

    Full Text Available Abstract Background A key challenge in the post genome era is to identify genome-wide transcriptional regulatory networks, which specify the interactions between transcription factors and their target genes. Numerous methods have been developed for reconstructing gene regulatory networks from expression data. However, most of them are based on coarse grained qualitative models, and cannot provide a quantitative view of regulatory systems. Results A binding affinity based regulatory model is proposed to quantify the transcriptional regulatory network. Multiple quantities, including binding affinity and the activity level of transcription factor (TF are incorporated into a general learning model. The sequence features of the promoter and the possible occupancy of nucleosomes are exploited to estimate the binding probability of regulators. Comparing with the previous models that only employ microarray data, the proposed model can bridge the gap between the relative background frequency of the observed nucleotide and the gene's transcription rate. Conclusions We testify the proposed approach on two real-world microarray datasets. Experimental results show that the proposed model can effectively identify the parameters and the activity level of TF. Moreover, the kinetic parameters introduced in the proposed model can reveal more biological sense than previous models can do.

  6. Treatment planning of electroporation-based medical interventions: electrochemotherapy, gene electrotransfer and irreversible electroporation

    International Nuclear Information System (INIS)

    Zupanic, Anze; Kos, Bor; Miklavcic, Damijan

    2012-01-01

    In recent years, cancer electrochemotherapy (ECT), gene electrotransfer for gene therapy and DNA vaccination (GET) and tissue ablation with irreversible electroporation (IRE) have all entered clinical practice. We present a method for a personalized treatment planning procedure for ECT, GET and IRE, based on medical image analysis, numerical modelling of electroporation and optimization with the genetic algorithm, and several visualization tools for treatment plan assessment. Each treatment plan provides the attending physician with optimal positions of electrodes in the body and electric pulse parameters for optimal electroporation of the target tissues. For the studied case of a deep-seated tumour, the optimal treatment plans for ECT and IRE require at least two electrodes to be inserted into the target tissue, thus lowering the necessary voltage for electroporation and limiting damage to the surrounding healthy tissue. In GET, it is necessary to place the electrodes outside the target tissue to prevent damage to target cells intended to express the transfected genes. The presented treatment planning procedure is a valuable tool for clinical and experimental use and evaluation of electroporation-based treatments. (paper)

  7. Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

    Directory of Open Access Journals (Sweden)

    Bülent Haznedar

    2017-02-01

    Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

  8. Potential mechanisms for cell-based gene therapy to treat HIV/AIDS.

    Science.gov (United States)

    Herrera-Carrillo, Elena; Berkhout, Ben

    2015-02-01

    An estimated 35 million people are infected with HIV worldwide. Anti-retroviral therapy (ART) has reduced the morbidity and mortality of HIV-infected patients but efficacy requires strict adherence and the treatment is not curative. Most importantly, the emergence of drug-resistant virus strains and drug toxicity can restrict the long-term therapeutic efficacy in some patients. Therefore, novel treatment strategies that permanently control or eliminate the virus and restore the damaged immune system are required. Gene therapy against HIV infection has been the topic of intense investigations for the last two decades because it can theoretically provide such a durable anti-HIV control. In this review we discuss two major gene therapy strategies to combat HIV. One approach aims to kill HIV-infected cells and the other is based on the protection of cells from HIV infection. We discuss the underlying molecular mechanisms for candidate approaches to permanently block HIV infection, including the latest strategies and future therapeutic applications. Hematopoietic stem cell-based gene therapy for HIV/AIDS may eventually become an alternative for standard ART and should ideally provide a functional cure in which the virus is durably controlled without medication. Recent results from preclinical research and early-stage clinical trials support the feasibility and safety of this novel strategy.

  9. Zn(II)-dipicolylamine-based metallo-lipids as novel non-viral gene vectors.

    Science.gov (United States)

    Su, Rong-Chuan; Liu, Qiang; Yi, Wen-Jing; Zhao, Zhi-Gang

    2017-08-01

    In this study, a series of Zn(II)-dipicolylamine (Zn-DPA) based cationic lipids bearing different hydrophobic tails (long chains, α-tocopherol, cholesterol or diosgenin) were synthesized. Structure-activity relationship (SAR) of these lipids was studied in detail by investigating the effects of several structural aspects including the type of hydrophobic tails, the chain length and saturation degree. In addition, several assays were used to study their interactions with plasmid DNA, and results reveal that these lipids could condense DNA into nanosized particles with appropriate size and zeta-potentials. MTT-based cell viability assays showed that lipoplexes 5 had low cytotoxicity. The in vitro gene transfection studies showed the hydrophobic tails clearly affected the TE, and hexadecanol-containing lipid 5b gives the best TE, which was 2.2 times higher than bPEI 25k in the presence of 10% serum. The results not only demonstrate that these lipids might be promising non-viral gene vectors, but also afford us clues for further optimization of lipidic gene delivery materials.

  10. Preparation and Characterization of Gelatin-Based Mucoadhesive Nanocomposites as Intravesical Gene Delivery Scaffolds

    Directory of Open Access Journals (Sweden)

    Ching-Wen Liu

    2014-01-01

    Full Text Available This study aimed to develop optimal gelatin-based mucoadhesive nanocomposites as scaffolds for intravesical gene delivery to the urothelium. Hydrogels were prepared by chemically crosslinking gelatin A or B with glutaraldehyde. Physicochemical and delivery properties including hydration ratio, viscosity, size, yield, thermosensitivity, and enzymatic degradation were studied, and scanning electron microscopy (SEM was carried out. The optimal hydrogels (H, composed of 15% gelatin A175, displayed an 81.5% yield rate, 87.1% hydration ratio, 42.9 Pa·s viscosity, and 125.8 nm particle size. The crosslinking density of the hydrogels was determined by performing pronase degradation and ninhydrin assays. In vitro lentivirus (LV release studies involving p24 capsid protein analysis in 293T cells revealed that hydrogels containing lentivirus (H-LV had a higher cumulative release than that observed for LV alone (3.7-, 2.3-, and 2.3-fold at days 1, 3, and 5, resp.. Lentivirus from lentivector constructed green fluorescent protein (GFP was then entrapped in hydrogels (H-LV-GFP. H-LV-GFP showed enhanced gene delivery in AY-27 cells in vitro and to rat urothelium by intravesical instillation in vivo. Cystometrogram showed mucoadhesive H-LV reduced peak micturition and threshold pressure and increased bladder compliance. In this study, we successfully developed first optimal gelatin-based mucoadhesive nanocomposites as intravesical gene delivery scaffolds.

  11. Blood-based gene-expression predictors of PTSD risk and resilience among deployed marines: a pilot study.

    Science.gov (United States)

    Glatt, Stephen J; Tylee, Daniel S; Chandler, Sharon D; Pazol, Joel; Nievergelt, Caroline M; Woelk, Christopher H; Baker, Dewleen G; Lohr, James B; Kremen, William S; Litz, Brett T; Tsuang, Ming T

    2013-06-01

    Susceptibility to PTSD is determined by both genes and environment. Similarly, gene-expression levels in peripheral blood are influenced by both genes and environment, and expression levels of many genes show good correspondence between peripheral blood and brain. Therefore, our objectives were to test the following hypotheses: (1) pre-trauma expression levels of a gene subset (particularly immune-system genes) in peripheral blood would differ between trauma-exposed Marines who later developed PTSD and those who did not; (2) a predictive biomarker panel of the eventual emergence of PTSD among high-risk individuals could be developed based on gene expression in readily assessable peripheral blood cells; and (3) a predictive panel based on expression of individual exons would surpass the accuracy of a model based on expression of full-length gene transcripts. Gene-expression levels were assayed in peripheral blood samples from 50 U.S. Marines (25 eventual PTSD cases and 25 non-PTSD comparison subjects) prior to their deployment overseas to war-zones in Iraq or Afghanistan. The panel of biomarkers dysregulated in peripheral blood cells of eventual PTSD cases prior to deployment was significantly enriched for immune genes, achieved 70% prediction accuracy in an independent sample based on the expression of 23 full-length transcripts, and attained 80% accuracy in an independent sample based on the expression of one exon from each of five genes. If the observed profiles of pre-deployment mRNA-expression in eventual PTSD cases can be further refined and replicated, they could suggest avenues for early intervention and prevention among individuals at high risk for trauma exposure. Copyright © 2013 Wiley Periodicals, Inc.

  12. Integration of gene-based markers in a pearl millet genetic map for identification of candidate genes underlying drought tolerance quantitative trait loci

    Directory of Open Access Journals (Sweden)

    Sehgal Deepmala

    2012-01-01

    Full Text Available Abstract Background Identification of genes underlying drought tolerance (DT quantitative trait loci (QTLs will facilitate understanding of molecular mechanisms of drought tolerance, and also will accelerate genetic improvement of pearl millet through marker-assisted selection. We report a map based on genes with assigned functional roles in plant adaptation to drought and other abiotic stresses and demonstrate its use in identifying candidate genes underlying a major DT-QTL. Results Seventy five single nucleotide polymorphism (SNP and conserved intron spanning primer (CISP markers were developed from available expressed sequence tags (ESTs using four genotypes, H 77/833-2, PRLT 2/89-33, ICMR 01029 and ICMR 01004, representing parents of two mapping populations. A total of 228 SNPs were obtained from 30.5 kb sequenced region resulting in a SNP frequency of 1/134 bp. The positions of major pearl millet linkage group (LG 2 DT-QTLs (reported from crosses H 77/833-2 × PRLT 2/89-33 and 841B × 863B were added to the present consensus function map which identified 18 genes, coding for PSI reaction center subunit III, PHYC, actin, alanine glyoxylate aminotransferase, uridylate kinase, acyl-CoA oxidase, dipeptidyl peptidase IV, MADS-box, serine/threonine protein kinase, ubiquitin conjugating enzyme, zinc finger C- × 8-C × 5-C × 3-H type, Hd3, acetyl CoA carboxylase, chlorophyll a/b binding protein, photolyase, protein phosphatase1 regulatory subunit SDS22 and two hypothetical proteins, co-mapping in this DT-QTL interval. Many of these candidate genes were found to have significant association with QTLs of grain yield, flowering time and leaf rolling under drought stress conditions. Conclusions We have exploited available pearl millet EST sequences to generate a mapped resource of seventy five new gene-based markers for pearl millet and demonstrated its use in identifying candidate genes underlying a major DT-QTL in this species. The reported gene-based

  13. Comprehensive Protocols for CRISPR/Cas9-based Gene Editing in Human Pluripotent Stem Cells.

    Science.gov (United States)

    Santos, David P; Kiskinis, Evangelos; Eggan, Kevin; Merkle, Florian T

    2016-08-17

    Genome editing of human pluripotent stem cells (hPSCs) with the CRISPR/Cas9 system has the potential to revolutionize hPSC-based disease modeling, drug screening, and transplantation therapy. Here, we aim to provide a single resource to enable groups, even those with limited experience with hPSC culture or the CRISPR/Cas9 system, to successfully perform genome editing. The methods are presented in detail and are supported by a theoretical framework to allow for the incorporation of inevitable improvements in the rapidly evolving gene-editing field. We describe protocols to generate hPSC lines with gene-specific knock-outs, small targeted mutations, or knock-in reporters. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  14. Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

    Science.gov (United States)

    Hiscock, D; Upton, C

    2000-05-01

    The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .

  15. Satellite DNA-based artificial chromosomes for use in gene therapy.

    Science.gov (United States)

    Hadlaczky, G

    2001-04-01

    Satellite DNA-based artificial chromosomes (SATACs) can be made by induced de novo chromosome formation in cells of different mammalian species. These artificially generated accessory chromosomes are composed of predictable DNA sequences and they contain defined genetic information. Prototype human SATACs have been successfully constructed in different cell types from 'neutral' endogenous DNA sequences from the short arm of the human chromosome 15. SATACs have already passed a number of hurdles crucial to their further development as gene therapy vectors, including: large-scale purification; transfer of purified artificial chromosomes into different cells and embryos; generation of transgenic animals and germline transmission with purified SATACs; and the tissue-specific expression of a therapeutic gene from an artificial chromosome in the milk of transgenic animals.

  16. Mesenchymal stem cell-based NK4 gene therapy in nude mice bearing gastric cancer xenografts

    Directory of Open Access Journals (Sweden)

    Zhu Y

    2014-12-01

    tissues after systemic injection. The microvessel density of tumor xenografts was decreased, and tumor cellular apoptosis was significantly induced in the mice treated with MSCs-NK4 compared to control mice. These findings demonstrate that MSC-based NK4 gene therapy can obviously inhibit the growth of gastric cancer xenografts, and MSCs are a better vehicle for NK4 gene therapy than lentiviral vectors. Further studies are warranted to explore the efficacy and safety of the MSC-based NK4 gene therapy in animals and cancer patients.Keywords: gastric cancer, gene therapy, tumor xenograft, hepatocyte growth factor, lentivirus, angiogenesis, apoptosis

  17. Shikonin enhances efficacy of a gene-based cancer vaccine via induction of RANTES

    Directory of Open Access Journals (Sweden)

    Chen Hui-Ming

    2012-04-01

    Full Text Available Abstract Background Shikonin, a phytochemical purified from Lithospermum erythrorhizon, has been shown to confer diverse pharmacological activities, including accelerating granuloma formation, wound healing, anti-inflammation and others, and is explored for immune-modifier activities for vaccination in this study. Transdermal gene-based vaccine is an attractive approach for delivery of DNA transgenes encoding specific tumor antigens to host skin tissues. Skin dendritic cells (DCs, a potent antigen-presenting cell type, is known to play a critical role in transmitting and orchestrating tumor antigen-specific immunities against cancers. The present study hence employs these various components for experimentation. Method The mRNA and protein expression of RANTES were detected by RT-PCR and ELISA, respectively. The regional expression of RANTES and tissue damage in test skin were evaluated via immunohistochemistry assay. Fluorescein isothiocyanate sensitization assay was performed to trace the trafficking of DCs from the skin vaccination site to draining lymph nodes. Adjuvantic effect of shikonin on gene gun-delivered human gp100 (hgp100 DNA cancer vaccine was studied in a human gp100-transfected B16 (B16/hgp100 tumor model. Results Among various phytochemicals tested, shikonin induced the highest level of expression of RANTES in normal skin tissues. In comparison, mouse RANTES cDNA gene transfection induced a higher level of mRANTES expression for a longer period, but caused more extensive skin damage. Topical application of shikonin onto the immunization site before gene gun-mediated vaccination augmented the population of skin DCs migrating into the draining lymph nodes. A hgp100 cDNA gene vaccination regimen with shikonin pretreatment as an adjuvant in a B16/hgp100 tumor model increased cytotoxic T lymphocyte activities in splenocytes and lymph node cells on target tumor cells. Conclusion Together, our findings suggest that shikonin can

  18. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    Science.gov (United States)

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based

  19. A candidate gene-based association study of tocopherol content and composition in rapeseed (Brassica napus

    Directory of Open Access Journals (Sweden)

    Steffi eFritsche

    2012-06-01

    Full Text Available Rapeseed (Brassica napus L. is the most important oil crop of temperate climates. Rapeseed oil contains tocopherols, also known as vitamin E, which is an indispensable nutrient for humans and animals due to its antioxidant and radical scavenging abilities. Moreover, tocopherols are also important for the oxidative stability of vegetable oils. Therefore, seed oil with increased tocopherol content or altered tocopherol composition is a target for breeding. We investigated the role of nucleotide variations within candidate genes from the tocopherol biosynthesis pathway. Field trials were carried out with 229 accessions from a worldwide B. napus collection which was divided into two panels of 96 and 133 accessions. Seed tocopherol content and composition were measured by HPLC. High heritabilities were found for both traits, ranging from 0.62 to 0.94. We identified polymorphisms by sequencing selected regions of the tocopherol genes from the 96 accession panel. Subsequently, we determined the population structure (Q and relative kinship (K as detected by genotyping with genome-wide distributed SSR markers. Association studies were performed using two models, the structure-based GLM+Q and the PK mixed model. Between 26 and 12 polymorphisms within two genes (BnaX.VTE3.a, BnaA.PDS1.c were significantly associated with tocopherol traits. The SNPs explained up to 16.93 % of the genetic variance for tocopherol composition and up to 10.48 % for total tocopherol content. Based on the sequence information we designed CAPS markers for genotyping the 133 accessions from the 2nd panel. Significant associations with various tocopherol traits confirmed the results from the first experiment. We demonstrate that the polymorphisms within the tocopherol genes clearly impact tocopherol content and composition in B. napus seeds. We suggest that these nucleotide variations may be used as selectable markers for breeding rapeseed with enhanced tocopherol quality.

  20. Systematics of Plant-Pathogenic and Related Streptomyces Species Based on Phylogenetic Analyses of Multiple Gene Loci

    Science.gov (United States)

    The 10 species of Streptomyces implicated as the etiological agents in scab disease of potatoes or soft rot disease of sweet potatoes are distributed among 7 different phylogenetic clades in analyses based on 16S rRNA gene sequences, but high sequence similarity of this gene among Streptomyces speci...

  1. Tsw gene-based resistance is triggered by a functional RNA silencing suppressor protein of the Tomato spotted wilt virus

    NARCIS (Netherlands)

    Ronde, de D.; Butterbach, P.B.E.; Lohuis, H.; Hedil, M.; Lent, van J.W.M.; Kormelink, R.J.M.

    2013-01-01

    As a result of contradictory reports, the avirulence (Avr) determinant that triggers Tsw gene-based resistance in Capsicum annuum against the Tomato spotted wilt virus (TSWV) is still unresolved. Here, the N and NSs genes of resistance-inducing (RI) and resistance-breaking (RB) isolates were cloned

  2. Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

    Directory of Open Access Journals (Sweden)

    Andrew J Burt

    Full Text Available Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris. Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08 where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.

  3. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis

    Directory of Open Access Journals (Sweden)

    Saurav Mallik

    2017-12-01

    Full Text Available For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures—weighted rank-based Jaccard and Cosine measures—and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm—RANWAR—was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  4. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis.

    Science.gov (United States)

    Mallik, Saurav; Zhao, Zhongming

    2017-12-28

    For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  5. AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density.

    Science.gov (United States)

    Zhao, X G; Dai, W; Li, Y; Tian, L

    2011-11-01

    The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no reliable numerical algorithm identifying optimal combination of a set of biomarkers to maximize the AUC, especially when the number of biomarkers is large. We have proposed a novel AUC-based statistical ensemble methods for combining multiple biomarkers to differentiate a binary response of interest. Specifically, we propose to replace the non-continuous and non-convex AUC objective function by a convex surrogate loss function, whose minimizer can be efficiently identified. With the established framework, the lasso and other regularization techniques enable feature selections. Extensive simulations have demonstrated the superiority of the new methods to the existing methods. The proposal has been applied to a gene expression dataset to construct gene expression scores to differentiate elderly women with low bone mineral density (BMD) and those with normal BMD. The AUCs of the resulting scores in the independent test dataset has been satisfactory. Aiming for directly maximizing AUC, the proposed AUC-based ensemble method provides an efficient means of generating a stable combination of multiple biomarkers, which is especially useful under the high-dimensional settings. lutian@stanford.edu. Supplementary data are available at Bioinformatics online.

  6. Genome-Wide Constitutively Expressed Gene Analysis and New Reference Gene Selection Based on Transcriptome Data: A Case Study from Poplar/Canker Disease Interaction

    Directory of Open Access Journals (Sweden)

    Jiaping Zhao

    2017-10-01

    Full Text Available A number of transcriptome datasets for differential expression (DE genes have been widely used for understanding organismal biology, but these datasets also contain untapped information that can be used to develop more precise analytical tools. With the use of transcriptome data generated from poplar/canker disease interaction system, we describe a methodology to identify candidate reference genes from high-throughput sequencing data. This methodology will improve the accuracy of RT-qPCR and will lead to better standards for the normalization of expression data. Expression stability analysis from xylem and phloem of Populus bejingensis inoculated with the fungal canker pathogen Botryosphaeria dothidea revealed that 729 poplar transcripts (1.11% were stably expressed, at a threshold level of coefficient of variance (CV of FPKM < 20% and maximum fold change (MFC of FPKM < 2.0. Expression stability and bioinformatics analysis suggested that commonly used house-keeping (HK genes were not the most appropriate internal controls: 70 of the 72 commonly used HK genes were not stably expressed, 45 of the 72 produced multiple isoform transcripts, and some of their reported primers produced unspecific amplicons in PCR amplification. RT-qPCR analysis to compare and evaluate the expression stability of 10 commonly used poplar HK genes and 20 of the 729 newly-identified stably expressed transcripts showed that some of the newly-identified genes (such as SSU_S8e, LSU_L5e, and 20S_PSU had higher stability ranking than most of commonly used HK genes. Based on these results, we recommend a pipeline for deriving reference genes from transcriptome data. An appropriate candidate gene should have a unique transcript, constitutive expression, CV value of expression < 20% (or possibly 30% and MFC value of expression <2, and an expression level of 50–1,000 units. Lastly, when four of the newly identified HK genes were used in the normalization of expression data for 20

  7. Prioritization of candidate genes for cattle reproductive traits, based on protein-protein interactions, gene expression, and text-mining

    DEFF Research Database (Denmark)

    Hulsegge, Ina; Woelders, Henri; Smits, Mari

    2013-01-01

    Reproduction is of significant economic importance in dairy cattle. Improved understanding of mechanisms that control estrous behavior and other reproduction traits could help in developing strategies to improve and/or monitor these traits. The objective of this study was to predict and rank gene...

  8. Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

    DEFF Research Database (Denmark)

    Ersland, Kari M; Christoforou, Andrea; Stansberg, Christine

    2012-01-01

    the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n...

  9. Microarray-based analysis of differential gene expression between infective and noninfective larvae of Strongyloides stercoralis.

    Directory of Open Access Journals (Sweden)

    Roshan Ramanathan

    2011-05-01

    Full Text Available Differences between noninfective first-stage (L1 and infective third-stage (L3i larvae of parasitic nematode Strongyloides stercoralis at the molecular level are relatively uncharacterized. DNA microarrays were developed and utilized for this purpose.Oligonucleotide hybridization probes for the array were designed to bind 3,571 putative mRNA transcripts predicted by analysis of 11,335 expressed sequence tags (ESTs obtained as part of the Nematode EST project. RNA obtained from S. stercoralis L3i and L1 was co-hybridized to each array after labeling the individual samples with different fluorescent tags. Bioinformatic predictions of gene function were developed using a novel cDNA Annotation System software. We identified 935 differentially expressed genes (469 L3i-biased; 466 L1-biased having two-fold expression differences or greater and microarray signals with a p value<0.01. Based on a functional analysis, L1 larvae have a larger number of genes putatively involved in transcription (p = 0.004, and L3i larvae have biased expression of putative heat shock proteins (such as hsp-90. Genes with products known to be immunoreactive in S. stercoralis-infected humans (such as SsIR and NIE had L3i biased expression. Abundantly expressed L3i contigs of interest included S. stercoralis orthologs of cytochrome oxidase ucr 2.1 and hsp-90, which may be potential chemotherapeutic targets. The S. stercoralis ortholog of fatty acid and retinol binding protein-1, successfully used in a vaccine against Ancylostoma ceylanicum, was identified among the 25 most highly expressed L3i genes. The sperm-containing glycoprotein domain, utilized in a vaccine against the nematode Cooperia punctata, was exclusively found in L3i biased genes and may be a valuable S. stercoralis target of interest.A new DNA microarray tool for the examination of S. stercoralis biology has been developed and provides new and valuable insights regarding differences between infective and

  10. A novel mutual information-based Boolean network inference method from time-series gene expression data.

    Directory of Open Access Journals (Sweden)

    Shohag Barman

    Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.

  11. A Peptide-based Vector for Efficient Gene Transfer In Vitro and In Vivo

    Science.gov (United States)

    Lehto, Taavi; Simonson, Oscar E; Mäger, Imre; Ezzat, Kariem; Sork, Helena; Copolovici, Dana-Maria; Viola, Joana R; Zaghloul, Eman M; Lundin, Per; Moreno, Pedro MD; Mäe, Maarja; Oskolkov, Nikita; Suhorutšenko, Julia; Smith, CI Edvard; Andaloussi, Samir EL

    2011-01-01

    Finding suitable nonviral delivery vehicles for nucleic acid–based therapeutics is a landmark goal in gene therapy. Cell-penetrating peptides (CPPs) are one class of delivery vectors that has been exploited for this purpose. However, since CPPs use endocytosis to enter cells, a large fraction of peptides remain trapped in endosomes. We have previously reported that stearylation of amphipathic CPPs, such as transportan 10 (TP10), dramatically increases transfection of oligonucleotides in vitro partially by promoting endosomal escape. Therefore, we aimed to evaluate whether stearyl-TP10 could be used for the delivery of plasmids as well. Our results demonstrate that stearyl-TP10 forms stable nanoparticles with plasmids that efficiently enter different cell-types in a ubiquitous manner, including primary cells, resulting in significantly higher gene expression levels than when using stearyl-Arg9 or unmodified CPPs. In fact, the transfection efficacy of stearyl-TP10 almost reached the levels of Lipofectamine 2000 (LF2000), however, without any of the observed lipofection-associated toxicities. Most importantly, stearyl-TP10/plasmid nanoparticles are nonimmunogenic, mediate efficient gene delivery in vivo, when administrated intramuscularly (i.m.) or intradermally (i.d.) without any associated toxicity in mice. PMID:21343913

  12. Biopolymer-Based Nanoparticles for Drug/Gene Delivery and Tissue Engineering

    Science.gov (United States)

    Nitta, Sachiko Kaihara; Numata, Keiji

    2013-01-01

    There has been a great interest in application of nanoparticles as biomaterials for delivery of therapeutic molecules such as drugs and genes, and for tissue engineering. In particular, biopolymers are suitable materials as nanoparticles for clinical application due to their versatile traits, including biocompatibility, biodegradability and low immunogenicity. Biopolymers are polymers that are produced from living organisms, which are classified in three groups: polysaccharides, proteins and nucleic acids. It is important to control particle size, charge, morphology of surface and release rate of loaded molecules to use biopolymer-based nanoparticles as drug/gene delivery carriers. To obtain a nano-carrier for therapeutic purposes, a variety of materials and preparation process has been attempted. This review focuses on fabrication of biocompatible nanoparticles consisting of biopolymers such as protein (silk, collagen, gelatin, β-casein, zein and albumin), protein-mimicked polypeptides and polysaccharides (chitosan, alginate, pullulan, starch and heparin). The effects of the nature of the materials and the fabrication process on the characteristics of the nanoparticles are described. In addition, their application as delivery carriers of therapeutic drugs and genes and biomaterials for tissue engineering are also reviewed. PMID:23344060

  13. Biopolymer-Based Nanoparticles for Drug/Gene Delivery and Tissue Engineering

    Directory of Open Access Journals (Sweden)

    Keiji Numata

    2013-01-01

    Full Text Available There has been a great interest in application of nanoparticles as biomaterials for delivery of therapeutic molecules such as drugs and genes, and for tissue engineering. In particular, biopolymers are suitable materials as nanoparticles for clinical application due to their versatile traits, including biocompatibility, biodegradability and low immunogenicity. Biopolymers are polymers that are produced from living organisms, which are classified in three groups: polysaccharides, proteins and nucleic acids. It is important to control particle size, charge, morphology of surface and release rate of loaded molecules to use biopolymer-based nanoparticles as drug/gene delivery carriers. To obtain a nano-carrier for therapeutic purposes, a variety of materials and preparation process has been attempted. This review focuses on fabrication of biocompatible nanoparticles consisting of biopolymers such as protein (silk, collagen, gelatin, β-casein, zein and albumin, protein-mimicked polypeptides and polysaccharides (chitosan, alginate, pullulan, starch and heparin. The effects of the nature of the materials and the fabrication process on the characteristics of the nanoparticles are described. In addition, their application as delivery carriers of therapeutic drugs and genes and biomaterials for tissue engineering are also reviewed.

  14. Use of reporter-gene based bacteria to quantify phenanthrene biodegradation and toxicity in soil

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Doyun [Department of Civil and Environmental Engineering, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of); Moon, Hee Sun [School of Earth and Environmental Science, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of); Lin, Chu-Ching; Barkay, Tamar [Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901 (United States); Nam, Kyoungphile, E-mail: kpnam@snu.ac.k [Department of Civil and Environmental Engineering, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of)

    2011-02-15

    A phenanthrene-degrading bacterium, Sphingomonas paucimobilis EPA505 was used to construct two fluorescence-based reporter strains. Strain D harboring gfp gene was constructed to generate green fluorescence when the strain started to biodegrade phenanthrene. Strain S possessing gef gene was designed to die once phenanthrene biodegradation was initiated and thus to lose green fluorescence when visualized by a live/dead cell staining. Confocal laser scanning microscopic observation followed by image analysis demonstrates that the fluorescence intensity generated by strain D increased and the intensity by strain S decreased linearly at the phenanthrene concentration of up to 200 mg/L. Such quantitative increase and decrease of fluorescence intensity in strain D (i.e., from 1 to 11.90 {+-} 0.72) and strain S (from 1 to 0.40 {+-} 0.07) were also evident in the presence of Ottawa sand spiked with the phenanthrene up to 1000 mg/kg. The potential use of the reporter strains in quantitatively determining biodegradable or toxic phenanthrene was discussed. - Research highlights: A novel reporter bacterial strain has been developed. The bacterium can quantitatively determine the change in fluorescence intensity. The intensity can represent the bioavailable phenanthrene in solid matrix. - A cell-killing gene harboring reporter bacterium shows phenanthrene toxicity.

  15. Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

    Science.gov (United States)

    Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

    2017-08-01

    Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.

  16. GeneTrailExpress: a web-based pipeline for the statistical evaluation of microarray experiments

    Directory of Open Access Journals (Sweden)

    Kohlbacher Oliver

    2008-12-01

    Full Text Available Abstract Background High-throughput methods that allow for measuring the expression of thousands of genes or proteins simultaneously have opened new avenues for studying biochemical processes. While the noisiness of the data necessitates an extensive pre-processing of the raw data, the high dimensionality requires effective statistical analysis methods that facilitate the identification of crucial biological features and relations. For these reasons, the evaluation and interpretation of expression data is a complex, labor-intensive multi-step process. While a variety of tools for normalizing, analysing, or visualizing expression profiles has been developed in the last years, most of these tools offer only functionality for accomplishing certain steps of the evaluation pipeline. Results Here, we present a web-based toolbox that provides rich functionality for all steps of the evaluation pipeline. Our tool GeneTrailExpress offers besides standard normalization procedures powerful statistical analysis methods for studying a large variety of biological categories and pathways. Furthermore, an integrated graph visualization tool, BiNA, enables the user to draw the relevant biological pathways applying cutting-edge graph-layout algorithms. Conclusion Our gene expression toolbox with its interactive visualization of the pathways and the expression values projected onto the nodes will simplify the analysis and interpretation of biochemical pathways considerably.

  17. Unveiling network-based functional features through integration of gene expression into protein networks.

    Science.gov (United States)

    Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali

    2018-06-01

    Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Map-Based Cloning of the Gene Associated With the Soybean Maturity Locus E3

    Science.gov (United States)

    Watanabe, Satoshi; Hideshima, Rumiko; Xia, Zhengjun; Tsubokura, Yasutaka; Sato, Shusei; Nakamoto, Yumi; Yamanaka, Naoki; Takahashi, Ryoji; Ishimoto, Masao; Anai, Toyoaki; Tabata, Satoshi; Harada, Kyuya

    2009-01-01

    Photosensitivity plays an essential role in the response of plants to their changing environments throughout their life cycle. In soybean [Glycine max (L.) Merrill], several associations between photosensitivity and maturity loci are known, but only limited information at the molecular level is available. The FT3 locus is one of the quantitative trait loci (QTL) for flowering time that corresponds to the maturity locus E3. To identify the gene responsible for this QTL, a map-based cloning strategy was undertaken. One phytochrome A gene (GmPhyA3) was considered a strong candidate for the FT3 locus. Allelism tests and gene sequence comparisons showed that alleles of Misuzudaizu (FT3/FT3; JP28856) and Harosoy (E3/E3; PI548573) were identical. The GmPhyA3 alleles of Moshidou Gong 503 (ft3/ft3; JP27603) and L62-667 (e3/e3; PI547716) showed weak or complete loss of function, respectively. High red/far-red (R/FR) long-day conditions enhanced the effects of the E3/FT3 alleles in various genetic backgrounds. Moreover, a mutant line harboring the nonfunctional GmPhyA3 flowered earlier than the original Bay (E3/E3; PI553043) under similar conditions. These results suggest that the variation in phytochrome A may contribute to the complex systems of soybean flowering response and geographic adaptation. PMID:19474204

  19. Gene-ontology enrichment analysis in two independent family-based samples highlights biologically plausible processes for autism spectrum disorders.

    LENUS (Irish Health Repository)

    Anney, Richard J L

    2012-02-01

    Recent genome-wide association studies (GWAS) have implicated a range of genes from discrete biological pathways in the aetiology of autism. However, despite the strong influence of genetic factors, association studies have yet to identify statistically robust, replicated major effect genes or SNPs. We apply the principle of the SNP ratio test methodology described by O\\'Dushlaine et al to over 2100 families from the Autism Genome Project (AGP). Using a two-stage design we examine association enrichment in 5955 unique gene-ontology classifications across four groupings based on two phenotypic and two ancestral classifications. Based on estimates from simulation we identify excess of association enrichment across all analyses. We observe enrichment in association for sets of genes involved in diverse biological processes, including pyruvate metabolism, transcription factor activation, cell-signalling and cell-cycle regulation. Both genes and processes that show enrichment have previously been examined in autistic disorders and offer biologically plausibility to these findings.

  20. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  1. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  2. Detection of Fusarium verticillioides by PCR-ELISA based on FUM21 gene.

    Science.gov (United States)

    Omori, Aline Myuki; Ono, Elisabete Yurie Sataque; Bordini, Jaqueline Gozzi; Hirozawa, Melissa Tiemi; Fungaro, Maria Helena Pelegrinelli; Ono, Mario Augusto

    2018-08-01

    Fusarium verticillioides is a primary corn pathogen and fumonisin producer which is associated with toxic effects in humans and animals. The traditional methods for detection of fungal contamination based on morphological characteristics are time-consuming and show low sensitivity and specificity. Therefore, the objective of this study was to develop a PCR-ELISA based on the FUM21 gene for F. verticillioides detection. The DNA of the F. verticillioides, Fusarium sp., Aspergillus sp. and Penicillium sp. isolates was analyzed by conventional PCR and PCR-ELISA to determine the specificity. The PCR-ELISA was specific to F. verticillioides isolates, showed a 2.5 pg detection limit and was 100-fold more sensitive than conventional PCR. In corn samples inoculated with F. verticillioides conidia, the detection limit of the PCR-ELISA was 1 × 10 4 conidia/g and was also 100-fold more sensitive than conventional PCR. Naturally contaminated corn samples were analyzed by PCR-ELISA based on the FUM21 gene and PCR-ELISA absorbance values correlated positively (p PCR-ELISA developed in this study can be useful for F. verticillioides detection in corn samples. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

    Science.gov (United States)

    Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

    2016-02-01

    Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs

    Directory of Open Access Journals (Sweden)

    Xue Fuzhong

    2010-01-01

    Full Text Available Abstract Background Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s which usually involves multiple single-nucleotide polymorphisms (SNPs available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (HWE and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (PCA are preferable in this regard but their performance varies with methods of extracting principal components (PCs. Results PCA-based bootstrap confidence interval test (PCA-BCIT, which directly uses the PC scores to assess gene-disease association, was developed and evaluated for three ways of extracting PCs, i.e., cases only(CAES, controls only(COES and cases and controls combined(CES. Extraction of PCs with COES is preferred to that with CAES and CES. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test. Conclusions PCA-BCIT is a valid and powerful method for assessing gene-disease association involving multiple SNPs.

  5. Amino acid-substituted gemini surfactant-based nanoparticles as safe and versatile gene delivery agents.

    Science.gov (United States)

    Singh, Jagbir; Yang, Peng; Michel, Deborah; Verrall, Ronald E; Foldvari, Marianna; Badea, Ildiko

    2011-05-01

    Gene based therapy represents an important advance in the treatment of diseases that heretofore have had either no treatment or cure. To capitalize on the true potential of gene therapy, there is a need to develop better delivery systems that can protect these therapeutic biomolecules and deliver them safely to the target sites. Recently, we have designed and developed a series of novel amino acid-substituted gemini surfactants with the general chemical formula C(12)H(25) (CH(3))(2)N(+)-(CH(2))(3)-N(AA)-(CH(2))(3)-N(+) (CH(3))(2)-C(12)H(25) (AA= glycine, lysine, glycyl-lysine and, lysyl-lysine). These compounds were synthesized and tested in rabbit epithelial cells using a model plasmid and a helper lipid. Plasmid/gemini/lipid (P/G/L) nanoparticles formulated using these novel compounds achieved higher gene expression than the nanoparticles containing the parent unsubstituted compound. In this study, we evaluated the cytotoxicity of P/G/L nanoparticles and explored the relationship between transfection efficiency/toxicity and their physicochemical characteristics (such as size, binding properties, etc.). An overall low toxicity is observed for all complexes with no significant difference among substituted and unsubstituted compounds. An interesting result revealed by the dye exclusion assay suggests a more balanced protection of the DNA by the glycine and glycyl-lysine substituted compounds. Thus, the higher transfection efficiency is attributed to the greater biocompatibility and flexibility of the amino acid/peptide-substituted gemini surfactants and demonstrates the feasibility of using amino acid-substituted gemini surfactants as gene carriers for the treatment of diseases affecting epithelial tissue.

  6. Recurrent neural network based hybrid model for reconstructing gene regulatory network.

    Science.gov (United States)

    Raza, Khalid; Alam, Mansaf

    2016-10-01

    One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. A new measure for gene expression biclustering based on non-parametric correlation.

    Science.gov (United States)

    Flores, Jose L; Inza, Iñaki; Larrañaga, Pedro; Calvo, Borja

    2013-12-01

    One of the emerging techniques for performing the analysis of the DNA microarray data known as biclustering is the search of subsets of genes and conditions which are coherently expressed. These subgroups provide clues about the main biological processes. Until now, different approaches to this problem have been proposed. Most of them use the mean squared residue as quality measure but relevant and interesting patterns can not be detected such as shifting, or scaling patterns. Furthermore, recent papers show that there exist new coherence patterns involved in different kinds of cancer and tumors such as inverse relationships between genes which can not be captured. The proposed measure is called Spearman's biclustering measure (SBM) which performs an estimation of the quality of a bicluster based on the non-linear correlation among genes and conditions simultaneously. The search of biclusters is performed by using a evolutionary technique called estimation of distribution algorithms which uses the SBM measure as fitness function. This approach has been examined from different points of view by using artificial and real microarrays. The assessment process has involved the use of quality indexes, a set of bicluster patterns of reference including new patterns and a set of statistical tests. It has been also examined the performance using real microarrays and comparing to different algorithmic approaches such as Bimax, CC, OPSM, Plaid and xMotifs. SBM shows several advantages such as the ability to recognize more complex coherence patterns such as shifting, scaling and inversion and the capability to selectively marginalize genes and conditions depending on the statistical significance. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  8. Lineage relationship of prostate cancer cell types based on gene expression

    Directory of Open Access Journals (Sweden)

    Ware Carol B

    2011-05-01

    Full Text Available Abstract Background Prostate tumor heterogeneity is a major factor in disease management. Heterogeneity could be due to multiple cancer cell types with distinct gene expression. Of clinical importance is the so-called cancer stem cell type. Cell type-specific transcriptomes are used to examine lineage relationship among cancer cell types and their expression similarity to normal cell types including stem/progenitor cells. Methods Transcriptomes were determined by Affymetrix DNA array analysis for the following cell types. Putative prostate progenitor cell populations were characterized and isolated by expression of the membrane transporter ABCG2. Stem cells were represented by embryonic stem and embryonal carcinoma cells. The cancer cell types were Gleason pattern 3 (glandular histomorphology and pattern 4 (aglandular sorted from primary tumors, cultured prostate cancer cell lines originally established from metastatic lesions, xenografts LuCaP 35 (adenocarcinoma phenotype and LuCaP 49 (neuroendocrine/small cell carcinoma grown in mice. No detectable gene expression differences were detected among serial passages of the LuCaP xenografts. Results Based on transcriptomes, the different cancer cell types could be clustered into a luminal-like grouping and a non-luminal-like (also not basal-like grouping. The non-luminal-like types showed expression more similar to that of stem/progenitor cells than the luminal-like types. However, none showed expression of stem cell genes known to maintain stemness. Conclusions Non-luminal-like types are all representatives of aggressive disease, and this could be attributed to the similarity in overall gene expression to stem and progenitor cell types.

  9. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  10. Gene expression based evidence of innate immune response activation in the epithelium with oral lichen planus

    Science.gov (United States)

    Adami, Guy R.; Yeung, Alexander C.F.; Stucki, Grant; Kolokythas, Antonia; Sroussi, Herve Y.; Cabay, Robert J.; Kuzin, Igor; Schwartz, Joel L.

    2014-01-01

    Objective Oral lichen planus (OLP) is a disease of the oral mucosa of unknown cause producing lesions with an intense band-like inflammatory infiltrate of T cells to the subepithelium and keratinocyte cell death. We performed gene expression analysis of the oral epithelium of lesions in subjects with OLP and its sister disease, oral lichenoid reaction (OLR), in order to better understand the role of the keratinocytes in these diseases. Design Fourteen patients with OLP or OLR were included in the study, along with a control group of 23 subjects with a variety of oral diseases and a normal group of 17 subjects with no clinically visible mucosal abnormalities. Various proteins have been associated with OLP, based on detection of secreted proteins or changes in RNA levels in tissue samples consisting of epithelium, stroma, and immune cells. The mRNA level of twelve of these genes expressed in the epithelium was tested in the three groups. Results Four genes showed increased expression in the epithelium of OLP patients: CD14, CXCL1, IL8, and TLR1, and at least two of these proteins, TLR1 and CXCL1, were expressed at substantial levels in oral keratinocytes. Conclusions Because of the large accumulation of T cells in lesions of OLP it has long been thought to be an adaptive immunity malfunction. We provide evidence that there is increased expression of innate immune genes in the epithelium with this illness, suggesting a role for this process in the disease and a possible target for treatment. PMID:24581860

  11. KMgene: a unified R package for gene-based association analysis for complex traits.

    Science.gov (United States)

    Yan, Qi; Fang, Zhou; Chen, Wei; Stegle, Oliver

    2018-02-09

    In this report, we introduce an R package KMgene for performing gene-based association tests for familial, multivariate or longitudinal traits using kernel machine (KM) regression under a generalized linear mixed model (GLMM) framework. Extensive simulations were performed to evaluate the validity of the approaches implemented in KMgene. http://cran.r-project.org/web/packages/KMgene. qi.yan@chp.edu or wei.chen@chp.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.

  12. Genetic characterization of Italian field strains of Schmallenberg virus based on N and NSs genes.

    Science.gov (United States)

    Izzo, Francesca; Cosseddu, Gian Mario; Polci, Andrea; Iapaolo, Federica; Pinoni, Chiara; Capobianco Dondona, Andrea; Valleriani, Fabrizia; Monaco, Federica

    2016-08-01

    Following its first identification in Germany in 2011, the Schmallenberg virus (SBV) has rapidly spread to many other European countries. Despite the wide dissemination, the molecular characterization of the circulating strains is limited to German, Belgian, Dutch, and Swiss viruses. To fill this gap, partial genetic characterization of 15 Italian field strains was performed, based on S segment genes. Samples were collected in 2012 in two different regions where outbreaks occurred during distinct epidemic seasons. The comparative sequence analysis demonstrated a high molecular stability of the circulating viruses; nevertheless, we identified several variants of the N and NSs proteins not described in other SBV isolates circulating in Europe.

  13. Applications of gene-based technologies for improving animal production and health in developing countries

    International Nuclear Information System (INIS)

    Makkar, H.P.S.; Viljoen, G.J.

    2005-01-01

    This book provides a compilation of peer-reviewed scientific contributions from authoritative researchers attending an international symposium convened by the Animal Production and Health Sub-programme of the Animal Production and Health (APH), Joint FAO/IAEA Programme in cooperation with the Animal Production and Health Division of the FAO. These Proceedings contain invaluable information on the role and future potential of gene-based technologies for improving animal production and health, possible applications and constraints in the use of this technology in developing countries and their specific research needs

  14. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    Science.gov (United States)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  15. A Cas9-based toolkit to program gene expression in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Apel, Amanda Reider; d'Espaux, Leo; Wehrs, Maren

    2017-01-01

    of these parts via a web-based tool, that automates the generation of DNA fragments for integration. Our system builds upon existing gene editing methods in the thoroughness with which the parts are standardized and characterized, the types and number of parts available and the ease with which our methodology...... can be used to perform genetic edits in yeast. We demonstrated the applicability of this toolkit by optimizing the expression of a challenging but industrially important enzyme, taxadiene synthase (TXS). This approach enabled us to diagnose an issue with TXS solubility, the resolution of which yielded...

  16. Finding differentially expressed genes in high dimensional data: Rank based test statistic via a distance measure.

    Science.gov (United States)

    Mathur, Sunil; Sadana, Ajit

    2015-12-01

    We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.

  17. Sensitive detection of novel Indian isolate of BTV 21 using ns1 gene based real-time PCR assay

    Directory of Open Access Journals (Sweden)

    Gaya Prasad

    2013-06-01

    Full Text Available Aim: The study was conducted to develop ns1 gene based sensitive real-time RT-PCR assay for diagnosis of India isolates of bluetongue virus (BTV. Materials and Methods: The BTV serotype 21 isolate (KMNO7 was isolated from Andhra Pradesh and propagated in BHK-21 cell line in our laboratory. The Nucleic acid (dsRNA of virus was extracted using Trizol method and cDNA was prepared using a standard protocol. The cDNA was allowed to ns1 gene based group specific PCR to confirm the isolate as BTV. The viral RNA was diluted 10 folds and the detection limit of ns1 gene based RT-PCR was determined. Finally the tenfold diluted viral RNA was subjected to real-time RT-PCR using ns1 gene primer and Taq man probe to standardized the reaction and determine the detection limit. Results: The ns1 gene based group specific PCR showed a single 366bp amplicon in agarose gel electrophoresis confirmed the sample as BTV. The ns1 gene RT-PCR using tenfold diluted viral RNA showed the detection limit of 70.0 fg in 1%agarose gel electrophoresis. The ns1 gene based real time RT-PCR was successfully standardized and the detection limit was found to be 7.0 fg. Conclusion: The ns1 gene based real-time RT-PCR was successfully standardized and it was found to be 10 times more sensitive than conventional RT-PCR. Key words: bluetongue, BTV21, RT-PCR, Real time RT-PCR, ns1 gene [Vet World 2013; 6(8.000: 554-557

  18. Side-by-side comparison of gene-based smallpox vaccine with MVA in nonhuman primates.

    Science.gov (United States)

    Golden, Joseph W; Josleyn, Matthew; Mucker, Eric M; Hung, Chien-Fu; Loudon, Peter T; Wu, T C; Hooper, Jay W

    2012-01-01

    Orthopoxviruses remain a threat as biological weapons and zoonoses. The licensed live-virus vaccine is associated with serious health risks, making its general usage unacceptable. Attenuated vaccines are being developed as alternatives, the most advanced of which is modified-vaccinia virus Ankara (MVA). We previously developed a gene-based vaccine, termed 4pox, which targets four orthopoxvirus antigens, A33, B5, A27 and L1. This vaccine protects mice and non-human primates from lethal orthopoxvirus disease. Here, we investigated the capacity of the molecular adjuvants GM-CSF and Escherichia coli heat-labile enterotoxin (LT) to enhance the efficacy of the 4pox gene-based vaccine. Both adjuvants significantly increased protective antibody responses in mice. We directly compared the 4pox plus LT vaccine against MVA in a monkeypox virus (MPXV) nonhuman primate (NHP) challenge model. NHPs were vaccinated twice with MVA by intramuscular injection or the 4pox/LT vaccine delivered using a disposable gene gun device. As a positive control, one NHP was vaccinated with ACAM2000. NHPs vaccinated with each vaccine developed anti-orthopoxvirus antibody responses, including those against the 4pox antigens. After MPXV intravenous challenge, all control NHPs developed severe disease, while the ACAM2000 vaccinated animal was well protected. All NHPs vaccinated with MVA were protected from lethality, but three of five developed severe disease and all animals shed virus. All five NHPs vaccinated with 4pox/LT survived and only one developed severe disease. None of the 4pox/LT-vaccinated animals shed virus. Our findings show, for the first time, that a subunit orthopoxvirus vaccine delivered by the same schedule can provide a degree of protection at least as high as that of MVA.

  19. Side-by-side comparison of gene-based smallpox vaccine with MVA in nonhuman primates.

    Directory of Open Access Journals (Sweden)

    Joseph W Golden

    Full Text Available Orthopoxviruses remain a threat as biological weapons and zoonoses. The licensed live-virus vaccine is associated with serious health risks, making its general usage unacceptable. Attenuated vaccines are being developed as alternatives, the most advanced of which is modified-vaccinia virus Ankara (MVA. We previously developed a gene-based vaccine, termed 4pox, which targets four orthopoxvirus antigens, A33, B5, A27 and L1. This vaccine protects mice and non-human primates from lethal orthopoxvirus disease. Here, we investigated the capacity of the molecular adjuvants GM-CSF and Escherichia coli heat-labile enterotoxin (LT to enhance the efficacy of the 4pox gene-based vaccine. Both adjuvants significantly increased protective antibody responses in mice. We directly compared the 4pox plus LT vaccine against MVA in a monkeypox virus (MPXV nonhuman primate (NHP challenge model. NHPs were vaccinated twice with MVA by intramuscular injection or the 4pox/LT vaccine delivered using a disposable gene gun device. As a positive control, one NHP was vaccinated with ACAM2000. NHPs vaccinated with each vaccine developed anti-orthopoxvirus antibody responses, including those against the 4pox antigens. After MPXV intravenous challenge, all control NHPs developed severe disease, while the ACAM2000 vaccinated animal was well protected. All NHPs vaccinated with MVA were protected from lethality, but three of five developed severe disease and all animals shed virus. All five NHPs vaccinated with 4pox/LT survived and only one developed severe disease. None of the 4pox/LT-vaccinated animals shed virus. Our findings show, for the first time, that a subunit orthopoxvirus vaccine delivered by the same schedule can provide a degree of protection at least as high as that of MVA.

  20. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  1. Gene-based vaccine development for improving animal production in developing countries. Possibilities and constraints

    International Nuclear Information System (INIS)

    Egerton, J.R.

    2005-01-01

    For vaccine production, recombinant antigens must be protective. Identifying protective antigens or candidate antigens is an essential precursor to vaccine development. Even when a protective antigen has been identified, cloning of its gene does not lead directly to vaccine development. The fimbrial protein of Dichelobacter nodosus, the agent of foot-rot in ruminants, was known to be protective. Recombinant vaccines against this infection are ineffective if expressed protein subunits are not assembled as mature fimbriae. Antigenic competition between different, but closely related, recombinant antigens limited the use of multivalent vaccines based on this technology. Recombinant antigens may need adjuvants to enhance response. DNA vaccines, potentiated with genes for different cytokines, may replace the need for aggressive adjuvants, and especially where cellular immunity is essential for protection. The expression of antigens from animal pathogens in plants and the demonstration of some immunity to a disease like rinderpest after ingestion of these, suggests an alternative approach to vaccination by injection. Research on disease pathogenesis and the identification of candidate antigens is specific to the disease agent. The definition of expression systems and the formulation of a vaccine for each disease must be followed by research to establish safety and efficacy. Where vaccines are based on unique gene sequences, the intellectual property is likely to be protected by patent. Organizations, licensed to produce recombinant vaccines, expect to recover their costs and to make a profit. The consequence is that genetically-derived vaccines are expensive. The capacity of vaccines to help animal owners of poorer countries depends not only on quality and cost but also on the veterinary infrastructure where they are used. Ensuring the existence of an effective animal health infrastructure in developing countries is as great a challenge for the developed world as

  2. Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

    Science.gov (United States)

    Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

    2018-04-01

    The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.

  3. Pea Marker Database (PMD) - A new online database combining known pea (Pisum sativum L.) gene-based markers.

    Science.gov (United States)

    Kulaeva, Olga A; Zhernakov, Aleksandr I; Afonin, Alexey M; Boikov, Sergei S; Sulima, Anton S; Tikhonovich, Igor A; Zhukov, Vladimir A

    2017-01-01

    Pea (Pisum sativum L.) is the oldest model object of plant genetics and one of the most agriculturally important legumes in the world. Since the pea genome has not been sequenced yet, identification of genes responsible for mutant phenotypes or desirable agricultural traits is usually performed via genetic mapping followed by candidate gene search. Such mapping is best carried out using gene-based molecular markers, as it opens the possibility for exploiting genome synteny between pea and its close relative Medicago truncatula Gaertn., possessing sequenced and annotated genome. In the last 5 years, a large number of pea gene-based molecular markers have been designed and mapped owing to the rapid evolution of "next-generation sequencing" technologies. However, the access to the complete set of markers designed worldwide is limited because the data are not uniformed and therefore hard to use. The Pea Marker Database was designed to combine the information about pea markers in a form of user-friendly and practical online tool. Version 1 (PMD1) comprises information about 2484 genic markers, including their locations in linkage groups, the sequences of corresponding pea transcripts and the names of related genes in M. truncatula. Version 2 (PMD2) is an updated version comprising 15944 pea markers in the same format with several advanced features. To test the performance of the PMD, fine mapping of pea symbiotic genes Sym13 and Sym27 in linkage groups VII and V, respectively, was carried out. The results of mapping allowed us to propose the Sen1 gene (a homologue of SEN1 gene of Lotus japonicus (Regel) K. Larsen) as the best candidate gene for Sym13, and to narrow the list of possible candidate genes for Sym27 to ten, thus proving PMD to be useful for pea gene mapping and cloning. All information contained in PMD1 and PMD2 is available at www.peamarker.arriam.ru.

  4. Area-Specific Cell Stimulation via Surface-Mediated Gene Transfer Using Apatite-Based Composite Layers

    Directory of Open Access Journals (Sweden)

    Yushin Yazaki

    2015-04-01

    Full Text Available Surface-mediated gene transfer systems using biocompatible calcium phosphate (CaP-based composite layers have attracted attention as a tool for controlling cell behaviors. In the present study we aimed to demonstrate the potential of CaP-based composite layers to mediate area-specific dual gene transfer and to stimulate cells on an area-by-area basis in the same well. For this purpose we prepared two pairs of DNA–fibronectin–apatite composite (DF-Ap layers using a pair of reporter genes and pair of differentiation factor genes. The results of the area-specific dual gene transfer successfully demonstrated that the cells cultured on a pair of DF-Ap layers that were adjacently placed in the same well showed specific gene expression patterns depending on the gene that was immobilized in theunderlying layer. Moreover, preliminary real-time PCR results indicated that multipotential C3H10T1/2 cells may have a potential to change into different types of cells depending on the differentiation factor gene that was immobilized in the underlying layer, even in the same well. Because DF-Ap layers have a potential to mediate area-specific cell stimulation on their surfaces, they could be useful in tissue engineering applications.

  5. Microarray-Based Gene Expression Profiling to Elucidate Effectiveness of Fermented Codonopsis lanceolata in Mice

    Directory of Open Access Journals (Sweden)

    Woon Yong Choi

    2014-04-01

    Full Text Available In this study, the effect of Codonopsis lanceolata fermented by lactic acid on controlling gene expression levels related to obesity was observed in an oligonucleotide chip microarray. Among 8170 genes, 393 genes were up regulated and 760 genes were down regulated in feeding the fermented C. lanceolata (FCL. Another 374 genes were up regulated and 527 genes down regulated without feeding the sample. The genes were not affected by the FCL sample. It was interesting that among those genes, Chytochrome P450, Dmbt1, LOC76487, and thyroid hormones, etc., were mostly up or down regulated. These genes are more related to lipid synthesis. We could conclude that the FCL possibly controlled the gene expression levels related to lipid synthesis, which resulted in reducing obesity. However, more detailed protein expression experiments should be carried out.

  6. A gene-based analysis of variants in the serum/glucocorticoid regulated kinase (SGK genes with blood pressure responses to sodium intake: the GenSalt Study.

    Directory of Open Access Journals (Sweden)

    Changwei Li

    Full Text Available Serum and glucocorticoid regulated kinase (SGK plays a critical role in the regulation of renal sodium transport. We examined the association between SGK genes and salt sensitivity of blood pressure (BP using single-marker and gene-based association analysis.A 7-day low-sodium (51.3 mmol sodium/day followed by a 7-day high-sodium intervention (307.8 mmol sodium/day was conducted among 1,906 Chinese participants. BP measurements were obtained at baseline and each intervention using a random-zero sphygmomanometer. Additive associations between each SNP and salt-sensitivity phenotypes were assessed using a mixed linear regression model to account for family dependencies. Gene-based analyses were conducted using the truncated p-value method. The Bonferroni-method was used to adjust for multiple testing in all analyses.In single-marker association analyses, SGK1 marker rs2758151 was significantly associated with diastolic BP (DBP response to high-sodium intervention (P = 0.0010. DBP responses (95% confidence interval to high-sodium intervention for genotypes C/C, C/T, and T/T were 2.04 (1.57 to 2.52, 1.79 (1.42 to 2.16, and 0.85 (0.30 to 1.41 mmHg, respectively. Similar trends were observed for SBP and MAP responses although not significant (P = 0.15 and 0.0026, respectively. In addition, gene-based analyses demonstrated significant associations between SGK1 and SBP, DBP and MAP responses to high sodium intervention (P = 0.0002, 0.0076, and 0.00001, respectively. Neither SGK2 nor SGK3 were associated with the salt-sensitivity phenotypes in single-maker or gene-based analyses.The current study identified association of the SGK1 gene and BP salt-sensitivity in the Han Chinese population. Further studies are warranted to identify causal SGK1 gene variants.

  7. Association Study between BDNF Gene Polymorphisms and Autism by Three-Dimensional Gel-Based Microarray

    Directory of Open Access Journals (Sweden)

    Zuhong Lu

    2009-06-01

    Full Text Available Single nucleotide polymorphisms (SNPs are important markers which can be used in association studies searching for susceptible genes of complex diseases. High-throughput methods are needed for SNP genotyping in a large number of samples. In this study, we applied polyacrylamide gel-based microarray combined with dual-color hybridization for association study of four BDNF polymorphisms with autism. All the SNPs in both patients and controls could be analyzed quickly and correctly. Among four SNPs, only C270T polymorphism showed significant differences in the frequency of the allele (χ2 = 7.809, p = 0.005 and genotype (χ2 = 7.800, p = 0.020. In the haplotype association analysis, there was significant difference in global haplotype distribution between the groups (χ2 = 28.19,p = 3.44e-005. We suggest that BDNF has a possible role in the pathogenesis of autism. The study also show that the polyacrylamide gel-based microarray combined with dual-color hybridization is a rapid, simple and high-throughput method for SNPs genotyping, and can be used for association study of susceptible gene with disorders in large samples.

  8. Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease.

    Science.gov (United States)

    Azuaje, Francisco; Zheng, Huiru; Camargo, Anyela; Wang, Haiying

    2011-08-01

    The discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive and functional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms. Copyright © 2011 Elsevier Inc. All rights reserved.

  9. Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

    Science.gov (United States)

    Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre

    2009-01-01

    Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932

  10. Optimal consistency in microRNA expression analysis using reference-gene-based normalization.

    Science.gov (United States)

    Wang, Xi; Gardiner, Erin J; Cairns, Murray J

    2015-05-01

    Normalization of high-throughput molecular expression profiles secures differential expression analysis between samples of different phenotypes or biological conditions, and facilitates comparison between experimental batches. While the same general principles apply to microRNA (miRNA) normalization, there is mounting evidence that global shifts in their expression patterns occur in specific circumstances, which pose a challenge for normalizing miRNA expression data. As an alternative to global normalization, which has the propensity to flatten large trends, normalization against constitutively expressed reference genes presents an advantage through their relative independence. Here we investigated the performance of reference-gene-based (RGB) normalization for differential miRNA expression analysis of microarray expression data, and compared the results with other normalization methods, including: quantile, variance stabilization, robust spline, simple scaling, rank invariant, and Loess regression. The comparative analyses were executed using miRNA expression in tissue samples derived from subjects with schizophrenia and non-psychiatric controls. We proposed a consistency criterion for evaluating methods by examining the overlapping of differentially expressed miRNAs detected using different partitions of the whole data. Based on this criterion, we found that RGB normalization generally outperformed global normalization methods. Thus we recommend the application of RGB normalization for miRNA expression data sets, and believe that this will yield a more consistent and useful readout of differentially expressed miRNAs, particularly in biological conditions characterized by large shifts in miRNA expression.

  11. rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks.

    Science.gov (United States)

    Guo, Liyuan; Wang, Jing

    2018-01-04

    Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis

    Directory of Open Access Journals (Sweden)

    Stajich Jason E

    2006-11-01

    Full Text Available Abstract Background To date, most fungal phylogenies have been derived from single gene comparisons, or from concatenated alignments of a small number of genes. The increase in fungal genome sequencing presents an opportunity to reconstruct evolutionary events using entire genomes. As a tool for future comparative, phylogenomic and phylogenetic studies, we used both supertrees and concatenated alignments to infer relationships between 42 species of fungi for which complete genome sequences are available. Results A dataset of 345,829 genes was extracted from 42 publicly available fungal genomes. Supertree methods were employed to derive phylogenies from 4,805 single gene families. We found that the average consensus supertree method may suffer from long-branch attraction artifacts, while matrix representation with parsimony (MRP appears to be immune from these. A genome phylogeny was also reconstructed from a concatenated alignment of 153 universally distributed orthologs. Our MRP supertree and concatenated phylogeny are highly congruent. Within the Ascomycota, the sub-phyla Pezizomycotina and Saccharomycotina were resolved. Both phylogenies infer that the Leotiomycetes are the closest sister group to the Sordariomycetes. There is some ambiguity regarding the placement of Stagonospora nodurum, the sole member of the class Dothideomycetes present in the dataset. Within the Saccharomycotina, a monophyletic clade containing organisms that translate CTG as serine instead of leucine is evident. There is also strong support for two groups within the CTG clade, one containing the fully sexual species Candida lusitaniae, Candida guilliermondii and Debaryomyces hansenii, and the second group containing Candida albicans, Candida dubliniensis, Candida tropicalis, Candida parapsilosis and Lodderomyces elongisporus. The second major clade within the Saccharomycotina contains species whose genomes have undergone a whole genome duplication (WGD, and their close

  13. Network-based differential gene expression analysis suggests cell cycle related genes regulated by E2F1 underlie the molecular difference between smoker and non-smoker lung adenocarcinoma

    Science.gov (United States)

    2013-01-01

    Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize

  14. New tools for Mendelian disease gene identification: PhenoDB variant analysis module; and GeneMatcher, a web-based tool for linking investigators with an interest in the same gene.

    Science.gov (United States)

    Sobreira, Nara; Schiettecatte, François; Boehm, Corinne; Valle, David; Hamosh, Ada

    2015-04-01

    Identifying the causative variant from among the thousands identified by whole-exome sequencing or whole-genome sequencing is a formidable challenge. To make this process as efficient and flexible as possible, we have developed a Variant Analysis Module coupled to our previously described Web-based phenotype intake tool, PhenoDB (http://researchphenodb.net and http://phenodb.org). When a small number of candidate-causative variants have been identified in a study of a particular patient or family, a second, more difficult challenge becomes proof of causality for any given variant. One approach to this problem is to find other cases with a similar phenotype and mutations in the same candidate gene. Alternatively, it may be possible to develop biological evidence for causality, an approach that is assisted by making connections to basic scientists studying the gene of interest, often in the setting of a model organism. Both of these strategies benefit from an open access, online site where individual clinicians and investigators could post genes of interest. To this end, we developed GeneMatcher (http://genematcher.org), a freely accessible Website that enables connections between clinicians and researchers across the world who share an interest in the same gene(s). © 2015 WILEY PERIODICALS, INC.

  15. Pathway-based analysis of a melanoma genome-wide association study: analysis of genes related to tumour-immunosuppression.

    Directory of Open Access Journals (Sweden)

    Nils Schoof

    Full Text Available Systemic immunosuppression is a risk factor for melanoma, and sunburn-induced immunosuppression is thought to be causal. Genes in immunosuppression pathways are therefore candidate melanoma-susceptibility genes. If variants within these genes individually have a small effect on disease risk, the association may be undetected in genome-wide association (GWA studies due to low power to reach a high significance level. Pathway-based approaches have been suggested as a method of incorporating a priori knowledge into the analysis of GWA studies. In this study, the association of 1113 single nucleotide polymorphisms (SNPs in 43 genes (39 genomic regions related to immunosuppression have been analysed using a gene-set approach in 1539 melanoma cases and 3917 controls from the GenoMEL consortium GWA study. The association between melanoma susceptibility and the whole set of tumour-immunosuppression genes, and also predefined functional subgroups of genes, was considered. The analysis was based on a measure formed by summing the evidence from the most significant SNP in each gene, and significance was evaluated empirically by case-control label permutation. An association was found between melanoma and the complete set of genes (p(emp=0.002, as well as the subgroups related to the generation of tolerogenic dendritic cells (p(emp=0.006 and secretion of suppressive factors (p(emp=0.0004, thus providing preliminary evidence of involvement of tumour-immunosuppression gene polymorphisms in melanoma susceptibility. The analysis was repeated on a second phase of the GenoMEL study, which showed no evidence of an association. As one of the first attempts to replicate a pathway-level association, our results suggest that low power and heterogeneity may present challenges.

  16. Genetic diversity of perch rhabdoviruses isolates based on the nucleoprotein and glycoprotein genes.

    Science.gov (United States)

    Talbi, Chiraz; Cabon, Joelle; Baud, Marine; Bourjaily, Maya; de Boisséson, Claire; Castric, Jeannette; Bigarré, Laurent

    2011-12-01

    Despite the increasing impact of rhabdoviruses in European percid farming, the diversity of the viral populations is still poorly investigated. To address this issue, we sequenced the partial nucleoprotein (N) and complete glycoprotein (G) genes of nine rhabdoviruses isolated from perch (Perca fluviatilis) between 1999 and 2010, mostly from France, and analyzed six of them by immunofluorescence antibody test (IFAT). Using two rabbit antisera raised against either the reference perch rhabdovirus (PRhV) isolated in 1980 or the perch isolate R6146, two serogroups were distinguished. Meanwhile, based on partial N and complete G gene analysis, perch rhabdoviruses were divided into four genogroups, A-B-D and E, with a maximum of 32.9% divergence (G gene) between isolates. A comparison of the G amino acid sequences of isolates from the two identified serogroups revealed several variable regions that might account for antigenic differences. Comparative analysis of perch isolates with other rhabdoviruses isolated from black bass, pike-perch and pike showed some strong phylogenetic relationships, suggesting cross-host transmission. Similarly, striking genetic similarities were shown between perch rhabdoviruses and isolates from other European countries and various ecological niches, most likely reflecting the circulation of viruses through fish trade as well as putative transfers from marine to freshwater fish. Phylogenetic relationships of the newly characterized viruses were also determined within the family Rhabdoviridae. The analysis revealed a genetic cluster containing only fish viruses, including all rhabdoviruses from perch, as well as siniperca chuatsi rhabdovirus (SCRV) and eel virus X (EVEX). This cluster was distinct from the one represented by spring viraemia of carp vesiculovirus (SVCV), pike fry rhabdovirus (PFRV) and mammalian vesiculoviruses. The new genetic data provided in the present study shed light on the diversity of rhabdoviruses infecting perch in

  17. Genomic DNA-based absolute quantification of gene expression in Vitis.

    Science.gov (United States)

    Gambetta, Gregory A; McElrone, Andrew J; Matthews, Mark A

    2013-07-01

    Many studies in which gene expression is quantified by polymerase chain reaction represent the expression of a gene of interest (GOI) relative to that of a reference gene (RG). Relative expression is founded on the assumptions that RG expression is stable across samples, treatments, organs, etc., and that reaction efficiencies of the GOI and RG are equal; assumptions which are often faulty. The true variability in RG expression and actual reaction efficiencies are seldom determined experimentally. Here we present a rapid and robust method for absolute quantification of expression in Vitis where varying concentrations of genomic DNA were used to construct GOI standard curves. This methodology was utilized to absolutely quantify and determine the variability of the previously validated RG ubiquitin (VvUbi) across three test studies in three different tissues (roots, leaves and berries). In addition, in each study a GOI was absolutely quantified. Data sets resulting from relative and absolute methods of quantification were compared and the differences were striking. VvUbi expression was significantly different in magnitude between test studies and variable among individual samples. Absolute quantification consistently reduced the coefficients of variation of the GOIs by more than half, often resulting in differences in statistical significance and in some cases even changing the fundamental nature of the result. Utilizing genomic DNA-based absolute quantification is fast and efficient. Through eliminating error introduced by assuming RG stability and equal reaction efficiencies between the RG and GOI this methodology produces less variation, increased accuracy and greater statistical power. © 2012 Scandinavian Plant Physiology Society.

  18. Differential gene expression in an elite hybrid rice cultivar (Oryza sativa, L and its parental lines based on SAGE data

    Directory of Open Access Journals (Sweden)

    Chen Chen

    2007-09-01

    Full Text Available Abstract Background It was proposed that differentially-expressed genes, aside from genetic variations affecting protein processing and functioning, between hybrid and its parents provide essential candidates for studying heterosis or hybrid vigor. Based our serial analysis of gene expression (SAGE data from an elite Chinese super-hybrid rice (LYP9 and its parental cultivars (93-11 and PA64s in three major tissue types (leaves, roots and panicles at different developmental stages, we analyzed the transcriptome and looked for candidate genes related to rice heterosis. Results By using an improved strategy of tag-to-gene mapping and two recently annotated genome assemblies (93-11 and PA64s, we identified 10,268 additional high-quality tags, reaching a grand total of 20,595 together with our previous result. We further detected 8.5% and 5.9% physically-mapped genes that are differentially-expressed among the triad (in at least one of the three stages with P-values less than 0.05 and 0.01, respectively. These genes distributed in 12 major gene expression patterns; among them, 406 up-regulated and 469 down-regulated genes (P Conclusion We improved tag-to-gene mapping strategy by combining information from transcript sequences and rice genome annotation, and obtained a more comprehensive view on genes that related to rice heterosis. The candidates for heterosis-related genes among different genotypes provided new avenue for exploring the molecular mechanism underlying heterosis.

  19. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

    Directory of Open Access Journals (Sweden)

    Kelsey E. Grinde

    2017-09-01

    Full Text Available To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6 and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures.

  20. Systems Pharmacology-Based Approach of Connecting Disease Genes in Genome-Wide Association Studies with Traditional Chinese Medicine.

    Science.gov (United States)

    Kim, Jihye; Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kang, Jaewoo; Tan, Aik Choon

    2018-01-01

    Traditional Chinese medicine (TCM) originated in ancient China has been practiced over thousands of years for treating various symptoms and diseases. However, the molecular mechanisms of TCM in treating these diseases remain unknown. In this study, we employ a systems pharmacology-based approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. We studied 102 TCM components and their target genes by analyzing microarray gene expression experiments. We constructed disease-gene networks from 2558 GWAS studies. We applied a systems pharmacology approach to prioritize disease-target genes. Using this bioinformatics approach, we analyzed 14,713 GWAS disease-TCM-target gene pairs and identified 115 disease-gene pairs with q value < 0.2. We validated several of these GWAS disease-TCM-target gene pairs with literature evidence, demonstrating that this computational approach could reveal novel indications for TCM. We also develop TCM-Disease web application to facilitate the traditional Chinese medicine drug repurposing efforts. Systems pharmacology is a promising approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. The computational approaches described in this study could be easily expandable to other disease-gene network analysis.

  1. Partial Least Squares Based Gene Expression Analysis in EBV- Positive and EBV-Negative Posttransplant Lymphoproliferative Disorders.

    Science.gov (United States)

    Wu, Sa; Zhang, Xin; Li, Zhi-Ming; Shi, Yan-Xia; Huang, Jia-Jia; Xia, Yi; Yang, Hang; Jiang, Wen-Qi

    2013-01-01

    Post-transplant lymphoproliferative disorder (PTLD) is a common complication of therapeutic immunosuppression after organ transplantation. Gene expression profile facilitates the identification of biological difference between Epstein-Barr virus (EBV) positive and negative PTLDs. Previous studies mainly implemented variance/regression analysis without considering unaccounted array specific factors. The aim of this study is to investigate the gene expression difference between EBV positive and negative PTLDs through partial least squares (PLS) based analysis. With a microarray data set from the Gene Expression Omnibus database, we performed PLS based analysis. We acquired 1188 differentially expressed genes. Pathway and Gene Ontology enrichment analysis identified significantly over-representation of dysregulated genes in immune response and cancer related biological processes. Network analysis identified three hub genes with degrees higher than 15, including CREBBP, ATXN1, and PML. Proteins encoded by CREBBP and PML have been reported to be interact with EBV before. Our findings shed light on expression distinction of EBV positive and negative PTLDs with the hope to offer theoretical support for future therapeutic study.

  2. Ecdysone Receptor-based Singular Gene Switches for Regulated Transgene Expression in Cells and Adult Rodent Tissues

    Directory of Open Access Journals (Sweden)

    Seoghyun Lee

    2016-01-01

    Full Text Available Controlled gene expression is an indispensable technique in biomedical research. Here, we report a convenient, straightforward, and reliable way to induce expression of a gene of interest with negligible background expression compared to the most widely used tetracycline (Tet-regulated system. Exploiting a Drosophila ecdysone receptor (EcR-based gene regulatory system, we generated nonviral and adenoviral singular vectors designated as pEUI(+ and pENTR-EUI, respectively, which contain all the required elements to guarantee regulated transgene expression (GAL4-miniVP16-EcR, termed GvEcR hereafter, and 10 tandem repeats of an upstream activation sequence promoter followed by a multiple cloning site. Through the transient and stable transfection of mammalian cell lines with reporter genes, we validated that tebufenozide, an ecdysone agonist, reversibly induced gene expression, in a dose- and time-dependent manner, with negligible background expression. In addition, we created an adenovirus derived from the pENTR-EUI vector that readily infected not only cultured cells but also rodent tissues and was sensitive to tebufenozide treatment for regulated transgene expression. These results suggest that EcR-based singular gene regulatory switches would be convenient tools for the induction of gene expression in cells and tissues in a tightly controlled fashion.

  3. AUDIOME: a tiered exome sequencing-based comprehensive gene panel for the diagnosis of heterogeneous nonsyndromic sensorineural hearing loss.

    Science.gov (United States)

    Guan, Qiaoning; Balciuniene, Jorune; Cao, Kajia; Fan, Zhiqian; Biswas, Sawona; Wilkens, Alisha; Gallo, Daniel J; Bedoukian, Emma; Tarpinian, Jennifer; Jayaraman, Pushkala; Sarmady, Mahdi; Dulik, Matthew; Santani, Avni; Spinner, Nancy; Abou Tayoun, Ahmad N; Krantz, Ian D; Conlin, Laura K; Luo, Minjie

    2018-03-29

    PurposeHereditary hearing loss is highly heterogeneous. To keep up with rapidly emerging disease-causing genes, we developed the AUDIOME test for nonsyndromic hearing loss (NSHL) using an exome sequencing (ES) platform and targeted analysis for the curated genes.MethodsA tiered strategy was implemented for this test. Tier 1 includes combined Sanger and targeted deletion analyses of the two most common NSHL genes and two mitochondrial genes. Nondiagnostic tier 1 cases are subjected to ES and array followed by targeted analysis of the remaining AUDIOME genes.ResultsES resulted in good coverage of the selected genes with 98.24% of targeted bases at >15 ×. A fill-in strategy was developed for the poorly covered regions, which generally fell within GC-rich or highly homologous regions. Prospective testing of 33 patients with NSHL revealed a diagnosis in 11 (33%) and a possible diagnosis in 8 cases (24.2%). Among those, 10 individuals had variants in tier 1 genes. The ES data in the remaining nondiagnostic cases are readily available for further analysis.ConclusionThe tiered and ES-based test provides an efficient and cost-effective diagnostic strategy for NSHL, with the potential to reflex to full exome to identify causal changes outside of the AUDIOME test.Genetics in Medicine advance online publication, 29 March 2018; doi:10.1038/gim.2018.48.

  4. Gene-based single nucleotide polymorphism markers for genetic and association mapping in common bean.

    Science.gov (United States)

    Galeano, Carlos H; Cortés, Andrés J; Fernández, Andrea C; Soler, Álvaro; Franco-Herrera, Natalia; Makunde, Godwill; Vanderleyden, Jos; Blair, Matthew W

    2012-06-26

    In common bean, expressed sequence tags (ESTs) are an underestimated source of gene-based markers such as insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). However, due to the nature of these conserved sequences, detection of markers is difficult and portrays low levels of polymorphism. Therefore, development of intron-spanning EST-SNP markers can be a valuable resource for genetic experiments such as genetic mapping and association studies. In this study, a total of 313 new gene-based markers were developed at target genes. Intronic variation was deeply explored in order to capture more polymorphism. Introns were putatively identified after comparing the common bean ESTs with the soybean genome, and the primers were designed over intron-flanking regions. The intronic regions were evaluated for parental polymorphisms using the single strand conformational polymorphism (SSCP) technique and Sequenom MassARRAY system. A total of 53 new marker loci were placed on an integrated molecular map in the DOR364 × G19833 recombinant inbred line (RIL) population. The new linkage map was used to build a consensus map, merging the linkage maps of the BAT93 × JALO EEP558 and DOR364 × BAT477 populations. A total of 1,060 markers were mapped, with a total map length of 2,041 cM across 11 linkage groups. As a second application of the generated resource, a diversity panel with 93 genotypes was evaluated with 173 SNP markers using the MassARRAY-platform and KASPar technology. These results were coupled with previous SSR evaluations and drought tolerance assays carried out on the same individuals. This agglomerative dataset was examined, in order to discover marker-trait associations, using general linear model (GLM) and mixed linear model (MLM). Some significant associations with yield components were identified, and were consistent with previous findings. In short, this study illustrates the power of intron-based markers for linkage and association mapping in

  5. Gene Therapy Vectors with Enhanced Transfection Based on Hydrogels Modified with Affinity Peptides

    Science.gov (United States)

    Shepard, Jaclyn A.; Wesson, Paul J.; Wang, Christine E.; Stevans, Alyson C.; Holland, Samantha J.; Shikanov, Ariella; Grzybowski, Bartosz A.; Shea, Lonnie D.

    2011-01-01

    Regenerative strategies for damaged tissue aim to present biochemical cues that recruit and direct progenitor cell migration and differentiation. Hydrogels capable of localized gene delivery are being developed to provide a support for tissue growth, and as a versatile method to induce the expression of inductive proteins; however, the duration, level, and localization of expression isoften insufficient for regeneration. We thus investigated the modification of hydrogels with affinity peptides to enhance vector retention and increase transfection within the matrix. PEG hydrogels were modified with lysine-based repeats (K4, K8), which retained approximately 25% more vector than control peptides. Transfection increased 5- to 15-fold with K8 and K4 respectively, over the RDG control peptide. K8- and K4-modified hydrogels bound similar quantities of vector, yet the vector dissociation rate was reduced for K8, suggesting excessive binding that limited transfection. These hydrogels were subsequently applied to an in vitro co-culture model to induce NGF expression and promote neurite outgrowth. K4-modified hydrogels promoted maximal neurite outgrowth, likely due to retention of both the vector and the NGF. Thus, hydrogels modified with affinity peptides enhanced vector retention and increased gene delivery, and these hydrogels may provide a versatile scaffold for numerous regenerative medicine applications. PMID:21514659

  6. Effective generation of transgenic pigs and mice by linker based sperm-mediated gene transfer.

    Directory of Open Access Journals (Sweden)

    Shih Ping Yao

    2002-04-01

    Full Text Available Abstract Background Transgenic animals have become valuable tools for both research and applied purposes. The current method of gene transfer, microinjection, which is widely used in transgenic mouse production, has only had limited success in producing transgenic animals of larger or higher species. Here, we report a linker based sperm-mediated gene transfer method (LB-SMGT that greatly improves the production efficiency of large transgenic animals. Results The linker protein, a monoclonal antibody (mAb C, is reactive to a surface antigen on sperm of all tested species including pig, mouse, chicken, cow, goat, sheep, and human. mAb C is a basic protein that binds to DNA through ionic interaction allowing exogenous DNA to be linked specifically to sperm. After fertilization of the egg, the DNA is shown to be successfully integrated into the genome of viable pig and mouse offspring with germ-line transfer to the F1 generation at a highly efficient rate: 37.5% of pigs and 33% of mice. The integration is demonstrated again by FISH analysis and F2 transmission in pigs. Furthermore, expression of the transgene is demonstrated in 61% (35/57 of transgenic pigs (F0 generation. Conclusions Our data suggests that LB-SMGT could be used to generate transgenic animals efficiently in many different species.

  7. Elastin overexpression by cell-based gene therapy preserves matrix and prevents cardiac dilation

    Science.gov (United States)

    Li, Shu-Hong; Sun, Zhuo; Guo, Lily; Han, Mihan; Wood, Michael F G; Ghosh, Nirmalya; Alex Vitkin, I; Weisel, Richard D; Li, Ren-Ke

    2012-01-01

    After a myocardial infarction, thinning and expansion of the fibrotic scar contribute to progressive heart failure. The loss of elastin is a major contributor to adverse extracellular matrix remodelling of the infarcted heart, and restoration of the elastic properties of the infarct region can prevent ventricular dysfunction. We implanted cells genetically modified to overexpress elastin to re-establish the elastic properties of the infarcted myocardium and prevent cardiac failure. A full-length human elastin cDNA was cloned, subcloned into an adenoviral vector and then transduced into rat bone marrow stromal cells (BMSCs). In vitro studies showed that BMSCs expressed the elastin protein, which was deposited into the extracellular matrix. Transduced BMSCs were injected into the infarcted myocardium of adult rats. Control groups received either BMSCs transduced with the green fluorescent protein gene or medium alone. Elastin deposition in the infarcted myocardium was associated with preservation of myocardial tissue structural integrity (by birefringence of polarized light; P elastin showed the greatest functional improvement (P elastin in the infarcted heart preserved the elastic structure of the extracellular matrix, which, in turn, preserved diastolic function, prevented ventricular dilation and preserved cardiac function. This cell-based gene therapy provides a new approach to cardiac regeneration. PMID:22435995

  8. Advances in Viral Vector-Based TRAIL Gene Therapy for Cancer

    International Nuclear Information System (INIS)

    Norian, Lyse A.; James, Britnie R.; Griffith, Thomas S.

    2011-01-01

    Numerous biologic approaches are being investigated as anti-cancer therapies in an attempt to induce tumor regression while circumventing the toxic side effects associated with standard chemo- or radiotherapies. Among these, tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) has shown particular promise in pre-clinical and early clinical trials, due to its preferential ability to induce apoptotic cell death in cancer cells and its minimal toxicity. One limitation of TRAIL use is the fact that many tumor types display an inherent resistance to TRAIL-induced apoptosis. To circumvent this problem, researchers have explored a number of strategies to optimize TRAIL delivery and to improve its efficacy via co-administration with other anti-cancer agents. In this review, we will focus on TRAIL-based gene therapy approaches for the treatment of malignancies. We will discuss the main viral vectors that are being used for TRAIL gene therapy and the strategies that are currently being attempted to improve the efficacy of TRAIL as an anti-cancer therapeutic

  9. Rapid and tunable method to temporally control gene editing based on conditional Cas9 stabilization. | Office of Cancer Genomics

    Science.gov (United States)

    The CRISPR/Cas9 system is a powerful tool for studying gene function. Here, we describe a method that allows temporal control of CRISPR/Cas9 activity based on conditional Cas9 destabilization. We demonstrate that fusing an FKBP12-derived destabilizing domain to Cas9 (DD-Cas9) enables conditional Cas9 expression and temporal control of gene editing in the presence of an FKBP12 synthetic ligand. This system can be easily adapted to co-express, from the same promoter, DD-Cas9 with any other gene of interest without co-modulation of the latter.

  10. Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning.

    Science.gov (United States)

    Doll, Bradley B; Bath, Kevin G; Daw, Nathaniel D; Frank, Michael J

    2016-01-27

    Considerable evidence suggests that multiple learning systems can drive behavior. Choice can proceed reflexively from previous actions and their associated outcomes, as captured by "model-free" learning algorithms, or flexibly from prospective consideration of outcomes that might occur, as captured by "model-based" learning algorithms. However, differential contributions of dopamine to these systems are poorly understood. Dopamine is widely thought to support model-free learning by modulating plasticity in striatum. Model-based learning may also be affected by these striatal effects, or by other dopaminergic effects elsewhere, notably on prefrontal working memory function. Indeed, prominent demonstrations linking striatal dopamine to putatively model-free learning did not rule out model-based effects, whereas other studies have reported dopaminergic modulation of verifiably model-based learning, but without distinguishing a prefrontal versus striatal locus. To clarify the relationships between dopamine, neural systems, and learning strategies, we combine a genetic association approach in humans with two well-studied reinforcement learning tasks: one isolating model-based from model-free behavior and the other sensitive to key aspects of striatal plasticity. Prefrontal function was indexed by a polymorphism in the COMT gene, differences of which reflect dopamine levels in the prefrontal cortex. This polymorphism has been associated with differences in prefrontal activity and working memory. Striatal function was indexed by a gene coding for DARPP-32, which is densely expressed in the striatum where it is necessary for synaptic plasticity. We found evidence for our hypothesis that variations in prefrontal dopamine relate to model-based learning, whereas variations in striatal dopamine function relate to model-free learning. Decisions can stem reflexively from their previously associated outcomes or flexibly from deliberative consideration of potential choice outcomes

  11. Sphingolipid base modifying enzymes in sunflower (Helianthus annuus): cloning and characterization of a C4-hydroxylase gene and a new paralogous Δ8-desaturase gene.

    Science.gov (United States)

    Moreno-Pérez, Antonio J; Martínez-Force, Enrique; Garcés, Rafael; Salas, Joaquín J

    2011-05-15

    Sphingolipids are components of plant cell membranes that participate in the regulation of important physiological processes. Unlike their animal counterparts, plant sphingolipids are characterized by high levels of base C4-hydroxylation. Moreover, desaturation at the Δ8 position predominates over the Δ4 desaturation typically found in animal sphingolipids. These modifications are due to the action of C4-hydroxylases and Δ8-long chain base desaturases, and they are important for complex sphingolipids finally becoming functional. The long chain bases of sunflower sphingolipids have high levels of hydroxylated and unsaturated moieties. Here, a C4-long chain base hydroxylase was functionally characterized in sunflower plant, an enzyme that could complement the sur2Δ mutation when heterologously expressed in this yeast mutant deficient in hydroxylation. This hydroxylase was ubiquitously expressed in sunflower, with the highest levels found in the developing cotyledons. In addition, we identified a new Δ8-long base chain desaturase gene that displays strong homology to a previously reported desaturase gene. This desaturase was also expressed in yeast and was able to change the long chain base composition of the transformed host. We studied the expression of this desaturase and compared it with that of the other isoform described in sunflower. The desaturase form studied in this paper displayed higher expression levels in developing seeds. Copyright © 2010 Elsevier GmbH. All rights reserved.

  12. Intellectual property rights and gene-based technologies for animal production and health. Issues for developing countries

    International Nuclear Information System (INIS)

    Dutfield, G.

    2005-01-01

    Intellectual property rights (IPR) are legal and institutional devices to protect creations of the mind. With respect to gene-based innovation, the most significant IPR is patents. Appropriate patent regimes have the potential to foster innovation in animal biotechnology and the transfer of gene-based technologies. Inappropriate patent systems may be counter-productive. Indeed, many critics are doubtful that the current international patent standards, based as they are on a combination of the United States of America' and European regimes, can help countries that lack the capacity to do much life science and biotechnology research to become more innovative o r contribute to the acquisition, absorption and, where desirable, the adaptation of new gene-based technologies from outside. Present legislation in Europe, North America and internationally is considered, together with the controversies and important policy questions for developing countries, and the choices facing countries seeking to enhance their scientific and technological capacities in these areas. (author)

  13. Gene introduction into the mitochondria of Arabidopsis thaliana via peptide-based carriers

    Science.gov (United States)

    Chuah, Jo-Ann; Yoshizumi, Takeshi; Kodama, Yutaka; Numata, Keiji

    2015-01-01

    Available methods in plant genetic transformation are nuclear and plastid transformations because similar procedures have not yet been established for the mitochondria. The double membrane and small size of the organelle, in addition to its large population in cells, are major obstacles in mitochondrial transfection. Here we report the intracellular delivery of exogenous DNA localized to the mitochondria of Arabidopsis thaliana using a combination of mitochondria-targeting peptide and cell-penetrating peptide. Low concentrations of peptides were sufficient to deliver DNA into the mitochondria and expression of imported DNA reached detectable levels within a short incubation period (12 h). We found that electrostatic interaction with the cell membrane is not a critical factor for complex internalization, instead, improved intracellular penetration of mitochondria-targeted complexes significantly enhanced gene transfer efficiency. Our results delineate a simple and effective peptide-based method, as a starting point for the development of more sophisticated plant mitochondrial transfection strategies.

  14. Gene delivery by microfluidic flow-through electroporation based on constant DC and AC field.

    Science.gov (United States)

    Geng, Tao; Zhan, Yihong; Lu, Chang

    2012-01-01

    Electroporation is one of the most widely used physical methods to deliver exogenous nucleic acids into cells with high efficiency and low toxicity. Conventional electroporation systems typically require expensive pulse generators to provide short electrical pulses at high voltage. In this work, we demonstrate a flow-through electroporation method for continuous transfection of cells based on disposable chips, a syringe pump, and a low-cost power supply that provides a constant voltage. We successfully transfect cells using either DC or AC voltage with high flow rates (ranging from 40 µl/min to 20 ml/min) and high efficiency (up to 75%). We also enable the entire cell membrane to be uniformly permeabilized and dramatically improve gene delivery by inducing complex migrations of cells during the flow.

  15. Development of new USER-based cloning vectors for multiple genes expression in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Kildegaard, Kanchana Rueksomtawin; Jensen, Niels Bjerg; Maury, Jerome

    2013-01-01

    auxotrophic and dominant markers for convenience of use. Our vector set also contains both integrating and multicopy vectors for stability of protein expression and high expression level. We will make the new vector system available to the yeast community and provide a comprehensive protocol for cloning...... the production strain with the proper phenotype and product yield. However, the sequential number of metabolic engineering is time-consuming. Furthermore, the number of available selectable markers is also limiting the number of genetic modifications. To overcome these limitations, we have developed a new set...... of shuttle vectors for convenience of use for high-throughput cloning and selectable marker recycling. The new USER-based cloning vectors consist of a unique USER site and a CRE-loxP-mediated marker recycling system. The USER site allows insertion of genes of interest along with a bidirectional promoter...

  16. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis

    Science.gov (United States)

    Kato, Hirotomo; Gomez, Eduardo A.; Martini-Robles, Luiggi; Muzzio, Jenny; Velez, Lenin; Calvopiña, Manuel; Romero-Alvarez, Daniel; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

    2016-01-01

    A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL) in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia) guyanensis, L. (V.) braziliensis, L. (V.) naiffi, L. (V.) lainsoni, and L. (Leishmania) mexicana. Two dominant species, L. (V.) guyanensis and L. (V.) braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V.) naiffi and L. (V.) lainsoni were identified in Amazonian areas, and L. (L.) mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V.) braziliensis infection are increasing in Pacific coast areas. PMID:27410039

  17. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis.

    Science.gov (United States)

    Kato, Hirotomo; Gomez, Eduardo A; Martini-Robles, Luiggi; Muzzio, Jenny; Velez, Lenin; Calvopiña, Manuel; Romero-Alvarez, Daniel; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

    2016-07-01

    A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL) in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia) guyanensis, L. (V.) braziliensis, L. (V.) naiffi, L. (V.) lainsoni, and L. (Leishmania) mexicana. Two dominant species, L. (V.) guyanensis and L. (V.) braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V.) naiffi and L. (V.) lainsoni were identified in Amazonian areas, and L. (L.) mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V.) braziliensis infection are increasing in Pacific coast areas.

  18. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis.

    Directory of Open Access Journals (Sweden)

    Hirotomo Kato

    2016-07-01

    Full Text Available A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia guyanensis, L. (V. braziliensis, L. (V. naiffi, L. (V. lainsoni, and L. (Leishmania mexicana. Two dominant species, L. (V. guyanensis and L. (V. braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V. naiffi and L. (V. lainsoni were identified in Amazonian areas, and L. (L. mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V. braziliensis infection are increasing in Pacific coast areas.

  19. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

    Science.gov (United States)

    Grinde, Kelsey E.; Arbet, Jaron; Green, Alden; O'Connell, Michael; Valcarcel, Alessandra; Westra, Jason; Tintle, Nathan

    2017-01-01

    To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures. PMID:28959274

  20. Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification.

    Science.gov (United States)

    Doostparast Torshizi, Abolfazl; Petzold, Linda R

    2018-01-01

    Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded from the Human Genome Atlas. Extensive numerical experiments demonstrate superior performance of the proposed approach compared to other state-of-the-art algorithms, including the latest graph-based classification techniques. Simulation results demonstrate that integrating various data types enhances classification performance and leads to a better understanding of interrelations between diverse omics data types. The proposed approach outperforms many of the state-of-the-art data integration algorithms. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  1. Temperature based daily incoming solar radiation modeling based on gene expression programming, neuro-fuzzy and neural network computing techniques.

    Science.gov (United States)

    Landeras, G.; López, J. J.; Kisi, O.; Shiri, J.

    2012-04-01

    The correct observation/estimation of surface incoming solar radiation (RS) is very important for many agricultural, meteorological and hydrological related applications. While most weather stations are provided with sensors for air temperature detection, the presence of sensors necessary for the detection of solar radiation is not so habitual and the data quality provided by them is sometimes poor. In these cases it is necessary to estimate this variable. Temperature based modeling procedures are reported in this study for estimating daily incoming solar radiation by using Gene Expression Programming (GEP) for the first time, and other artificial intelligence models such as Artificial Neural Networks (ANNs), and Adaptive Neuro-Fuzzy Inference System (ANFIS). Traditional temperature based solar radiation equations were also included in this study and compared with artificial intelligence based approaches. Root mean square error (RMSE), mean absolute error (MAE) RMSE-based skill score (SSRMSE), MAE-based skill score (SSMAE) and r2 criterion of Nash and Sutcliffe criteria were used to assess the models' performances. An ANN (a four-input multilayer perceptron with ten neurons in the hidden layer) presented the best performance among the studied models (2.93 MJ m-2 d-1 of RMSE). A four-input ANFIS model revealed as an interesting alternative to ANNs (3.14 MJ m-2 d-1 of RMSE). Very limited number of studies has been done on estimation of solar radiation based on ANFIS, and the present one demonstrated the ability of ANFIS to model solar radiation based on temperatures and extraterrestrial radiation. By the way this study demonstrated, for the first time, the ability of GEP models to model solar radiation based on daily atmospheric variables. Despite the accuracy of GEP models was slightly lower than the ANFIS and ANN models the genetic programming models (i.e., GEP) are superior to other artificial intelligence models in giving a simple explicit equation for the

  2. Genes involved in immunity and apoptosis are associated with human presbycusis based on microarray analysis.

    Science.gov (United States)

    Dong, Yang; Li, Ming; Liu, Puzhao; Song, Haiyan; Zhao, Yuping; Shi, Jianrong

    2014-06-01

    Genes involved in immunity and apoptosis were associated with human presbycusis. CCR3 and GILZ played an important role in the pathogenesis of presbycusis, probably through regulating chemokine receptor, T-cell apoptosis, or T-cell activation pathways. To identify genes associated with human presbycusis and explore the molecular mechanism of presbycusis. Hearing function was tested by pure-tone audiometry. Microarray analysis was performed to identify presbycusis-correlated genes by Illumina Human-6 BeadChip using the peripheral blood samples of subjects. To identify biological process categories and pathways associated with presbycusis-correlated genes, bioinformatics analysis was carried out by Gene Ontology Tree Machine (GOTM) and database for annotation, visualization, and integrated discovery (DAVID). Quantitative RT-PCR (qRT-PCR) was used to validate the microarray data. Microarray analysis identified 469 up-regulated genes and 323 down-regulated genes. Both the dominant biological processes by Gene Ontology (GO) analysis and the enriched pathways by Kyoto encyclopedia of genes and genomes (KEGG) and BIOCARTA showed that genes involved in immunity and apoptosis were associated with presbycusis. In addition, CCR3, GILZ, CXCL10, and CX3CR1 genes showed consistent difference between groups for both the gene chip and qRT-PCR data. The differences of CCR3 and GILZ between presbycusis patients and controls were statistically significant (p < 0.05).

  3. A potential disruptive technology in vaccine development: gene-based vaccines and their application to infectious diseases.

    Science.gov (United States)

    Kaslow, David C

    2004-10-01

    Vaccine development requires an amalgamation of disparate disciplines and has unique economic and regulatory drivers. Non-viral gene-based delivery systems, such as formulated plasmid DNA, are new and potentially disruptive technologies capable of providing 'cheaper, simpler, and more convenient-to-use' vaccines. Typically and somewhat ironically, disruptive technologies have poorer product performance, at least in the near-term, compared with the existing conventional technologies. Because successful product development requires that the product's performance must meet or exceed the efficacy threshold for a desired application, the appropriate selection of the initial product applications for a disruptive technology is critical for its successful evolution. In this regard, the near-term successes of gene-based vaccines will likely be for protection against bacterial toxins and acute viral and bacterial infections. Recent breakthroughs, however, herald increasing rather than languishing performance improvements in the efficacy of gene-based vaccines. Whether gene-based vaccines ultimately succeed in eliciting protective immunity in humans to persistent intracellular pathogens, such as HIV, malaria and tuberculosis, for which the conventional vaccine technologies have failed, remains to be determined. A success against any one of the persistent intracellular pathogens would be sufficient proof that gene-based vaccines represent a disruptive technology against which future vaccine technologies will be measured.

  4. Identification and characterization of gene-based SSR markers in date palm (Phoenix dactylifera L.

    Directory of Open Access Journals (Sweden)

    Zhao Yongli

    2012-12-01

    Full Text Available Abstract Background Date palm (Phoenix dactylifera L. is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding. Results In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs. We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7% were the most common, followed by tetranucleotide (10.4% and dinucleotide motifs (9.6%. The motif AG (85.7% was most abundant in dinucleotides, while motifs AGG (26.8%, AAG (19.3%, and AGC (16.1% were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4% of such ESTs had homology with known proteins. Conclusion Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.

  5. Heat-transfer-based detection of SNPs in the PAH gene of PKU patients

    Directory of Open Access Journals (Sweden)

    Vanden Bon N

    2014-03-01

    Full Text Available Natalie Vanden Bon,1 Bart van Grinsven,2 Mohammed Sharif Murib,2 Weng Siang Yeap,2 Ken Haenen,2,3 Ward De Ceuninck,2,3 Patrick Wagner,2,3 Marcel Ameloot,1 Veronique Vermeeren,1 Luc Michiels11Biomedical Research Institute, Hasselt University, Diepenbeek, Belgium; 2Institute for Materials Research, Hasselt University, Diepenbeek, Belgium; 3IMOMEC, Diepenbeek, BelgiumAbstract: Conventional neonatal diagnosis of phenylketonuria is based on the presence of abnormal levels of phenylalanine in the blood. However, for carrier detection and prenatal diagnosis, direct detection of disease-correlated mutations is needed. To speed up and simplify mutation screening in genes, new technologies are developed. In this study, a heat-transfer method is evaluated as a mutation-detection technology in entire exons of the phenylalanine hydroxylase (PAH gene. This method is based on the change in heat-transfer resistance (Rth upon thermal denaturation of dsDNA (double-stranded DNA on nanocrystalline diamond. First, ssDNA (single-stranded DNA fragments that span the size range of the PAH exons were successfully immobilized on nanocrystalline diamond. Next, it was studied whether an Rth change could be observed during the thermal denaturation of these DNA fragments after hybridization to their complementary counterpart. A clear Rth shift during the denaturation of exon 5, exon 9, and exon 12 dsDNA was observed, corresponding to lengths of up to 123 bp. Finally, Rth was shown to detect prevalent single-nucleotide polymorphisms, c.473G>A (R158Q, c.932T>C (p.L311P, and c.1222C>T (R408W, correlated with phenylketonuria, displaying an effect related to the different melting temperatures of homoduplexes and heteroduplexes.Keywords: mutation detection, heat-transfer resistance, melting temperature, nanocrystalline diamond, persistence length

  6. Identification of Gene Modules Associated with Low Temperatures Response in Bambara Groundnut by Network-Based Analysis.

    Directory of Open Access Journals (Sweden)

    Venkata Suresh Bonthala

    Full Text Available Bambara groundnut (Vigna subterranea (L. Verdc. is an African legume and is a promising underutilized crop with good seed nutritional values. Low temperature stress in a number of African countries at night, such as Botswana, can effect the growth and development of bambara groundnut, leading to losses in potential crop yield. Therefore, in this study we developed a computational pipeline to identify and analyze the genes and gene modules associated with low temperature stress responses in bambara groundnut using the cross-species microarray technique (as bambara groundnut has no microarray chip coupled with network-based analysis. Analyses of the bambara groundnut transcriptome using cross-species gene expression data resulted in the identification of 375 and 659 differentially expressed genes (p<0.01 under the sub-optimal (23°C and very sub-optimal (18°C temperatures, respectively, of which 110 genes are commonly shared between the two stress conditions. The construction of a Highest Reciprocal Rank-based gene co-expression network, followed by its partition using a Heuristic Cluster Chiseling Algorithm resulted in 6 and 7 gene modules in sub-optimal and very sub-optimal temperature stresses being identified, respectively. Modules of sub-optimal temperature stress are principally enriched with carbohydrate and lipid metabolic processes, while most of the modules of very sub-optimal temperature stress are significantly enriched with responses to stimuli and various metabolic processes. Several transcription factors (from MYB, NAC, WRKY, WHIRLY & GATA classes that may regulate the downstream genes involved in response to stimulus in order for the plant to withstand very sub-optimal temperature stress were highlighted. The identified gene modules could be useful in breeding for low-temperature stress tolerant bambara groundnut varieties.

  7. Google goes cancer: improving outcome prediction for cancer patients by network-based ranking of marker genes.

    Directory of Open Access Journals (Sweden)

    Christof Winter

    Full Text Available Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice.

  8. Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data

    Directory of Open Access Journals (Sweden)

    de los Reyes Benildo G

    2008-04-01

    Full Text Available Abstract Background Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published data. Integrating these complementary datasets helps infer a mutually consistent transcriptional regulatory network (TRN with strong similarity to the structure of the underlying genetic regulatory modules. Decomposing the TRN into a small set of recurring regulatory patterns, called network motifs (NM, facilitates the inference. Identifying NMs defined by specific transcription factors (TF establishes the framework structure of a TRN and allows the inference of TF-target gene relationship. This paper introduces a computational framework for utilizing data from multiple sources to infer TF-target gene relationships on the basis of NMs. The data include time course gene expression profiles, genome-wide location analysis data, binding sequence data, and gene ontology (GO information. Results The proposed computational framework was tested using gene expression data associated with cell cycle progression in yeast. Among 800 cell cycle related genes, 85 were identified as candidate TFs and classified into four previously defined NMs. The NMs for a subset of TFs are obtained from literature. Support vector machine (SVM classifiers were used to estimate NMs for the remaining TFs. The potential downstream target genes for the TFs were clustered into 34 biologically significant groups. The relationships between TFs and potential target gene clusters were examined by training recurrent neural networks whose topologies mimic the NMs to which the TFs are classified. The identified relationships between TFs and gene clusters were evaluated using the following biological validation and statistical analyses: (1 Gene set enrichment

  9. Sieve-based relation extraction of gene regulatory networks from biological literature.

    Science.gov (United States)

    Žitnik, Slavko; Žitnik, Marinka; Zupan, Blaž; Bajec, Marko

    2015-01-01

    Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming

  10. Gene-based technologies for livestock industries in the 3rd millennium

    International Nuclear Information System (INIS)

    Cunningham, E.P.

    2005-01-01

    The first complete genome sequence of an organism was for yeast, in 1996. Since then, the much larger task of doing a complete human sequence has been completed. Those of major domestic animals are following rapidly. It will always be impossible to foresee the full potential of such an explosion in knowledge, but aspects of gene-based technologies are already beginning to have an impact in the livestock sector. The first and most obvious area of impact concerns feed supply, which constitutes 50-75 percent of total costs in many livestock systems. Production costs for maize and soybean are being reduced by genetic modification of the crop for herbicide and insect resistance. Maize has been modified to reduce phosphorous and nitrogen excretion in swine and poultry, and also to provide a more valuable amino acid balance. Genetic modification of the animal is also possible. Most dramatically, the insertion of a growth hormone in the DNA of fish accelerates growth. However, in this and all other cases, the genetic modification (GM) of animals has produced profound physiological disturbances. At the same time, the administration of GM-produced growth hormone to dairy cows is now routine in the United States of America and several other countries. This is not permitted in Europe, where the attitude to all GM technologies has been much more cautious. Conventional selection programmes continue to deliver steady genetic improvement in all animal populations. New molecular methods offer the prospect of enhancing genetic gains, particularly for traits that are difficult or expensive to measure, or which have low heritability. Gene technologies have much to contribute to the control of disease in animals. As pressure to reduce antibiotic and drug use increases, genetically modified vaccines with proven specificity and distinguishable from natural infections are already in use. DNA typing is helping with rapid and precise diagnosis. In addition, the interaction of some pathogens

  11. Dual delivery systems based on polyamine analog BENSpm as prodrug and gene delivery vectors

    Science.gov (United States)

    Zhu, Yu

    Combination drug and gene therapy shows promise in cancer treatment. However, the success of such strategy requires careful selection of the therapeutic agents, as well as development of efficient delivery vectors. BENSpm (N 1, N11-bisethylnorspermine), a polyamine analogue targeting the intracellular polyamine pathway, draws our special attention because of the following reasons: (1) polyamine pathway is frequently dysregulated in cancer; (2) BENSpm exhibits multiple functions to interfere with the polyamine pathway, such as to up-regulate polyamine metabolism enzymes and down-regulate polyamine biosynthesis enzymes. Therefore BENSpm depletes all natural polyamines and leads to apoptosis and cell growth inhibition in a wide range of cancers; (3) preclinical studies proved that BENSpm can act synergistically with various chemotherapy agents, making it a promising candidate in combination therapy; (4) multiple positive charges in BENSpm enable it as a suitable building block for cationic polymers, which can be further applied to gene delivery. In this dissertation, our goal was to design dual-function delivery vector based on BENSpm that can function as a gene delivery vector and, after intracellular degradation, as an active anticancer agent targeting dysregulated polyamine metabolism. We first demonstrated strong synergism between BENSpm and a potential therapeutic gene product TRAIL. Strong synergism was obtained in both estrogen-dependent MCF-7 breast cancer cells and triple-negative MDA-MB-231 breast cancer cells. Significant dose reduction of TRAIL in combination with BENSpm in MDA-MB-231 cells, together with the fact that BENSpm rendered MCF-7 cells more sensitive to TRAIL treatment verified our rationale of designing BENSpm-based delivery platform. This was expected to be beneficial for overcoming drug resistance in chemotherapy, as well as boosting the therapeutic effect of therapeutic genes. We first designed a lipid-based BENSpm dual vector (Lipo

  12. Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation.

    Science.gov (United States)

    Li, Min; Zhang, Jiayi; Liu, Qing; Wang, Jianxin; Wu, Fang-Xiang

    2014-01-01

    Predicting disease-related genes is one of the most important tasks in bioinformatics and systems biology. With the advances in high-throughput techniques, a large number of protein-protein interactions are available, which make it possible to identify disease-related genes at the network level. However, network-based identification of disease-related genes is still a challenge as the considerable false-positives are still existed in the current available protein interaction networks (PIN). Considering the fact that the majority of genetic disorders tend to manifest only in a single or a few tissues, we constructed tissue-specific networks (TSN) by integrating PIN and tissue-specific data. We further weighed the constructed tissue-specific network (WTSN) by using DNA methylation as it plays an irreplaceable role in the development of complex diseases. A PageRank-based method was developed to identify disease-related genes from the constructed networks. To validate the effectiveness of the proposed method, we constructed PIN, weighted PIN (WPIN), TSN, WTSN for colon cancer and leukemia, respectively. The experimental results on colon cancer and leukemia show that the combination of tissue-specific data and DNA methylation can help to identify disease-related genes more accurately. Moreover, the PageRank-based method was effective to predict disease-related genes on the case studies of colon cancer and leukemia. Tissue-specific data and DNA methylation are two important factors to the study of human diseases. The same method implemented on the WTSN can achieve better results compared to those being implemented on original PIN, WPIN, or TSN. The PageRank-based method outperforms degree centrality-based method for identifying disease-related genes from WTSN.

  13. Protein-Protein Interactions Prediction Based on Iterative Clique Extension with Gene Ontology Filtering

    Directory of Open Access Journals (Sweden)

    Lei Yang

    2014-01-01

    Full Text Available Cliques (maximal complete subnets in protein-protein interaction (PPI network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.

  14. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    Directory of Open Access Journals (Sweden)

    Ujjwal Maulik

    Full Text Available Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution. The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post

  15. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    Science.gov (United States)

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data

  16. Genome Wide Association Study of SNP-, Gene-, and Pathway-based Approaches to Identify Genes Influencing Susceptibility to Staphylococcus aureus Infections

    Directory of Open Access Journals (Sweden)

    Zhan eYe

    2014-05-01

    Full Text Available Background: We conducted a genome-wide association study (GWAS to identify specific genetic variants that underlie susceptibility to disease caused by Staphylococcus aureus in humans. Methods: Cases (n=309 and controls (n=2,925 were genotyped at 508,921 single nucleotide polymorphisms (SNPs. Cases had at least one laboratory and clinician confirmed disease caused by S. aureus whereas controls did not. R-package (for SNP association, EIGENSOFT (to estimate and adjust for population stratification and gene- (VEGAS and pathway-based (DAVID, PANTHER, and Ingenuity Pathway Analysis analyses were performed.Results: No SNP reached genome-wide significance. Four SNPs exceeded the pConclusion: We identified potential susceptibility genes for S. aureus diseases in this preliminary study but confirmation by other studies is needed. The observed associations could be relevant given the complexity of S. aureus as a pathogen and its ability to exploit multiple biological pathways to cause infections in humans.

  17. [Phylogenetic analysis of closely related Leuconostoc citreum species based on partial housekeeping genes].

    Science.gov (United States)

    Lv, Qiang; Chen, Ming; Xu, Haiyan; Song, Yuqin; Sun, Zhihong; Dan, Tong; Sun, Tiansong

    2013-07-04

    Using the 16S rRNA, dnaA, murC and pyrG gene sequences, we identified the phylogenetic relationship among closely related Leuconostoc citreum species. Seven Leu. citreum strains originally isolated from sourdough were characterized by PCR methods to amplify the dnaA, murC and pyrG gene sequences, which were determined to assess the suitability as phylogenetic markers. Then, we estimated the genetic distance and constructed the phylogenetic trees including 16S rRNA and above mentioned three housekeeping genes combining with published corresponding sequences. By comparing the phylogenetic trees, the topology of three housekeeping genes trees were consistent with that of 16S rRNA gene. The homology of closely related Leu. citreum species among dnaA, murC, pyrG and 16S rRNA gene sequences were different, ranged from75.5% to 97.2%, 50.2% to 99.7%, 65.0% to 99.8% and 98.5% 100%, respectively. The phylogenetic relationship of three housekeeping genes sequences were highly consistent with the results of 16S rRNA gene sequence, while the genetic distance of these housekeeping genes were extremely high than 16S rRNA gene. Consequently, the dnaA, murC and pyrG gene are suitable for classification and identification closely related Leu. citreum species.

  18. Chronic obstructive pulmonary disease candidate gene prioritization based on metabolic networks and functional information.

    Directory of Open Access Journals (Sweden)

    Xinyan Wang

    Full Text Available Chronic obstructive pulmonary disease (COPD is a multi-factor disease, in which metabolic disturbances played important roles. In this paper, functional information was integrated into a COPD-related metabolic network to assess similarity between genes. Then a gene prioritization method was applied to the COPD-related metabolic network to prioritize COPD candidate genes. The gene prioritization method was superior to ToppGene and ToppNet in both literature validation and functional enrichment analysis. Top-ranked genes prioritized from the metabolic perspective with functional information could promote the better understanding about the molecular mechanism of this disease. Top 100 genes might be potential markers for diagnostic and effective therapies.

  19. Accurate, model-based tuning of synthetic gene expression using introns in S. cerevisiae.

    Directory of Open Access Journals (Sweden)

    Ido Yofe

    2014-06-01

    Full Text Available Introns are key regulators of eukaryotic gene expression and present a potentially powerful tool for the design of synthetic eukaryotic gene expression systems. However, intronic control over gene expression is governed by a multitude of complex, incompletely understood, regulatory mechanisms. Despite this lack of detailed mechanistic understanding, here we show how a relatively simple model enables accurate and predictable tuning of synthetic gene expression system in yeast using several predictive intron features such as transcript folding and sequence motifs. Using only natural Saccharomyces cerevisiae introns as regulators, we demonstrate fine and accurate control over gene expression spanning a 100 fold expression range. These results broaden the engineering toolbox of synthetic gene expression systems and provide a framework in which precise and robust tuning of gene expression is accomplished.

  20. Liposome-based DNA carriers may induce cellular stress response and change gene expression pattern in transfected cells

    Science.gov (United States)

    2011-01-01

    Background During functional studies on the rat stress-inducible Hspa1b (hsp70.1) gene we noticed that some liposome-based DNA carriers, which are used for transfection, induce its promoter activity. This observation concerned commercial liposome formulations (LA), Lipofectin and Lipofectamine 2000. This work was aimed to understand better the mechanism of this phenomenon and its potential biological and practical consequences. Results We found that a reporter gene driven by Hspa1b promoter is activated both in the case of transient transfections and in the stably transfected cells treated with LA. Using several deletion clones containing different fragments of Hspa1b promoter, we found that the regulatory elements responsible for most efficient LA-driven inducibility were located between nucleotides -269 and +85, relative to the transcription start site. Further studies showed that the induction mechanism was independent of the classical HSE-HSF interaction that is responsible for gene activation during heat stress. Using DNA microarrays we also detected significant activation of the endogenous Hspa1b gene in cells treated with Lipofectamine 2000. Several other stress genes were also induced, along with numerous genes involved in cellular metabolism, cell cycle control and pro-apoptotic pathways. Conclusions Our observations suggest that i) some cationic liposomes may not be suitable for functional studies on hsp promoters, ii) lipofection may cause unintended changes in global gene expression in the transfected cells. PMID:21663599

  1. Liposome-based DNA carriers may induce cellular stress response and change gene expression pattern in transfected cells

    Directory of Open Access Journals (Sweden)

    Lisowska Katarzyna Marta

    2011-06-01

    Full Text Available Abstract Background During functional studies on the rat stress-inducible Hspa1b (hsp70.1 gene we noticed that some liposome-based DNA carriers, which are used for transfection, induce its promoter activity. This observation concerned commercial liposome formulations (LA, Lipofectin and Lipofectamine 2000. This work was aimed to understand better the mechanism of this phenomenon and its potential biological and practical consequences. Results We found that a reporter gene driven by Hspa1b promoter is activated both in the case of transient transfections and in the stably transfected cells treated with LA. Using several deletion clones containing different fragments of Hspa1b promoter, we found that the regulatory elements responsible for most efficient LA-driven inducibility were located between nucleotides -269 and +85, relative to the transcription start site. Further studies showed that the induction mechanism was independent of the classical HSE-HSF interaction that is responsible for gene activation during heat stress. Using DNA microarrays we also detected significant activation of the endogenous Hspa1b gene in cells treated with Lipofectamine 2000. Several other stress genes were also induced, along with numerous genes involved in cellular metabolism, cell cycle control and pro-apoptotic pathways. Conclusions Our observations suggest that i some cationic liposomes may not be suitable for functional studies on hsp promoters, ii lipofection may cause unintended changes in global gene expression in the transfected cells.

  2. dictyExpress: a Dictyostelium discoideum gene expression database with an explorative data analysis web-based interface

    Science.gov (United States)

    Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz

    2009-01-01

    Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156

  3. Multiple-endpoints gene alteration-based (MEGA) assay: A toxicogenomics approach for water quality assessment of wastewater effluents.

    Science.gov (United States)

    Fukushima, Toshikazu; Hara-Yamamura, Hiroe; Nakashima, Koji; Tan, Lea Chua; Okabe, Satoshi

    2017-12-01

    Wastewater effluents contain a significant number of toxic contaminants, which, even at low concentrations, display a wide variety of toxic actions. In this study, we developed a multiple-endpoints gene alteration-based (MEGA) assay, a real-time PCR-based transcriptomic analysis, to assess the water quality of wastewater effluents for human health risk assessment and management. Twenty-one genes from the human hepatoblastoma cell line (HepG2), covering the basic health-relevant stress responses such as response to xenobiotics, genotoxicity, and cytotoxicity, were selected and incorporated into the MEGA assay. The genes related to the p53-mediated DNA damage response and cytochrome P450 were selected as markers for genotoxicity and response to xenobiotics, respectively. Additionally, the genes that were dose-dependently regulated by exposure to the wastewater effluents were chosen as markers for cytotoxicity. The alterations in the expression of an individual gene, induced by exposure to the wastewater effluents, were evaluated by real-time PCR and the results were validated by genotoxicity (e.g., comet assay) and cell-based cytotoxicity tests. In summary, the MEGA assay is a real-time PCR-based assay that targets cellular responses to contaminants present in wastewater effluents at the transcriptional level; it is rapid, cost-effective, and high-throughput and can thus complement any chemical analysis for water quality assessment and management. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Cloning of low dose radiation induced gene RIG1 by RACE based on non-cloned cDNA library

    International Nuclear Information System (INIS)

    Luo Ying; Sui Jianli; Tie Yi; Zhang Yuanping; Zhou Pingkun; Sun Zhixian

    2001-01-01

    Objective: To obtain full-length cDNA of radiation induced new gene RIG1 based on its EST fragment. Methods: Based on non-cloned cDNA library, enhanced nested RACE PCR and biotin-avidin labelled probe for magnetic bead purification was used to obtain full-length cDNA of RIG1. Results: About 1 kb of 3' end of RIG1 gene was successfully cloned by this set of methods and cloning of RIG1 5' end is proceeding well. Conclusion: The result is consistent with the design of experiment. This set of protocol is useful for cloning of full-length gene based on EST fragment

  5. Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Science.gov (United States)

    Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A

    2015-01-01

    Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  6. Congruent Deep Relationships in the Grape Family (Vitaceae Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    Full Text Available Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera. The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.

  7. Genealogy-based methods for inference of historical recombination and gene flow and their application in Saccharomyces cerevisiae.

    Science.gov (United States)

    Jenkins, Paul A; Song, Yun S; Brem, Rachel B

    2012-01-01

    Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.

  8. iSyTE 2.0: a database for expression-based gene discovery in the eye

    Science.gov (United States)

    Kakrana, Atul; Yang, Andrian; Anand, Deepti; Djordjevic, Djordje; Ramachandruni, Deepti; Singh, Abhyudai; Huang, Hongzhan

    2018-01-01

    Abstract Although successful in identifying new cataract-linked genes, the previous version of the database iSyTE (integrated Systems Tool for Eye gene discovery) was based on expression information on just three mouse lens stages and was functionally limited to visualization by only UCSC-Genome Browser tracks. To increase its efficacy, here we provide an enhanced iSyTE version 2.0 (URL: http://research.bioinformatics.udel.edu/iSyTE) based on well-curated, comprehensive genome-level lens expression data as a one-stop portal for the effective visualization and analysis of candidate genes in lens development and disease. iSyTE 2.0 includes all publicly available lens Affymetrix and Illumina microarray datasets representing a broad range of embryonic and postnatal stages from wild-type and specific gene-perturbation mouse mutants with eye defects. Further, we developed a new user-friendly web interface for direct access and cogent visualization of the curated expression data, which supports convenient searches and a range of downstream analyses. The utility of these new iSyTE 2.0 features is illustrated through examples of established genes associated with lens development and pathobiology, which serve as tutorials for its application by the end-user. iSyTE 2.0 will facilitate the prioritization of eye development and disease-linked candidate genes in studies involving transcriptomics or next-generation sequencing data, linkage analysis and GWAS approaches. PMID:29036527

  9. RNAi-based therapeutic nanostrategy: IL-8 gene silencing in pancreatic cancer cells using gold nanorods delivery vehicles

    International Nuclear Information System (INIS)

    Panwar, Nishtha; Yang, Chengbin; Yin, Feng; Chuan, Tjin Swee; Yong, Ken-Tye; Yoon, Ho Sup

    2015-01-01

    RNA interference (RNAi)-based gene silencing possesses great ability for therapeutic intervention in pancreatic cancer. Among various oncogene mutations, Interleukin-8 (IL-8) gene mutations are found to be overexpressed in many pancreatic cell lines. In this work, we demonstrate IL-8 gene silencing by employing an RNAi-based gene therapy approach and this is achieved by using gold nanorods (AuNRs) for efficient delivery of IL-8 small interfering RNA (siRNA) to the pancreatic cell lines of MiaPaCa-2 and Panc-1. Upon comparing to Panc-1 cells, we found that the dominant expression of the IL-8 gene in MiaPaCa-2 cells resulted in an aggressive behavior towards the processes of cell invasion and metastasis. We have hence investigated the suitability of using AuNRs as novel non-viral nanocarriers for the efficient uptake and delivery of IL-8 siRNA in realizing gene knockdown of both MiaPaCa-2 and Panc-1 cells. Flow cytometry and fluorescence imaging techniques have been applied to confirm transfection and release of IL-8 siRNA. The ratio of AuNRs and siRNA has been optimized and transfection efficiencies as high as 88.40 ± 2.14% have been achieved. Upon successful delivery of IL-8 siRNA into cancer cells, the effects of IL-8 gene knockdown are quantified in terms of gene expression, cell invasion, cell migration and cell apoptosis assays. Statistical comparative studies for both MiaPaCa-2 and Panc-1 cells are presented in this work. IL-8 gene silencing has been demonstrated with knockdown efficiencies of 81.02 ± 10.14% and 75.73 ± 6.41% in MiaPaCa-2 and Panc-1 cells, respectively. Our results are then compared with a commercial transfection reagent, Oligofectamine, serving as positive control. The gene knockdown results illustrate the potential role of AuNRs as non-viral gene delivery vehicles for RNAi-based targeted cancer therapy applications. (paper)

  10. A Shortest-Path-Based Method for the Analysis and Prediction of Fruit-Related Genes in Arabidopsis thaliana.

    Science.gov (United States)

    Zhu, Liucun; Zhang, Yu-Hang; Su, Fangchu; Chen, Lei; Huang, Tao; Cai, Yu-Dong

    2016-01-01

    Biologically, fruits are defined as seed-bearing reproductive structures in angiosperms that develop from the ovary. The fertilization, development and maturation of fruits are crucial for plant reproduction and are precisely regulated by intrinsic genetic regulatory factors. In this study, we used Arabidopsis thaliana as a model organism and attempted to identify novel genes related to fruit-associated biological processes. Specifically, using validated genes, we applied a shortest-path-based method to identify several novel genes in a large network constructed using the protein-protein interactions observed in Arabidopsis thaliana. The described analyses indicate that several of the discovered genes are associated with fruit fertilization, development and maturation in Arabidopsis thaliana.

  11. Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics.

    Science.gov (United States)

    de Angelis, Martin Hrabě; Nicholson, George; Selloum, Mohammed; White, Jacqui; Morgan, Hugh; Ramirez-Solis, Ramiro; Sorg, Tania; Wells, Sara; Fuchs, Helmut; Fray, Martin; Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl Mj; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie; Holmes, Chris; Steel, Karen P; Herault, Yann; Gailus-Durner, Valérie; Mallon, Ann-Marie; Brown, Steve Dm

    2015-09-01

    The function of the majority of genes in the mouse and human genomes remains unknown. The mouse embryonic stem cell knockout resource provides a basis for the characterization of relationships between genes and phenotypes. The EUMODIC consortium developed and validated robust methodologies for the broad-based phenotyping of knockouts through a pipeline comprising 20 disease-oriented platforms. We developed new statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no previous functional annotation. We captured data from over 27,000 mice, finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. New phenotypes were uncovered for many genes with previously unknown function, providing a powerful basis for hypothesis generation and further investigation in diverse systems.

  12. Heterologous Reconstitution of the Intact Geodin Gene Cluster in Aspergillus nidulans through a Simple and Versatile PCR Based Approach

    DEFF Research Database (Denmark)

    Nielsen, Morten Thrane; Nielsen, Jakob Blæsbjerg; Anyaogu, Dianna Chinyere

    2013-01-01

    was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to ransformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were...... of solid methodology for genetic manipulation of most species severely hampers pathway haracterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus...... successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC) encodes a polyketide synthase, ATEG_08453 (gedR) encodes a transcription factor...

  13. The force analysis for superparamagnetic nanoparticles-based gene delivery in an oscillating magnetic field

    Energy Technology Data Exchange (ETDEWEB)

    Sun, Jiajia [State Key Laboratory of Electrical Insulation and Power Equipment, Xi’an Jiaotong University, No. 28 Xianning West Road, Xi’an, Shaanxi Province 710049 (China); Shi, Zongqian, E-mail: zqshi@mail.xjtu.edu.cn [State Key Laboratory of Electrical Insulation and Power Equipment, Xi’an Jiaotong University, No. 28 Xianning West Road, Xi’an, Shaanxi Province 710049 (China); Jia, Shenli [State Key Laboratory of Electrical Insulation and Power Equipment, Xi’an Jiaotong University, No. 28 Xianning West Road, Xi’an, Shaanxi Province 710049 (China); Zhang, Pengbo [Department of Anesthesiology, Second Affiliated Hospital of Xi’an Jiaotong University School of Medicine, No.157 West 5 Road, Xi’an, Shaanxi Province 710004 (China)

    2017-04-01

    Due to the peculiar magnetic properties and the ability to function in cell-level biological interaction, superparamagnetic nanoparticles (SMNP) have been being the attractive carrier for gene delivery. The superparamagnetic nanoparticles with surface-bound gene vector can be attracted to the surface of cells by the Kelvin force provided by external magnetic field. In this article, the influence of the oscillating magnetic field on the characteristics of magnetofection is studied in terms of the magnetophoretic velocity. The magnetic field of a cylindrical permanent magnet is calculated by equivalent current source (ECS) method, and the Kelvin force is derived by using the effective moment method. The results show that the static magnetic field accelerates the sedimentation of the particles, and drives the particles inward towards the axis of the magnet. Based on the investigation of the magnetophoretic velocity of the particle under horizontally oscillating magnetic field, an oscillating velocity within the amplitude of the magnet oscillation is observed. Furthermore, simulation results indicate that the oscillating amplitude plays an important role in regulating the active region, where the particles may present oscillating motion. The analysis of the magnetophoretic velocity gives us an insight into the physical mechanism of the magnetofection. It's also helpful to the optimal design of the magnetofection system. - Highlights: • We compare the results of the ECS method and FEA method with the commercial software, Ansys. • We analyze the physic mechanism of the oscillating motion of the particles in the presence of an oscillating magnet. • We discuss the influence of the oscillating amplitude of the magnet on the behavior of the particle.

  14. Classification based upon gene expression data: bias and precision of error rates.

    Science.gov (United States)

    Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L

    2007-06-01

    Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp

  15. In Silico Analysis of Microarray-Based Gene Expression Profiles Predicts Tumor Cell Response to Withanolides

    Directory of Open Access Journals (Sweden)

    Thomas Efferth

    2012-05-01

    Full Text Available Withania somnifera (L. Dunal (Indian ginseng, winter cherry, Solanaceae is widely used in traditional medicine. Roots are either chewed or used to prepare beverages (aqueous decocts. The major secondary metabolites of Withania somnifera are the withanolides, which are C-28-steroidal lactone triterpenoids. Withania somnifera extracts exert chemopreventive and anticancer activities in vitro and in vivo. The aims of the present in silico study were, firstly, to investigate whether tumor cells develop cross-resistance between standard anticancer drugs and withanolides and, secondly, to elucidate the molecular determinants of sensitivity and resistance of tumor cells towards withanolides. Using IC50 concentrations of eight different withanolides (withaferin A, withaferin A diacetate, 3-azerininylwithaferin A, withafastuosin D diacetate, 4-B-hydroxy-withanolide E, isowithanololide E, withafastuosin E, and withaperuvin and 19 established anticancer drugs, we analyzed the cross-resistance profile of 60 tumor cell lines. The cell lines revealed cross-resistance between the eight withanolides. Consistent cross-resistance between withanolides and nitrosoureas (carmustin, lomustin, and semimustin was also observed. Then, we performed transcriptomic microarray-based COMPARE and hierarchical cluster analyses of mRNA expression to identify mRNA expression profiles predicting sensitivity or resistance towards withanolides. Genes from diverse functional groups were significantly associated with response of tumor cells to withaferin A diacetate, e.g. genes functioning in DNA damage and repair, stress response, cell growth regulation, extracellular matrix components, cell adhesion and cell migration, constituents of the ribosome, cytoskeletal organization and regulation, signal transduction, transcription factors, and others.

  16. An efficient model for auxiliary diagnosis of hepatocellular carcinoma based on gene expression programming.

    Science.gov (United States)

    Zhang, Li; Chen, Jiasheng; Gao, Chunming; Liu, Chuanmiao; Xu, Kuihua

    2018-03-16

    Hepatocellular carcinoma (HCC) is a leading cause of cancer-related death worldwide. The early diagnosis of HCC is greatly helpful to achieve long-term disease-free survival. However, HCC is usually difficult to be diagnosed at an early stage. The aim of this study was to create the prediction model to diagnose HCC based on gene expression programming (GEP). GEP is an evolutionary algorithm and a domain-independent problem-solving technique. Clinical data show that six serum biomarkers, including gamma-glutamyl transferase, C-reaction protein, carcinoembryonic antigen, alpha-fetoprotein, carbohydrate antigen 153, and carbohydrate antigen 199, are related to HCC characteristics. In this study, the prediction of HCC was made based on these six biomarkers (195 HCC patients and 215 non-HCC controls) by setting up optimal joint models with GEP. The GEP model discriminated 353 out of 410 subjects, representing a determination coefficient of 86.28% (283/328) and 85.37% (70/82) for training and test sets, respectively. Compared to the results from the support vector machine, the artificial neural network, and the multilayer perceptron, GEP showed a better outcome. The results suggested that GEP modeling was a promising and excellent tool in diagnosis of hepatocellular carcinoma, and it could be widely used in HCC auxiliary diagnosis. Graphical abstract The process to establish an efficient model for auxiliary diagnosis of hepatocellular carcinoma.

  17. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    Science.gov (United States)

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. The integration of weighted gene association networks based on information entropy.

    Science.gov (United States)

    Yang, Fan; Wu, Duzhi; Lin, Limei; Yang, Jian; Yang, Tinghong; Zhao, Jing

    2017-01-01

    Constructing genome scale weighted gene association networks (WGAN) from multiple data sources is one of research hot spots in systems biology. In this paper, we employ information entropy to describe the uncertain degree of gene-gene links and propose a strategy for data integration of weighted networks. We use this method to integrate four existing human weighted gene association networks and construct a much larger WGAN, which includes richer biology information while still keeps high functional relevance between linked gene pairs. The new WGAN shows satisfactory performance in disease gene prediction, which suggests the reliability of our integration strategy. Compared with existing integration methods, our method takes the advantage of the inherent characteristics of the component networks and pays less attention to the biology background of the data. It can make full use of existing biological networks with low computational effort.

  19. GGDonto ontology as a knowledge-base for genetic diseases and disorders of glycan metabolism and their causative genes.

    Science.gov (United States)

    Solovieva, Elena; Shikanai, Toshihide; Fujita, Noriaki; Narimatsu, Hisashi

    2018-04-18

    Inherited mutations in glyco-related genes can affect the biosynthesis and degradation of glycans and result in severe genetic diseases and disorders. The Glyco-Disease Genes Database (GDGDB), which provides information about these diseases and disorders as well as their causative genes, has been developed by the Research Center for Medical Glycoscience (RCMG) and released in April 2010. GDGDB currently provides information on about 80 genetic diseases and disorders caused by single-gene mutations in glyco-related genes. Many biomedical resources provide information about genetic disorders and genes involved in their pathogenesis, but resources focused on genetic disorders known to be related to glycan metabolism are lacking. With the aim of providing more comprehensive knowledge on genetic diseases and disorders of glycan biosynthesis and degradation, we enriched the content of the GDGDB database and improved the methods for data representation. We developed the Genetic Glyco-Diseases Ontology (GGDonto) and a RDF/SPARQL-based user interface using Semantic Web technologies. In particular, we represented the GGDonto content using Semantic Web languages, such as RDF, RDFS, SKOS, and OWL, and created an interactive user interface based on SPARQL queries. This user interface provides features to browse the hierarchy of the ontology, view detailed information on diseases and related genes, and find relevant background information. Moreover, it provides the ability to filter and search information by faceted and keyword searches. Focused on the molecular etiology, pathogenesis, and clinical manifestations of genetic diseases and disorders of glycan metabolism and developed as a knowledge-base for this scientific field, GGDonto provides comprehensive information on various topics, including links to aid the integration with other scientific resources. The availability and accessibility of this knowledge will help users better understand how genetic defects impact the

  20. Sex-based differences in gene expression in hippocampus following postnatal lead exposure

    International Nuclear Information System (INIS)

    Schneider, J.S.; Anderson, D.W.; Sonnenahalli, H.; Vadigepalli, R.

    2011-01-01

    The influence of sex as an effect modifier of childhood lead poisoning has received little systematic attention. Considering the paucity of information available concerning the interactive effects of lead and sex on the brain, the current study examined the interactive effects of lead and sex on gene expression patterns in the hippocampus, a structure involved in learning and memory. Male or female rats were fed either 1500 ppm lead-containing chow or control chow for 30 days beginning at weaning.Blood lead levels were 26.7 ± 2.1 μg/dl and 27.1 ± 1.7 μg/dl for females and males, respectively. The expression of 175 unique genes was differentially regulated between control male and female rats. A total of 167 unique genes were differentially expressed in response to lead in either males or females. Lead exposure had a significant effect without a significant difference between male and female responses in 77 of these genes. In another set of 71 genes, there were significant differences in male vs. female response. A third set of 30 genes was differentially expressed in opposite directions in males vs. females, with the majority of genes expressed at a lower level in females than in males. Highly differentially expressed genes in males and females following lead exposure were associated with diverse biological pathways and functions. These results show that a brief exposure to lead produced significant changes in expression of a variety of genes in the hippocampus and that the response of the brain to a given lead exposure may vary depending on sex. - Highlights: → Postnatal lead exposure has a significant effect on hippocampal gene expression patterns. → At least one set of genes was affected in opposite directions in males and females. → Differentially expressed genes were associated with diverse biological pathways.

  1. Microarray-based genomic surveying of gene polymorphisms in Chlamydia trachomatis

    OpenAIRE

    Brunelle, Brian W; Nicholson, Tracy L; Stephens, Richard S

    2004-01-01

    By comparing two fully sequenced genomes of Chlamydia trachomatis using competitive hybridization on DNA microarrays, a logarithmic correlation was demonstrated between the signal ratio of the arrays and the 75-99% range of nucleotide identities of the genes. Variable genes within 14 uncharacterized strains of C. trachomatis were identified by array analysis and verified by DNA sequencing. These genes may be crucial for understanding chlamydial virulence and pathogenesis.

  2. An efficient nonviral gene-delivery vector based on hyperbranched cationic glycogen derivatives

    Directory of Open Access Journals (Sweden)

    Liang X

    2014-01-01

    Full Text Available Xuan Liang,1,* Xianyue Ren,2,* Zhenzhen Liu,1 Yingliang Liu,1 Jue Wang,2 Jingnan Wang,2 Li-Ming Zhang,1 David YB Deng,2 Daping Quan,1 Liqun Yang1 1Institute of Polymer Science, School of Chemistry and Chemical Engineering, Key Laboratory of Designed Synthesis and Application of Polymer Material, Key Laboratory for Polymeric Composite and Functional Materials of Ministry of Education, Sun Yat-Sen University, Guangzhou, People's Republic of China; 2Research Center of Translational Medicine, The First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, People's Republic of China *Both these authors contributed equally to this work Background: The purpose of this study was to synthesize and evaluate hyperbranched cationic glycogen derivatives as an efficient nonviral gene-delivery vector. Methods: A series of hyperbranched cationic glycogen derivatives conjugated with 3-(dimethylamino-1-propylamine (DMAPA-Glyp and 1-(2-aminoethyl piperazine (AEPZ-Glyp residues were synthesized and characterized by Fourier-transform infrared and hydrogen-1 nuclear magnetic resonance spectroscopy. Their buffer capacity was assessed by acid–base titration in aqueous NaCl solution. Plasmid deoxyribonucleic acid (pDNA condensation ability and protection against DNase I degradation of the glycogen derivatives were assessed using agarose gel electrophoresis. The zeta potentials and particle sizes of the glycogen derivative/pDNA complexes were measured, and the images of the complexes were observed using atomic force microscopy. Blood compatibility and cytotoxicity were evaluated by hemolysis assay and MTT (3-[4,5-dimethylthiazol-2-yl]-2,5-diphenyltetrazolium bromide assay, respectively. pDNA transfection efficiency mediated by the cationic glycogen derivatives was evaluated by flow cytometry and fluorescence microscopy in the 293T (human embryonic kidney and the CNE2 (human nasopharyngeal carcinoma cell lines. In vivo delivery of pDNA in model animals (Sprague Dawley

  3. Quantitative multiplex quantum dot in-situ hybridisation based gene expression profiling in tissue microarrays identifies prognostic genes in acute myeloid leukaemia

    Energy Technology Data Exchange (ETDEWEB)

    Tholouli, Eleni [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); MacDermott, Sarah [The Medical School, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Hoyland, Judith [School of Biomedicine, Faculty of Medical and Human Sciences, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Yin, John Liu [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); Byers, Richard, E-mail: richard.byers@cmft.nhs.uk [School of Cancer and Enabling Sciences, Faculty of Medical and Human Sciences, The University of Manchester, Stopford Building, Oxford Road, M13 9PT Manchester (United Kingdom)

    2012-08-24

    Highlights: Black-Right-Pointing-Pointer Development of a quantitative high throughput in situ expression profiling method. Black-Right-Pointing-Pointer Application to a tissue microarray of 242 AML bone marrow samples. Black-Right-Pointing-Pointer Identification of HOXA4, HOXA9, Meis1 and DNMT3A as prognostic markers in AML. -- Abstract: Measurement and validation of microarray gene signatures in routine clinical samples is problematic and a rate limiting step in translational research. In order to facilitate measurement of microarray identified gene signatures in routine clinical tissue a novel method combining quantum dot based oligonucleotide in situ hybridisation (QD-ISH) and post-hybridisation spectral image analysis was used for multiplex in-situ transcript detection in archival bone marrow trephine samples from patients with acute myeloid leukaemia (AML). Tissue-microarrays were prepared into which white cell pellets were spiked as a standard. Tissue microarrays were made using routinely processed bone marrow trephines from 242 patients with AML. QD-ISH was performed for six candidate prognostic genes using triplex QD-ISH for DNMT1, DNMT3A, DNMT3B, and for HOXA4, HOXA9, Meis1. Scrambled oligonucleotides were used to correct for background staining followed by normalisation of expression against the expression values for the white cell pellet standard. Survival analysis demonstrated that low expression of HOXA4 was associated with poorer overall survival (p = 0.009), whilst high expression of HOXA9 (p < 0.0001), Meis1 (p = 0.005) and DNMT3A (p = 0.04) were associated with early treatment failure. These results demonstrate application of a standardised, quantitative multiplex QD-ISH method for identification of prognostic markers in formalin-fixed paraffin-embedded clinical samples, facilitating measurement of gene expression signatures in routine clinical samples.

  4. Heterologous reconstitution of the intact geodin gene cluster in Aspergillus nidulans through a simple and versatile PCR based approach.

    Directory of Open Access Journals (Sweden)

    Morten Thrane Nielsen

    Full Text Available Fungal natural products are a rich resource for bioactive molecules. To fully exploit this potential it is necessary to link genes to metabolites. Genetic information for numerous putative biosynthetic pathways has become available in recent years through genome sequencing. However, the lack of solid methodology for genetic manipulation of most species severely hampers pathway characterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to transformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC encodes a polyketide synthase, ATEG_08453 (gedR encodes a transcription factor responsible for activation of the geodin gene cluster and ATEG_08460 (gedL encodes a halogenase that catalyzes conversion of sulochrin to dihydrogeodin. We expect that our approach for transferring intact biosynthetic pathways to a fungus with a well developed genetic toolbox will be instrumental in characterizing the many exciting pathways for secondary metabolite production that are currently being uncovered by the fungal genome sequencing projects.

  5. Prediction of lymphatic metastasis based on gene expression profile analysis after brachytherapy for early-stage oral tongue carcinoma

    International Nuclear Information System (INIS)

    Watanabe, Hiroshi; Mogushi, Kaoru; Miura, Masahiko; Yoshimura, Ryo-ichi; Kurabayashi, Tohru; Shibuya, Hitoshi; Tanaka, Hiroshi; Noda, Shuhei; Iwakawa, Mayumi; Imai, Takashi

    2008-01-01

    Background and purpose: The management of lymphatic metastasis of early-stage oral tongue carcinoma patients is crucial for its prognosis. The purpose of this study was to evaluate the predictive ability of lymphatic metastasis after brachytherapy (BRT) for early-stage tongue carcinoma based on gene expression profiling. Patients and methods: Pre-therapeutic biopsies from 39 patients with T1 or T2 tongue cancer were analyzed for gene expression signatures using Codelink Uniset Human 20K Bioarray. All patients were treated with low dose-rate BRT for their primary lesions and underwent strict follow-up under a wait-and-see policy for cervical lymphatic metastasis. Candidate genes were selected for predicting lymph-node status in the reference group by the permutation test. Predictive accuracy was further evaluated by the prediction strength (PS) scoring system using an independent validation group. Results: We selected a set of 19 genes whose expression differed significantly between classes with or without lymphatic metastasis in the reference group. The lymph-node status in the validation group was predicted by the PS scoring system with an accuracy of 76%. Conclusions: Gene expression profiling using 19 genes in primary tumor tissues may allow prediction of lymphatic metastasis after BRT for early-stage oral tongue carcinoma

  6. PCR-based detection of resistance genes in anaerobic bacteria isolated from intra-abdominal infections.

    Science.gov (United States)

    Tran, Chau Minh; Tanaka, Kaori; Watanabe, Kunitomo

    2013-04-01

    Little information is available on the distribution of antimicrobial resistance genes in anaerobes in Japan. To understand the background of antimicrobial resistance in anaerobes involved in intra-abdominal infections, we investigated the distribution of eight antimicrobial resistance genes (cepA, cfiA, cfxA, ermF, ermB, mefA, tetQ, and nim) and a mutation in the gyrA gene in a total of 152 organisms (Bacteroides spp., Prevotella spp., Fusobacterium spp., Porphyromonas spp., Bilophila wadsworthia, Desulfovibrio desulfuricans, Veillonella spp., gram-positive cocci, and non-spore-forming gram-positive bacilli) isolated between 2003 and 2004 in Japan. The cepA gene was distributed primarily in Bacteroides fragilis. Gene cfxA was detected in about 9 % of the Bacteroides isolates and 75 % of the Prevotella spp. isolates and did not appear to contribute to cephamycin resistance. Two strains of B. fragilis contained the metallo-β-lactamase gene cfiA, but they did not produce the protein product. Gene tetQ was detected in about 81, 44, and 63 % of B. fragilis isolates, other Bacteroides spp., and Prevotella spp. isolates, respectively. The ermF gene was detected in 25, 13, 56, 64, and 16 % of Bacteroides spp., Prevotella spp., Fusobacterium spp., B. wadsworthia, and anaerobic cocci, respectively. Gene mefA was found in only 10 % of the B. fragilis strains and 3 % of the non-B. fragilis strains. Genes nim and ermB were not detected in any isolate. Substitution at position 82 (Ser to Phe) in gyrA was detected in B. fragilis isolates that were less susceptible or resistant to moxifloxacin. This study is the first report on the distribution of resistance genes in anaerobes isolated from intra-abdominal infections in Japan. We expect that the results might help in understanding the resistance mechanisms of specific anaerobes.

  7. Metagenomic-based study of the phylogenetic and functional gene diversity in Galápagos land and marine iguanas.

    Science.gov (United States)

    Hong, Pei-Ying; Mao, Yuejian; Ortiz-Kofoed, Shannon; Shah, Rushabh; Cann, Isaac; Mackie, Roderick I

    2015-02-01

    In this study, a metagenome-based analysis of the fecal samples from the macrophytic algae-consuming marine iguana (MI; Amblyrhynchus cristatus) and terrestrial biomass-consuming land iguanas (LI; Conolophus spp.) was conducted. Phylogenetic affiliations of the fecal microbiome were more similar between both iguanas than to other mammalian herbivorous hosts. However, functional gene diversities in both MI and LI iguana hosts differed in relation to the diet, where the MI fecal microbiota had a functional diversity that clustered apart from the other terrestrial-biomass consuming reptilian and mammalian hosts. A further examination of the carbohydrate-degrading genes revealed that several of the prevalent glycosyl hydrolases (GH), glycosyl transferases (GT), carbohydrate binding modules (CBM), and carbohydrate esterases (CE) gene classes were conserved among all examined herbivorous hosts, reiterating the important roles these genes play in the breakdown and metabolism of herbivorous diets. Genes encoding some classes of carbohydrate-degrading families, including GH2, GH13, GT2, GT4, CBM50, CBM48, CE4, and CE11, as well as genes associated with sulfur metabolism and dehalogenation, were highly enriched or unique to the MI. In contrast, gene sequences that relate to archaeal methanogenesis were detected only in LI fecal microbiome, and genes coding for GH13, GH66, GT2, GT4, CBM50, CBM13, CE4, and CE8 carbohydrate active enzymes were highly abundant in the LI. Bacterial populations were enriched on various carbohydrates substrates (e.g., glucose, arabinose, xylose). The majority of the enriched bacterial populations belong to genera Clostridium spp. and Enterococcus spp. that likely accounted for the high prevalence of GH13 and GH2, as well as the GT families (e.g., GT2, GT4, GT28, GT35, and GT51) that were ubiquitously present in the fecal microbiota of all herbivorous hosts.

  8. Genomic organization, annotation, and ligand-receptor inferences of chicken chemokines and chemokine receptor genes based on comparative genomics

    Directory of Open Access Journals (Sweden)

    Sze Sing-Hoi

    2005-03-01

    Full Text Available Abstract Background Chemokines and their receptors play important roles in host defense, organogenesis, hematopoiesis, and neuronal communication. Forty-two chemokines and 19 cognate receptors have been found in the human genome. Prior to this report, only 11 chicken chemokines and 7 receptors had been reported. The objectives of this study were to systematically identify chicken chemokines and their cognate receptor genes in the chicken genome and to annotate these genes and ligand-receptor binding by a comparative genomics approach. Results Twenty-three chemokine and 14 chemokine receptor genes were identified in the chicken genome. All of the chicken chemokines contained a conserved CC, CXC, CX3C, or XC motif, whereas all the chemokine receptors had seven conserved transmembrane helices, four extracellular domains with a conserved cysteine, and a conserved DRYLAIV sequence in the second intracellular domain. The number of coding exons in these genes and the syntenies are highly conserved between human, mouse, and chicken although the amino acid sequence homologies are generally low between mammalian and chicken chemokines. Chicken genes were named with the systematic nomenclature used in humans and mice based on phylogeny, synteny, and sequence homology. Conclusion The independent nomenclature of chicken chemokines and chemokine receptors suggests that the chicken may have ligand-receptor pairings similar to mammals. All identified chicken chemokines and their cognate receptors were identified in the chicken genome except CCR9, whose ligand was not identified in this study. The organization of these genes suggests that there were a substantial number of these genes present before divergence between aves and mammals and more gene duplications of CC, CXC, CCR, and CXCR subfamilies in mammals than in aves after the divergence.

  9. Next generation sequencing based transcriptome analysis of septic-injury responsive genes in the beetle Tribolium castaneum.

    Directory of Open Access Journals (Sweden)

    Boran Altincicek

    Full Text Available Beetles (Coleoptera are the most diverse animal group on earth and interact with numerous symbiotic or pathogenic microbes in their environments. The red flour beetle Tribolium castaneum is a genetically tractable model beetle species and its whole genome sequence has recently been determined. To advance our understanding of the molecular basis of beetle immunity here we analyzed the whole transcriptome of T. castaneum by high-throughput next generation sequencing technology. Here, we demonstrate that the Illumina/Solexa sequencing approach of cDNA samples from T. castaneum including over 9.7 million reads with 72 base pairs (bp length (approximately 700 million bp sequence information with about 30× transcriptome coverage confirms the expression of most predicted genes and enabled subsequent qualitative and quantitative transcriptome analysis. This approach recapitulates our recent quantitative real-time PCR studies of immune-challenged and naïve T. castaneum beetles, validating our approach. Furthermore, this sequencing analysis resulted in the identification of 73 differentially expressed genes upon immune-challenge with statistical significance by comparing expression data to calculated values derived by fitting to generalized linear models. We identified up regulation of diverse immune-related genes (e.g. Toll receptor, serine proteinases, DOPA decarboxylase and thaumatin and of numerous genes encoding proteins with yet unknown functions. Of note, septic-injury resulted also in the elevated expression of genes encoding heat-shock proteins or cytochrome P450s supporting the view that there is crosstalk between immune and stress responses in T. castaneum. The present study provides a first comprehensive overview of septic-injury responsive genes in T. castaneum beetles. Identified genes advance our understanding of T. castaneum specific gene expression alteration upon immune-challenge in particular and may help to understand beetle immunity

  10. Metagenomic-Based Study of the Phylogenetic and Functional Gene Diversity in Galápagos Land and Marine Iguanas

    KAUST Repository

    Hong, Pei-Ying

    2014-12-19

    In this study, a metagenome-based analysis of the fecal samples from the macrophytic algae-consuming marine iguana (MI; Amblyrhynchus cristatus) and terrestrial biomass-consuming land iguanas (LI; Conolophus spp.) was conducted. Phylogenetic affiliations of the fecal microbiome were more similar between both iguanas than to other mammalian herbivorous hosts. However, functional gene diversities in both MI and LI iguana hosts differed in relation to the diet, where the MI fecal microbiota had a functional diversity that clustered apart from the other terrestrial-biomass consuming reptilian and mammalian hosts. A further examination of the carbohydrate-degrading genes revealed that several of the prevalent glycosyl hydrolases (GH), glycosyl transferases (GT), carbohydrate binding modules (CBM), and carbohydrate esterases (CE) gene classes were conserved among all examined herbivorous hosts, reiterating the important roles these genes play in the breakdown and metabolism of herbivorous diets. Genes encoding some classes of carbohydrate-degrading families, including GH2, GH13, GT2, GT4, CBM50, CBM48, CE4, and CE11, as well as genes associated with sulfur metabolism and dehalogenation, were highly enriched or unique to the MI. In contrast, gene sequences that relate to archaeal methanogenesis were detected only in LI fecal microbiome, and genes coding for GH13, GH66, GT2, GT4, CBM50, CBM13, CE4, and CE8 carbohydrate active enzymes were highly abundant in the LI. Bacterial populations were enriched on various carbohydrates substrates (e.g., glucose, arabinose, xylose). The majority of the enriched bacterial populations belong to genera Clostridium spp. and Enterococcus spp. that likely accounted for the high prevalence of GH13 and GH2, as well as the GT families (e.g., GT2, GT4, GT28, GT35, and GT51) that were ubiquitously present in the fecal microbiota of all herbivorous hosts.

  11. Junk DNA enhances pEI-based non-viral gene delivery

    NARCIS (Netherlands)

    Gaal, E.V.B. van; Oosting, R.S.; Hennink, W.E.; Crommelin, D.J.A.; Mastrobattista, E.

    Gene therapy aims at delivering exogenous DNA into the nuclei of target cells to establish expression of a therapeutic protein. Non-viral gene delivery is examined as a safer alternative to viral approaches, but is presently characterized by a low efficiency. In the past years several non-viral

  12. Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses

    NARCIS (Netherlands)

    Orr, J.L.; Back, W.; Gu, J.; Leegwater, P.H.; Govindarajan, P.; Conroy, J.; Ducro, B.J.; Arendonk, van J.A.M.

    2010-01-01

    The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of

  13. Effective Nanoparticle-based Gene Delivery by a Protease Triggered Charge Switch

    DEFF Research Database (Denmark)

    Gjetting, Torben; Jølck, Rasmus Irming; Andresen, Thomas Lars

    2014-01-01

    Gene carriers made from synthetic materials are of interest in relation to gene therapy but suffer from lack of transfection efficiency upon systemic delivery. To address this problem, a novel lipo-peptide-PEG conjugate constituted by a lipid-anchor, a peptide sensitive to proteases and a poly (e...

  14. An Independent Filter for Gene Set Testing Based on Spectral Enrichment

    NARCIS (Netherlands)

    Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

    2015-01-01

    Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in

  15. GeneRecon—A coalescent based tool for fine-scale association mapping

    DEFF Research Database (Denmark)

    Mailund, Thomas; Schierup, Mikkel Heide; Pedersen, Christian Nørgaard Storm

    2006-01-01

    GeneRecon is a tool for fine-scale association mapping using a coalescence model. GeneRecon takes as input case-control data from phased or unphased SNP and micro-satellite genotypes. The posterior distribution of disease locus position is obtained by Metropolis Hastings sampling in the state space...

  16. Prediction of highly expressed genes in microbes based on chromatin accessibility

    DEFF Research Database (Denmark)

    Willenbrock, Hanni; Ussery, David

    2007-01-01

    BACKGROUND: It is well known that gene expression is dependent on chromatin structure in eukaryotes and it is likely that chromatin can play a role in bacterial gene expression as well. Here, we use a nucleosomal position preference measure of anisotropic DNA flexibility to predict highly expressed...

  17. Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.

    Science.gov (United States)

    Wu, Yufeng

    2012-03-01

    Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.

  18. Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

    Science.gov (United States)

    de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

    2016-08-01

    Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected pneratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.

  19. Transcriptome characterization and sequencing-based identification of salt-responsive genes in Millettia pinnata, a semi-mangrove plant.

    Science.gov (United States)

    Huang, Jianzi; Lu, Xiang; Yan, Hao; Chen, Shouyi; Zhang, Wanke; Huang, Rongfeng; Zheng, Yizhi

    2012-04-01

    Semi-mangroves form a group of transitional species between glycophytes and halophytes, and hold unique potential for learning molecular mechanisms underlying plant salt tolerance. Millettia pinnata is a semi-mangrove plant that can survive a wide range of saline conditions in the absence of specialized morphological and physiological traits. By employing the Illumina sequencing platform, we generated ~192 million short reads from four cDNA libraries of M. pinnata and processed them into 108,598 unisequences with a high depth of coverage. The mean length and total length of these unisequences were 606 bp and 65.8 Mb, respectively. A total of 54,596 (50.3%) unisequences were assigned Nr annotations. Functional classification revealed the involvement of unisequences in various biological processes related to metabolism and environmental adaptation. We identified 23,815 candidate salt-responsive genes with significantly differential expression under seawater and freshwater treatments. Based on the reverse transcription-polymerase chain reaction (RT-PCR) and real-time PCR analyses, we verified the changes in expression levels for a number of candidate genes. The functional enrichment analyses for the candidate genes showed tissue-specific patterns of transcriptome remodelling upon salt stress in the roots and the leaves. The transcriptome of M. pinnata will provide valuable gene resources for future application in crop improvement. In addition, this study sets a good example for large-scale identification of salt-responsive genes in non-model organisms using the sequencing-based approach.

  20. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    Science.gov (United States)

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  1. Detection of 22 common leukemic fusion genes using a single-step multiplex qRT-PCR-based assay.

    Science.gov (United States)

    Lyu, Xiaodong; Wang, Xianwei; Zhang, Lina; Chen, Zhenzhu; Zhao, Yu; Hu, Jieying; Fan, Ruihua; Song, Yongping

    2017-07-25

    Fusion genes generated from chromosomal translocation play an important role in hematological malignancies. Detection of fusion genes currently employ use of either conventional RT-PCR methods or fluorescent in situ hybridization (FISH), where both methods involve tedious methodologies and require prior characterization of chromosomal translocation events as determined by cytogenetic analysis. In this study, we describe a real-time quantitative reverse transcription PCR (qRT-PCR)-based multi-fusion gene screening method with the capacity to detect 22 fusion genes commonly found in leukemia. This method does not require pre-characterization of gene translocation events, thereby facilitating immediate diagnosis and therapeutic management. We performed fluorescent qRT-PCR (F-qRT-PCR) using a commercially-available multi-fusion gene detection kit on a patient cohort of 345 individuals comprising 108 cases diagnosed with acute myeloid leukemia (AML) for initial evaluation; remaining patients within the cohort were assayed for confirmatory diagnosis. Results obtained by F-qRT-PCR were compared alongside patient analysis by cytogenetic characterization. Gene translocations detected by F-qRT-PCR in AML cases were diagnosed in 69.4% of the patient cohort, which was comparatively similar to 68.5% as diagnosed by cytogenetic analysis, thereby demonstrating 99.1% concordance. Overall gene fusion was detected in 53.7% of the overall patient population by F-qRT-PCR, 52.9% by cytogenetic prediction in leukemia, and 9.1% in non-leukemia patients by both methods. The overall concordance rate was calculated to be 99.0%. Fusion genes were detected by F-qRT-PCR in 97.3% of patients with CML, followed by 69.4% with AML, 33.3% with acute lymphoblastic leukemia (ALL), 9.1% with myelodysplastic syndromes (MDS), and 0% with chronic lymphocytic leukemia (CLL). We describe the use of a F-qRT-PCR-based multi-fusion gene screening method as an efficient one-step diagnostic procedure as an

  2. Using the Pathogen-Host Interactions database (PHI-base to investigate plant pathogen genomes and genes implicated in virulence

    Directory of Open Access Journals (Sweden)

    Martin eUrban

    2015-08-01

    Full Text Available New pathogen-host interaction mechanisms can be revealed by integrating mutant phenotype data with genetic information. PHI-base is a multi-species manually curated database combining peer-reviewed published phenotype data from plant and animal pathogens and gene/protein information in a single database.

  3. Genome-wide screening for genes whose deletions confer sensitivity to mutagenic purine base analogs in yeast

    Directory of Open Access Journals (Sweden)

    Kozmin Stanislav G

    2005-06-01

    Full Text Available Abstract Background N-hydroxylated base analogs, such as 6-hydroxylaminopurine (HAP and 2-amino-6-hydroxylaminopurine (AHA, are strong mutagens in various organisms due to their ambiguous base-pairing properties. The systems protecting cells from HAP and related noncanonical purines in Escherichia coli include specialized deoxyribonucleoside triphosphatase RdgB, DNA repair endonuclease V, and a molybdenum cofactor-dependent system. Fewer HAP-detoxification systems have been identified in yeast Saccharomyces cerevisiae and other eukaryotes. Cellular systems protecting from AHA are unknown. In the present study, we performed a genome-wide search for genes whose deletions confer sensitivity to HAP and AHA in yeast. Results We screened the library of yeast deletion mutants for sensitivity to the toxic and mutagenic action of HAP and AHA. We identified novel genes involved in the genetic control of base analogs sensitivity, including genes controlling purine metabolism, cytoskeleton organization, and amino acid metabolism. Conclusion We developed a method for screening the yeast deletion library for sensitivity to the mutagenic and toxic action of base analogs and identified 16 novel genes controlling pathways of protection from HAP. Three of them also protect from AHA.

  4. Phylogeny of the New World diploid cottons (Gossypium L., Malvaceae) based on sequences of three low-copy nuclear genes.

    Science.gov (United States)

    I. Alvarez; R. Cronn; J.F. Wendel

    2005-01-01

    American diploid cottons (Gossypium L., subgenus Houzingenia Fryxell) form a monophyletic group of 13 species distributed mainly in western Mexico, extending into Arizona, Baja California, and with one disjunct species each in the Galapagos Islands and Peru. Prior phylogenetic analyses based on an alcohol dehydrogenase gene (...

  5. Prediction of essential proteins based on subcellular localization and gene expression correlation.

    Science.gov (United States)

    Fan, Yetian; Tang, Xiwei; Hu, Xiaohua; Wu, Wei; Ping, Qing

    2017-12-01

    Essential proteins are indispensable to the survival and development process of living organisms. To understand the functional mechanisms of essential proteins, which can be applied to the analysis of disease and design of drugs, it is important to identify essential proteins from a set of proteins first. As traditional experimental methods designed to test out essential proteins are usually expensive and laborious, computational methods, which utilize biological and topological features of proteins, have attracted more attention in recent years. Protein-protein interaction networks, together with other biological data, have been explored to improve the performance of essential protein prediction. The proposed method SCP is evaluated on Saccharomyces cerevisiae datasets and compared with five other methods. The results show that our method SCP outperforms the other five methods in terms of accuracy of essential protein prediction. In this paper, we propose a novel algorithm named SCP, which combines the ranking by a modified PageRank algorithm based on subcellular compartments information, with the ranking by Pearson correlation coefficient (PCC) calculated from gene expression data. Experiments show that subcellular localization information is promising in boosting essential protein prediction.

  6. Physicochemical and biological characterization of 1,2-dialkoylamidopropane-based lipoplexes for gene delivery.

    Science.gov (United States)

    Aljaberi, Ahmad; Saleh, Suhair; Abu Khadra, Khalid M; Kearns, Molinda; Savva, Michalakis

    2015-04-01

    Elucidation of the molecular and formulation requirements for efficient lipofection is a prerequisite to enhance the biological activity of cationic lipid-mediated gene delivery systems. To this end, the in vitro lipofection activity of the ionizable asymmetric 1,2-dialkoylamidopropane-based derivatives bearing a single primary amine group as the cationic head group was evaluated. The electrostatic interactions of these cationic lipids with plasmid DNA in serum-free medium were investigated by means of gel electrophoresis retardation and Eth-Br quenching assays. The effect of the inclusion of the helper lipid DOPE in the formulation on these interactions was also considered. The physicochemical properties of these lipids in terms of bilayer fluidity and extent of ionization were investigated using fluorescence anisotropy and surface potential techniques, respectively. The results showed that only the active lipid, 1,2lmp[5], existed in a liquid crystalline state at physiological temperature. Moreover, the extent of ionization of this lipid in assemblies was significantly higher that it's saturated analogues. Inclusion of the helper lipid DOPE improved the encapsulation and association between 1,2lmp[5] and plasmid DNA, which was reflected by the significant boost of lipofection activity of the 1,2lmp[5]/DOPE formulation as compared to the lipid alone. In conclusion, membrane fluidity and sufficient protonation of ionizable cationic lipid are required for efficient association and encapsulation of plasmid DNA and elicit of improved in vitro lipofection activity. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Species Authentication of Common Meat Based on PCR Analysis of the Mitochondrial COI Gene.

    Science.gov (United States)

    Dai, Zhenyu; Qiao, Jiao; Yang, Siran; Hu, Shen; Zuo, Jingjing; Zhu, Weifeng; Huang, Chunhong

    2015-07-01

    Adulteration of meat products and costly animal-derived commodities with their inferior/cheaper counterparts is a grievous global problem. Species authentication is still technical challenging, especially to those deep processed products. The present study described the design of seven sets of species-specific primer based on a high heterozygous region of mitochondrial cytochrome c oxidase subunit I (COI) gene. These primers were proven to have high species specificity and no cross-reactions and unexpected products to different DNA source. Multiplex PCR assay was achieved for rapid and economical identification of four commonly consumed meats (pork, beef, chicken, and mutton). The conventional PCR assay was sensitive down to 0.001 ng of DNA template in the reactant. The developed method was also powerful in detecting as low as 0.1-mg adulterated pork (0.05 % in wt/wt) in an artificial counterfeited mutton. Validation test showed that the assay is specific, reproducible, and robust in commercial deep processed meats, leatherware, and feather commodities. This proposed method will be greatly beneficial to the consumers, food industry, leather, and feather commodity manufacture.

  8. Palindromic Molecule Beacon-Based Cascade Amplification for Colorimetric Detection of Cancer Genes.

    Science.gov (United States)

    Shen, Zhi-Fa; Li, Feng; Jiang, Yi-Fan; Chen, Chang; Xu, Huo; Li, Cong-Cong; Yang, Zhe; Wu, Zai-Sheng

    2018-03-06

    A highly sensitive and selective colorimetric assay based on a multifunctional molecular beacon with palindromic tail (PMB) was proposed for the detection of target p53 gene. The PMB probe can serve as recognition element, primer, and polymerization template and contains a nicking site and a C-rich region complementary to a DNAzyme. In the presence of target DNA, the hairpin of PMB is opened, and the released palindromic tails intermolecularly hybridize with each other, triggering the autonomous polymerization/nicking/displacement cycles. Although only one type of probe is involved, the system can execute triple and continuous polymerization strand displacement amplifications, generating large amounts of G-quadruplex fragments. These G-rich fragments can bind to hemin and form the DNAzymes that possess the catalytic activity similar to horseradish peroxidase, catalyzing the oxidation of ABTS by H 2 O 2 and producing the colorimetric signal. Utilizing the newly proposed sensing system, target DNA can be detected down to 10 pM with a linear response range from 10 pM to 200 nM, and mutant target DNAs are able to be distinguished even by the naked eye. The desirable detection sensitivity, high specificity, and operation convenience without any separation step and chemical modification demonstrate that the palindromic molecular beacon holds the potential for detecting and monitoring a variety of nucleic acid-related biomarkers.

  9. Sensitive detection of enteropathogenic E. coli using a bfpA gene-based electrochemical sensor

    International Nuclear Information System (INIS)

    Zhang, Wei; Luo, Caihui; Zhong, Liang; Zhao, Dan; Ding, Shijia; Nie, Shichang; Cheng, Wei

    2013-01-01

    We have developed a sensitive assay for enteropathogenic E. coli (EPEC) by integrating DNA extraction, specific polymerase chain reaction (PCR) and DNA detection using an electrode modified with the bundle-forming pilus (bfpA) structural gene. The PCR amplified products are captured on the electrode and hybridized with biotinylated detection probes to form a sandwich hybrid containing two biotinylated detection probes. The sandwich hybridization structure significantly combined the numerous streptavidin alkaline phosphatase on the electrode by biotin-streptavidin connectors. Electrochemical readout is based on dual signal amplification by both the sandwich hybridization structure and the enzyme. The electrode can satisfactorily discriminate complementary and mismatched oligonucleotides. Under optimal conditions, synthetic target DNA can be detected in the 1 pM to 10 nM concentration range, with a detection limit of 0.3 pM. EPEC can be quantified in the 10 to 10 7 CFU mL −1 levels within 3.5 h. The method also is believed to present a powerful platform for the screening of pathogenic microorganisms in clinical diagnostics, food safety and environmental monitoring. (author)

  10. Phylogenetic reconstruction of the family Acrypteridae (Orthoptera: Acridoidea) based on mitochondrial cytochrome B gene.

    Science.gov (United States)

    Huo, Guangming; Jiang, Guofang; Sun, Zhengli; Liu, Dianfeng; Zhang, Yalin; Lu, Lin

    2007-04-01

    Sequences from the mitochondrial cytochrome b gene (Cyt b) were determined for 25 species from the superfamily Acridoidae and the homologous sequences of 19 species of grasshoppers were downloaded from the GenBank data library. The purpose was to develop a molecular phylogeny of the Acrypteridae, and to interpret the phylogenetic position of the family within the superfamily Acridoidea. Phylogeny was reconstructed by Maximum-parsimony (MP) and Bayesian criteria using Yunnanites coriacea and Tagasta marginella as outgroups. The alignment length of the fragments was 384 bp after excluding ambiguous sites, including 167 parsimony informative sites. In the fragments, the percentages of A + T and G + C were 70.7% and 29.3%, respectively. The monophyly of Arcypteridae is not supported by phylogenetic trees. Within the Arcypteridae, neither Arcypterinae nor Ceracrinae is supported as a monophyletic group. The current genus Chorthippus is not a monophyletic group, and should be a polyphyletic group. The present results are significantly different from the classification scheme of Arcypteridae, which is based on morphology.

  11. Hessian regularization based symmetric nonnegative matrix factorization for clustering gene expression and microbiome data.

    Science.gov (United States)

    Ma, Yuanyuan; Hu, Xiaohua; He, Tingting; Jiang, Xingpeng

    2016-12-01

    Nonnegative matrix factorization (NMF) has received considerable attention due to its interpretation of observed samples as combinations of different components, and has been successfully used as a clustering method. As an extension of NMF, Symmetric NMF (SNMF) inherits the advantages of NMF. Unlike NMF, however, SNMF takes a nonnegative similarity matrix as an input, and two lower rank nonnegative matrices (H, H T ) are computed as an output to approximate the original similarity matrix. Laplacian regularization has improved the clustering performance of NMF and SNMF. However, Laplacian regularization (LR), as a classic manifold regularization method, suffers some problems because of its weak extrapolating ability. In this paper, we propose a novel variant of SNMF, called Hessian regularization based symmetric nonnegative matrix factorization (HSNMF), for this purpose. In contrast to Laplacian regularization, Hessian regularization fits the data perfectly and extrapolates nicely to unseen data. We conduct extensive experiments on several datasets including text data, gene expression data and HMP (Human Microbiome Project) data. The results show that the proposed method outperforms other methods, which suggests the potential application of HSNMF in biological data clustering. Copyright © 2016. Published by Elsevier Inc.

  12. 18S rRNA is a reliable normalisation gene for real time PCR based on influenza virus infected cells

    Directory of Open Access Journals (Sweden)

    Kuchipudi Suresh V

    2012-10-01

    Full Text Available Abstract Background One requisite of quantitative reverse transcription PCR (qRT-PCR is to normalise the data with an internal reference gene that is invariant regardless of treatment, such as virus infection. Several studies have found variability in the expression of commonly used housekeeping genes, such as beta-actin (ACTB and glyceraldehyde-3-phosphate dehydrogenase (GAPDH, under different experimental settings. However, ACTB and GAPDH remain widely used in the studies of host gene response to virus infections, including influenza viruses. To date no detailed study has been described that compares the suitability of commonly used housekeeping genes in influenza virus infections. The present study evaluated several commonly used housekeeping genes [ACTB, GAPDH, 18S ribosomal RNA (18S rRNA, ATP synthase, H+ transporting, mitochondrial F1 complex, beta polypeptide (ATP5B and ATP synthase, H+ transporting, mitochondrial Fo complex, subunit C1 (subunit 9 (ATP5G1] to identify the most stably expressed gene in human, pig, chicken and duck cells infected with a range of influenza A virus subtypes. Results The relative expression stability of commonly used housekeeping genes were determined in primary human bronchial epithelial cells (HBECs, pig tracheal epithelial cells (PTECs, and chicken and duck primary lung-derived cells infected with five influenza A virus subtypes. Analysis of qRT-PCR data from virus and mock infected cells using NormFinder and BestKeeper software programmes found that 18S rRNA was the most stable gene in HBECs, PTECs and avian lung cells. Conclusions Based on the presented data from cell culture models (HBECs, PTECs, chicken and duck lung cells infected with a range of influenza viruses, we found that 18S rRNA is the most stable reference gene for normalising qRT-PCR data. Expression levels of the other housekeeping genes evaluated in this study (including ACTB and GPADH were highly affected by influenza virus infection and

  13. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

    Directory of Open Access Journals (Sweden)

    Hutchison Clyde A

    2006-01-01

    Full Text Available Abstract Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs. We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency. We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.

  14. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.

    Science.gov (United States)

    Powell, Bradford C; Hutchison, Clyde A

    2006-01-19

    Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.

  15. Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

    Science.gov (United States)

    Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

    2009-02-01

    Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.

  16. RNA-based ovarian cancer research from 'a gene to systems biomedicine' perspective.

    Science.gov (United States)

    Gov, Esra; Kori, Medi; Arga, Kazim Yalcin

    2017-08-01

    Ovarian cancer remains the leading cause of death from a gynecologic malignancy, and treatment of this disease is harder than any other type of female reproductive cancer. Improvements in the diagnosis and development of novel and effective treatment strategies for complex pathophysiologies, such as ovarian cancer, require a better understanding of disease emergence and mechanisms of progression through systems medicine approaches. RNA-level analyses generate new information that can help in understanding the mechanisms behind disease pathogenesis, to identify new biomarkers and therapeutic targets and in new drug discovery. Whole RNA sequencing and coding and non-coding RNA expression array datasets have shed light on the mechanisms underlying disease progression and have identified mRNAs, miRNAs, and lncRNAs involved in ovarian cancer progression. In addition, the results from these analyses indicate that various signalling pathways and biological processes are associated with ovarian cancer. Here, we present a comprehensive literature review on RNA-based ovarian cancer research and highlight the benefits of integrative approaches within the systems biomedicine concept for future ovarian cancer research. We invite the ovarian cancer and systems biomedicine research fields to join forces to achieve the interdisciplinary caliber and rigor required to find real-life solutions to common, devastating, and complex diseases such as ovarian cancer. CAF: cancer-associated fibroblasts; COG: Cluster of Orthologous Groups; DEA: disease enrichment analysis; EOC: epithelial ovarian carcinoma; ESCC: oesophageal squamous cell carcinoma; GSI: gamma secretase inhibitor; GO: Gene Ontology; GSEA: gene set enrichment analyzes; HAS: Hungarian Academy of Sciences; lncRNAs: long non-coding RNAs; MAPK/ERK: mitogen-activated protein kinase/extracellular signal-regulated kinases; NGS: next-generation sequencing; ncRNAs: non-coding RNAs; OvC: ovarian cancer; PI3K

  17. Real-time PCR based on SYBR-Green I fluorescence: An alternative to the TaqMan assay for a relative quantification of gene rearrangements, gene amplifications and micro gene deletions

    Directory of Open Access Journals (Sweden)

    Puisieux Alain

    2003-10-01

    Full Text Available Abstract Background Real-time PCR is increasingly being adopted for RNA quantification and genetic analysis. At present the most popular real-time PCR assay is based on the hybridisation of a dual-labelled probe to the PCR product, and the development of a signal by loss of fluorescence quenching as PCR degrades the probe. Though this so-called 'TaqMan' approach has proved easy to optimise in practice, the dual-labelled probes are relatively expensive. Results We have designed a new assay based on SYBR-Green I binding that is quick, reliable, easily optimised and compares well with the published assay. Here we demonstrate its general applicability by measuring copy number in three different genetic contexts; the quantification of a gene rearrangement (T-cell receptor excision circles (TREC in peripheral blood mononuclear cells; the detection and quantification of GLI, MYC-C and MYC-N gene amplification in cell lines and cancer biopsies; and detection of deletions in the OPA1 gene in dominant optic atrophy. Conclusion Our assay has important clinical applications, providing accurate diagnostic results in less time, from less biopsy material and at less cost than assays currently employed such as FISH or Southern blotting.

  18. High rate of translocation-based gene birth on the Drosophila Y chromosome.

    Science.gov (United States)

    Tobler, Ray; Nolte, Viola; Schlötterer, Christian

    2017-10-31

    The Y chromosome is a unique genetic environment defined by a lack of recombination and male-limited inheritance. The Drosophila Y chromosome has been gradually acquiring genes from the rest of the genome, with only seven Y-linked genes being gained over the past 63 million years (0.12 gene gains per million years). Using a next-generation sequencing (NGS)-powered genomic scan, we show that gene transfers to the Y chromosome are much more common than previously suspected: at least 25 have arisen across three Drosophila species over the past 5.4 million years (1.67 per million years for each lineage). The gene transfer rate is significantly lower in Drosophila melanogaster than in the Drosophila simulans clade, primarily due to Y-linked retrotranspositions being significantly more common in the latter. Despite all Y-linked gene transfers being evolutionarily recent (Drosophila Y chromosome to be more dynamic than previously appreciated. Our analytical method provides a powerful means to identify Y-linked gene transfers and will help illuminate the evolutionary dynamics of the Y chromosome in Drosophila and other species. Copyright © 2017 the Author(s). Published by PNAS.

  19. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    Directory of Open Access Journals (Sweden)

    Liu Qingzhong

    2011-12-01

    Full Text Available Abstract Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.

  20. Dysregulated Pathway Identification of Alzheimer's Disease Based on Internal Correlation Analysis of Genes and Pathways.

    Science.gov (United States)

    Kong, Wei; Mou, Xiaoyang; Di, Benteng; Deng, Jin; Zhong, Ruxing; Wang, Shuaiqun

    2017-11-20

    Dysregulated pathway identification is an important task which can gain insight into the underlying biological processes of disease. Current pathway-identification methods focus on a set of co-expression genes and single pathways and ignore the correlation between genes and pathways. The method proposed in this study, takes into account the internal correlations not only between genes but also pathways to identifying dysregulated pathways related to Alzheimer's disease (AD), the most common form of dementia. In order to find the significantly differential genes for AD, mutual informatio