osjag gene based: Topics by WorldWideScience.org

Sample records for osjag gene based

A Nonlinear Model for Gene-Based Gene-Environment Interaction

Directory of Open Access Journals (Sweden)

Jian Sa

2016-06-01

Full Text Available A vast amount of literature has confirmed the role of gene-environment (G×E interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.
A powerful score-based test statistic for detecting gene-gene co-association.

Science.gov (United States)

Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun

2016-01-29

The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.
Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.

Science.gov (United States)

Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian

2018-02-23

Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.
Evaluation of Gene-Based Family-Based Methods to Detect Novel Genes Associated With Familial Late Onset Alzheimer Disease

Directory of Open Access Journals (Sweden)

Maria V. Fernández

2018-04-01

Full Text Available Gene-based tests to study the combined effect of rare variants on a particular phenotype have been widely developed for case-control studies, but their evolution and adaptation for family-based studies, especially studies of complex incomplete families, has been slower. In this study, we have performed a practical examination of all the latest gene-based methods available for family-based study designs using both simulated and real datasets. We examined the performance of several collapsing, variance-component, and transmission disequilibrium tests across eight different software packages and 22 models utilizing a cohort of 285 families (N = 1,235 with late-onset Alzheimer disease (LOAD. After a thorough examination of each of these tests, we propose a methodological approach to identify, with high confidence, genes associated with the tested phenotype and we provide recommendations to select the best software and model for family-based gene-based analyses. Additionally, in our dataset, we identified PTK2B, a GWAS candidate gene for sporadic AD, along with six novel genes (CHRD, CLCN2, HDLBP, CPAMD8, NLRP9, and MAS1L as candidate genes for familial LOAD.
KBERG: KnowledgeBase for Estrogen Responsive Genes

DEFF Research Database (Denmark)

Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam

2007-01-01

Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection...... is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition...
Scuba: scalable kernel-based gene prioritization.

Science.gov (United States)

Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

2018-01-25

The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .
Finding gene regulatory network candidates using the gene expression knowledge base.

Science.gov (United States)

Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

2014-12-10

Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Link-based quantitative methods to identify differentially coexpressed genes and gene Pairs

Directory of Open Access Journals (Sweden)

Ye Zhi-Qiang

2011-08-01

Full Text Available Abstract Background Differential coexpression analysis (DCEA is increasingly used for investigating the global transcriptional mechanisms underlying phenotypic changes. Current DCEA methods mostly adopt a gene connectivity-based strategy to estimate differential coexpression, which is characterized by comparing the numbers of gene neighbors in different coexpression networks. Although it simplifies the calculation, this strategy mixes up the identities of different coexpression neighbors of a gene, and fails to differentiate significant differential coexpression changes from those trivial ones. Especially, the correlation-reversal is easily missed although it probably indicates remarkable biological significance. Results We developed two link-based quantitative methods, DCp and DCe, to identify differentially coexpressed genes and gene pairs (links. Bearing the uniqueness of exploiting the quantitative coexpression change of each gene pair in the coexpression networks, both methods proved to be superior to currently popular methods in simulation studies. Re-mining of a publicly available type 2 diabetes (T2D expression dataset from the perspective of differential coexpression analysis led to additional discoveries than those from differential expression analysis. Conclusions This work pointed out the critical weakness of current popular DCEA methods, and proposed two link-based DCEA algorithms that will make contribution to the development of DCEA and help extend it to a broader spectrum.
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

Science.gov (United States)

Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

2016-01-11

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
Paper-based synthetic gene networks.

Science.gov (United States)

Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J

2014-11-06

Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.
Paper-based Synthetic Gene Networks

Science.gov (United States)

Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

2014-01-01

Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

KAUST Repository

AlShahrani, Mona; Hoehndorf, Robert

2018-01-01

In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

KAUST Repository

Alshahrani, Mona

2018-04-30

In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease\\'s (or patient\\'s) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.
Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

Directory of Open Access Journals (Sweden)

Dajeong Lim

2014-01-01

Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.
A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes

Science.gov (United States)

Seo, Minseok; Shin, Su-kyung; Kwon, Eun-Young; Kim, Sung-Eun; Bae, Yun-Jung; Lee, Seungyeoun; Sung, Mi-Kyung; Choi, Myung-Sook; Park, Taesung

2016-01-01

Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs) among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs). However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods. Through analysis of
A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes.

Directory of Open Access Journals (Sweden)

Samuel Sunghwan Cho

Full Text Available Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs. However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods
Development of gene diagnosis for diabetes and cholecystis based on gene analysis of CCK-A receptor

International Nuclear Information System (INIS)

Kono, Akira

1998-01-01

The gene structures of CCK, A type receptor in human, the rat and the mouse were investigated aiming to clarify that the aberration of the gene is involved in the incidences of diabetes and cholecystis. In this fiscal year, 1997, the normal structure of the gene and the accurate base sequence were analyzed using DNA fragments bound to 32 P-labelled cDNA of human CCKAR originated from the gene library of leucocyte. This gene contained about 2.2 x 10 5 base pairs and the base sequence was completely determined and registered to Japan DNA data bank (D85606). In addition, the genome structures and base sequences of mouse and rat CCKAR were analyzed and registered (D 85605 and D 50608, respectively). The differences in the base sequence of CCKAR among the species were found in the promotor region and the intron regions, suggesting that there might be differences in splicing among species. (M.N.)
PCR-based detection of gene transfer vectors: application to gene doping surveillance.

Science.gov (United States)

Perez, Irene C; Le Guiner, Caroline; Ni, Weiyi; Lyles, Jennifer; Moullier, Philippe; Snyder, Richard O

2013-12-01

Athletes who illicitly use drugs to enhance their athletic performance are at risk of being banned from sports competitions. Consequently, some athletes may seek new doping methods that they expect to be capable of circumventing detection. With advances in gene transfer vector design and therapeutic gene transfer, and demonstrations of safety and therapeutic benefit in humans, there is an increased probability of the pursuit of gene doping by athletes. In anticipation of the potential for gene doping, assays have been established to directly detect complementary DNA of genes that are top candidates for use in doping, as well as vector control elements. The development of molecular assays that are capable of exposing gene doping in sports can serve as a deterrent and may also identify athletes who have illicitly used gene transfer for performance enhancement. PCR-based methods to detect foreign DNA with high reliability, sensitivity, and specificity include TaqMan real-time PCR, nested PCR, and internal threshold control PCR.
Development of gene diagnosis for diabetes and cholecystitis based on gene analysis of CCK-A receptor

International Nuclear Information System (INIS)

Kono, Akira

1999-01-01

Base sequence analysis of CCKAR gene (a gene of A-type receptor for cholecystokinin) from OLETF rat, a model rat for insulin-independent diabetes was made based on the base sequence of wild CCKAR gene, which had been clarified in the previous year. From the pancreas of OLETF rat, DNA was extracted and transduced into λphage after fragmentation to construct the gene library of OLETF. Then, λphage DNA clone bound with labelled cDNA of CCKAR gene was analyzed and the gene structure was compared with that of the wild gene. It was demonstrated that CCKAR gene of OLETF had a deletion (6800 b.p.) ranging from the promoter region to the Exon 2, suggesting that CCKAR gene is not functional in OLETF rat. The whole sequence of this mutant gene was registered into Japan DNA Bank (D 50610). Then, F 2 offspring rats were obtained through crossing OLETF (female) and F344 (male) and the time course-changes in the blood glucose level after glucose loading were compared among them. The blood glucose level after glucose loading was significantly higher in the homo-mutant F 2 (CCKAR,-/-) as well as the parent OLETF rat than hetero-mutant F 2 (CCKARm-/+) or the wild rat (CCKAR,+/+). This suggests that CCKAR gene might be involved in the control of blood glucose level and an alteration of the expression level or the functions of CCKAR gene might affect the blood glucose level. (M.N.)
A PLSPM-Based Test Statistic for Detecting Gene-Gene Co-Association in Genome-Wide Association Study with Case-Control Design

Science.gov (United States)

Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

2013-01-01

For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods. PMID:23620809

Exploring the role of peptides in polymer-based gene delivery.

Science.gov (United States)

Sun, Yanping; Yang, Zhen; Wang, Chunxi; Yang, Tianzhi; Cai, Cuifang; Zhao, Xiaoyun; Yang, Li; Ding, Pingtian

2017-09-15

Polymers are widely studied as non-viral gene vectors because of their strong DNA binding ability, capacity to carry large payload, flexibility of chemical modifications, low immunogenicity, and facile processes for manufacturing. However, high cytotoxicity and low transfection efficiency substantially restrict their application in clinical trials. Incorporating functional peptides is a promising approach to address these issues. Peptides demonstrate various functions in polymer-based gene delivery systems, such as targeting to specific cells, breaching membrane barriers, facilitating DNA condensation and release, and lowering cytotoxicity. In this review, we systematically summarize the role of peptides in polymer-based gene delivery, and elaborate how to rationally design polymer-peptide based gene delivery vectors. Polymers are widely studied as non-viral gene vectors, but suffer from high cytotoxicity and low transfection efficiency. Incorporating short, bioactive peptides into polymer-based gene delivery systems can address this issue. Peptides demonstrate various functions in polymer-based gene delivery systems, such as targeting to specific cells, breaching membrane barriers, facilitating DNA condensation and release, and lowering cytotoxicity. In this review, we highlight the peptides' roles in polymer-based gene delivery, and elaborate how to utilize various functional peptides to enhance the transfection efficiency of polymers. The optimized peptide-polymer vectors should be able to alter their structures and functions according to biological microenvironments and utilize inherent intracellular pathways of cells, and consequently overcome the barriers during gene delivery to enhance transfection efficiency. Copyright © 2017 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Speeding disease gene discovery by sequence based candidate prioritization

Directory of Open Access Journals (Sweden)

Porteous David J

2005-03-01

Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.
Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

Directory of Open Access Journals (Sweden)

Ettore Mosca

2017-09-01

Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.
Fast gene ontology based clustering for microarray experiments.

Science.gov (United States)

Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

2008-11-21

Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Multi-label literature classification based on the Gene Ontology graph

Directory of Open Access Journals (Sweden)

Lu Xinghua

2008-12-01

Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate
Gene-based testing of interactions in association studies of quantitative traits.

Directory of Open Access Journals (Sweden)

Li Ma

Full Text Available Various methods have been developed for identifying gene-gene interactions in genome-wide association studies (GWAS. However, most methods focus on individual markers as the testing unit, and the large number of such tests drastically erodes statistical power. In this study, we propose novel interaction tests of quantitative traits that are gene-based and that confer advantage in both statistical power and biological interpretation. The framework of gene-based gene-gene interaction (GGG tests combine marker-based interaction tests between all pairs of markers in two genes to produce a gene-level test for interaction between the two. The tests are based on an analytical formula we derive for the correlation between marker-based interaction tests due to linkage disequilibrium. We propose four GGG tests that extend the following P value combining methods: minimum P value, extended Simes procedure, truncated tail strength, and truncated P value product. Extensive simulations point to correct type I error rates of all tests and show that the two truncated tests are more powerful than the other tests in cases of markers involved in the underlying interaction not being directly genotyped and in cases of multiple underlying interactions. We applied our tests to pairs of genes that exhibit a protein-protein interaction to test for gene-level interactions underlying lipid levels using genotype data from the Atherosclerosis Risk in Communities study. We identified five novel interactions that are not evident from marker-based interaction testing and successfully replicated one of these interactions, between SMAD3 and NEDD9, in an independent sample from the Multi-Ethnic Study of Atherosclerosis. We conclude that our GGG tests show improved power to identify gene-level interactions in existing, as well as emerging, association studies.
Fast Gene Ontology based clustering for microarray experiments

Directory of Open Access Journals (Sweden)

Ovaska Kristian

2008-11-01

Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Two-Way Gene Interaction From Microarray Data Based on Correlation Methods.

Science.gov (United States)

Alavi Majd, Hamid; Talebi, Atefeh; Gilany, Kambiz; Khayyer, Nasibeh

2016-06-01

Gene networks have generated a massive explosion in the development of high-throughput techniques for monitoring various aspects of gene activity. Networks offer a natural way to model interactions between genes, and extracting gene network information from high-throughput genomic data is an important and difficult task. The purpose of this study is to construct a two-way gene network based on parametric and nonparametric correlation coefficients. The first step in constructing a Gene Co-expression Network is to score all pairs of gene vectors. The second step is to select a score threshold and connect all gene pairs whose scores exceed this value. In the foundation-application study, we constructed two-way gene networks using nonparametric methods, such as Spearman's rank correlation coefficient and Blomqvist's measure, and compared them with Pearson's correlation coefficient. We surveyed six genes of venous thrombosis disease, made a matrix entry representing the score for the corresponding gene pair, and obtained two-way interactions using Pearson's correlation, Spearman's rank correlation, and Blomqvist's coefficient. Finally, these methods were compared with Cytoscape, based on BIND, and Gene Ontology, based on molecular function visual methods; R software version 3.2 and Bioconductor were used to perform these methods. Based on the Pearson and Spearman correlations, the results were the same and were confirmed by Cytoscape and GO visual methods; however, Blomqvist's coefficient was not confirmed by visual methods. Some results of the correlation coefficients are not the same with visualization. The reason may be due to the small number of data.
Density based pruning for identification of differentially expressed genes from microarray data

Directory of Open Access Journals (Sweden)

Xu Jia

2010-11-01

Full Text Available Abstract Motivation Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes. Results We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change. Conclusions Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune
Comparative study on gene set and pathway topology-based enrichment methods.

Science.gov (United States)

Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim

2015-10-22

Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both
A sight on the current nanoparticle-based gene delivery vectors

Science.gov (United States)

Dizaj, Solmaz Maleki; Jafari, Samira; Khosroushahi, Ahmad Yari

2014-05-01

Nowadays, gene delivery for therapeutic objects is considered one of the most promising strategies to cure both the genetic and acquired diseases of human. The design of efficient gene delivery vectors possessing the high transfection efficiencies and low cytotoxicity is considered the major challenge for delivering a target gene to specific tissues or cells. On this base, the investigations on non-viral gene vectors with the ability to overcome physiological barriers are increasing. Among the non-viral vectors, nanoparticles showed remarkable properties regarding gene delivery such as the ability to target the specific tissue or cells, protect target gene against nuclease degradation, improve DNA stability, and increase the transformation efficiency or safety. This review attempts to represent a current nanoparticle based on its lipid, polymer, hybrid, and inorganic properties. Among them, hybrids, as efficient vectors, are utilized in gene delivery in terms of materials (synthetic or natural), design, and in vitro/ in vivo transformation efficiency.
GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies.

Science.gov (United States)

Yung, Ling Sing; Yang, Can; Wan, Xiang; Yu, Weichuan

2011-05-01

Collecting millions of genetic variations is feasible with the advanced genotyping technology. With a huge amount of genetic variations data in hand, developing efficient algorithms to carry out the gene-gene interaction analysis in a timely manner has become one of the key problems in genome-wide association studies (GWAS). Boolean operation-based screening and testing (BOOST), a recent work in GWAS, completes gene-gene interaction analysis in 2.5 days on a desktop computer. Compared with central processing units (CPUs), graphic processing units (GPUs) are highly parallel hardware and provide massive computing resources. We are, therefore, motivated to use GPUs to further speed up the analysis of gene-gene interactions. We implement the BOOST method based on a GPU framework and name it GBOOST. GBOOST achieves a 40-fold speedup compared with BOOST. It completes the analysis of Wellcome Trust Case Control Consortium Type 2 Diabetes (WTCCC T2D) genome data within 1.34 h on a desktop computer equipped with Nvidia GeForce GTX 285 display card. GBOOST code is available at http://bioinformatics.ust.hk/BOOST.html#GBOOST.
Model-based gene set analysis for Bioconductor.

Science.gov (United States)

Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien

2011-07-01

Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.
Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

Science.gov (United States)

Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

2012-07-15

Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
Gene-Based Genome-Wide Association Analysis in European and Asian Populations Identified Novel Genes for Rheumatoid Arthritis.

Directory of Open Access Journals (Sweden)

Hong Zhu

Full Text Available Rheumatoid arthritis (RA is a complex autoimmune disease. Using a gene-based association research strategy, the present study aims to detect unknown susceptibility to RA and to address the ethnic differences in genetic susceptibility to RA between European and Asian populations.Gene-based association analyses were performed with KGG 2.5 by using publicly available large RA datasets (14,361 RA cases and 43,923 controls of European subjects, 4,873 RA cases and 17,642 controls of Asian Subjects. For the newly identified RA-associated genes, gene set enrichment analyses and protein-protein interactions analyses were carried out with DAVID and STRING version 10.0, respectively. Differential expression verification was conducted using 4 GEO datasets. The expression levels of three selected 'highly verified' genes were measured by ELISA among our in-house RA cases and controls.A total of 221 RA-associated genes were newly identified by gene-based association study, including 71'overlapped', 76 'European-specific' and 74 'Asian-specific' genes. Among them, 105 genes had significant differential expressions between RA patients and health controls at least in one dataset, especially for 20 genes including 11 'overlapped' (ABCF1, FLOT1, HLA-F, IER3, TUBB, ZKSCAN4, BTN3A3, HSP90AB1, CUTA, BRD2, HLA-DMA, 5 'European-specific' (PHTF1, RPS18, BAK1, TNFRSF14, SUOX and 4 'Asian-specific' (RNASET2, HFE, BTN2A2, MAPK13 genes whose differential expressions were significant at least in three datasets. The protein expressions of two selected genes FLOT1 (P value = 1.70E-02 and HLA-DMA (P value = 4.70E-02 in plasma were significantly different in our in-house samples.Our study identified 221 novel RA-associated genes and especially highlighted the importance of 20 candidate genes on RA. The results addressed ethnic genetic background differences for RA susceptibility between European and Asian populations and detected a long list of overlapped or ethnic specific RA
Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction

Energy Technology Data Exchange (ETDEWEB)

Wang, Jing; Ma, Zihao; Carr, Steven A.; Mertins, Philipp; Zhang, Hui; Zhang, Zhen; Chan, Daniel W.; Ellis, Matthew J. C.; Townsend, R. Reid; Smith, Richard D.; McDermott, Jason E.; Chen, Xian; Paulovich, Amanda G.; Boja, Emily S.; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Rodland, Karin D.; Liebler, Daniel C.; Zhang, Bing

2016-11-11

Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies
A relative variation-based method to unraveling gene regulatory networks.

Directory of Open Access Journals (Sweden)

Yali Wang

Full Text Available Gene regulatory network (GRN reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called Z-score, usually perform better. A fundamental problem with the Z-score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the Z-score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be
Identification of novel risk genes associated with type 1 diabetes mellitus using a genome-wide gene-based association analysis.

Science.gov (United States)

Qiu, Ying-Hua; Deng, Fei-Yan; Li, Min-Jing; Lei, Shu-Feng

2014-11-01

Type 1 diabetes mellitus is a serious disorder characterized by destruction of pancreatic β-cells, culminating in absolute insulin deficiency. Genetic factors contribute to the susceptibility of type 1 diabetes mellitus. The aim of the present study was to identify more susceptibility genes of type 1 diabetes mellitus. We carried out an initial gene-based genome-wide association study in a total of 4,075 type 1 diabetes mellitus cases and 2,604 controls by using the Gene-based Association Test using Extended Simes procedure. Furthermore, we carried out replication studies, differential expression analysis and functional annotation clustering analysis to support the significance of the identified susceptibility genes. We identified 452 genes associated with type 1 diabetes mellitus, even after adapting the genome-wide threshold for significance (P diabetes mellitus, which were ignored in single-nucleotide polymorphism-based association analysis and were not previously reported. We found that 53 genes have supportive evidence from replication studies and/or differential expression studies. In particular, seven genes including four non-human leukocyte antigen (HLA) genes (RASIP1, STRN4, BCAR1 and MYL2) are replicated in at least one independent population and also differentially expressed in peripheral blood mononuclear cells or monocytes. Furthermore, the associated genes tend to enrich in immune-related pathways or Gene Ontology project terms. The present results suggest the high power of gene-based association analysis in detecting disease-susceptibility genes. Our findings provide more insights into the genetic basis of type 1 diabetes mellitus.
Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures

Directory of Open Access Journals (Sweden)

Anders E. Berglund

2017-01-01

Full Text Available Background. Many gene-expression signatures exist for describing the biological state of profiled tumors. Principal Component Analysis (PCA can be used to summarize a gene signature into a single score. Our hypothesis is that gene signatures can be validated when applied to new datasets, using inherent properties of PCA. Results. This validation is based on four key concepts. Coherence: elements of a gene signature should be correlated beyond chance. Uniqueness: the general direction of the data being examined can drive most of the observed signal. Robustness: if a gene signature is designed to measure a single biological effect, then this signal should be sufficiently strong and distinct compared to other signals within the signature. Transferability: the derived PCA gene signature score should describe the same biology in the target dataset as it does in the training dataset. Conclusions. The proposed validation procedure ensures that PCA-based gene signatures perform as expected when applied to datasets other than those that the signatures were trained upon. Complex signatures, describing multiple independent biological components, are also easily identified.
Ranking candidate disease genes from gene expression and protein interaction: a Katz-centrality based approach.

Directory of Open Access Journals (Sweden)

Jing Zhao

Full Text Available Many diseases have complex genetic causes, where a set of alleles can affect the propensity of getting the disease. The identification of such disease genes is important to understand the mechanistic and evolutionary aspects of pathogenesis, improve diagnosis and treatment of the disease, and aid in drug discovery. Current genetic studies typically identify chromosomal regions associated specific diseases. But picking out an unknown disease gene from hundreds of candidates located on the same genomic interval is still challenging. In this study, we propose an approach to prioritize candidate genes by integrating data of gene expression level, protein-protein interaction strength and known disease genes. Our method is based only on two, simple, biologically motivated assumptions--that a gene is a good disease-gene candidate if it is differentially expressed in cases and controls, or that it is close to other disease-gene candidates in its protein interaction network. We tested our method on 40 diseases in 58 gene expression datasets of the NCBI Gene Expression Omnibus database. On these datasets our method is able to predict unknown disease genes as well as identifying pleiotropic genes involved in the physiological cellular processes of many diseases. Our study not only provides an effective algorithm for prioritizing candidate disease genes but is also a way to discover phenotypic interdependency, cooccurrence and shared pathophysiology between different disorders.

Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

Directory of Open Access Journals (Sweden)

Nolan Priedigkeit

2015-02-01

Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.
Canonical correlation analysis for gene-based pleiotropy discovery.

Directory of Open Access Journals (Sweden)

Jose A Seoane

2014-10-01

Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.
RNAi-based silencing of genes encoding the vacuolar- ATPase ...

African Journals Online (AJOL)

RNAi-based silencing of genes encoding the vacuolar- ATPase subunits a and c in pink bollworm (Pectinophora gossypiella). Ahmed M. A. Mohammed. Abstract. RNA interference is a post- transcriptional gene regulation mechanism that is predominantly found in eukaryotic organisms. RNAi demonstrated a successful ...
New Genome Similarity Measures based on Conserved Gene Adjacencies.

Science.gov (United States)

Doerr, Daniel; Kowada, Luis Antonio B; Araujo, Eloi; Deshpande, Shachi; Dantas, Simone; Moret, Bernard M E; Stoye, Jens

2017-06-01

Many important questions in molecular biology, evolution, and biomedicine can be addressed by comparative genomic approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example, to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomic methods that allow this kind of input are called gene family-based. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this article, we study an intermediate approach between family-based and family-free genomic similarity measures. Introducing this simpler model, called gene connections, we focus on the combinatorial aspects of gene family-free genome comparison. While in most cases, the computational costs to the general family-free case are the same, we also find an instance where the gene connections model has lower complexity. Within the gene connections model, we define three variants of genomic similarity measures that have different expression powers. We give polynomial-time algorithms for two of them, while we show NP-hardness for the third, most powerful one. We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.
Expression-based clustering of CAZyme-encoding genes of Aspergillus niger.

Science.gov (United States)

Gruben, Birgit S; Mäkelä, Miia R; Kowalczyk, Joanna E; Zhou, Miaomiao; Benoit-Gelber, Isabelle; De Vries, Ronald P

2017-11-23

The Aspergillus niger genome contains a large repertoire of genes encoding carbohydrate active enzymes (CAZymes) that are targeted to plant polysaccharide degradation enabling A. niger to grow on a wide range of plant biomass substrates. Which genes need to be activated in certain environmental conditions depends on the composition of the available substrate. Previous studies have demonstrated the involvement of a number of transcriptional regulators in plant biomass degradation and have identified sets of target genes for each regulator. In this study, a broad transcriptional analysis was performed of the A. niger genes encoding (putative) plant polysaccharide degrading enzymes. Microarray data focusing on the initial response of A. niger to the presence of plant biomass related carbon sources were analyzed of a wild-type strain N402 that was grown on a large range of carbon sources and of the regulatory mutant strains ΔxlnR, ΔaraR, ΔamyR, ΔrhaR and ΔgalX that were grown on their specific inducing compounds. The cluster analysis of the expression data revealed several groups of co-regulated genes, which goes beyond the traditionally described co-regulated gene sets. Additional putative target genes of the selected regulators were identified, based on their expression profile. Notably, in several cases the expression profile puts questions on the function assignment of uncharacterized genes that was based on homology searches, highlighting the need for more extensive biochemical studies into the substrate specificity of enzymes encoded by these non-characterized genes. The data also revealed sets of genes that were upregulated in the regulatory mutants, suggesting interaction between the regulatory systems and a therefore even more complex overall regulatory network than has been reported so far. Expression profiling on a large number of substrates provides better insight in the complex regulatory systems that drive the conversion of plant biomass by fungi. In
Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

Directory of Open Access Journals (Sweden)

Chen Xin

2012-10-01

Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

Science.gov (United States)

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene
DNA Array-Based Gene Profiling

Science.gov (United States)

Mocellin, Simone; Provenzano, Maurizio; Rossi, Carlo Riccardo; Pilati, Pierluigi; Nitti, Donato; Lise, Mario

2005-01-01

Cancer is a heterogeneous disease in most respects, including its cellularity, different genetic alterations, and diverse clinical behaviors. Traditional molecular analyses are reductionist, assessing only 1 or a few genes at a time, thus working with a biologic model too specific and limited to confront a process whose clinical outcome is likely to be governed by the combined influence of many genes. The potential of functional genomics is enormous, because for each experiment, thousands of relevant observations can be made simultaneously. Accordingly, DNA array, like other high-throughput technologies, might catalyze and ultimately accelerate the development of knowledge in tumor cell biology. Although in its infancy, the implementation of DNA array technology in cancer research has already provided investigators with novel data and intriguing new hypotheses on the molecular cascade leading to carcinogenesis, tumor aggressiveness, and sensitivity to antiblastic agents. Given the revolutionary implications that the use of this technology might have in the clinical management of patients with cancer, principles of DNA array-based tumor gene profiling need to be clearly understood for the data to be correctly interpreted and appreciated. In the present work, we discuss the technical features characterizing this powerful laboratory tool and review the applications so far described in the field of oncology. PMID:15621987
A Gene Module-Based eQTL Analysis Prioritizing Disease Genes and Pathways in Kidney Cancer

Directory of Open Access Journals (Sweden)

Mary Qu Yang

Full Text Available Clear cell renal cell carcinoma (ccRCC is the most common and most aggressive form of renal cell cancer (RCC. The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1, as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways. Keywords: ccRCC, Causative mutation, Pathways, Protein-protein interaction, Gene module, eQTL
Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

Directory of Open Access Journals (Sweden)

Xiaobo Guo

Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
A Cancer Gene Selection Algorithm Based on the K-S Test and CFS

Directory of Open Access Journals (Sweden)

Qiang Su

2017-01-01

Full Text Available Background. To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S test and correlation-based feature selection (CFS principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. Results. We adopted support vector machines (SVM as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR, and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. Conclusions. The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.
GOBO: gene expression-based outcome for breast cancer online.

Directory of Open Access Journals (Sweden)

Markus Ringnér

Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.
Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

Science.gov (United States)

Zhou, Xionghui; Liu, Juan

2014-01-01

Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for
A hybrid network-based method for the detection of disease-related genes

Science.gov (United States)

Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

2018-02-01

Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.
Weighted functional linear regression models for gene-based association analysis.

Science.gov (United States)

Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

2018-01-01

Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.
Mesenchymal stem cell-based gene therapy: A promising therapeutic strategy.

Science.gov (United States)

Mohammadian, Mozhdeh; Abasi, Elham; Akbarzadeh, Abolfazl

2016-08-01

Mesenchymal stem cells (MSCs) are multipotent stromal cells that exist in bone marrow, fat, and so many other tissues, and can differentiate into a variety of cell types including osteoblasts, chondrocytes, and adipocytes, as well as myocytes and neurons. Moreover, they have great capacity for self-renewal while maintaining their multipotency. Their capacity for proliferation and differentiation, in addition to their immunomodulatory activity, makes them very promising candidates for cell-based regenerative medicine. Moreover, MSCs have the ability of mobilization to the site of damage; therefore, they can automatically migrate to the site of injury via their chemokine receptors following intravenous transplantation. In this respect, they can be applied for MSC-based gene therapy. In this new therapeutic method, genes of interest are introduced into MSCs via viral and non-viral-based methods that lead to transgene expression in them. Although stem cell-based gene therapy is a relatively new strategy, it lights a new hope for the treatment of a variety of genetic disorders. In the near future, MSCs can be of use in a vast number of clinical applications, because of their uncomplicated isolation, culture, and genetic manipulation. However, full consideration is still crucial before they are utilized for clinical trials, because the number of studies that signify the advantageous effects of MSC-based gene therapy are still limited.
An Entropy-based gene selection method for cancer classification using microarray data

Directory of Open Access Journals (Sweden)

Krishnan Arun

2005-03-01

Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.
A pathway-based network analysis of hypertension-related genes

Science.gov (United States)

Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng

2016-02-01

Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.
Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

Science.gov (United States)

Hur, Junguk; Özgür, Arzucan; He, Yongqun

2017-03-14

Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of
Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

Science.gov (United States)

Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

2005-07-15

A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.

Evaluation of gene importance in microarray data based upon probability of selection

Directory of Open Access Journals (Sweden)

Fu Li M

2005-03-01

Full Text Available Abstract Background Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. Results Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes with optimal classification performance, compared with results reported in the literature. Conclusion In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities.
GO(vis), a gene ontology visualization tool based on multi-dimensional values.

Science.gov (United States)

Ning, Zi; Jiang, Zhenran

2010-05-01

Most of gene product similarity measurements concentrate on the information content of Gene Ontology (GO) terms or use a path-based similarity between GO terms, which may ignore other important information contained in the structure of the ontology. In our study, we integrate different GO similarity measure approaches to analyze the functional relationship of genes and gene products with a new triangle-based visualization tool called GO(Vis). The purpose of this tool is to demonstrate the effect of three important information factors when measuring the similarity between gene products. One advantage of this tool is that its important ratio can be adjusted to meet different measuring requirements according to the biological knowledge of each factor. The experimental results demonstrate that GO(Vis) can display diagrams of the functional relationship for gene products effectively.
FiGS: a filter-based gene selection workbench for microarray data

Directory of Open Access Journals (Sweden)

Yun Taegyun

2010-01-01

Full Text Available Abstract Background The selection of genes that discriminate disease classes from microarray data is widely used for the identification of diagnostic biomarkers. Although various gene selection methods are currently available and some of them have shown excellent performance, no single method can retain the best performance for all types of microarray datasets. It is desirable to use a comparative approach to find the best gene selection result after rigorous test of different methodological strategies for a given microarray dataset. Results FiGS is a web-based workbench that automatically compares various gene selection procedures and provides the optimal gene selection result for an input microarray dataset. FiGS builds up diverse gene selection procedures by aligning different feature selection techniques and classifiers. In addition to the highly reputed techniques, FiGS diversifies the gene selection procedures by incorporating gene clustering options in the feature selection step and different data pre-processing options in classifier training step. All candidate gene selection procedures are evaluated by the .632+ bootstrap errors and listed with their classification accuracies and selected gene sets. FiGS runs on parallelized computing nodes that capacitate heavy computations. FiGS is freely accessible at http://gexp.kaist.ac.kr/figs. Conclusion FiGS is an web-based application that automates an extensive search for the optimized gene selection analysis for a microarray dataset in a parallel computing environment. FiGS will provide both an efficient and comprehensive means of acquiring optimal gene sets that discriminate disease states from microarray datasets.
The Arabidopsis co-expression tool (act): a WWW-based tool and database for microarray-based gene expression analysis

DEFF Research Database (Denmark)

Jen, C. H.; Manfield, I. W.; Michalopoulos, D. W.

2006-01-01

be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots......We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (act) , based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression...
Network-based association of hypoxia-responsive genes with cardiovascular diseases

International Nuclear Information System (INIS)

Wang, Rui-Sheng; Oldham, William M; Loscalzo, Joseph

2014-01-01

Molecular oxygen is indispensable for cellular viability and function. Hypoxia is a stress condition in which oxygen demand exceeds supply. Low cellular oxygen content induces a number of molecular changes to activate regulatory pathways responsible for increasing the oxygen supply and optimizing cellular metabolism under limited oxygen conditions. Hypoxia plays critical roles in the pathobiology of many diseases, such as cancer, heart failure, myocardial ischemia, stroke, and chronic lung diseases. Although the complicated associations between hypoxia and cardiovascular (and cerebrovascular) diseases (CVD) have been recognized for some time, there are few studies that investigate their biological link from a systems biology perspective. In this study, we integrate hypoxia genes, CVD genes, and the human protein interactome in order to explore the relationship between hypoxia and cardiovascular diseases at a systems level. We show that hypoxia genes are much closer to CVD genes in the human protein interactome than that expected by chance. We also find that hypoxia genes play significant bridging roles in connecting different cardiovascular diseases. We construct a hypoxia-CVD bipartite network and find several interesting hypoxia-CVD modules with significant gene ontology similarity. Finally, we show that hypoxia genes tend to have more CVD interactors in the human interactome than in random networks of matching topology. Based on these observations, we can predict novel genes that may be associated with CVD. This network-based association study gives us a broad view of the relationships between hypoxia and cardiovascular diseases and provides new insights into the role of hypoxia in cardiovascular biology. (paper)
A sight on protein-based nanoparticles as drug/gene delivery systems.

Science.gov (United States)

Salatin, Sara; Jelvehgari, Mitra; Maleki-Dizaj, Solmaz; Adibkia, Khosro

2015-01-01

Polymeric nanomaterials have extensively been applied for the preparation of targeted and controlled release drug/gene delivery systems. However, problems involved in the formulation of synthetic polymers such as using of the toxic solvents and surfactants have limited their desirable applications. In this regard, natural biomolecules including proteins and polysaccharide are suitable alternatives due to their safety. According to literature, protein-based nanoparticles possess many advantages for drug and gene delivery such as biocompatibility, biodegradability and ability to functionalize with targeting ligands. This review provides a general sight on the application of biodegradable protein-based nanoparticles in drug/gene delivery based on their origins. Their unique physicochemical properties that help them to be formulated as pharmaceutical carriers are also discussed.
Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes.

Directory of Open Access Journals (Sweden)

Emre Guney

Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.
Statistics on gene-based laser speckles with a small number of scatterers: implications for the detection of polymorphism in the Chlamydia trachomatis omp1 gene

Science.gov (United States)

Ulyanov, Sergey S.; Ulianova, Onega V.; Zaytsev, Sergey S.; Saltykov, Yury V.; Feodorova, Valentina A.

2018-04-01

The transformation mechanism for a nucleotide sequence of the Chlamydia trachomatis gene into a speckle pattern has been considered. The first and second-order statistics of gene-based speckles have been analyzed. It has been demonstrated that gene-based speckles do not obey Gaussian statistics and belong to the class of speckles with a small number of scatterers. It has been shown that gene polymorphism can be easily detected through analysis of the statistical characteristics of gene-based speckles.
Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

Directory of Open Access Journals (Sweden)

Mario Fruzangohar

Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.
Hepatitis B virus DNA polymerase gene polymorphism based ...

African Journals Online (AJOL)

Hepatitis B virus DNA polymerase gene polymorphism based prediction of genotypes in chronic HBV patients from Western India. Yashwant G. Chavan, Sharad R. Pawar, Minal Wani, Amol D. Raut, Rabindra N. Misra ...
A robust approach based on Weibull distribution for clustering gene expression data

Directory of Open Access Journals (Sweden)

Gong Binsheng

2011-05-01

Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters
A Region-Based GeneSIS Segmentation Algorithm for the Classification of Remotely Sensed Images

Directory of Open Access Journals (Sweden)

Stelios K. Mylonas

2015-03-01

Full Text Available This paper proposes an object-based segmentation/classification scheme for remotely sensed images, based on a novel variant of the recently proposed Genetic Sequential Image Segmentation (GeneSIS algorithm. GeneSIS segments the image in an iterative manner, whereby at each iteration a single object is extracted via a genetic-based object extraction algorithm. Contrary to the previous pixel-based GeneSIS where the candidate objects to be extracted were evaluated through the fuzzy content of their included pixels, in the newly developed region-based GeneSIS algorithm, a watershed-driven fine segmentation map is initially obtained from the original image, which serves as the basis for the forthcoming GeneSIS segmentation. Furthermore, in order to enhance the spatial search capabilities, we introduce a more descriptive encoding scheme in the object extraction algorithm, where the structural search modules are represented by polygonal shapes. Our objectives in the new framework are posed as follows: enhance the flexibility of the algorithm in extracting more flexible object shapes, assure high level classification accuracies, and reduce the execution time of the segmentation, while at the same time preserving all the inherent attributes of the GeneSIS approach. Finally, exploiting the inherent attribute of GeneSIS to produce multiple segmentations, we also propose two segmentation fusion schemes that operate on the ensemble of segmentations generated by GeneSIS. Our approaches are tested on an urban and two agricultural images. The results show that region-based GeneSIS has considerably lower computational demands compared to the pixel-based one. Furthermore, the suggested methods achieve higher classification accuracies and good segmentation maps compared to a series of existing algorithms.
Avirulence (AVR) Gene-Based Diagnosis Complements Existing Pathogen Surveillance Tools for Effective Deployment of Resistance (R) Genes Against Rice Blast Disease.

Science.gov (United States)

Selisana, S M; Yanoria, M J; Quime, B; Chaipanya, C; Lu, G; Opulencia, R; Wang, G-L; Mitchell, T; Correll, J; Talbot, N J; Leung, H; Zhou, B

2017-06-01

Avirulence (AVR) genes in Magnaporthe oryzae, the fungal pathogen that causes the devastating rice blast disease, have been documented to be major targets subject to mutations to avoid recognition by resistance (R) genes. In this study, an AVR-gene-based diagnosis tool for determining the virulence spectrum of a rice blast pathogen population was developed and validated. A set of 77 single-spore field isolates was subjected to pathotype analysis using differential lines, each containing a single R gene, and classified into 20 virulent pathotypes, except for 4 isolates that lost pathogenicity. In all, 10 differential lines showed low frequency (95%), inferring the effectiveness of R genes present in the respective differential lines. In addition, the haplotypes of seven AVR genes were determined by polymerase chain reaction amplification and sequencing, if applicable. The calculated frequency of different AVR genes displayed significant variations in the population. AVRPiz-t and AVR-Pii were detected in 100 and 84.9% of the isolates, respectively. Five AVR genes such as AVR-Pik-D (20.5%) and AVR-Pik-E (1.4%), AVRPiz-t (2.7%), AVR-Pita (0%), AVR-Pia (0%), and AVR1-CO39 (0%) displayed low or even zero frequency. The frequency of AVR genes correlated almost perfectly with the resistance frequency of the cognate R genes in differential lines, except for International Rice Research Institute-bred blast-resistant lines IRBLzt-T, IRBLta-K1, and IRBLkp-K60. Both genetic analysis and molecular marker validation revealed an additional R gene, most likely Pi19 or its allele, in these three differential lines. This can explain the spuriously higher resistance frequency of each target R gene based on conventional pathotyping. This study demonstrates that AVR-gene-based diagnosis provides a precise, R-gene-specific, and differential line-free assessment method that can be used for determining the virulence spectrum of a rice blast pathogen population and for predicting the
Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

Science.gov (United States)

Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

2018-04-01

Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Thermodynamics-based models of transcriptional regulation with gene sequence.

Science.gov (United States)

Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

2015-12-01

Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

Science.gov (United States)

Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

2016-06-01

Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
A meta-analysis based method for prioritizing candidate genes involved in a pre-specific function

Directory of Open Access Journals (Sweden)

Jingjing Zhai

2016-12-01

Full Text Available The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Towards this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization, in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The ‘leave-one-out’ cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2. Moreover, RAP ranked 53.68% (204/380 flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software.
Gene set-based module discovery in the breast cancer transcriptome

Directory of Open Access Journals (Sweden)

Zhang Michael Q

2009-02-01

Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.
Rational design of gene-based vaccines.

Science.gov (United States)

Barouch, Dan H

2006-01-01

Vaccine development has traditionally been an empirical discipline. Classical vaccine strategies include the development of attenuated organisms, whole killed organisms, and protein subunits, followed by empirical optimization and iterative improvements. While these strategies have been remarkably successful for a wide variety of viruses and bacteria, these approaches have proven more limited for pathogens that require cellular immune responses for their control. In this review, current strategies to develop and optimize gene-based vaccines are described, with an emphasis on novel approaches to improve plasmid DNA vaccines and recombinant adenovirus vector-based vaccines. Copyright 2006 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
Detection of Gene Interactions Based on Syntactic Relations

Directory of Open Access Journals (Sweden)

Mi-Young Kim

2008-01-01

Full Text Available Interactions between proteins and genes are considered essential in the description of biomolecular phenomena, and networks of interactions are applied in a system's biology approach. Recently, many studies have sought to extract information from biomolecular text using natural language processing technology. Previous studies have asserted that linguistic information is useful for improving the detection of gene interactions. In particular, syntactic relations among linguistic information are good for detecting gene interactions. However, previous systems give a reasonably good precision but poor recall. To improve recall without sacrificing precision, this paper proposes a three-phase method for detecting gene interactions based on syntactic relations. In the first phase, we retrieve syntactic encapsulation categories for each candidate agent and target. In the second phase, we construct a verb list that indicates the nature of the interaction between pairs of genes. In the last phase, we determine direction rules to detect which of two genes is the agent or target. Even without biomolecular knowledge, our method performs reasonably well using a small training dataset. While the first phase contributes to improve recall, the second and third phases contribute to improve precision. In the experimental results using ICML 05 Workshop on Learning Language in Logic (LLL05 data, our proposed method gave an F-measure of 67.2% for the test data, significantly outperforming previous methods. We also describe the contribution of each phase to the performance.

Identification of potential crucial genes associated with steroid-induced necrosis of femoral head based on gene expression profile.

Science.gov (United States)

Lin, Zhe; Lin, Yongsheng

2017-09-05

The aim of this study was to explore potential crucial genes associated with the steroid-induced necrosis of femoral head (SINFH) and to provide valid biological information for further investigation of SINFH. Gene expression profile of GSE26316, generated from 3 SINFH rat samples and 3 normal rat samples were downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified using LIMMA package. After functional enrichment analyses of DEGs, protein-protein interaction (PPI) network and sub-PPI network analyses were conducted based on the STRING database and cytoscape. In total, 59 up-regulated DEGs and 156 downregulated DEGs were identified. The up-regulated DEGs were mainly involved in functions about immunity (e.g. Fcer1A and Il7R), and the downregulated DEGs were mainly enriched in muscle system process (e.g. Tnni2, Mylpf and Myl1). The PPI network of DEGs consisted of 123 nodes and 300 interactions. Tnni2, Mylpf, and Myl1 were the top 3 outstanding genes based on both subgraph centrality and degree centrality evaluation. These three genes interacted with each other in the network. Furthermore, the significant network module was composed of 22 downregulated genes (e.g. Tnni2, Mylpf and Myl1). These genes were mainly enriched in functions like muscle system process. The DEGs related to the regulation of immune system process (e.g. Fcer1A and Il7R), and DEGs correlated with muscle system process (e.g. Tnni2, Mylpf and Myl1) may be closely associated with the progress of SINFH, which is still needed to be confirmed by experiments. Copyright © 2017 Elsevier B.V. All rights reserved.
Actionable gene-based classification toward precision medicine in gastric cancer

Directory of Open Access Journals (Sweden)

Hiroshi Ichikawa

2017-10-01

Full Text Available Abstract Background Intertumoral heterogeneity represents a significant hurdle to identifying optimized targeted therapies in gastric cancer (GC. To realize precision medicine for GC patients, an actionable gene alteration-based molecular classification that directly associates GCs with targeted therapies is needed. Methods A total of 207 Japanese patients with GC were included in this study. Formalin-fixed, paraffin-embedded (FFPE tumor tissues were obtained from surgical or biopsy specimens and were subjected to DNA extraction. We generated comprehensive genomic profiling data using a 435-gene panel including 69 actionable genes paired with US Food and Drug Administration-approved targeted therapies, and the evaluation of Epstein-Barr virus (EBV infection and microsatellite instability (MSI status. Results Comprehensive genomic sequencing detected at least one alteration of 435 cancer-related genes in 194 GCs (93.7% and of 69 actionable genes in 141 GCs (68.1%. We classified the 207 GCs into four The Cancer Genome Atlas (TCGA subtypes using the genomic profiling data; EBV (N = 9, MSI (N = 17, chromosomal instability (N = 119, and genomically stable subtype (N = 62. Actionable gene alterations were not specific and were widely observed throughout all TCGA subtypes. To discover a novel classification which more precisely selects candidates for targeted therapies, 207 GCs were classified using hypermutated phenotype and the mutation profile of 69 actionable genes. We identified a hypermutated group (N = 32, while the others (N = 175 were sub-divided into six clusters including five with actionable gene alterations: ERBB2 (N = 25, CDKN2A, and CDKN2B (N = 10, KRAS (N = 10, BRCA2 (N = 9, and ATM cluster (N = 12. The clinical utility of this classification was demonstrated by a case of unresectable GC with a remarkable response to anti-HER2 therapy in the ERBB2 cluster. Conclusions This actionable gene-based
GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.

Science.gov (United States)

Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H

2010-04-01

A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.
Gene-based meta-analysis of genome-wide association studies implicates new loci involved in obesity

DEFF Research Database (Denmark)

Hägg, Sara; Ganna, Andrea; Van Der Laan, Sander W

2015-01-01

) approach to assign variants to genes and to calculate gene-based P-values based on simulations. The VEGAS method was applied to each cohort separately before a gene-based meta-analysis was performed. In Stage 1, two known (FTO and TMEM18) and six novel (PEX2, MTFR2, SSFA2, IARS2, CEP295 and TXNDC12) loci...
Analysis of regulatory networks constructed based on gene ...

Indian Academy of Sciences (India)

2013-12-09

Dec 9, 2013 ... early diagnosis of complex diseases or cancer without obvious symptoms. [Gong J., Diao B., Yao G. J., ... expression levels of thousands of genes in a specific cell or tissue. Previous ..... base of the brain. It mainly controls the ...
Targeted delivery of genes to endothelial cells and cell- and gene-based therapy in pulmonary vascular diseases.

Science.gov (United States)

Suen, Colin M; Mei, Shirley H J; Kugathasan, Lakshmi; Stewart, Duncan J

2013-10-01

Pulmonary arterial hypertension (PAH) is a devastating disease that, despite significant advances in medical therapies over the last several decades, continues to have an extremely poor prognosis. Gene therapy is a method to deliver therapeutic genes to replace defective or mutant genes or supplement existing cellular processes to modify disease. Over the last few decades, several viral and nonviral methods of gene therapy have been developed for preclinical PAH studies with varying degrees of efficacy. However, these gene delivery methods face challenges of immunogenicity, low transduction rates, and nonspecific targeting which have limited their translation to clinical studies. More recently, the emergence of regenerative approaches using stem and progenitor cells such as endothelial progenitor cells (EPCs) and mesenchymal stem cells (MSCs) have offered a new approach to gene therapy. Cell-based gene therapy is an approach that augments the therapeutic potential of EPCs and MSCs and may deliver on the promise of reversal of established PAH. These new regenerative approaches have shown tremendous potential in preclinical studies; however, large, rigorously designed clinical studies will be necessary to evaluate clinical efficacy and safety. © 2013 American Physiological Society. Compr Physiol 3:1749-1779, 2013.
Hessian regularization based non-negative matrix factorization for gene expression data clustering.

Science.gov (United States)

Liu, Xiao; Shi, Jun; Wang, Congzhi

2015-01-01

Since a key step in the analysis of gene expression data is to detect groups of genes that have similar expression patterns, clustering technique is then commonly used to analyze gene expression data. Data representation plays an important role in clustering analysis. The non-negative matrix factorization (NMF) is a widely used data representation method with great success in machine learning. Although the traditional manifold regularization method, Laplacian regularization (LR), can improve the performance of NMF, LR still suffers from the problem of its weak extrapolating power. Hessian regularization (HR) is a newly developed manifold regularization method, whose natural properties make it more extrapolating, especially for small sample data. In this work, we propose the HR-based NMF (HR-NMF) algorithm, and then apply it to represent gene expression data for further clustering task. The clustering experiments are conducted on five commonly used gene datasets, and the results indicate that the proposed HR-NMF outperforms LR-based NMM and original NMF, which suggests the potential application of HR-NMF for gene expression data.
Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data

Directory of Open Access Journals (Sweden)

Keita Mori

2013-01-01

Full Text Available Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.
Gene ontology based transfer learning for protein subcellular localization

Directory of Open Access Journals (Sweden)

Zhou Shuigeng

2011-02-01

Full Text Available Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for
Pairagon+N-SCAN_EST: a model-based gene annotation pipeline

DEFF Research Database (Denmark)

Arumugam, Manimozhiyan; Wei, Chaochun; Brown, Randall H

2006-01-01

This paper describes Pairagon+N-SCAN_EST, a gene annotation pipeline that uses only native alignments. For each expressed sequence it chooses the best genomic alignment. Systems like ENSEMBL and ExoGean rely on trans alignments, in which expressed sequences are aligned to the genomic loci...... with de novo gene prediction by using N-SCAN_EST. N-SCAN_EST is based on a generalized HMM probability model augmented with a phylogenetic conservation model and EST alignments. It can predict complete transcripts by extending or merging EST alignments, but it can also predict genes in regions without EST...
A Fisheye Viewer for microarray-based gene expression data.

Science.gov (United States)

Wu, Min; Thao, Cheng; Mu, Xiangming; Munson, Ethan V

2006-10-13

Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface--an electronic table (E-table) that uses fisheye distortion technology. The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.
A modular positive feedback-based gene amplifier

Directory of Open Access Journals (Sweden)

Bhalerao Kaustubh D

2010-02-01

Full Text Available Abstract Background Positive feedback is a common mechanism used in the regulation of many gene circuits as it can amplify the response to inducers and also generate binary outputs and hysteresis. In the context of electrical circuit design, positive feedback is often considered in the design of amplifiers. Similar approaches, therefore, may be used for the design of amplifiers in synthetic gene circuits with applications, for example, in cell-based sensors. Results We developed a modular positive feedback circuit that can function as a genetic signal amplifier, heightening the sensitivity to inducer signals as well as increasing maximum expression levels without the need for an external cofactor. The design utilizes a constitutively active, autoinducer-independent variant of the quorum-sensing regulator LuxR. We experimentally tested the ability of the positive feedback module to separately amplify the output of a one-component tetracycline sensor and a two-component aspartate sensor. In each case, the positive feedback module amplified the response to the respective inducers, both with regards to the dynamic range and sensitivity. Conclusions The advantage of our design is that the actual feedback mechanism depends only on a single gene and does not require any other modulation. Furthermore, this circuit can amplify any transcriptional signal, not just one encoded within the circuit or tuned by an external inducer. As our design is modular, it can potentially be used as a component in the design of more complex synthetic gene circuits.
Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

Science.gov (United States)

Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M; Krasnogor, Natalio

2012-01-01

Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods

Science.gov (United States)

Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo

2014-01-01

Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further
Molecular typing of Staphylococcus aureus based on coagulase gene.

Science.gov (United States)

Javid, Faizan; Taku, Anil; Bhat, Mohd Altaf; Badroo, Gulzar Ahmad; Mudasir, Mir; Sofi, Tanveer Ahmad

2018-04-01

This study was conducted to study the coagulase gene-based genetic diversity of Staphylococcus aureus , isolated from different samples of cattle using restriction fragment length polymorphism (RFLP) and their sequence-based phylogenetic analysis. A total of 192 different samples from mastitic milk, nasal cavity, and pus from skin wounds of cattle from Military Dairy Farm, Jammu, India, were screened for the presence of S. aureus . The presumptive isolates were confirmed by nuc gene-based polymerase chain reaction (PCR). The confirmed S. aureus isolates were subjected to coagulase ( coa ) gene PCR. Different coa genotypes observed were subjected to RFLP using restriction enzymes Hae111 and Alu1 , to obtain the different restriction patterns. One isolate from each restriction pattern was sequenced. These sequences were aligned for maximum homology using the Bioedit softwareandsimilarity in the sequences was inferred with the help of sequence identity matrix. Of 192 different samples,39 (20.31%) isolates of S. aureus were confirmed by targeting nuc gene using PCR. Of 39 S. aureus isolates, 25 (64.10%) isolates carried coa gene. Four different genotypes of coa gene, i.e., 514 bp, 595 bp, 757 bp, and 802 bp were obtained. Two coa genotypes, 595 bp (15 isolates) and 802 bp (4 isolates), were observed in mastitic milk. 514 bp (2 isolates) and 757 bp (4 isolates) coa genotypes were observed from nasal cavity and pus from skin wounds, respectively. On RFLP using both restriction enzymes, four different restriction patterns P1, P2, P3, and P4 were observed. On sequencing, four different sequences having unique restriction patterns were obtained. The most identical sequences with the value of 0.810 were found between isolate S. aureus 514 (nasal cavity) and S. aureus 595 (mastitic milk), and thus, they are most closely related. While as the most distant sequences with the value of 0.483 were found between S. aureus 514 and S. aureus 802 isolates. The study, being localized
A network-based gene expression signature informs prognosis and treatment for colorectal cancer patients.

Directory of Open Access Journals (Sweden)

Mingguang Shi

Full Text Available Several studies have reported gene expression signatures that predict recurrence risk in stage II and III colorectal cancer (CRC patients with minimal gene membership overlap and undefined biological relevance. The goal of this study was to investigate biological themes underlying these signatures, to infer genes of potential mechanistic importance to the CRC recurrence phenotype and to test whether accurate prognostic models can be developed using mechanistically important genes.We investigated eight published CRC gene expression signatures and found no functional convergence in Gene Ontology enrichment analysis. Using a random walk-based approach, we integrated these signatures and publicly available somatic mutation data on a protein-protein interaction network and inferred 487 genes that were plausible candidate molecular underpinnings for the CRC recurrence phenotype. We named the list of 487 genes a NEM signature because it integrated information from Network, Expression, and Mutation. The signature showed significant enrichment in four biological processes closely related to cancer pathophysiology and provided good coverage of known oncogenes, tumor suppressors, and CRC-related signaling pathways. A NEM signature-based Survival Support Vector Machine prognostic model was trained using a microarray gene expression dataset and tested on an independent dataset. The model-based scores showed a 75.7% concordance with the real survival data and separated patients into two groups with significantly different relapse-free survival (p = 0.002. Similar results were obtained with reversed training and testing datasets (p = 0.007. Furthermore, adjuvant chemotherapy was significantly associated with prolonged survival of the high-risk patients (p = 0.006, but not beneficial to the low-risk patients (p = 0.491.The NEM signature not only reflects CRC biology but also informs patient prognosis and treatment response. Thus, the network-based
Ortholog-based screening and identification of genes related to intracellular survival.

Science.gov (United States)

Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin

2018-04-20

Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.
Comparison of lists of genes based on functional profiles

Directory of Open Access Journals (Sweden)

Salicrú Miquel

2011-10-01

Full Text Available Abstract Background How to compare studies on the basis of their biological significance is a problem of central importance in high-throughput genomics. Many methods for performing such comparisons are based on the information in databases of functional annotation, such as those that form the Gene Ontology (GO. Typically, they consist of analyzing gene annotation frequencies in some pre-specified GO classes, in a class-by-class way, followed by p-value adjustment for multiple testing. Enrichment analysis, where a list of genes is compared against a wider universe of genes, is the most common example. Results A new global testing procedure and a method incorporating it are presented. Instead of testing separately for each GO class, a single global test for all classes under consideration is performed. The test is based on the distance between the functional profiles, defined as the joint frequencies of annotation in a given set of GO classes. These classes may be chosen at one or more GO levels. The new global test is more powerful and accurate with respect to type I errors than the usual class-by-class approach. When applied to some real datasets, the results suggest that the method may also provide useful information that complements the tests performed using a class-by-class approach if gene counts are sparse in some classes. An R library, goProfiles, implements these methods and is available from Bioconductor, http://bioconductor.org/packages/release/bioc/html/goProfiles.html. Conclusions The method provides an inferential basis for deciding whether two lists are functionally different. For global comparisons it is preferable to the global chi-square test of homogeneity. Furthermore, it may provide additional information if used in conjunction with class-by-class methods.
Candidate genes and pathogenesis investigation for sepsis-related acute respiratory distress syndrome based on gene expression profile.

Science.gov (United States)

Wang, Min; Yan, Jingjun; He, Xingxing; Zhong, Qiang; Zhan, Chengye; Li, Shusheng

2016-04-18

Acute respiratory distress syndrome (ARDS) is a potentially devastating form of acute inflammatory lung injury as well as a major cause of acute respiratory failure. Although researchers have made significant progresses in elucidating the pathophysiology of this complex syndrome over the years, the absence of a universal detail disease mechanism up until now has led to a series of practical problems for a definitive treatment. This study aimed to predict some genes or pathways associated with sepsis-related ARDS based on a public microarray dataset and to further explore the molecular mechanism of ARDS. A total of 122 up-regulated DEGs and 91 down-regulated differentially expressed genes (DEGs) were obtained. The up- and down-regulated DEGs were mainly involved in functions like mitotic cell cycle and pathway like cell cycle. Protein-protein interaction network of ARDS analysis revealed 20 hub genes including cyclin B1 (CCNB1), cyclin B2 (CCNB2) and topoisomerase II alpha (TOP2A). A total of seven transcription factors including forkhead box protein M1 (FOXM1) and 30 target genes were revealed in the transcription factor-target gene regulation network. Furthermore, co-cited genes including CCNB2-CCNB1 were revealed in literature mining for the relations ARDS related genes. Pathways like mitotic cell cycle were closed related with the development of ARDS. Genes including CCNB1, CCNB2 and TOP2A, as well as transcription factors like FOXM1 might be used as the novel gene therapy targets for sepsis related ARDS.
Minimal gene selection for classification and diagnosis prediction based on gene expression profile

Directory of Open Access Journals (Sweden)

Alireza Mehridehnavi

2013-01-01

Conclusion: We have shown that the use of two most significant genes based on their S/N ratios and selection of suitable training samples can lead to classify DLBCL patients with a rather good result. Actually with the aid of mentioned methods we could compensate lack of enough number of patients, improve accuracy of classifying and reduce complication of computations and so running time.

Cellular automata-based artificial life system of horizontal gene transfer

Directory of Open Access Journals (Sweden)

Ji-xin Liu

2016-02-01

Full Text Available Mutation and natural selection is the core of Darwin's idea about evolution. Many algorithms and models are based on this idea. However, in the evolution of prokaryotes, more and more researches have indicated that horizontal gene transfer (HGT would be much more important and universal than the authors had imagined. Owing to this mechanism, the prokaryotes not only become adaptable in nearly any environment on Earth, but also form a global genetic bank and a super communication network with all the genes of the prokaryotic world. Under this background, they present a novel cellular automata model general gene transfer to simulate and study the vertical gene transfer and HGT in the prokaryotes. At the same time, they use Schrodinger's life theory to formulate some evaluation indices and to discuss the intelligence and cognition of prokaryotes which is derived from HGT.
A fisheye viewer for microarray-based gene expression data

Directory of Open Access Journals (Sweden)

Munson Ethan V

2006-10-01

Full Text Available Abstract Background Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface – an electronic table (E-table that uses fisheye distortion technology. Results The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. Conclusion This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.
Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data.

Directory of Open Access Journals (Sweden)

Enrico Glaab

Full Text Available Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.
Development of an ELA-DRA gene typing method based on pyrosequencing technology.

Science.gov (United States)

Díaz, S; Echeverría, M G; It, V; Posik, D M; Rogberg-Muñoz, A; Pena, N L; Peral-García, P; Vega-Pla, J L; Giovambattista, G

2008-11-01

The polymorphism of equine lymphocyte antigen (ELA) class II DRA gene had been detected by polymerase chain reaction-single-strand conformational polymorphism (PCR-SSCP) and reference strand-mediated conformation analysis. These methodologies allowed to identify 11 ELA-DRA exon 2 sequences, three of which are widely distributed among domestic horse breeds. Herein, we describe the development of a pyrosequencing-based method applicable to ELA-DRA typing, by screening samples from eight different horse breeds previously typed by PCR-SSCP. This sequence-based method would be useful in high-throughput genotyping of major histocompatibility complex genes in horses and other animal species, making this system interesting as a rapid screening method for animal genotyping of immune-related genes.
Environmental Application of Reporter-Genes Based Biosensors for Chemical Contamination Screening

Directory of Open Access Journals (Sweden)

Matejczyk Marzena

2014-12-01

Full Text Available The paper presents results of research concerning possibilities of applications of reporter-genes based microorganisms, including the selective presentation of defects and advantages of different new scientific achievements of methodical solutions in genetic system constructions of biosensing elements for environmental research. The most robust and popular genetic fusion and new trends in reporter genes technology – such as LacZ (β-galactosidase, xylE (catechol 2,3-dioxygenase, gfp (green fluorescent proteins and its mutated forms, lux (prokaryotic luciferase, luc (eukaryotic luciferase, phoA (alkaline phosphatase, gusA and gurA (β-glucuronidase, antibiotics and heavy metals resistance are described. Reporter-genes based biosensors with use of genetically modified bacteria and yeast successfully work for genotoxicity, bioavailability and oxidative stress assessment for detection and monitoring of toxic compounds in drinking water and different environmental samples, surface water, soil, sediments.
Sequence-based model of gap gene regulatory network.

Science.gov (United States)

Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

2014-01-01

The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3
Probability-based collaborative filtering model for predicting gene-disease associations.

Science.gov (United States)

Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

2017-12-28

Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

Science.gov (United States)

Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

2014-06-01

In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both
Blood-based gene-expression predictors of PTSD risk and resilience among deployed marines: a pilot study.

Science.gov (United States)

Glatt, Stephen J; Tylee, Daniel S; Chandler, Sharon D; Pazol, Joel; Nievergelt, Caroline M; Woelk, Christopher H; Baker, Dewleen G; Lohr, James B; Kremen, William S; Litz, Brett T; Tsuang, Ming T

2013-06-01

Susceptibility to PTSD is determined by both genes and environment. Similarly, gene-expression levels in peripheral blood are influenced by both genes and environment, and expression levels of many genes show good correspondence between peripheral blood and brain. Therefore, our objectives were to test the following hypotheses: (1) pre-trauma expression levels of a gene subset (particularly immune-system genes) in peripheral blood would differ between trauma-exposed Marines who later developed PTSD and those who did not; (2) a predictive biomarker panel of the eventual emergence of PTSD among high-risk individuals could be developed based on gene expression in readily assessable peripheral blood cells; and (3) a predictive panel based on expression of individual exons would surpass the accuracy of a model based on expression of full-length gene transcripts. Gene-expression levels were assayed in peripheral blood samples from 50 U.S. Marines (25 eventual PTSD cases and 25 non-PTSD comparison subjects) prior to their deployment overseas to war-zones in Iraq or Afghanistan. The panel of biomarkers dysregulated in peripheral blood cells of eventual PTSD cases prior to deployment was significantly enriched for immune genes, achieved 70% prediction accuracy in an independent sample based on the expression of 23 full-length transcripts, and attained 80% accuracy in an independent sample based on the expression of one exon from each of five genes. If the observed profiles of pre-deployment mRNA-expression in eventual PTSD cases can be further refined and replicated, they could suggest avenues for early intervention and prevention among individuals at high risk for trauma exposure. Copyright © 2013 Wiley Periodicals, Inc.
Efficient gene transfer into nondividing cells by adeno-associated virus-based vectors.

OpenAIRE

Podsakoff, G; Wong, K K; Chatterjee, S

1994-01-01

Gene transfer vectors based on adeno-associated virus (AAV) are emerging as highly promising for use in human gene therapy by virtue of their characteristics of wide host range, high transduction efficiencies, and lack of cytopathogenicity. To better define the biology of AAV-mediated gene transfer, we tested the ability of an AAV vector to efficiently introduce transgenes into nonproliferating cell populations. Cells were induced into a nonproliferative state by treatment with the DNA synthe...
Frequency-based time-series gene expression recomposition using PRIISM

Directory of Open Access Journals (Sweden)

Rosa Bruce A

2012-06-01

Full Text Available Abstract Background Circadian rhythm pathways influence the expression patterns of as much as 31% of the Arabidopsis genome through complicated interaction pathways, and have been found to be significantly disrupted by biotic and abiotic stress treatments, complicating treatment-response gene discovery methods due to clock pattern mismatches in the fold change-based statistics. The PRIISM (Pattern Recomposition for the Isolation of Independent Signals in Microarray data algorithm outlined in this paper is designed to separate pattern changes induced by different forces, including treatment-response pathways and circadian clock rhythm disruptions. Results Using the Fourier transform, high-resolution time-series microarray data is projected to the frequency domain. By identifying the clock frequency range from the core circadian clock genes, we separate the frequency spectrum to different sections containing treatment-frequency (representing up- or down-regulation by an adaptive treatment response, clock-frequency (representing the circadian clock-disruption response and noise-frequency components. Then, we project the components’ spectra back to the expression domain to reconstruct isolated, independent gene expression patterns representing the effects of the different influences. By applying PRIISM on a high-resolution time-series Arabidopsis microarray dataset under a cold treatment, we systematically evaluated our method using maximum fold change and principal component analyses. The results of this study showed that the ranked treatment-frequency fold change results produce fewer false positives than the original methodology, and the 26-hour timepoint in our dataset was the best statistic for distinguishing the most known cold-response genes. In addition, six novel cold-response genes were discovered. PRIISM also provides gene expression data which represents only circadian clock influences, and may be useful for circadian clock studies
A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis.

Science.gov (United States)

Voyle, Nicola; Keohane, Aoife; Newhouse, Stephen; Lunnon, Katie; Johnston, Caroline; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Hodges, Angela; Kiddle, Steven; Dobson, Richard Jb

2016-01-01

Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer's disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.
Prediction of highly expressed genes in microbes based on chromatin accessibility

Directory of Open Access Journals (Sweden)

Ussery David W

2007-02-01

Full Text Available Abstract Background It is well known that gene expression is dependent on chromatin structure in eukaryotes and it is likely that chromatin can play a role in bacterial gene expression as well. Here, we use a nucleosomal position preference measure of anisotropic DNA flexibility to predict highly expressed genes in microbial genomes. We compare these predictions with those based on codon adaptation index (CAI values, and also with experimental data for 6 different microbial genomes, with a particular interest in experimental data from Escherichia coli. Moreover, position preference is examined further in 328 sequenced microbial genomes. Results We find that absolute gene expression levels are correlated with the position preference in many microbial genomes. It is postulated that in these regions, the DNA may be more accessible to the transcriptional machinery. Moreover, ribosomal proteins and ribosomal RNA are encoded by DNA having significantly lower position preference values than other genes in fast-replicating microbes. Conclusion This insight into DNA structure-dependent gene expression in microbes may be exploited for predicting the expression of non-translated genes such as non-coding RNAs that may not be predicted by any of the conventional codon usage bias approaches.
Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

Science.gov (United States)

Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

2017-10-02

Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.
GeneRecon Users' Manual — A coalescent based tool for fine-scale association mapping

DEFF Research Database (Denmark)

Mailund, T

2006-01-01

GeneRecon is a software package for linkage disequilibrium mapping using coalescent theory. It is based on Bayesian Markov-chain Monte Carlo (MCMC) method for fine-scale linkage-disequilibrium gene mapping using high-density marker maps. GeneRecon explicitly models the genealogy of a sample of th...
Outreach Science Education: Evidence-Based Studies in a Gene Technology Lab

Science.gov (United States)

Scharfenberg, Franz-Josef; Bogner, Franz X.

2014-01-01

Nowadays, outreach labs are important informal learning environments in science education. After summarizing research to goals outreach labs focus on, we describe our evidence-based gene technology lab as a model of a research-driven outreach program. Evaluation-based optimizations of hands-on teaching based on cognitive load theory (additional…
Affinity-based biosensors as promising tools for gene doping detection.

Science.gov (United States)

Minunni, Maria; Scarano, Simona; Mascini, Marco

2008-05-01

Innovative bioanalytical approaches can be foreseen as interesting means for solving relevant emerging problems in anti-doping control. Sport authorities fear that the newer form of doping, so-called gene doping, based on a misuse of gene therapy, will be undetectable and thus much less preventable. The World Anti-Doping Agency has already asked scientists to assist in finding ways to prevent and detect this newest kind of doping. In this Opinion article we discuss the main aspects of gene doping, from the putative target analytes to suitable sampling strategies. Moreover, we discuss the potential application of affinity sensing in this field, which so far has been successfully applied to a variety of analytical problems, from clinical diagnostics to food and environmental analysis.
PR Interval Associated Genes, Atrial Remodeling and Rhythm Outcome of Catheter Ablation of Atrial Fibrillation—A Gene-Based Analysis of GWAS Data

Directory of Open Access Journals (Sweden)

Daniela Husser

2017-12-01

Full Text Available Background: PR interval prolongation has recently been shown to associate with advanced left atrial remodeling and atrial fibrillation (AF recurrence after catheter ablation. While different genome-wide association studies (GWAS have implicated 13 loci to associate with the PR interval as an AF endophenotype their subsequent associations with AF remodeling and response to catheter ablation are unknown. Here, we perform a gene-based analysis of GWAS data to test the hypothesis that PR interval candidate genes also associate with left atrial remodeling and arrhythmia recurrence following AF catheter ablation.Methods and Results: Samples from 660 patients with paroxysmal (n = 370 or persistent AF (n = 290 undergoing AF catheter ablation were genotyped for ~1,000,000 SNPs. Gene-based association was investigated using VEGAS (versatile gene-based association study. Among the 13 candidate genes, SLC8A1, MEIS1, ITGA9, SCN5A, and SOX5 associated with the PR interval. Of those, ITGA9 and SOX5 were significantly associated with left atrial low voltage areas and left atrial diameter and subsequently with AF recurrence after radiofrequency catheter ablation.Conclusion: This study suggests contributions of ITGA9 and SOX5 to AF remodeling expressed as PR interval prolongation, low voltage areas and left atrial dilatation and subsequently to response to catheter ablation. Future and larger studies are necessary to replicate and apply these findings with the aim of designing AF pathophysiology-based multi-locus risk scores.
HMM-Based Gene Annotation Methods

Energy Technology Data Exchange (ETDEWEB)

Haussler, David; Hughey, Richard; Karplus, Keven

1999-09-20

Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.
Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

Science.gov (United States)

Jia, Bin; Wang, Xiaodong

2013-12-17

: The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

Gene-based Association Approach Identify Genes Across Stress Traits in Fruit Flies

DEFF Research Database (Denmark)

Rohde, Palle Duun; Edwards, Stefan McKinnon; Sarup, Pernille Merete

Identification of genes explaining variation in quantitative traits or genetic risk factors of human diseases requires both good phenotypic- and genotypic data, but also efficient statistical methods. Genome-wide association studies may reveal association between phenotypic variation and variation...... approach grouping variants accordingly to gene position, thus lowering the number of statistical tests performed and increasing the probability of identifying genes with small to moderate effects. Using this approach we identify numerous genes associated with different types of stresses in Drosophila...... melanogaster, but also identify common genes that affects the stress traits....
Information dimension analysis of bacterial essential and nonessential genes based on chaos game representation

International Nuclear Information System (INIS)

Zhou, Qian; Yu, Yong-ming

2014-01-01

Essential genes are indispensable for the survival of an organism. Investigating features associated with gene essentiality is fundamental to the prediction and identification of the essential genes. Selecting features associated with gene essentiality is fundamental to predict essential genes with computational techniques. We use fractal theory to make comparative analysis of essential and nonessential genes in bacteria. The information dimensions of essential genes and nonessential genes available in the DEG database for 27 bacteria are calculated based on their gene chaos game representations (CGRs). It is found that weak positive linear correlation exists between information dimension and gene length. Moreover, for genes of similar length, the average information dimension of essential genes is larger than that of nonessential genes. This indicates that essential genes show less regularity and higher complexity than nonessential genes. Our results show that for bacterium with a similar number of essential genes and nonessential genes, the CGR information dimension is helpful for the classification of essential genes and nonessential genes. Therefore, the gene CGR information dimension is very probably a useful gene feature for a genetic algorithm predicting essential genes. (paper)
ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis

Directory of Open Access Journals (Sweden)

Saurav Mallik

2017-12-01

Full Text Available For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures—weighted rank-based Jaccard and Cosine measures—and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm—RANWAR—was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis.

Science.gov (United States)

Mallik, Saurav; Zhao, Zhongming

2017-12-28

For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks.

Science.gov (United States)

Guo, Liyuan; Wang, Jing

2018-01-04

Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Construction of coffee transcriptome networks based on gene annotation semantics

Directory of Open Access Journals (Sweden)

Castillo Luis F.

2012-12-01

Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.
Area-Specific Cell Stimulation via Surface-Mediated Gene Transfer Using Apatite-Based Composite Layers

Directory of Open Access Journals (Sweden)

Yushin Yazaki

2015-04-01

Full Text Available Surface-mediated gene transfer systems using biocompatible calcium phosphate (CaP-based composite layers have attracted attention as a tool for controlling cell behaviors. In the present study we aimed to demonstrate the potential of CaP-based composite layers to mediate area-specific dual gene transfer and to stimulate cells on an area-by-area basis in the same well. For this purpose we prepared two pairs of DNA–fibronectin–apatite composite (DF-Ap layers using a pair of reporter genes and pair of differentiation factor genes. The results of the area-specific dual gene transfer successfully demonstrated that the cells cultured on a pair of DF-Ap layers that were adjacently placed in the same well showed specific gene expression patterns depending on the gene that was immobilized in theunderlying layer. Moreover, preliminary real-time PCR results indicated that multipotential C3H10T1/2 cells may have a potential to change into different types of cells depending on the differentiation factor gene that was immobilized in the underlying layer, even in the same well. Because DF-Ap layers have a potential to mediate area-specific cell stimulation on their surfaces, they could be useful in tissue engineering applications.
Tumor Suppressor Gene-Based Nanotherapy: From Test Tube to the Clinic

Directory of Open Access Journals (Sweden)

Manish Shanker

2011-01-01

Full Text Available Cancer is a major health problem in the world. Advances made in cancer therapy have improved the survival of patients in certain types of cancer. However, the overall five-year survival has not significantly improved in the majority of cancer types. Major challenges encountered in having effective cancer therapy are development of drug resistance by the tumor cells, nonspecific cytotoxicity, and inability to affect metastatic tumors by the chemodrugs. Overcoming these challenges requires development and testing of novel therapies. One attractive cancer therapeutic approach is cancer gene therapy. Several laboratories including the authors' laboratory have been investigating nonviral formulations for delivering therapeutic genes as a mode for effective cancer therapy. In this paper the authors will summarize their experience in the development and testing of a cationic lipid-based nanocarrier formulation and the results from their preclinical studies leading to a Phase I clinical trial for nonsmall cell lung cancer. Their nanocarrier formulation containing therapeutic genes such as tumor suppressor genes when administered intravenously effectively controls metastatic tumor growth. Additional Phase I clinical trials based on the results of their nanocarrier formulation have been initiated or proposed for treatment of cancer of the breast, ovary, pancreas, and metastatic melanoma, and will be discussed.
Tumor suppressor gene-based nanotherapy: from test tube to the clinic.

Science.gov (United States)

Shanker, Manish; Jin, Jiankang; Branch, Cynthia D; Miyamoto, Shinya; Grimm, Elizabeth A; Roth, Jack A; Ramesh, Rajagopal

2011-01-01

Cancer is a major health problem in the world. Advances made in cancer therapy have improved the survival of patients in certain types of cancer. However, the overall five-year survival has not significantly improved in the majority of cancer types. Major challenges encountered in having effective cancer therapy are development of drug resistance by the tumor cells, nonspecific cytotoxicity, and inability to affect metastatic tumors by the chemodrugs. Overcoming these challenges requires development and testing of novel therapies. One attractive cancer therapeutic approach is cancer gene therapy. Several laboratories including the authors' laboratory have been investigating nonviral formulations for delivering therapeutic genes as a mode for effective cancer therapy. In this paper the authors will summarize their experience in the development and testing of a cationic lipid-based nanocarrier formulation and the results from their preclinical studies leading to a Phase I clinical trial for nonsmall cell lung cancer. Their nanocarrier formulation containing therapeutic genes such as tumor suppressor genes when administered intravenously effectively controls metastatic tumor growth. Additional Phase I clinical trials based on the results of their nanocarrier formulation have been initiated or proposed for treatment of cancer of the breast, ovary, pancreas, and metastatic melanoma, and will be discussed.
Integration of gene-based markers in a pearl millet genetic map for identification of candidate genes underlying drought tolerance quantitative trait loci

Directory of Open Access Journals (Sweden)

Sehgal Deepmala

2012-01-01

Full Text Available Abstract Background Identification of genes underlying drought tolerance (DT quantitative trait loci (QTLs will facilitate understanding of molecular mechanisms of drought tolerance, and also will accelerate genetic improvement of pearl millet through marker-assisted selection. We report a map based on genes with assigned functional roles in plant adaptation to drought and other abiotic stresses and demonstrate its use in identifying candidate genes underlying a major DT-QTL. Results Seventy five single nucleotide polymorphism (SNP and conserved intron spanning primer (CISP markers were developed from available expressed sequence tags (ESTs using four genotypes, H 77/833-2, PRLT 2/89-33, ICMR 01029 and ICMR 01004, representing parents of two mapping populations. A total of 228 SNPs were obtained from 30.5 kb sequenced region resulting in a SNP frequency of 1/134 bp. The positions of major pearl millet linkage group (LG 2 DT-QTLs (reported from crosses H 77/833-2 × PRLT 2/89-33 and 841B × 863B were added to the present consensus function map which identified 18 genes, coding for PSI reaction center subunit III, PHYC, actin, alanine glyoxylate aminotransferase, uridylate kinase, acyl-CoA oxidase, dipeptidyl peptidase IV, MADS-box, serine/threonine protein kinase, ubiquitin conjugating enzyme, zinc finger C- × 8-C × 5-C × 3-H type, Hd3, acetyl CoA carboxylase, chlorophyll a/b binding protein, photolyase, protein phosphatase1 regulatory subunit SDS22 and two hypothetical proteins, co-mapping in this DT-QTL interval. Many of these candidate genes were found to have significant association with QTLs of grain yield, flowering time and leaf rolling under drought stress conditions. Conclusions We have exploited available pearl millet EST sequences to generate a mapped resource of seventy five new gene-based markers for pearl millet and demonstrated its use in identifying candidate genes underlying a major DT-QTL in this species. The reported gene-based
Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

Directory of Open Access Journals (Sweden)

Olszewski Kellen L

2007-07-01

Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the
Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation.

Science.gov (United States)

Li, Min; Zhang, Jiayi; Liu, Qing; Wang, Jianxin; Wu, Fang-Xiang

2014-01-01

Predicting disease-related genes is one of the most important tasks in bioinformatics and systems biology. With the advances in high-throughput techniques, a large number of protein-protein interactions are available, which make it possible to identify disease-related genes at the network level. However, network-based identification of disease-related genes is still a challenge as the considerable false-positives are still existed in the current available protein interaction networks (PIN). Considering the fact that the majority of genetic disorders tend to manifest only in a single or a few tissues, we constructed tissue-specific networks (TSN) by integrating PIN and tissue-specific data. We further weighed the constructed tissue-specific network (WTSN) by using DNA methylation as it plays an irreplaceable role in the development of complex diseases. A PageRank-based method was developed to identify disease-related genes from the constructed networks. To validate the effectiveness of the proposed method, we constructed PIN, weighted PIN (WPIN), TSN, WTSN for colon cancer and leukemia, respectively. The experimental results on colon cancer and leukemia show that the combination of tissue-specific data and DNA methylation can help to identify disease-related genes more accurately. Moreover, the PageRank-based method was effective to predict disease-related genes on the case studies of colon cancer and leukemia. Tissue-specific data and DNA methylation are two important factors to the study of human diseases. The same method implemented on the WTSN can achieve better results compared to those being implemented on original PIN, WPIN, or TSN. The PageRank-based method outperforms degree centrality-based method for identifying disease-related genes from WTSN.
Cell based-gene delivery approaches for the treatment of spinal cord injury and neurodegenerative disorders.

Science.gov (United States)

Taha, Masoumeh Fakhr

2010-03-01

Cell based-gene delivery has provided an important therapeutic strategy for different disorders in the recent years. This strategy is based on the transplantation of genetically modified cells to express specific genes and to target the delivery of therapeutic factors, especially for the treatment of cancers and neurological, immunological, cardiovascular and heamatopoietic disorders. Although, preliminary reports are encouraging, and experimental studies indicate functionally and structurally improvements in the animal models of different disorders, universal application of this strategy for human diseases requires more evidence. There are a number of parameters that need to be evaluated, including the optimal cell source, the most effective gene/genes to be delivered, the optimal vector and method of gene delivery into the cells and the most efficient route for the delivery of genetically modified cells into the patient. Also, some obstacles have to be overcome, including the safety and usefulness of the approaches and the stability of the improvements. Here, recent studies concerning with the cell-based gene delivery for spinal cord injury and some neurodegenerative disorders such as amyotrophic lateral sclerosis, Parkinson's disease and Alzheimer's disease are briefly reviewed, and their exciting consequences are discussed.
A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.

Science.gov (United States)

Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis

2016-09-02

Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal
Efficient gene transfer into nondividing cells by adeno-associated virus-based vectors.

Science.gov (United States)

Podsakoff, G; Wong, K K; Chatterjee, S

1994-09-01

Gene transfer vectors based on adeno-associated virus (AAV) are emerging as highly promising for use in human gene therapy by virtue of their characteristics of wide host range, high transduction efficiencies, and lack of cytopathogenicity. To better define the biology of AAV-mediated gene transfer, we tested the ability of an AAV vector to efficiently introduce transgenes into nonproliferating cell populations. Cells were induced into a nonproliferative state by treatment with the DNA synthesis inhibitors fluorodeoxyuridine and aphidicolin or by contact inhibition induced by confluence and serum starvation. Cells in logarithmic growth or DNA synthesis arrest were transduced with vCWR:beta gal, an AAV-based vector encoding beta-galactosidase under Rous sarcoma virus long terminal repeat promoter control. Under each condition tested, vCWR:beta Gal expression in nondividing cells was at least equivalent to that in actively proliferating cells, suggesting that mechanisms for virus attachment, nuclear transport, virion uncoating, and perhaps some limited second-strand synthesis of AAV vectors were present in nondividing cells. Southern hybridization analysis of vector sequences from cells transduced while in DNA synthetic arrest and expanded after release of the block confirmed ultimate integration of the vector genome into cellular chromosomal DNA. These findings may provide the basis for the use of AAV-based vectors for gene transfer into quiescent cell populations such as totipotent hematopoietic stem cells.
Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

Directory of Open Access Journals (Sweden)

Zhai Chengxiang

2010-05-01

Full Text Available Abstract Background Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO. However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. Results We propose a statistical method that uses the primary literature, i.e. free-text, as the source to perform overrepresentation analysis. The method is based on a statistical framework of mixture model and addresses the methodological flaws in several existing programs. We implemented this method within a literature mining system, BeeSpace, taking advantage of its analysis environment and added features that facilitate the interactive analysis of gene sets. Through experimentation with several datasets, we showed that our program can effectively summarize the important conceptual themes of large gene sets, even when traditional GO-based analysis does not yield informative results. Conclusions We conclude that the current work will provide biologists with a tool that effectively complements the existing ones for overrepresentation analysis from genomic experiments. Our program, Genelist Analyzer, is freely available at: http://workerbee.igb.uiuc.edu:8080/BeeSpace/Search.jsp
First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

KAUST Repository

Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

2013-01-01

This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.
First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

KAUST Repository

Lin, Hsiu Chin

2013-12-12

This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.
RANWAR: rank-based weighted association rule mining from gene expression and methylation data.

Science.gov (United States)

Mallik, Saurav; Mukhopadhyay, Anirban; Maulik, Ujjwal

2015-01-01

Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported.
Network Based Integrated Analysis of Phenotype-Genotype Data for Prioritization of Candidate Symptom Genes

Directory of Open Access Journals (Sweden)

Xing Li

2014-01-01

Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

Gene therapy prospects--intranasal delivery of therapeutic genes.

Science.gov (United States)

Podolska, Karolina; Stachurska, Anna; Hajdukiewicz, Karolina; Małecki, Maciej

2012-01-01

Gene therapy is recognized to be a novel method for the treatment of various disorders. Gene therapy strategies involve gene manipulation on broad biological processes responsible for the spreading of diseases. Cancer, monogenic diseases, vascular and infectious diseases are the main targets of gene therapy. In order to obtain valuable experimental and clinical results, sufficient gene transfer methods are required. Therapeutic genes can be administered into target tissues via gene carriers commonly defined as vectors. The retroviral, adenoviral and adeno-associated virus based vectors are most frequently used in the clinic. So far, gene preparations may be administered directly into target organs or by intravenous, intramuscular, intratumor or intranasal injections. It is common knowledge that the number of gene therapy clinical trials has rapidly increased. However, some limitations such as transfection efficiency and stable and long-term gene expression are still not resolved. Consequently, great effort is focused on the evaluation of new strategies of gene delivery. There are many expectations associated with intranasal delivery of gene preparations for the treatment of diseases. Intranasal delivery of therapeutic genes is regarded as one of the most promising forms of pulmonary gene therapy research. Gene therapy based on inhalation of gene preparations offers an alternative way for the treatment of patients suffering from such lung diseases as cystic fibrosis, alpha-1-antitrypsin defect, or cancer. Experimental and first clinical trials based on plasmid vectors or recombinant viruses have revealed that gene preparations can effectively deliver therapeutic or marker genes to the cells of the respiratory tract. The noninvasive intranasal delivery of gene preparations or conventional drugs seems to be very encouraging, although basic scientific research still has to continue.
[Smart therapeutics based on synthetic gene circuits].

Science.gov (United States)

Peng, Shuguang; Xie, Zhen

2017-03-25

Synthetic biology has an important impact on biology research since its birth. Applying the thought and methods that reference from electrical engineering, synthetic biology uncovers many regulatory mechanisms of life systems, transforms and expands a series of biological components. Therefore, it brings a wide range of biomedical applications, including providing new ideas for disease diagnosis and treatment. This review describes the latest advances in the field of disease diagnosis and therapy based on mammalian cell or bacterial synthetic gene circuits, and provides new ideas for future smart therapy design.
Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

Science.gov (United States)

Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

2015-01-01

In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Design-Based Learning for Biology: Genetic Engineering Experience Improves Understanding of Gene Expression

Science.gov (United States)

Ellefson, Michelle R.; Brinker, Rebecca A.; Vernacchio, Vincent J.; Schunn, Christian D.

2008-01-01

Gene expression is a difficult topic for students to learn and comprehend, at least partially because it involves various biochemical structures and processes occurring at the microscopic level. Designer Bacteria, a design-based learning (DBL) unit for high-school students, applies principles of DBL to the teaching of gene expression. Throughout…
Research on the Bionics Design of Automobile Styling Based on the Form Gene

Science.gov (United States)

Aili, Zhao; Long, Jiang

2017-09-01

From the heritage of form gene point of view, this thesis has analyzed the gene make-up, cultural inheritance and aesthetic features in the evolution and development of forms of brand automobiles and proposed the bionic design concept and methods in the automobile styling design. And this innovative method must be based on the form gene, and the consistency and combination of form element must be maintained during the design. Taking the design of Maserati as an example, the thesis will show you the design method and philosophy in the aspects of form gene expression and bionic design innovation for the future automobile styling.
A novel mutual information-based Boolean network inference method from time-series gene expression data.

Directory of Open Access Journals (Sweden)

Shohag Barman

Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.
PGMA-Based Cationic Nanoparticles with Polyhydric Iodine Units for Advanced Gene Vectors.

Science.gov (United States)

Sun, Yue; Hu, Hao; Yu, Bingran; Xu, Fu-Jian

2016-11-16

It is crucial for successful gene delivery to develop safe, effective, and multifunctional polycations. Iodine-based small molecules are widely used as contrast agents for CT imaging. Herein, a series of star-like poly(glycidyl methacrylate) (PGMA)-based cationic vectors (II-PGEA/II) with abundant flanking polyhydric iodine units are prepared for multifunctional gene delivery systems. The proposed II-PGEA/II star vector is composed of one iohexol intermediate (II) core and five ethanolamine (EA) and II-difunctionalized PGMA arms. The amphipathic II-PGEA/II vectors readily self-assemble into well-defined cationic nanoparticles, where massive hydroxyl groups can establish a hydration shell to stabilize the nanoparticles. The II introduction improves cell viabilities of polycations. Moreover, by controlling the suitable amount of introduced II units, the resultant II-PGEA/II nanoparticles can produce fairly good transfection performances in different cell lines. Particularly, the II-PGEA/II nanoparticles induce much better in vitro CT imaging abilities in tumor cells than iohexol (one commonly used commercial CT contrast agent). The present design of amphipathic PGMA-based nanoparticles with CT contrast agents would provide useful information for the development of new multifunctional gene delivery systems.
A phylogenetic analysis of the genus Psathyrostachys (Poaceae) based on one nuclear gene, three plastid genes, and morphology

DEFF Research Database (Denmark)

Petersen, Gitte; Seberg, Ole; Baden, Claus

2004-01-01

A phylogenetic analysis of the small, Central Asian genus Psathyrostachys Nevski is presented. The analysis is based on morphological characters and nucleotide sequence data from one nuclear gene, DMC1, and three plastid genes, rbcL, rpoA, and rpoC2. Separate analyses of the three data partitions...... (morphology, nuclear sequences, and plastid sequences) result in mostly congruent trees. The plastid and nuclear sequences produce completely congruent trees, and only the trees based on plastid sequences and morphological characters are incongruent. Combined analysis of all data results in a fairly well......-resolved strict consensus tree: Ps. rupestris is the sister to the remaining species, which are divided into two clades: one including Ps. fragilis and Ps. caduca, the other including Ps. juncea, Ps. huashanica, Ps. lanuginosa, Ps. stoloniformis, and Ps. kronenburgii. Pubescent culms and more than 20 mm long...
Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens.

Science.gov (United States)

de Groot, Reinoud; Lüthi, Joel; Lindsay, Helen; Holtackers, René; Pelkmans, Lucas

2018-01-23

High-content imaging using automated microscopy and computer vision allows multivariate profiling of single-cell phenotypes. Here, we present methods for the application of the CISPR-Cas9 system in large-scale, image-based, gene perturbation experiments. We show that CRISPR-Cas9-mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image-based phenotyping. We developed a pipeline to construct a large-scale arrayed library of 2,281 sequence-verified CRISPR-Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine-learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in-depth characterization of gene perturbation effects. This approach enables genome-scale image-based multivariate gene perturbation profiling using CRISPR-Cas9. © 2018 The Authors. Published under the terms of the CC BY 4.0 license.
A potential disruptive technology in vaccine development: gene-based vaccines and their application to infectious diseases.

Science.gov (United States)

Kaslow, David C

2004-10-01

Vaccine development requires an amalgamation of disparate disciplines and has unique economic and regulatory drivers. Non-viral gene-based delivery systems, such as formulated plasmid DNA, are new and potentially disruptive technologies capable of providing 'cheaper, simpler, and more convenient-to-use' vaccines. Typically and somewhat ironically, disruptive technologies have poorer product performance, at least in the near-term, compared with the existing conventional technologies. Because successful product development requires that the product's performance must meet or exceed the efficacy threshold for a desired application, the appropriate selection of the initial product applications for a disruptive technology is critical for its successful evolution. In this regard, the near-term successes of gene-based vaccines will likely be for protection against bacterial toxins and acute viral and bacterial infections. Recent breakthroughs, however, herald increasing rather than languishing performance improvements in the efficacy of gene-based vaccines. Whether gene-based vaccines ultimately succeed in eliciting protective immunity in humans to persistent intracellular pathogens, such as HIV, malaria and tuberculosis, for which the conventional vaccine technologies have failed, remains to be determined. A success against any one of the persistent intracellular pathogens would be sufficient proof that gene-based vaccines represent a disruptive technology against which future vaccine technologies will be measured.
Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.

Science.gov (United States)

Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph

2013-11-07

The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the
Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

Science.gov (United States)

2011-01-01

Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were
A resampling-based meta-analysis for detection of differential gene expression in breast cancer

International Nuclear Information System (INIS)

Gur-Dedeoglu, Bala; Konu, Ozlen; Kir, Serkan; Ozturk, Ahmet Rasit; Bozkurt, Betul; Ergul, Gulusan; Yulug, Isik G

2008-01-01

Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC), and invasive lobular carcinoma (ILC) samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively). The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real-time qRT-PCR supported the meta-analysis results. The
A resampling-based meta-analysis for detection of differential gene expression in breast cancer

Directory of Open Access Journals (Sweden)

Ergul Gulusan

2008-12-01

Full Text Available Abstract Background Accuracy in the diagnosis of breast cancer and classification of cancer subtypes has improved over the years with the development of well-established immunohistopathological criteria. More recently, diagnostic gene-sets at the mRNA expression level have been tested as better predictors of disease state. However, breast cancer is heterogeneous in nature; thus extraction of differentially expressed gene-sets that stably distinguish normal tissue from various pathologies poses challenges. Meta-analysis of high-throughput expression data using a collection of statistical methodologies leads to the identification of robust tumor gene expression signatures. Methods A resampling-based meta-analysis strategy, which involves the use of resampling and application of distribution statistics in combination to assess the degree of significance in differential expression between sample classes, was developed. Two independent microarray datasets that contain normal breast, invasive ductal carcinoma (IDC, and invasive lobular carcinoma (ILC samples were used for the meta-analysis. Expression of the genes, selected from the gene list for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes were tested on 10 independent primary IDC samples and matched non-tumor controls by real-time qRT-PCR. Other existing breast cancer microarray datasets were used in support of the resampling-based meta-analysis. Results The two independent microarray studies were found to be comparable, although differing in their experimental methodologies (Pearson correlation coefficient, R = 0.9389 and R = 0.8465 for ductal and lobular samples, respectively. The resampling-based meta-analysis has led to the identification of a highly stable set of genes for classification of normal breast samples and breast tumors encompassing both the ILC and IDC subtypes. The expression results of the selected genes obtained through real
Sensitive detection of novel Indian isolate of BTV 21 using ns1 gene based real-time PCR assay

Directory of Open Access Journals (Sweden)

Gaya Prasad

2013-06-01

Full Text Available Aim: The study was conducted to develop ns1 gene based sensitive real-time RT-PCR assay for diagnosis of India isolates of bluetongue virus (BTV. Materials and Methods: The BTV serotype 21 isolate (KMNO7 was isolated from Andhra Pradesh and propagated in BHK-21 cell line in our laboratory. The Nucleic acid (dsRNA of virus was extracted using Trizol method and cDNA was prepared using a standard protocol. The cDNA was allowed to ns1 gene based group specific PCR to confirm the isolate as BTV. The viral RNA was diluted 10 folds and the detection limit of ns1 gene based RT-PCR was determined. Finally the tenfold diluted viral RNA was subjected to real-time RT-PCR using ns1 gene primer and Taq man probe to standardized the reaction and determine the detection limit. Results: The ns1 gene based group specific PCR showed a single 366bp amplicon in agarose gel electrophoresis confirmed the sample as BTV. The ns1 gene RT-PCR using tenfold diluted viral RNA showed the detection limit of 70.0 fg in 1%agarose gel electrophoresis. The ns1 gene based real time RT-PCR was successfully standardized and the detection limit was found to be 7.0 fg. Conclusion: The ns1 gene based real-time RT-PCR was successfully standardized and it was found to be 10 times more sensitive than conventional RT-PCR. Key words: bluetongue, BTV21, RT-PCR, Real time RT-PCR, ns1 gene [Vet World 2013; 6(8.000: 554-557
A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses.

Science.gov (United States)

He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

2017-03-01

comprehensive gene data set of sex pheromone biosynthesis and degradation enzyme related genes in DBM created by genome- and transcriptome-wide identification, characterization and expression profiling. Our findings provide a basis to better understand the function of genes with tissue enriched expression. The results also provide information on the genes involved in sex pheromone biosynthesis and degradation, and may be useful to identify potential gene targets for pest control strategies by disrupting the insect-insect communication using pheromone-based behavioral antagonists.
Screening key genes for abdominal aortic aneurysm based on gene expression omnibus dataset.

Science.gov (United States)

Wan, Li; Huang, Jingyong; Ni, Haizhen; Yu, Guanfeng

2018-02-13

Abdominal aortic aneurysm (AAA) is a common cardiovascular system disease with high mortality. The aim of this study was to identify potential genes for diagnosis and therapy in AAA. We searched and downloaded mRNA expression data from the Gene Expression Omnibus (GEO) database to identify differentially expressed genes (DEGs) from AAA and normal individuals. Then, Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis, transcriptional factors (TFs) network and protein-protein interaction (PPI) network were used to explore the function of genes. Additionally, immunohistochemical (IHC) staining was used to validate the expression of identified genes. Finally, the diagnostic value of identified genes was accessed by receiver operating characteristic (ROC) analysis in GEO database. A total of 1199 DEGs (188 up-regulated and 1011 down-regulated) were identified between AAA and normal individual. KEGG pathway analysis displayed that vascular smooth muscle contraction and pathways in cancer were significantly enriched signal pathway. The top 10 up-regulated and top 10 down-regulated DEGs were used to construct TFs and PPI networks. Some genes with high degrees such as NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16 and FOXO1 were identified to be related to AAA. The consequences of IHC staining showed that CCR7 and PDGFA were up-regulated in tissue samples of AAA. ROC analysis showed that NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16, FOXO1 and PDGFA had the potential diagnostic value for AAA. The identified genes including NELL2, CCR7, MGAM, HBB, CSNK2A2, ZBTB16, FOXO1 and PDGFA might be involved in the pathology of AAA.
A gene-based analysis of variants in the serum/glucocorticoid regulated kinase (SGK genes with blood pressure responses to sodium intake: the GenSalt Study.

Directory of Open Access Journals (Sweden)

Changwei Li

Full Text Available Serum and glucocorticoid regulated kinase (SGK plays a critical role in the regulation of renal sodium transport. We examined the association between SGK genes and salt sensitivity of blood pressure (BP using single-marker and gene-based association analysis.A 7-day low-sodium (51.3 mmol sodium/day followed by a 7-day high-sodium intervention (307.8 mmol sodium/day was conducted among 1,906 Chinese participants. BP measurements were obtained at baseline and each intervention using a random-zero sphygmomanometer. Additive associations between each SNP and salt-sensitivity phenotypes were assessed using a mixed linear regression model to account for family dependencies. Gene-based analyses were conducted using the truncated p-value method. The Bonferroni-method was used to adjust for multiple testing in all analyses.In single-marker association analyses, SGK1 marker rs2758151 was significantly associated with diastolic BP (DBP response to high-sodium intervention (P = 0.0010. DBP responses (95% confidence interval to high-sodium intervention for genotypes C/C, C/T, and T/T were 2.04 (1.57 to 2.52, 1.79 (1.42 to 2.16, and 0.85 (0.30 to 1.41 mmHg, respectively. Similar trends were observed for SBP and MAP responses although not significant (P = 0.15 and 0.0026, respectively. In addition, gene-based analyses demonstrated significant associations between SGK1 and SBP, DBP and MAP responses to high sodium intervention (P = 0.0002, 0.0076, and 0.00001, respectively. Neither SGK2 nor SGK3 were associated with the salt-sensitivity phenotypes in single-maker or gene-based analyses.The current study identified association of the SGK1 gene and BP salt-sensitivity in the Han Chinese population. Further studies are warranted to identify causal SGK1 gene variants.
Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

Directory of Open Access Journals (Sweden)

Sujan Mamidi

2016-07-01

Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.
Design of Knowledge Bases for Plant Gene Regulatory Networks.

Science.gov (United States)

Mukundi, Eric; Gomez-Cano, Fabio; Ouma, Wilberforce Zachary; Grotewold, Erich

2017-01-01

Developing a knowledge base that contains all the information necessary for the researcher studying gene regulation in a particular organism can be accomplished in four stages. This begins with defining the data scope. We describe here the necessary information and resources, and outline the methods for obtaining data. The second stage consists of designing the schema, which involves defining the entire arrangement of the database in a systematic plan. The third stage is the implementation, defined by actualization of the database by using software according to a predefined schema. The final stage is development, where the database is made available to users in a web-accessible system. The result is a knowledgebase that integrates all the information pertaining to gene regulation, and which is easily expandable and transferable.

A non-inheritable maternal Cas9-based multiple-gene editing system in mice

OpenAIRE

Takayuki Sakurai; Akiko Kamiyoshi; Hisaka Kawate; Chie Mori; Satoshi Watanabe; Megumu Tanaka; Ryuichi Uetake; Masahiro Sato; Takayuki Shindo

2016-01-01

The CRISPR/Cas9 system is capable of editing multiple genes through one-step zygote injection. The preexisting method is largely based on the co-injection of Cas9 DNA (or mRNA) and guide RNAs (gRNAs); however, it is unclear how many genes can be simultaneously edited by this method, and a reliable means to generate transgenic (Tg) animals with multiple gene editing has yet to be developed. Here, we employed non-inheritable maternal Cas9 (maCas9) protein derived from Tg mice with systemic Cas9...
Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

Directory of Open Access Journals (Sweden)

Guo Zheng

2006-01-01

Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex
Partial least squares based gene expression analysis in estrogen receptor positive and negative breast tumors.

Science.gov (United States)

Ma, W; Zhang, T-F; Lu, P; Lu, S H

2014-01-01

Breast cancer is categorized into two broad groups: estrogen receptor positive (ER+) and ER negative (ER-) groups. Previous study proposed that under trastuzumab-based neoadjuvant chemotherapy, tumor initiating cell (TIC) featured ER- tumors response better than ER+ tumors. Exploration of the molecular difference of these two groups may help developing new therapeutic strategies, especially for ER- patients. With gene expression profile from the Gene Expression Omnibus (GEO) database, we performed partial least squares (PLS) based analysis, which is more sensitive than common variance/regression analysis. We acquired 512 differentially expressed genes. Four pathways were found to be enriched with differentially expressed genes, involving immune system, metabolism and genetic information processing process. Network analysis identified five hub genes with degrees higher than 10, including APP, ESR1, SMAD3, HDAC2, and PRKAA1. Our findings provide new understanding for the molecular difference between TIC featured ER- and ER+ breast tumors with the hope offer supports for therapeutic studies.
DNA base sequence changes induced by ultraviolet light mutagenesis of a gene on a chromosome in Chinese hamster ovary cells

Energy Technology Data Exchange (ETDEWEB)

Romac, S; Leong, P; Sockett, H; Hutchinson, F [Yale Univ., New Haven, CT (USA). Dept. of Molecular Biophysics and Biochemistry

1989-09-20

The DNA base sequence changes induced by mutagenesis with ultraviolet light have been determined in a gene on a chromosome of cultured Chinese hamster ovary (CHO) cells. The gene was the Excherichia coli gpt gene, of which a single copy was stably incorporated and expressed in the CHO cell genome. The cells were irradiated with ultraviolet light and gpt{sup -} colonies were selected by resistance to 6-thioguanine. The gpt gene was amplified from chromosomal DNA by use of the polymerase chain reaction (PCR) and the amplified DNA sequenced directly by the dideoxy method. Of the 58 sequenced mutants of independent origin 53 were base change mutations. Forty-one base substitutions were single base changes, ten had two adjacent (or tandem) base changes, and one had two base changes separated by a single base-pair. Only one mutant had a multiple base change mutation with two or more well separated base changes. In contrast much higher levels of such mutations were reported in ultraviolet mutagenesis of genes on a shuttle vector in primate cells. Two deletions of a single base-pair were observed and three deletions ranging from 6 to 37 base-pairs. The mutation spectrum in the gpt gene had similarities to the ultraviolet mutation spectra for several genes in prokaryotes, which suggests similarities in mutational mechanisms in prokaryotes and eukaryotes. (author).
Partial Least Squares Based Gene Expression Analysis in EBV- Positive and EBV-Negative Posttransplant Lymphoproliferative Disorders.

Science.gov (United States)

Wu, Sa; Zhang, Xin; Li, Zhi-Ming; Shi, Yan-Xia; Huang, Jia-Jia; Xia, Yi; Yang, Hang; Jiang, Wen-Qi

2013-01-01

Post-transplant lymphoproliferative disorder (PTLD) is a common complication of therapeutic immunosuppression after organ transplantation. Gene expression profile facilitates the identification of biological difference between Epstein-Barr virus (EBV) positive and negative PTLDs. Previous studies mainly implemented variance/regression analysis without considering unaccounted array specific factors. The aim of this study is to investigate the gene expression difference between EBV positive and negative PTLDs through partial least squares (PLS) based analysis. With a microarray data set from the Gene Expression Omnibus database, we performed PLS based analysis. We acquired 1188 differentially expressed genes. Pathway and Gene Ontology enrichment analysis identified significantly over-representation of dysregulated genes in immune response and cancer related biological processes. Network analysis identified three hub genes with degrees higher than 15, including CREBBP, ATXN1, and PML. Proteins encoded by CREBBP and PML have been reported to be interact with EBV before. Our findings shed light on expression distinction of EBV positive and negative PTLDs with the hope to offer theoretical support for future therapeutic study.
Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups.

Science.gov (United States)

Shiao, S Pamela K; Grayson, James; Yu, Chong Ho; Wasek, Brandi; Bottiglieri, Teodoro

2018-02-16

For the personalization of polygenic/omics-based health care, the purpose of this study was to examine the gene-environment interactions and predictors of colorectal cancer (CRC) by including five key genes in the one-carbon metabolism pathways. In this proof-of-concept study, we included a total of 54 families and 108 participants, 54 CRC cases and 54 matched family friends representing four major racial ethnic groups in southern California (White, Asian, Hispanics, and Black). We used three phases of data analytics, including exploratory, family-based analyses adjusting for the dependence within the family for sharing genetic heritage, the ensemble method, and generalized regression models for predictive modeling with a machine learning validation procedure to validate the results for enhanced prediction and reproducibility. The results revealed that despite the family members sharing genetic heritage, the CRC group had greater combined gene polymorphism rates than the family controls ( p relation to gene-environment interactions in the prevention of CRC.
Algorithms for MDC-based multi-locus phylogeny inference: beyond rooted binary gene trees on single alleles.

Science.gov (United States)

Yu, Yun; Warnow, Tandy; Nakhleh, Luay

2011-11-01

One of the criteria for inferring a species tree from a collection of gene trees, when gene tree incongruence is assumed to be due to incomplete lineage sorting (ILS), is Minimize Deep Coalescence (MDC). Exact algorithms for inferring the species tree from rooted, binary trees under MDC were recently introduced. Nevertheless, in phylogenetic analyses of biological data sets, estimated gene trees may differ from true gene trees, be incompletely resolved, and not necessarily rooted. In this article, we propose new MDC formulations for the cases where the gene trees are unrooted/binary, rooted/non-binary, and unrooted/non-binary. Further, we prove structural theorems that allow us to extend the algorithms for the rooted/binary gene tree case to these cases in a straightforward manner. In addition, we devise MDC-based algorithms for cases when multiple alleles per species may be sampled. We study the performance of these methods in coalescent-based computer simulations.
Towards precise classification of cancers based on robust gene functional expression profiles

Directory of Open Access Journals (Sweden)

Zhu Jing

2005-03-01

Full Text Available Abstract Background Development of robust and efficient methods for analyzing and interpreting high dimension gene expression profiles continues to be a focus in computational biology. The accumulated experiment evidence supports the assumption that genes express and perform their functions in modular fashions in cells. Therefore, there is an open space for development of the timely and relevant computational algorithms that use robust functional expression profiles towards precise classification of complex human diseases at the modular level. Results Inspired by the insight that genes act as a module to carry out a highly integrated cellular function, we thus define a low dimension functional expression profile for data reduction. After annotating each individual gene to functional categories defined in a proper gene function classification system such as Gene Ontology applied in this study, we identify those functional categories enriched with differentially expressed genes. For each functional category or functional module, we compute a summary measure (s for the raw expression values of the annotated genes to capture the overall activity level of the module. In this way, we can treat the gene expressions within a functional module as an integrative data point to replace the multiple values of individual genes. We compare the classification performance of decision trees based on functional expression profiles with the conventional gene expression profiles using four publicly available datasets, which indicates that precise classification of tumour types and improved interpretation can be achieved with the reduced functional expression profiles. Conclusion This modular approach is demonstrated to be a powerful alternative approach to analyzing high dimension microarray data and is robust to high measurement noise and intrinsic biological variance inherent in microarray data. Furthermore, efficient integration with current biological knowledge
An Improved Fuzzy Based Missing Value Estimation in DNA Microarray Validated by Gene Ranking

Directory of Open Access Journals (Sweden)

Sujay Saha

2016-01-01

Full Text Available Most of the gene expression data analysis algorithms require the entire gene expression matrix without any missing values. Hence, it is necessary to devise methods which would impute missing data values accurately. There exist a number of imputation algorithms to estimate those missing values. This work starts with a microarray dataset containing multiple missing values. We first apply the modified version of the fuzzy theory based existing method LRFDVImpute to impute multiple missing values of time series gene expression data and then validate the result of imputation by genetic algorithm (GA based gene ranking methodology along with some regular statistical validation techniques, like RMSE method. Gene ranking, as far as our knowledge, has not been used yet to validate the result of missing value estimation. Firstly, the proposed method has been tested on the very popular Spellman dataset and results show that error margins have been drastically reduced compared to some previous works, which indirectly validates the statistical significance of the proposed method. Then it has been applied on four other 2-class benchmark datasets, like Colorectal Cancer tumours dataset (GDS4382, Breast Cancer dataset (GSE349-350, Prostate Cancer dataset, and DLBCL-FL (Leukaemia for both missing value estimation and ranking the genes, and the results show that the proposed method can reach 100% classification accuracy with very few dominant genes, which indirectly validates the biological significance of the proposed method.
Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

Directory of Open Access Journals (Sweden)

Pugalendhi Ganesh Kumar

Full Text Available This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR-based method for redefining the criterion function of f-information (FI to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA, which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS. Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA, particle swarm optimization (PSO, and artificial bee colony algorithm (ABC on all the datasets. In the global cancer map with repeated measurements (GCM_RM dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%. In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively
Tipping the Proteome with Gene-Based Vaccines: Weighing in on the Role of Nano materials

International Nuclear Information System (INIS)

Flores, K.J.; Craig, M.; Smith, J.J.; DeLong, R.K.; Wanekaya, A.; Dong, L.

2012-01-01

Since the first generation of DNA vaccines was introduced in 1988, remarkable improvements have been made to improve their efficacy and immunogenicity. Although human clinical trials have shown that delivery of DNA vaccines is well tolerated and safe, the potency of these vaccines in humans is somewhat less than optimal. The development of a gene-based vaccine that was effective enough to be approved for clinical use in humans would be one of, if not the most important, advance in vaccines to date. This paper highlights the literature relating to gene-based vaccines, specifically DNA vaccines, and suggests possible approaches to boost their performance. In addition, we explore the idea that combining RNA and nano materials may hold the key to successful gene-based vaccines for prevention and treatment of disease
Using FlyBase, a Database of Drosophila Genes and Genomes.

Science.gov (United States)

Marygold, Steven J; Crosby, Madeline A; Goodman, Joshua L

2016-01-01

For nearly 25 years, FlyBase (flybase.org) has provided a freely available online database of biological information about Drosophila species, focusing on the model organism D. melanogaster. The need for a centralized, integrated view of Drosophila research has never been greater as advances in genomic, proteomic, and high-throughput technologies add to the quantity and diversity of available data and resources.FlyBase has taken several approaches to respond to these changes in the research landscape. Novel report pages have been generated for new reagent types and physical interaction data; Drosophila models of human disease are now represented and showcased in dedicated Human Disease Model Reports; other integrated reports have been established that bring together related genes, datasets, or reagents; Gene Reports have been revised to improve access to new data types and to highlight functional data; links to external sites have been organized and expanded; and new tools have been developed to display and interrogate all these data, including improved batch processing and bulk file availability. In addition, several new community initiatives have served to enhance interactions between researchers and FlyBase, resulting in direct user contributions and improved feedback.This chapter provides an overview of the data content, organization, and available tools within FlyBase, focusing on recent improvements. We hope it serves as a guide for our diverse user base, enabling efficient and effective exploration of the database and thereby accelerating research discoveries.
Zn(II)-dipicolylamine-based metallo-lipids as novel non-viral gene vectors.

Science.gov (United States)

Su, Rong-Chuan; Liu, Qiang; Yi, Wen-Jing; Zhao, Zhi-Gang

2017-08-01

In this study, a series of Zn(II)-dipicolylamine (Zn-DPA) based cationic lipids bearing different hydrophobic tails (long chains, α-tocopherol, cholesterol or diosgenin) were synthesized. Structure-activity relationship (SAR) of these lipids was studied in detail by investigating the effects of several structural aspects including the type of hydrophobic tails, the chain length and saturation degree. In addition, several assays were used to study their interactions with plasmid DNA, and results reveal that these lipids could condense DNA into nanosized particles with appropriate size and zeta-potentials. MTT-based cell viability assays showed that lipoplexes 5 had low cytotoxicity. The in vitro gene transfection studies showed the hydrophobic tails clearly affected the TE, and hexadecanol-containing lipid 5b gives the best TE, which was 2.2 times higher than bPEI 25k in the presence of 10% serum. The results not only demonstrate that these lipids might be promising non-viral gene vectors, but also afford us clues for further optimization of lipidic gene delivery materials.
A Morpholino-based screen to identify novel genes involved in craniofacial morphogenesis

Science.gov (United States)

Melvin, Vida Senkus; Feng, Weiguo; Hernandez-Lagunas, Laura; Artinger, Kristin Bruk; Williams, Trevor

2014-01-01

BACKGROUND The regulatory mechanisms underpinning facial development are conserved between diverse species. Therefore, results from model systems provide insight into the genetic causes of human craniofacial defects. Previously, we generated a comprehensive dataset examining gene expression during development and fusion of the mouse facial prominences. Here, we used this resource to identify genes that have dynamic expression patterns in the facial prominences, but for which only limited information exists concerning developmental function. RESULTS This set of ~80 genes was used for a high throughput functional analysis in the zebrafish system using Morpholino gene knockdown technology. This screen revealed three classes of cranial cartilage phenotypes depending upon whether knockdown of the gene affected the neurocranium, viscerocranium, or both. The targeted genes that produced consistent phenotypes encoded proteins linked to transcription (meis1, meis2a, tshz2, vgll4l), signaling (pkdcc, vlk, macc1, wu:fb16h09), and extracellular matrix function (smoc2). The majority of these phenotypes were not altered by reduction of p53 levels, demonstrating that both p53 dependent and independent mechanisms were involved in the craniofacial abnormalities. CONCLUSIONS This Morpholino-based screen highlights new genes involved in development of the zebrafish craniofacial skeleton with wider relevance to formation of the face in other species, particularly mouse and human. PMID:23559552
Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

Directory of Open Access Journals (Sweden)

Kelsey E. Grinde

2017-09-01

Full Text Available To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6 and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures.
SVMRFE based approach for prediction of most discriminatory gene target for type II diabetes

Directory of Open Access Journals (Sweden)

Atul Kumar

2017-06-01

Full Text Available Type II diabetes is a chronic condition that affects the way our body metabolizes sugar. The body's important source of fuel is now becoming a chronic disease all over the world. It is now very necessary to identify the new potential targets for the drugs which not only control the disease but also can treat it. Support vector machines are the classifier which has a potential to make a classification of the discriminatory genes and non-discriminatory genes. SVMRFE a modification of SVM ranks the genes based on their discriminatory power and eliminate the genes which are not involved in causing the disease. A gene regulatory network has been formed with the top ranked coding genes to identify their role in causing diabetes. To further validate the results pathway study was performed to identify the involvement of the coding genes in type II diabetes. The genes obtained from this study showed a significant involvement in causing the disease, which may be used as a potential drug target.
Identification of IGF1, SLC4A4, WWOX, and SFMBT1 as hypertension susceptibility genes in Han Chinese with a genome-wide gene-based association study.

Directory of Open Access Journals (Sweden)

Hsin-Chou Yang

Full Text Available Hypertension is a complex disorder with high prevalence rates all over the world. We conducted the first genome-wide gene-based association scan for hypertension in a Han Chinese population. By analyzing genome-wide single-nucleotide-polymorphism data of 400 matched pairs of young-onset hypertensive patients and normotensive controls genotyped with the Illumina HumanHap550-Duo BeadChip, 100 susceptibility genes for hypertension were identified and also validated with permutation tests. Seventeen of the 100 genes exhibited differential allelic and expression distributions between patient and control groups. These genes provided a good molecular signature for classifying hypertensive patients and normotensive controls. Among the 17 genes, IGF1, SLC4A4, WWOX, and SFMBT1 were not only identified by our gene-based association scan and gene expression analysis but were also replicated by a gene-based association analysis of the Hong Kong Hypertension Study. Moreover, cis-acting expression quantitative trait loci associated with the differentially expressed genes were found and linked to hypertension. IGF1, which encodes insulin-like growth factor 1, is associated with cardiovascular disorders, metabolic syndrome, decreased body weight/size, and changes of insulin levels in mice. SLC4A4, which encodes the electrogenic sodium bicarbonate cotransporter 1, is associated with decreased body weight/size and abnormal ion homeostasis in mice. WWOX, which encodes the WW domain-containing protein, is related to hypoglycemia and hyperphosphatemia. SFMBT1, which encodes the scm-like with four MBT domains protein 1, is a novel hypertension gene. GRB14, TMEM56 and KIAA1797 exhibited highly significant differential allelic and expressed distributions between hypertensive patients and normotensive controls. GRB14 was also found relevant to blood pressure in a previous genetic association study in East Asian populations. TMEM56 and KIAA1797 may be specific to
Potential mechanisms for cell-based gene therapy to treat HIV/AIDS.

Science.gov (United States)

Herrera-Carrillo, Elena; Berkhout, Ben

2015-02-01

An estimated 35 million people are infected with HIV worldwide. Anti-retroviral therapy (ART) has reduced the morbidity and mortality of HIV-infected patients but efficacy requires strict adherence and the treatment is not curative. Most importantly, the emergence of drug-resistant virus strains and drug toxicity can restrict the long-term therapeutic efficacy in some patients. Therefore, novel treatment strategies that permanently control or eliminate the virus and restore the damaged immune system are required. Gene therapy against HIV infection has been the topic of intense investigations for the last two decades because it can theoretically provide such a durable anti-HIV control. In this review we discuss two major gene therapy strategies to combat HIV. One approach aims to kill HIV-infected cells and the other is based on the protection of cells from HIV infection. We discuss the underlying molecular mechanisms for candidate approaches to permanently block HIV infection, including the latest strategies and future therapeutic applications. Hematopoietic stem cell-based gene therapy for HIV/AIDS may eventually become an alternative for standard ART and should ideally provide a functional cure in which the virus is durably controlled without medication. Recent results from preclinical research and early-stage clinical trials support the feasibility and safety of this novel strategy.
Cytomegalovirus replicon-based regulation of gene expression in vitro and in vivo.

Directory of Open Access Journals (Sweden)

Hermine Mohr

Full Text Available There is increasing evidence for a connection between DNA replication and the expression of adjacent genes. Therefore, this study addressed the question of whether a herpesvirus origin of replication can be used to activate or increase the expression of adjacent genes. Cell lines carrying an episomal vector, in which reporter genes are linked to the murine cytomegalovirus (MCMV origin of lytic replication (oriLyt, were constructed. Reporter gene expression was silenced by a histone-deacetylase-dependent mechanism, but was resolved upon lytic infection with MCMV. Replication of the episome was observed subsequent to infection, leading to the induction of gene expression by more than 1000-fold. oriLyt-based regulation thus provided a unique opportunity for virus-induced conditional gene expression without the need for an additional induction mechanism. This principle was exploited to show effective late trans-complementation of the toxic viral protein M50 and the glycoprotein gO of MCMV. Moreover, the application of this principle for intracellular immunization against herpesvirus infection was demonstrated. The results of the present study show that viral infection specifically activated the expression of a dominant-negative transgene, which inhibited viral growth. This conditional system was operative in explant cultures of transgenic mice, but not in vivo. Several applications are discussed.
The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes

Directory of Open Access Journals (Sweden)

Xinguo Lu

2018-01-01

Full Text Available With advances in next-generation sequencing(NGS technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast cancer is known as a heterogeneous disease. The identification of subtype-specific driver genes is critical to guide the diagnosis, assessment of prognosis and treatment of breast cancer. We developed an integrated frame based on gene expression profiles and copy number variation (CNV data to identify breast cancer subtype-specific driver genes. In this frame, we employed statistical machine-learning method to select gene subsets and utilized an module-network analysis method to identify potential candidate driver genes. The final subtype-specific driver genes were acquired by paired-wise comparison in subtypes. To validate specificity of the driver genes, the gene expression data of these genes were applied to classify the patient samples with 10-fold cross validation and the enrichment analysis were also conducted on the identified driver genes. The experimental results show that the proposed integrative method can identify the potential driver genes and the classifier with these genes acquired better performance than with genes identified by other methods.

A computational method based on the integration of heterogeneous networks for predicting disease-gene associations.

Directory of Open Access Journals (Sweden)

Xingli Guo

Full Text Available The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.
A contribution to the study of plant development evolution based on gene co-expression networks

Directory of Open Access Journals (Sweden)

Francisco J. Romero-Campero

2013-08-01

Full Text Available Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms.
Clustering gene expression data based on predicted differential effects of GV interaction.

Science.gov (United States)

Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

2005-02-01

Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.
Satellite DNA-based artificial chromosomes for use in gene therapy.

Science.gov (United States)

Hadlaczky, G

2001-04-01

Satellite DNA-based artificial chromosomes (SATACs) can be made by induced de novo chromosome formation in cells of different mammalian species. These artificially generated accessory chromosomes are composed of predictable DNA sequences and they contain defined genetic information. Prototype human SATACs have been successfully constructed in different cell types from 'neutral' endogenous DNA sequences from the short arm of the human chromosome 15. SATACs have already passed a number of hurdles crucial to their further development as gene therapy vectors, including: large-scale purification; transfer of purified artificial chromosomes into different cells and embryos; generation of transgenic animals and germline transmission with purified SATACs; and the tissue-specific expression of a therapeutic gene from an artificial chromosome in the milk of transgenic animals.
Machine learning approaches to supporting the identification of photoreceptor-enriched genes based on expression data

Directory of Open Access Journals (Sweden)

Simpson David

2006-03-01

Full Text Available Abstract Background Retinal photoreceptors are highly specialised cells, which detect light and are central to mammalian vision. Many retinal diseases occur as a result of inherited dysfunction of the rod and cone photoreceptor cells. Development and maintenance of photoreceptors requires appropriate regulation of the many genes specifically or highly expressed in these cells. Over the last decades, different experimental approaches have been developed to identify photoreceptor enriched genes. Recent progress in RNA analysis technology has generated large amounts of gene expression data relevant to retinal development. This paper assesses a machine learning methodology for supporting the identification of photoreceptor enriched genes based on expression data. Results Based on the analysis of publicly-available gene expression data from the developing mouse retina generated by serial analysis of gene expression (SAGE, this paper presents a predictive methodology comprising several in silico models for detecting key complex features and relationships encoded in the data, which may be useful to distinguish genes in terms of their functional roles. In order to understand temporal patterns of photoreceptor gene expression during retinal development, a two-way cluster analysis was firstly performed. By clustering SAGE libraries, a hierarchical tree reflecting relationships between developmental stages was obtained. By clustering SAGE tags, a more comprehensive expression profile for photoreceptor cells was revealed. To demonstrate the usefulness of machine learning-based models in predicting functional associations from the SAGE data, three supervised classification models were compared. The results indicated that a relatively simple instance-based model (KStar model performed significantly better than relatively more complex algorithms, e.g. neural networks. To deal with the problem of functional class imbalance occurring in the dataset, two data re
GENECODIS-Grid: An online grid-based tool to predict functional information in gene lists

International Nuclear Information System (INIS)

Nogales, R.; Mejia, E.; Vicente, C.; Montes, E.; Delgado, A.; Perez Griffo, F. J.; Tirado, F.; Pascual-Montano, A.

2007-01-01

In this work we introduce GeneCodis-Grid, a grid-based alternative to a bioinformatics tool named Genecodis that integrates different sources of biological information to search for biological features (annotations) that frequently co-occur in a set of genes and rank them by statistical significance. GeneCodis-Grid is a web-based application that takes advantage of two independent grid networks and a computer cluster managed by a meta-scheduler and a web server that host the application. The mining of concurrent biological annotations provides significant information for the functional analysis of gene list obtained by high throughput experiments in biology. Due to the large popularity of this tool, that has registered more than 13000 visits since its publication in January 2007, there is a strong need to facilitate users from different sites to access the system simultaneously. In addition, the complexity of some of the statistical tests used in this approach has made this technique a good candidate for its implementation in a Grid opportunistic environment. (Author)
Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

Science.gov (United States)

Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre

2009-01-01

Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932
Stoichiometric Representation of Gene–Protein–Reaction Associations Leverages Constraint-Based Analysis from Reaction to Gene-Level Phenotype Prediction

DEFF Research Database (Denmark)

Machado, Daniel; Herrgard, Markus; Rocha, Isabel

2016-01-01

only describe the metabolic phenotype at the reaction level, understanding the mechanistic link between genotype and phenotype is still hampered by the complexity of gene-protein-reaction associations. We implement a model transformation that enables constraint-based methods to be applied at the gene...... design methods are not actually feasible, and show how our approach allows using the same methods to obtain feasible gene-based designs. We also show, by extensive comparison with experimental 13C-flux data, how simple reformulations of different simulation methods with gene-wise objective functions...
Gene expression-based molecular diagnostic system for malignant gliomas is superior to histological diagnosis.

Science.gov (United States)

Shirahata, Mitsuaki; Iwao-Koizumi, Kyoko; Saito, Sakae; Ueno, Noriko; Oda, Masashi; Hashimoto, Nobuo; Takahashi, Jun A; Kato, Kikuya

2007-12-15

Current morphology-based glioma classification methods do not adequately reflect the complex biology of gliomas, thus limiting their prognostic ability. In this study, we focused on anaplastic oligodendroglioma and glioblastoma, which typically follow distinct clinical courses. Our goal was to construct a clinically useful molecular diagnostic system based on gene expression profiling. The expression of 3,456 genes in 32 patients, 12 and 20 of whom had prognostically distinct anaplastic oligodendroglioma and glioblastoma, respectively, was measured by PCR array. Next to unsupervised methods, we did supervised analysis using a weighted voting algorithm to construct a diagnostic system discriminating anaplastic oligodendroglioma from glioblastoma. The diagnostic accuracy of this system was evaluated by leave-one-out cross-validation. The clinical utility was tested on a microarray-based data set of 50 malignant gliomas from a previous study. Unsupervised analysis showed divergent global gene expression patterns between the two tumor classes. A supervised binary classification model showed 100% (95% confidence interval, 89.4-100%) diagnostic accuracy by leave-one-out cross-validation using 168 diagnostic genes. Applied to a gene expression data set from a previous study, our model correlated better with outcome than histologic diagnosis, and also displayed 96.6% (28 of 29) consistency with the molecular classification scheme used for these histologically controversial gliomas in the original article. Furthermore, we observed that histologically diagnosed glioblastoma samples that shared anaplastic oligodendroglioma molecular characteristics tended to be associated with longer survival. Our molecular diagnostic system showed reproducible clinical utility and prognostic ability superior to traditional histopathologic diagnosis for malignant glioma.
SoFoCles: feature filtering for microarray classification based on gene ontology.

Science.gov (United States)

Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A

2010-02-01

Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.
An Intelligent Method of Product Scheme Design Based on Product Gene

Directory of Open Access Journals (Sweden)

Qing Song Ai

2013-01-01

Full Text Available Nowadays, in order to have some featured products, many customers tend to buy customized products instead of buying common ones in supermarket. The manufacturing enterprises, with the purpose of improving their competitiveness, are focusing on providing customized products with high quality and low cost as well. At present, how to produce customized products rapidly and cheaply has been the key challenge to manufacturing enterprises. In this paper, an intelligent modeling approach applied to supporting the modeling of customized products is proposed, which may improve the efficiency during the product design process. Specifically, the product gene (PG method, which is an analogy of biological evolution in engineering area, is employed to model products in a new way. Based on product gene, we focus on the intelligent modeling method to generate product schemes rapidly and automatically. The process of our research includes three steps: (1 develop a product gene model for customized products; (2 find the obtainment and storage method for product gene; and (3 propose a specific genetic algorithm used for calculating the solution of customized product and generating new product schemes. Finally, a case study is applied to test the usefulness of our study.
A new measure for functional similarity of gene products based on Gene Ontology

Directory of Open Access Journals (Sweden)

Lengauer Thomas

2006-06-01

Full Text Available Abstract Background Gene Ontology (GO is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. Results We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; simRel and funSim. One measure (simRel is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. Conclusion The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families.
Global Regulatory Differences for Gene- and Cell-Based Therapies

DEFF Research Database (Denmark)

Coppens, Delphi G M; De Bruin, Marie L; Leufkens, Hubert G M

2017-01-01

Gene- and cell-based therapies (GCTs) offer potential new treatment options for unmet medical needs. However, the use of conventional regulatory requirements for medicinal products to approve GCTs may impede patient access and therapeutic innovation. Furthermore, requirements differ between...... jurisdictions, complicating the global regulatory landscape. We provide a comparative overview of regulatory requirements for GCT approval in five jurisdictions and hypothesize on the consequences of the observed global differences on patient access and therapeutic innovation....
Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

Directory of Open Access Journals (Sweden)

D. Ramyachitra

2015-09-01

Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.
Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

Science.gov (United States)

Ramyachitra, D; Sofia, M; Manikandan, P

2015-09-01

Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.
Genome-wide screening for genes whose deletions confer sensitivity to mutagenic purine base analogs in yeast

Directory of Open Access Journals (Sweden)

Kozmin Stanislav G

2005-06-01

Full Text Available Abstract Background N-hydroxylated base analogs, such as 6-hydroxylaminopurine (HAP and 2-amino-6-hydroxylaminopurine (AHA, are strong mutagens in various organisms due to their ambiguous base-pairing properties. The systems protecting cells from HAP and related noncanonical purines in Escherichia coli include specialized deoxyribonucleoside triphosphatase RdgB, DNA repair endonuclease V, and a molybdenum cofactor-dependent system. Fewer HAP-detoxification systems have been identified in yeast Saccharomyces cerevisiae and other eukaryotes. Cellular systems protecting from AHA are unknown. In the present study, we performed a genome-wide search for genes whose deletions confer sensitivity to HAP and AHA in yeast. Results We screened the library of yeast deletion mutants for sensitivity to the toxic and mutagenic action of HAP and AHA. We identified novel genes involved in the genetic control of base analogs sensitivity, including genes controlling purine metabolism, cytoskeleton organization, and amino acid metabolism. Conclusion We developed a method for screening the yeast deletion library for sensitivity to the mutagenic and toxic action of base analogs and identified 16 novel genes controlling pathways of protection from HAP. Three of them also protect from AHA.
A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

Directory of Open Access Journals (Sweden)

Solis Julio

2010-10-01

Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.
Network-based differential gene expression analysis suggests cell cycle related genes regulated by E2F1 underlie the molecular difference between smoker and non-smoker lung adenocarcinoma

Science.gov (United States)

2013-01-01

Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize
Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

Science.gov (United States)

Grinde, Kelsey E.; Arbet, Jaron; Green, Alden; O'Connell, Michael; Valcarcel, Alessandra; Westra, Jason; Tintle, Nathan

2017-01-01

To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures. PMID:28959274
Mesenchymal Stem Cell-Based Tumor-Targeted Gene Therapy in Gastrointestinal Cancer

OpenAIRE

Bao, Qi; Zhao, Yue; Niess, Hanno; Conrad, Claudius; Schwarz, Bettina; Jauch, Karl-Walter; Huss, Ralf; Nelson, Peter J.; Bruns, Christiane J.

2012-01-01

Mesenchymal stem (or stromal) cells (MSCs) are nonhematopoietic progenitor cells that can be obtained from bone marrow aspirates or adipose tissue, expanded and genetically modified in vitro, and then used for cancer therapeutic strategies in vivo. Here, we review available data regarding the application of MSC-based tumor-targeted therapy in gastrointestinal cancer, provide an overview of the general history of MSC-based gene therapy in cancer research, and discuss potential problems associa...

Ecdysone Receptor-based Singular Gene Switches for Regulated Transgene Expression in Cells and Adult Rodent Tissues

Directory of Open Access Journals (Sweden)

Seoghyun Lee

2016-01-01

Full Text Available Controlled gene expression is an indispensable technique in biomedical research. Here, we report a convenient, straightforward, and reliable way to induce expression of a gene of interest with negligible background expression compared to the most widely used tetracycline (Tet-regulated system. Exploiting a Drosophila ecdysone receptor (EcR-based gene regulatory system, we generated nonviral and adenoviral singular vectors designated as pEUI(+ and pENTR-EUI, respectively, which contain all the required elements to guarantee regulated transgene expression (GAL4-miniVP16-EcR, termed GvEcR hereafter, and 10 tandem repeats of an upstream activation sequence promoter followed by a multiple cloning site. Through the transient and stable transfection of mammalian cell lines with reporter genes, we validated that tebufenozide, an ecdysone agonist, reversibly induced gene expression, in a dose- and time-dependent manner, with negligible background expression. In addition, we created an adenovirus derived from the pENTR-EUI vector that readily infected not only cultured cells but also rodent tissues and was sensitive to tebufenozide treatment for regulated transgene expression. These results suggest that EcR-based singular gene regulatory switches would be convenient tools for the induction of gene expression in cells and tissues in a tightly controlled fashion.
Mesenchymal Stem Cell-Based Tumor-Targeted Gene Therapy in Gastrointestinal Cancer

Science.gov (United States)

Bao, Qi; Zhao, Yue; Niess, Hanno; Conrad, Claudius; Schwarz, Bettina; Jauch, Karl-Walter; Huss, Ralf; Nelson, Peter J.

2012-01-01

Mesenchymal stem (or stromal) cells (MSCs) are nonhematopoietic progenitor cells that can be obtained from bone marrow aspirates or adipose tissue, expanded and genetically modified in vitro, and then used for cancer therapeutic strategies in vivo. Here, we review available data regarding the application of MSC-based tumor-targeted therapy in gastrointestinal cancer, provide an overview of the general history of MSC-based gene therapy in cancer research, and discuss potential problems associated with the utility of MSC-based therapy such as biosafety, immunoprivilege, transfection methods, and distribution in the host. PMID:22530882
Consequences of population topology for studying gene flow using link-based landscape genetic methods.

Science.gov (United States)

van Strien, Maarten J

2017-07-01

Many landscape genetic studies aim to determine the effect of landscape on gene flow between populations. These studies frequently employ link-based methods that relate pairwise measures of historical gene flow to measures of the landscape and the geographical distance between populations. However, apart from landscape and distance, there is a third important factor that can influence historical gene flow, that is, population topology (i.e., the arrangement of populations throughout a landscape). As the population topology is determined in part by the landscape configuration, I argue that it should play a more prominent role in landscape genetics. Making use of existing literature and theoretical examples, I discuss how population topology can influence results in landscape genetic studies and how it can be taken into account to improve the accuracy of these results. In support of my arguments, I have performed a literature review of landscape genetic studies published during the first half of 2015 as well as several computer simulations of gene flow between populations. First, I argue why one should carefully consider which population pairs should be included in link-based analyses. Second, I discuss several ways in which the population topology can be incorporated in response and explanatory variables. Third, I outline why it is important to sample populations in such a way that a good representation of the population topology is obtained. Fourth, I discuss how statistical testing for link-based approaches could be influenced by the population topology. I conclude the article with six recommendations geared toward better incorporating population topology in link-based landscape genetic studies.
The UDP glucuronosyltransferase gene superfamily: suggested nomenclature based on evolutionary divergence

NARCIS (Netherlands)

Burchell, B.; Nebert, D. W.; Nelson, D. R.; Bock, K. W.; Iyanagi, T.; Jansen, P. L.; Lancet, D.; Mulder, G. J.; Chowdhury, J. R.; Siest, G.

1991-01-01

A nomenclature system for the UDP glucuronosyltransferase superfamily is proposed, based on divergent evolution of the genes. A total of 26 distinct cDNAs in five mammalian species have been sequenced to date. Comparison of the deduced amino acid sequences leads to the definition of two families and
Biopolymer-Based Nanoparticles for Drug/Gene Delivery and Tissue Engineering

Science.gov (United States)

Nitta, Sachiko Kaihara; Numata, Keiji

2013-01-01

There has been a great interest in application of nanoparticles as biomaterials for delivery of therapeutic molecules such as drugs and genes, and for tissue engineering. In particular, biopolymers are suitable materials as nanoparticles for clinical application due to their versatile traits, including biocompatibility, biodegradability and low immunogenicity. Biopolymers are polymers that are produced from living organisms, which are classified in three groups: polysaccharides, proteins and nucleic acids. It is important to control particle size, charge, morphology of surface and release rate of loaded molecules to use biopolymer-based nanoparticles as drug/gene delivery carriers. To obtain a nano-carrier for therapeutic purposes, a variety of materials and preparation process has been attempted. This review focuses on fabrication of biocompatible nanoparticles consisting of biopolymers such as protein (silk, collagen, gelatin, β-casein, zein and albumin), protein-mimicked polypeptides and polysaccharides (chitosan, alginate, pullulan, starch and heparin). The effects of the nature of the materials and the fabrication process on the characteristics of the nanoparticles are described. In addition, their application as delivery carriers of therapeutic drugs and genes and biomaterials for tissue engineering are also reviewed. PMID:23344060
Biopolymer-Based Nanoparticles for Drug/Gene Delivery and Tissue Engineering

Directory of Open Access Journals (Sweden)

Keiji Numata

2013-01-01

Full Text Available There has been a great interest in application of nanoparticles as biomaterials for delivery of therapeutic molecules such as drugs and genes, and for tissue engineering. In particular, biopolymers are suitable materials as nanoparticles for clinical application due to their versatile traits, including biocompatibility, biodegradability and low immunogenicity. Biopolymers are polymers that are produced from living organisms, which are classified in three groups: polysaccharides, proteins and nucleic acids. It is important to control particle size, charge, morphology of surface and release rate of loaded molecules to use biopolymer-based nanoparticles as drug/gene delivery carriers. To obtain a nano-carrier for therapeutic purposes, a variety of materials and preparation process has been attempted. This review focuses on fabrication of biocompatible nanoparticles consisting of biopolymers such as protein (silk, collagen, gelatin, β-casein, zein and albumin, protein-mimicked polypeptides and polysaccharides (chitosan, alginate, pullulan, starch and heparin. The effects of the nature of the materials and the fabrication process on the characteristics of the nanoparticles are described. In addition, their application as delivery carriers of therapeutic drugs and genes and biomaterials for tissue engineering are also reviewed.
[Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

Science.gov (United States)

Ortega, Maya

2010-01-01

My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.
Single-gene prognostic signatures for advanced stage serous ovarian cancer based on 1257 patient samples.

Science.gov (United States)

Zhang, Fan; Yang, Kai; Deng, Kui; Zhang, Yuanyuan; Zhao, Weiwei; Xu, Huan; Rong, Zhiwei; Li, Kang

2018-04-16

We sought to identify stable single-gene prognostic signatures based on a large collection of advanced stage serous ovarian cancer (AS-OvCa) gene expression data and explore their functions. The empirical Bayes (EB) method was used to remove the batch effect and integrate 8 ovarian cancer datasets. Univariate Cox regression was used to evaluate the association between gene and overall survival (OS). The Database for Annotation, Visualization and Integrated Discovery (DAVID) tool was used for the functional annotation of genes for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The batch effect was removed by the EB method, and 1257 patient samples were used for further analysis. We selected 341 single-gene prognostic signatures with FDR matrix organization, focal adhesion and DNA replication which are closely associated with cancer. We used the EB method to remove the batch effect of 8 datasets, integrated these datasets and identified stable prognosis signatures for AS-OvCa.
Learning gene regulatory networks from gene expression data using weighted consensus

KAUST Repository

Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin

2016-01-01

An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.
Learning gene regulatory networks from gene expression data using weighted consensus

KAUST Repository

Fujii, Chisato

2016-08-25

An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.
Treatment planning of electroporation-based medical interventions: electrochemotherapy, gene electrotransfer and irreversible electroporation

International Nuclear Information System (INIS)

Zupanic, Anze; Kos, Bor; Miklavcic, Damijan

2012-01-01

In recent years, cancer electrochemotherapy (ECT), gene electrotransfer for gene therapy and DNA vaccination (GET) and tissue ablation with irreversible electroporation (IRE) have all entered clinical practice. We present a method for a personalized treatment planning procedure for ECT, GET and IRE, based on medical image analysis, numerical modelling of electroporation and optimization with the genetic algorithm, and several visualization tools for treatment plan assessment. Each treatment plan provides the attending physician with optimal positions of electrodes in the body and electric pulse parameters for optimal electroporation of the target tissues. For the studied case of a deep-seated tumour, the optimal treatment plans for ECT and IRE require at least two electrodes to be inserted into the target tissue, thus lowering the necessary voltage for electroporation and limiting damage to the surrounding healthy tissue. In GET, it is necessary to place the electrodes outside the target tissue to prevent damage to target cells intended to express the transfected genes. The presented treatment planning procedure is a valuable tool for clinical and experimental use and evaluation of electroporation-based treatments. (paper)
New tools for Mendelian disease gene identification: PhenoDB variant analysis module; and GeneMatcher, a web-based tool for linking investigators with an interest in the same gene.

Science.gov (United States)

Sobreira, Nara; Schiettecatte, François; Boehm, Corinne; Valle, David; Hamosh, Ada

2015-04-01

Identifying the causative variant from among the thousands identified by whole-exome sequencing or whole-genome sequencing is a formidable challenge. To make this process as efficient and flexible as possible, we have developed a Variant Analysis Module coupled to our previously described Web-based phenotype intake tool, PhenoDB (http://researchphenodb.net and http://phenodb.org). When a small number of candidate-causative variants have been identified in a study of a particular patient or family, a second, more difficult challenge becomes proof of causality for any given variant. One approach to this problem is to find other cases with a similar phenotype and mutations in the same candidate gene. Alternatively, it may be possible to develop biological evidence for causality, an approach that is assisted by making connections to basic scientists studying the gene of interest, often in the setting of a model organism. Both of these strategies benefit from an open access, online site where individual clinicians and investigators could post genes of interest. To this end, we developed GeneMatcher (http://genematcher.org), a freely accessible Website that enables connections between clinicians and researchers across the world who share an interest in the same gene(s). © 2015 WILEY PERIODICALS, INC.
Intellectual property rights and gene-based technologies for animal production and health. Issues for developing countries

International Nuclear Information System (INIS)

Dutfield, G.

2005-01-01

Intellectual property rights (IPR) are legal and institutional devices to protect creations of the mind. With respect to gene-based innovation, the most significant IPR is patents. Appropriate patent regimes have the potential to foster innovation in animal biotechnology and the transfer of gene-based technologies. Inappropriate patent systems may be counter-productive. Indeed, many critics are doubtful that the current international patent standards, based as they are on a combination of the United States of America' and European regimes, can help countries that lack the capacity to do much life science and biotechnology research to become more innovative o r contribute to the acquisition, absorption and, where desirable, the adaptation of new gene-based technologies from outside. Present legislation in Europe, North America and internationally is considered, together with the controversies and important policy questions for developing countries, and the choices facing countries seeking to enhance their scientific and technological capacities in these areas. (author)
RNAi-based therapeutic nanostrategy: IL-8 gene silencing in pancreatic cancer cells using gold nanorods delivery vehicles

International Nuclear Information System (INIS)

Panwar, Nishtha; Yang, Chengbin; Yin, Feng; Chuan, Tjin Swee; Yong, Ken-Tye; Yoon, Ho Sup

2015-01-01

RNA interference (RNAi)-based gene silencing possesses great ability for therapeutic intervention in pancreatic cancer. Among various oncogene mutations, Interleukin-8 (IL-8) gene mutations are found to be overexpressed in many pancreatic cell lines. In this work, we demonstrate IL-8 gene silencing by employing an RNAi-based gene therapy approach and this is achieved by using gold nanorods (AuNRs) for efficient delivery of IL-8 small interfering RNA (siRNA) to the pancreatic cell lines of MiaPaCa-2 and Panc-1. Upon comparing to Panc-1 cells, we found that the dominant expression of the IL-8 gene in MiaPaCa-2 cells resulted in an aggressive behavior towards the processes of cell invasion and metastasis. We have hence investigated the suitability of using AuNRs as novel non-viral nanocarriers for the efficient uptake and delivery of IL-8 siRNA in realizing gene knockdown of both MiaPaCa-2 and Panc-1 cells. Flow cytometry and fluorescence imaging techniques have been applied to confirm transfection and release of IL-8 siRNA. The ratio of AuNRs and siRNA has been optimized and transfection efficiencies as high as 88.40 ± 2.14% have been achieved. Upon successful delivery of IL-8 siRNA into cancer cells, the effects of IL-8 gene knockdown are quantified in terms of gene expression, cell invasion, cell migration and cell apoptosis assays. Statistical comparative studies for both MiaPaCa-2 and Panc-1 cells are presented in this work. IL-8 gene silencing has been demonstrated with knockdown efficiencies of 81.02 ± 10.14% and 75.73 ± 6.41% in MiaPaCa-2 and Panc-1 cells, respectively. Our results are then compared with a commercial transfection reagent, Oligofectamine, serving as positive control. The gene knockdown results illustrate the potential role of AuNRs as non-viral gene delivery vehicles for RNAi-based targeted cancer therapy applications. (paper)
Gene expression and gene therapy imaging

International Nuclear Information System (INIS)

Rome, Claire; Couillaud, Franck; Moonen, Chrit T.W.

2007-01-01

The fast growing field of molecular imaging has achieved major advances in imaging gene expression, an important element of gene therapy. Gene expression imaging is based on specific probes or contrast agents that allow either direct or indirect spatio-temporal evaluation of gene expression. Direct evaluation is possible with, for example, contrast agents that bind directly to a specific target (e.g., receptor). Indirect evaluation may be achieved by using specific substrate probes for a target enzyme. The use of marker genes, also called reporter genes, is an essential element of MI approaches for gene expression in gene therapy. The marker gene may not have a therapeutic role itself, but by coupling the marker gene to a therapeutic gene, expression of the marker gene reports on the expression of the therapeutic gene. Nuclear medicine and optical approaches are highly sensitive (detection of probes in the picomolar range), whereas MRI and ultrasound imaging are less sensitive and require amplification techniques and/or accumulation of contrast agents in enlarged contrast particles. Recently developed MI techniques are particularly relevant for gene therapy. Amongst these are the possibility to track gene therapy vectors such as stem cells, and the techniques that allow spatiotemporal control of gene expression by non-invasive heating (with MRI guided focused ultrasound) and the use of temperature sensitive promoters. (orig.)
Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

Science.gov (United States)

de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

2016-08-01

Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected pneratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.
Multiple-endpoints gene alteration-based (MEGA) assay: A toxicogenomics approach for water quality assessment of wastewater effluents.

Science.gov (United States)

Fukushima, Toshikazu; Hara-Yamamura, Hiroe; Nakashima, Koji; Tan, Lea Chua; Okabe, Satoshi

2017-12-01

Wastewater effluents contain a significant number of toxic contaminants, which, even at low concentrations, display a wide variety of toxic actions. In this study, we developed a multiple-endpoints gene alteration-based (MEGA) assay, a real-time PCR-based transcriptomic analysis, to assess the water quality of wastewater effluents for human health risk assessment and management. Twenty-one genes from the human hepatoblastoma cell line (HepG2), covering the basic health-relevant stress responses such as response to xenobiotics, genotoxicity, and cytotoxicity, were selected and incorporated into the MEGA assay. The genes related to the p53-mediated DNA damage response and cytochrome P450 were selected as markers for genotoxicity and response to xenobiotics, respectively. Additionally, the genes that were dose-dependently regulated by exposure to the wastewater effluents were chosen as markers for cytotoxicity. The alterations in the expression of an individual gene, induced by exposure to the wastewater effluents, were evaluated by real-time PCR and the results were validated by genotoxicity (e.g., comet assay) and cell-based cytotoxicity tests. In summary, the MEGA assay is a real-time PCR-based assay that targets cellular responses to contaminants present in wastewater effluents at the transcriptional level; it is rapid, cost-effective, and high-throughput and can thus complement any chemical analysis for water quality assessment and management. Copyright © 2017 Elsevier Ltd. All rights reserved.
Graph-based semi-supervised learning with genomic data integration using condition-responsive genes applied to phenotype classification.

Science.gov (United States)

Doostparast Torshizi, Abolfazl; Petzold, Linda R

2018-01-01

Data integration methods that combine data from different molecular levels such as genome, epigenome, transcriptome, etc., have received a great deal of interest in the past few years. It has been demonstrated that the synergistic effects of different biological data types can boost learning capabilities and lead to a better understanding of the underlying interactions among molecular levels. In this paper we present a graph-based semi-supervised classification algorithm that incorporates latent biological knowledge in the form of biological pathways with gene expression and DNA methylation data. The process of graph construction from biological pathways is based on detecting condition-responsive genes, where 3 sets of genes are finally extracted: all condition responsive genes, high-frequency condition-responsive genes, and P-value-filtered genes. The proposed approach is applied to ovarian cancer data downloaded from the Human Genome Atlas. Extensive numerical experiments demonstrate superior performance of the proposed approach compared to other state-of-the-art algorithms, including the latest graph-based classification techniques. Simulation results demonstrate that integrating various data types enhances classification performance and leads to a better understanding of interrelations between diverse omics data types. The proposed approach outperforms many of the state-of-the-art data integration algorithms. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

Science.gov (United States)

Hoff, Katharina J; Lange, Simone; Lomsadze, Alexandre; Borodovsky, Mark; Stanke, Mario

2016-03-01

Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Cloning of low dose radiation induced gene RIG1 by RACE based on non-cloned cDNA library

International Nuclear Information System (INIS)

Luo Ying; Sui Jianli; Tie Yi; Zhang Yuanping; Zhou Pingkun; Sun Zhixian

2001-01-01

Objective: To obtain full-length cDNA of radiation induced new gene RIG1 based on its EST fragment. Methods: Based on non-cloned cDNA library, enhanced nested RACE PCR and biotin-avidin labelled probe for magnetic bead purification was used to obtain full-length cDNA of RIG1. Results: About 1 kb of 3' end of RIG1 gene was successfully cloned by this set of methods and cloning of RIG1 5' end is proceeding well. Conclusion: The result is consistent with the design of experiment. This set of protocol is useful for cloning of full-length gene based on EST fragment

Study of hepatitis B virus gene mutations with enzymatic colorimetry-based DNA microarray.

Science.gov (United States)

Mao, Hailei; Wang, Huimin; Zhang, Donglei; Mao, Hongju; Zhao, Jianlong; Shi, Jian; Cui, Zhichu

2006-01-01

To establish a modified microarray method for detecting HBV gene mutations in the clinic. Site-specific oligonucleotide probes were immobilized to microarray slides and hybridized to biotin-labeled HBV gene fragments amplified from two-step PCR. Hybridized targets were transferred to nitrocellulose membranes, followed by intensity measurement using BCIP/NBT colorimetry. HBV genes from 99 Hepatitis B patients and 40 healthy blood donors were analyzed. Mutation frequencies of HBV pre-core/core and basic core promoter (BCP) regions were found to be significantly higher in the patient group (42%, 40% versus 2.5%, 5%, P colorimetry method exhibited the same level of sensitivity and reproducibility. An enzymatic colorimetry-based DNA microarray assay was successfully established to monitor HBV mutations. Pre-core/core and BCP mutations of HBV genes could be major causes of HBV infection in HBeAg-negative patients and could also be relevant to chronicity and aggravation of hepatitis B.
RNAi-Based Identification of Gene-Specific Nuclear Cofactor Networks Regulating Interleukin-1 Target Genes

Directory of Open Access Journals (Sweden)

Johanna Meier-Soelch

2018-04-01

Full Text Available The potent proinflammatory cytokine interleukin (IL-1 triggers gene expression through the NF-κB signaling pathway. Here, we investigated the cofactor requirements of strongly regulated IL-1 target genes whose expression is impaired in p65 NF-κB-deficient murine embryonic fibroblasts. By two independent small-hairpin (shRNA screens, we examined 170 genes annotated to encode nuclear cofactors for their role in Cxcl2 mRNA expression and identified 22 factors that modulated basal or IL-1-inducible Cxcl2 levels. The functions of 16 of these factors were validated for Cxcl2 and further analyzed for their role in regulation of 10 additional IL-1 target genes by RT-qPCR. These data reveal that each inducible gene has its own (quantitative requirement of cofactors to maintain basal levels and to respond to IL-1. Twelve factors (Epc1, H2afz, Kdm2b, Kdm6a, Mbd3, Mta2, Phf21a, Ruvbl1, Sin3b, Suv420h1, Taf1, and Ube3a have not been previously implicated in inflammatory cytokine functions. Bioinformatics analysis indicates that they are components of complex nuclear protein networks that regulate chromatin functions and gene transcription. Collectively, these data suggest that downstream from the essential NF-κB signal each cytokine-inducible target gene has further subtle requirements for individual sets of nuclear cofactors that shape its transcriptional activation profile.
iSyTE 2.0: a database for expression-based gene discovery in the eye

Science.gov (United States)

Kakrana, Atul; Yang, Andrian; Anand, Deepti; Djordjevic, Djordje; Ramachandruni, Deepti; Singh, Abhyudai; Huang, Hongzhan

2018-01-01

Abstract Although successful in identifying new cataract-linked genes, the previous version of the database iSyTE (integrated Systems Tool for Eye gene discovery) was based on expression information on just three mouse lens stages and was functionally limited to visualization by only UCSC-Genome Browser tracks. To increase its efficacy, here we provide an enhanced iSyTE version 2.0 (URL: http://research.bioinformatics.udel.edu/iSyTE) based on well-curated, comprehensive genome-level lens expression data as a one-stop portal for the effective visualization and analysis of candidate genes in lens development and disease. iSyTE 2.0 includes all publicly available lens Affymetrix and Illumina microarray datasets representing a broad range of embryonic and postnatal stages from wild-type and specific gene-perturbation mouse mutants with eye defects. Further, we developed a new user-friendly web interface for direct access and cogent visualization of the curated expression data, which supports convenient searches and a range of downstream analyses. The utility of these new iSyTE 2.0 features is illustrated through examples of established genes associated with lens development and pathobiology, which serve as tutorials for its application by the end-user. iSyTE 2.0 will facilitate the prioritization of eye development and disease-linked candidate genes in studies involving transcriptomics or next-generation sequencing data, linkage analysis and GWAS approaches. PMID:29036527
A dual selection based, targeted gene replacement tool for Magnaporthe grisea and Fusarium oxysporum.

Science.gov (United States)

Khang, Chang Hyun; Park, Sook-Young; Lee, Yong-Hwan; Kang, Seogchan

2005-06-01

Rapid progress in fungal genome sequencing presents many new opportunities for functional genomic analysis of fungal biology through the systematic mutagenesis of the genes identified through sequencing. However, the lack of efficient tools for targeted gene replacement is a limiting factor for fungal functional genomics, as it often necessitates the screening of a large number of transformants to identify the desired mutant. We developed an efficient method of gene replacement and evaluated factors affecting the efficiency of this method using two plant pathogenic fungi, Magnaporthe grisea and Fusarium oxysporum. This method is based on Agrobacterium tumefaciens-mediated transformation with a mutant allele of the target gene flanked by the herpes simplex virus thymidine kinase (HSVtk) gene as a conditional negative selection marker against ectopic transformants. The HSVtk gene product converts 5-fluoro-2'-deoxyuridine to a compound toxic to diverse fungi. Because ectopic transformants express HSVtk, while gene replacement mutants lack HSVtk, growing transformants on a medium amended with 5-fluoro-2'-deoxyuridine facilitates the identification of targeted mutants by counter-selecting against ectopic transformants. In addition to M. grisea and F. oxysporum, the method and associated vectors are likely to be applicable to manipulating genes in a broad spectrum of fungi, thus potentially serving as an efficient, universal functional genomic tool for harnessing the growing body of fungal genome sequence data to study fungal biology.
A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

Directory of Open Access Journals (Sweden)

Borui Pi

Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.
A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

Science.gov (United States)

Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

2015-01-01

Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180
Suicide genes or p53 gene and p53 target genes as targets for cancer gene therapy by ionizing radiation

International Nuclear Information System (INIS)

Liu Bing; Chinese Academy of Sciences, Beijing; Zhang Hong

2005-01-01

Radiotherapy has some disadvantages due to the severe side-effect on the normal tissues at a curative dose of ionizing radiation (IR). Similarly, as a new developing approach, gene therapy also has some disadvantages, such as lack of specificity for tumors, limited expression of therapeutic gene, potential biological risk. To certain extent, above problems would be solved by the suicide genes or p53 gene and its target genes therapies targeted by ionizing radiation. This strategy not only makes up the disadvantage from radiotherapy or gene therapy alone, but also promotes success rate on the base of lower dose. By present, there have been several vectors measuring up to be reaching clinical trials. This review focused on the development of the cancer gene therapy through suicide genes or p53 and its target genes mediated by IR. (authors)
Reranking candidate gene models with cross-species comparison for improved gene prediction

Directory of Open Access Journals (Sweden)

Pereira Fernando CN

2008-10-01

Full Text Available Abstract Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc. Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models.
Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data

Directory of Open Access Journals (Sweden)

de los Reyes Benildo G

2008-04-01

Full Text Available Abstract Background Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published data. Integrating these complementary datasets helps infer a mutually consistent transcriptional regulatory network (TRN with strong similarity to the structure of the underlying genetic regulatory modules. Decomposing the TRN into a small set of recurring regulatory patterns, called network motifs (NM, facilitates the inference. Identifying NMs defined by specific transcription factors (TF establishes the framework structure of a TRN and allows the inference of TF-target gene relationship. This paper introduces a computational framework for utilizing data from multiple sources to infer TF-target gene relationships on the basis of NMs. The data include time course gene expression profiles, genome-wide location analysis data, binding sequence data, and gene ontology (GO information. Results The proposed computational framework was tested using gene expression data associated with cell cycle progression in yeast. Among 800 cell cycle related genes, 85 were identified as candidate TFs and classified into four previously defined NMs. The NMs for a subset of TFs are obtained from literature. Support vector machine (SVM classifiers were used to estimate NMs for the remaining TFs. The potential downstream target genes for the TFs were clustered into 34 biologically significant groups. The relationships between TFs and potential target gene clusters were examined by training recurrent neural networks whose topologies mimic the NMs to which the TFs are classified. The identified relationships between TFs and gene clusters were evaluated using the following biological validation and statistical analyses: (1 Gene set enrichment
Radiopharmaceuticals to monitor the expression of transferred genes in gene transfer therapy

International Nuclear Information System (INIS)

Wiebe, L. I.

1997-01-01

The development and application of radiopharmaceuticals has, in many instances, been based on the pharmacological properties of therapeutic agents. The molecular biology-biotechnology revolution has had an important impact on treatment of diseases, in part through the reduced toxicity of 'biologicals', in part because of their specificity for interaction at unique molecular sites and in part because of their selective delivery to the target site. Immunotherapeutic approaches include the use of monoclonal antibodies (MABs), MAB-fragments and chemotactic peptides. Such agents currently form the basis of both diagnostic and immunotherapeutic radiopharmaceuticals. More recently, gene transfer techniques have been advanced to the point that a new molecular approach, gene therapy, has become a reality. Gene therapy offers an opportunity to attack disease at its most fundamental level. The therapeutic mechanism is based on the expression of a specific gene or genes, the product of which will invoke immunological, receptor-based or enzyme-based therapeutic modalities. Several approaches to gene therapy of cancer have been envisioned, the most clinically-advanced concepts involving the introduction of genes that will encode for molecular targets nor normally found in healthy mammalian cells. A number of gene therapy clinical trials are based on the introduction of the Herpes simplex virus type-1 (HSV-1) gene that encodes for viral thymidine kinase (tk+). Once HSV-1 tk+ is expressed in the target (cancer) cell, therapy can be effected by the administration of a highly molecularly-targeted and systemically non-toxic antiviral drug such as ganciclovir. The development of radiodiagnostic imaging in gene therapy will be reviewed, using HSV-1 tk+ and radioiodinated IVFRU as a basis for development of the theme. Molecular targets that could be exploited in gene therapy, other than tk+, will be identified
Radiopharmaceuticals to monitor the expression of transferred genes in gene transfer therapy

Energy Technology Data Exchange (ETDEWEB)

Wiebe, L I [University of Alberta, Edmonton (Canada). Noujaim Institute for Pharmaceutical Oncology Research

1997-10-01

The development and application of radiopharmaceuticals has, in many instances, been based on the pharmacological properties of therapeutic agents. The molecular biology-biotechnology revolution has had an important impact on treatment of diseases, in part through the reduced toxicity of `biologicals`, in part because of their specificity for interaction at unique molecular sites and in part because of their selective delivery to the target site. Immunotherapeutic approaches include the use of monoclonal antibodies (MABs), MAB-fragments and chemotactic peptides. Such agents currently form the basis of both diagnostic and immunotherapeutic radiopharmaceuticals. More recently, gene transfer techniques have been advanced to the point that a new molecular approach, gene therapy, has become a reality. Gene therapy offers an opportunity to attack disease at its most fundamental level. The therapeutic mechanism is based on the expression of a specific gene or genes, the product of which will invoke immunological, receptor-based or enzyme-based therapeutic modalities. Several approaches to gene therapy of cancer have been envisioned, the most clinically-advanced concepts involving the introduction of genes that will encode for molecular targets nor normally found in healthy mammalian cells. A number of gene therapy clinical trials are based on the introduction of the Herpes simplex virus type-1 (HSV-1) gene that encodes for viral thymidine kinase (tk+). Once HSV-1 tk+ is expressed in the target (cancer) cell, therapy can be effected by the administration of a highly molecularly-targeted and systemically non-toxic antiviral drug such as ganciclovir. The development of radiodiagnostic imaging in gene therapy will be reviewed, using HSV-1 tk+ and radioiodinated IVFRU as a basis for development of the theme. Molecular targets that could be exploited in gene therapy, other than tk+, will be identified
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

Science.gov (United States)

Mazandu, Gaston K; Mulder, Nicola J

2013-09-25

The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.
Shikonin enhances efficacy of a gene-based cancer vaccine via induction of RANTES

Directory of Open Access Journals (Sweden)

Chen Hui-Ming

2012-04-01

Full Text Available Abstract Background Shikonin, a phytochemical purified from Lithospermum erythrorhizon, has been shown to confer diverse pharmacological activities, including accelerating granuloma formation, wound healing, anti-inflammation and others, and is explored for immune-modifier activities for vaccination in this study. Transdermal gene-based vaccine is an attractive approach for delivery of DNA transgenes encoding specific tumor antigens to host skin tissues. Skin dendritic cells (DCs, a potent antigen-presenting cell type, is known to play a critical role in transmitting and orchestrating tumor antigen-specific immunities against cancers. The present study hence employs these various components for experimentation. Method The mRNA and protein expression of RANTES were detected by RT-PCR and ELISA, respectively. The regional expression of RANTES and tissue damage in test skin were evaluated via immunohistochemistry assay. Fluorescein isothiocyanate sensitization assay was performed to trace the trafficking of DCs from the skin vaccination site to draining lymph nodes. Adjuvantic effect of shikonin on gene gun-delivered human gp100 (hgp100 DNA cancer vaccine was studied in a human gp100-transfected B16 (B16/hgp100 tumor model. Results Among various phytochemicals tested, shikonin induced the highest level of expression of RANTES in normal skin tissues. In comparison, mouse RANTES cDNA gene transfection induced a higher level of mRANTES expression for a longer period, but caused more extensive skin damage. Topical application of shikonin onto the immunization site before gene gun-mediated vaccination augmented the population of skin DCs migrating into the draining lymph nodes. A hgp100 cDNA gene vaccination regimen with shikonin pretreatment as an adjuvant in a B16/hgp100 tumor model increased cytotoxic T lymphocyte activities in splenocytes and lymph node cells on target tumor cells. Conclusion Together, our findings suggest that shikonin can
Effective generation of transgenic pigs and mice by linker based sperm-mediated gene transfer.

OpenAIRE

Chang, Keejong; Qian, Jin; Jiang, MeiSheng; Liu, Yi-Hsin; Wu, Ming-Che; Chen, Chi-Dar; Lai, Chao-Kuen; Lo, Hsin-Lung; Hsiao, Chin-Ton; Brown, Lucy; Bolen, James; Huang, Hsiao-I; Ho, Pei-Yu; Shih, Ping Yao; Yao, Chen-Wen

2002-01-01

Abstract Background Transgenic animals have become valuable tools for both research and applied purposes. The current method of gene transfer, microinjection, which is widely used in transgenic mouse production, has only had limited success in producing transgenic animals of larger or higher species. Here, we report a linker based sperm-mediated gene transfer method (LB-SMGT) that greatly improves the production efficiency of large transgenic animals. Results The linker protein, a monoclonal ...
Systematics of Plant-Pathogenic and Related Streptomyces Species Based on Phylogenetic Analyses of Multiple Gene Loci

Science.gov (United States)

The 10 species of Streptomyces implicated as the etiological agents in scab disease of potatoes or soft rot disease of sweet potatoes are distributed among 7 different phylogenetic clades in analyses based on 16S rRNA gene sequences, but high sequence similarity of this gene among Streptomyces speci...
Form gene clustering method about pan-ethnic-group products based on emotional semantic

Science.gov (United States)

Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

2016-09-01

The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.
Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold.

Science.gov (United States)

Glass, Edmund R; Dozmorov, Mikhail G

2016-10-06

The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. Combined with cell counts, heterogeneous gene expression may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression, and a global cutoff to judge significance, such as False Discovery Rate (FDR). Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect linear regression. In this paper we quantify the parameter space affecting the performance of linear regression (sensitivity of cell type-specific differential expression detection) on a per-gene basis. We evaluated the effect of sample sizes, cell type-specific proportion variability, and mean squared error on sensitivity of cell type-specific differential expression detection using linear regression. Each parameter affected variability of cell type-specific expression estimates and, subsequently, the sensitivity of differential expression detection. We provide the R package, LRCDE, which performs linear regression-based cell type-specific differential expression (deconvolution) detection on a gene-by-gene basis. Accounting for variability around cell type-specific gene expression estimates, it computes per-gene t-statistics of differential detection, p-values, t-statistic-based sensitivity, group-specific mean squared error, and several gene-specific diagnostic metrics. The sensitivity of linear regression-based cell type-specific differential expression detection differed for each gene as a function of mean squared error, per group sample sizes, and variability of the proportions
A comprehensive family-based replication study of schizophrenia genes

DEFF Research Database (Denmark)

Aberg, Karolina A; Liu, Youfang; Bukszár, Jozsef

2013-01-01

768 control subjects from 6 databases and, after quality control 6298 individuals (including 3286 cases) from 1811 nuclear families. MAIN OUTCOMES AND MEASURES Case-control status for SCZ. RESULTS Replication results showed a highly significant enrichment of SNPs with small P values. Of the SNPs...... in an independent family-based replication study that, after quality control, consisted of 8107 SNPs. SETTING Linkage meta-analysis, brain transcriptome meta-analysis, candidate gene database, OMIM, relevant mouse studies, and expression quantitative trait locus databases. PATIENTS We included 11 185 cases and 10...
Identification of Gene Modules Associated with Low Temperatures Response in Bambara Groundnut by Network-Based Analysis.

Directory of Open Access Journals (Sweden)

Venkata Suresh Bonthala

Full Text Available Bambara groundnut (Vigna subterranea (L. Verdc. is an African legume and is a promising underutilized crop with good seed nutritional values. Low temperature stress in a number of African countries at night, such as Botswana, can effect the growth and development of bambara groundnut, leading to losses in potential crop yield. Therefore, in this study we developed a computational pipeline to identify and analyze the genes and gene modules associated with low temperature stress responses in bambara groundnut using the cross-species microarray technique (as bambara groundnut has no microarray chip coupled with network-based analysis. Analyses of the bambara groundnut transcriptome using cross-species gene expression data resulted in the identification of 375 and 659 differentially expressed genes (p<0.01 under the sub-optimal (23°C and very sub-optimal (18°C temperatures, respectively, of which 110 genes are commonly shared between the two stress conditions. The construction of a Highest Reciprocal Rank-based gene co-expression network, followed by its partition using a Heuristic Cluster Chiseling Algorithm resulted in 6 and 7 gene modules in sub-optimal and very sub-optimal temperature stresses being identified, respectively. Modules of sub-optimal temperature stress are principally enriched with carbohydrate and lipid metabolic processes, while most of the modules of very sub-optimal temperature stress are significantly enriched with responses to stimuli and various metabolic processes. Several transcription factors (from MYB, NAC, WRKY, WHIRLY & GATA classes that may regulate the downstream genes involved in response to stimulus in order for the plant to withstand very sub-optimal temperature stress were highlighted. The identified gene modules could be useful in breeding for low-temperature stress tolerant bambara groundnut varieties.
Identifying novel fruit-related genes in Arabidopsis thaliana based on the random walk with restart algorithm.

Science.gov (United States)

Zhang, Yunhua; Dai, Li; Liu, Ying; Zhang, YuHang; Wang, ShaoPeng

2017-01-01

Fruit is essential for plant reproduction and is responsible for protection and dispersal of seeds. The development and maturation of fruit is tightly regulated by numerous genetic factors that respond to environmental and internal stimulation. In this study, we attempted to identify novel fruit-related genes in a model organism, Arabidopsis thaliana, using a computational method. Based on validated fruit-related genes, the random walk with restart (RWR) algorithm was applied on a protein-protein interaction (PPI) network using these genes as seeds. The identified genes with high probabilities were filtered by the permutation test and linkage tests. In the permutation test, the genes that were selected due to the structure of the PPI network were discarded. In the linkage tests, the importance of each candidate gene was measured from two aspects: (1) its functional associations with validated genes and (2) its similarity with validated genes on gene ontology (GO) terms and KEGG pathways. Finally, 255 inferred genes were obtained, subsequent extensive analysis of important genes revealed that they mainly contribute to ubiquitination (UBQ9, UBQ8, UBQ11, UBQ10), serine hydroxymethyl transfer (SHM7, SHM5, SHM6) or glycol-metabolism (HXKL2_ARATH, CSY5, GAPCP1), suggesting essential roles during the development and maturation of fruit in Arabidopsis thaliana.

fabp4 is central to eight obesity associated genes: a functional gene network-based polymorphic study.

Science.gov (United States)

Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand

2015-01-07

Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pathway-based analysis of a melanoma genome-wide association study: analysis of genes related to tumour-immunosuppression.

Directory of Open Access Journals (Sweden)

Nils Schoof

Full Text Available Systemic immunosuppression is a risk factor for melanoma, and sunburn-induced immunosuppression is thought to be causal. Genes in immunosuppression pathways are therefore candidate melanoma-susceptibility genes. If variants within these genes individually have a small effect on disease risk, the association may be undetected in genome-wide association (GWA studies due to low power to reach a high significance level. Pathway-based approaches have been suggested as a method of incorporating a priori knowledge into the analysis of GWA studies. In this study, the association of 1113 single nucleotide polymorphisms (SNPs in 43 genes (39 genomic regions related to immunosuppression have been analysed using a gene-set approach in 1539 melanoma cases and 3917 controls from the GenoMEL consortium GWA study. The association between melanoma susceptibility and the whole set of tumour-immunosuppression genes, and also predefined functional subgroups of genes, was considered. The analysis was based on a measure formed by summing the evidence from the most significant SNP in each gene, and significance was evaluated empirically by case-control label permutation. An association was found between melanoma and the complete set of genes (p(emp=0.002, as well as the subgroups related to the generation of tolerogenic dendritic cells (p(emp=0.006 and secretion of suppressive factors (p(emp=0.0004, thus providing preliminary evidence of involvement of tumour-immunosuppression gene polymorphisms in melanoma susceptibility. The analysis was repeated on a second phase of the GenoMEL study, which showed no evidence of an association. As one of the first attempts to replicate a pathway-level association, our results suggest that low power and heterogeneity may present challenges.
Imaging gene expression in gene therapy

International Nuclear Information System (INIS)

Wiebe, Leonard I.

1997-01-01

Full text. Gene therapy can be used to introduce new genes, or to supplement the function of indigenous genes. At the present time, however, there is non-invasive test to demonstrate efficacy of the gene transfer and expression processes. It has been postulated that scintigraphic imaging can offer unique information on both the site at which the transferred gene is expressed, and the degree of expression, both of which are critical issue for safety and clinical efficacy. Many current studies are based on 'suicide gene therapy' of cancer. Cells modified to express these genes commit metabolic suicide in the presence of an enzyme encoded by the transferred gene and a specifically-convertible pro drug. Pro drug metabolism can lead to selective metabolic trapping, required for scintigraphy. Herpes simplex virus type-1 thymidine kinase (H S V-1 t k + ) has been use for 'suicide' in vivo tumor gene therapy. It has been proposed that radiolabelled nucleosides can be used as radiopharmaceuticals to detect H S V-1 t k + gene expression where the H S V-1 t k + gene serves a reporter or therapeutic function. Animal gene therapy models have been studied using purine-([ 18 F]F H P G; [ 18 F]-A C V), and pyrimidine- ([ 123 / 131 I]I V R F U; [ 124 / 131I ]) antiviral nucleosides. Principles of gene therapy and gene therapy imaging will be reviewed and experimental data for [ 123 / 131I ]I V R F U imaging with the H S V-1 t k + reporter gene will be presented
PCR-based isolation and identification of full-length low-molecular-weight glutenin subunit genes in bread wheat (Triticum aestivum L.).

Science.gov (United States)

Zhang, Xiaofei; Liu, Dongcheng; Jiang, Wei; Guo, Xiaoli; Yang, Wenlong; Sun, Jiazhu; Ling, Hongqing; Zhang, Aimin

2011-12-01

Low-molecular-weight glutenin subunits (LMW-GSs) are encoded by a multi-gene family and are essential for determining the quality of wheat flour products, such as bread and noodles. However, the exact role or contribution of individual LMW-GS genes to wheat quality remains unclear. This is, at least in part, due to the difficulty in characterizing complete sequences of all LMW-GS gene family members in bread wheat. To identify full-length LMW-GS genes, a polymerase chain reaction (PCR)-based method was established, consisting of newly designed conserved primers and the previously developed LMW-GS gene molecular marker system. Using the PCR-based method, 17 LMW-GS genes were identified and characterized in Xiaoyan 54, of which 12 contained full-length sequences. Sequence alignments showed that 13 LMW-GS genes were identical to those found in Xiaoyan 54 using the genomic DNA library screening, and the other four full-length LMW-GS genes were first isolated from Xiaoyan 54. In Chinese Spring, 16 unique LMW-GS genes were isolated, and 13 of them contained full-length coding sequences. Additionally, 16 and 17 LMW-GS genes in Dongnong 101 and Lvhan 328 (chosen from the micro-core collections of Chinese germplasm), respectively, were also identified. Sequence alignments revealed that at least 15 LMW-GS genes were common in the four wheat varieties, and allelic variants of each gene shared high sequence identities (>95%) but exhibited length polymorphism in repetitive regions. This study provides a PCR-based method for efficiently identifying LMW-GS genes in bread wheat, which will improve the characterization of complex members of the LMW-GS gene family and facilitate the understanding of their contributions to wheat quality.
Time warping of evolutionary distant temporal gene expression data based on noise suppression

Directory of Open Access Journals (Sweden)

Papatsenko Dmitri

2009-10-01

Full Text Available Abstract Background Comparative analysis of genome wide temporal gene expression data has a broad potential area of application, including evolutionary biology, developmental biology, and medicine. However, at large evolutionary distances, the construction of global alignments and the consequent comparison of the time-series data are difficult. The main reason is the accumulation of variability in expression profiles of orthologous genes, in the course of evolution. Results We applied Pearson distance matrices, in combination with other noise-suppression techniques and data filtering to improve alignments. This novel framework enhanced the capacity to capture the similarities between the temporal gene expression datasets separated by large evolutionary distances. We aligned and compared the temporal gene expression data in budding (Saccharomyces cerevisiae and fission (Schizosaccharomyces pombe yeast, which are separated by more then ~400 myr of evolution. We found that the global alignment (time warping properly matched the duration of cell cycle phases in these distant organisms, which was measured in prior studies. At the same time, when applied to individual ortholog pairs, this alignment procedure revealed groups of genes with distinct alignments, different from the global alignment. Conclusion Our alignment-based predictions of differences in the cell cycle phases between the two yeast species were in a good agreement with the existing data, thus supporting the computational strategy adopted in this study. We propose that the existence of the alternative alignments, specific to distinct groups of genes, suggests presence of different synchronization modes between the two organisms and possible functional decoupling of particular physiological gene networks in the course of evolution.
Double-Bottom Chaotic Map Particle Swarm Optimization Based on Chi-Square Test to Determine Gene-Gene Interactions

Science.gov (United States)

Yang, Cheng-Hong; Chang, Hsueh-Wei

2014-01-01

Gene-gene interaction studies focus on the investigation of the association between the single nucleotide polymorphisms (SNPs) of genes for disease susceptibility. Statistical methods are widely used to search for a good model of gene-gene interaction for disease analysis, and the previously determined models have successfully explained the effects between SNPs and diseases. However, the huge numbers of potential combinations of SNP genotypes limit the use of statistical methods for analysing high-order interaction, and finding an available high-order model of gene-gene interaction remains a challenge. In this study, an improved particle swarm optimization with double-bottom chaotic maps (DBM-PSO) was applied to assist statistical methods in the analysis of associated variations to disease susceptibility. A big data set was simulated using the published genotype frequencies of 26 SNPs amongst eight genes for breast cancer. Results showed that the proposed DBM-PSO successfully determined two- to six-order models of gene-gene interaction for the risk association with breast cancer (odds ratio > 1.0; P value <0.05). Analysis results supported that the proposed DBM-PSO can identify good models and provide higher chi-square values than conventional PSO. This study indicates that DBM-PSO is a robust and precise algorithm for determination of gene-gene interaction models for breast cancer. PMID:24895547
The progress of PET based reporter gene imaging

International Nuclear Information System (INIS)

Zhao Wei; Zhang Xiuli

2005-01-01

More than two decades of intense research have allowed gene therapy to move from the laboratory to the clinical setting, where its use for the treatment of human pathologies has been considerably increased in the last years. However, many crucial questions remain to be solved in this challenging field. In vivo imaging with positron emission tomography (PET) by combination of the appropriate PET reporter gene and PET reporter probe could provide invaluable qualitative and quantitative information to answer multiple unsolved questions about gene therapy. PET imaging could be used to define parameters not available by other techniques that are of substantial interest not only for the proper understanding of the gene therapy process, but also for its future development and clinical application in humans. (authors)
Entropy-based gene ranking without selection bias for the predictive classification of microarray data

Directory of Open Access Journals (Sweden)

Serafini Maria

2003-11-01

Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.
A cell-based in vitro alternative to identify skin sensitizers by gene expression

International Nuclear Information System (INIS)

Hooyberghs, Jef; Schoeters, Elke; Lambrechts, Nathalie; Nelissen, Inge; Witters, Hilda; Schoeters, Greet; Heuvel, Rosette van den

2008-01-01

The ethical and economic burden associated with animal testing for assessment of skin sensitization has triggered intensive research effort towards development and validation of alternative methods. In addition, new legislation on the registration and use of cosmetics and chemicals promote the use of suitable alternatives for hazard assessment. Our previous studies demonstrated that human CD34 + progenitor-derived dendritic cells from cord blood express specific gene profiles upon exposure to low molecular weight sensitizing chemicals. This paper presents a classification model based on this cell type which is successful in discriminating sensitizing chemicals from non-sensitizing chemicals based on transcriptome analysis of 13 genes. Expression profiles of a set of 10 sensitizers and 11 non-sensitizers were analyzed by RT-PCR using 9 different exposure conditions and a total of 73 donor samples. Based on these data a predictive dichotomous classifier for skin sensitizers has been constructed, which is referred to as . In a first step the dimensionality of the input data was reduced by selectively rejecting a number of exposure conditions and genes. Next, the generalization of a linear classifier was evaluated by a cross-validation which resulted in a prediction performance with a concordance of 89%, a specificity of 97% and a sensitivity of 82%. These results show that the present model may be a useful human in vitro alternative for further use in a test strategy towards the reduction of animal use for skin sensitization
Tensor decomposition-based unsupervised feature extraction identifies candidate genes that induce post-traumatic stress disorder-mediated heart diseases.

Science.gov (United States)

Taguchi, Y-H

2017-12-21

Although post-traumatic stress disorder (PTSD) is primarily a mental disorder, it can cause additional symptoms that do not seem to be directly related to the central nervous system, which PTSD is assumed to directly affect. PTSD-mediated heart diseases are some of such secondary disorders. In spite of the significant correlations between PTSD and heart diseases, spatial separation between the heart and brain (where PTSD is primarily active) prevents researchers from elucidating the mechanisms that bridge the two disorders. Our purpose was to identify genes linking PTSD and heart diseases. In this study, gene expression profiles of various murine tissues observed under various types of stress or without stress were analyzed in an integrated manner using tensor decomposition (TD). Based upon the obtained features, ∼ 400 genes were identified as candidate genes that may mediate heart diseases associated with PTSD. Various gene enrichment analyses supported biological reliability of the identified genes. Ten genes encoding protein-, DNA-, or mRNA-interacting proteins-ILF2, ILF3, ESR1, ESR2, RAD21, HTT, ATF2, NR3C1, TP53, and TP63-were found to be likely to regulate expression of most of these ∼ 400 genes and therefore are candidate primary genes that cause PTSD-mediated heart diseases. Approximately 400 genes in the heart were also found to be strongly affected by various drugs whose known adverse effects are related to heart diseases and/or fear memory conditioning; these data support the reliability of our findings. TD-based unsupervised feature extraction turned out to be a useful method for gene selection and successfully identified possible genes causing PTSD-mediated heart diseases.
Advances in Viral Vector-Based TRAIL Gene Therapy for Cancer

International Nuclear Information System (INIS)

Norian, Lyse A.; James, Britnie R.; Griffith, Thomas S.

2011-01-01

Numerous biologic approaches are being investigated as anti-cancer therapies in an attempt to induce tumor regression while circumventing the toxic side effects associated with standard chemo- or radiotherapies. Among these, tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) has shown particular promise in pre-clinical and early clinical trials, due to its preferential ability to induce apoptotic cell death in cancer cells and its minimal toxicity. One limitation of TRAIL use is the fact that many tumor types display an inherent resistance to TRAIL-induced apoptosis. To circumvent this problem, researchers have explored a number of strategies to optimize TRAIL delivery and to improve its efficacy via co-administration with other anti-cancer agents. In this review, we will focus on TRAIL-based gene therapy approaches for the treatment of malignancies. We will discuss the main viral vectors that are being used for TRAIL gene therapy and the strategies that are currently being attempted to improve the efficacy of TRAIL as an anti-cancer therapeutic
A comparison of 100 human genes using an alu element-based instability model.

Science.gov (United States)

Cook, George W; Konkel, Miriam K; Walker, Jerilyn A; Bourgeois, Matthew G; Fullerton, Mitchell L; Fussell, John T; Herbold, Heath D; Batzer, Mark A

2013-01-01

The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.
Real-time PCR based on SYBR-Green I fluorescence: An alternative to the TaqMan assay for a relative quantification of gene rearrangements, gene amplifications and micro gene deletions

Directory of Open Access Journals (Sweden)

Puisieux Alain

2003-10-01

Full Text Available Abstract Background Real-time PCR is increasingly being adopted for RNA quantification and genetic analysis. At present the most popular real-time PCR assay is based on the hybridisation of a dual-labelled probe to the PCR product, and the development of a signal by loss of fluorescence quenching as PCR degrades the probe. Though this so-called 'TaqMan' approach has proved easy to optimise in practice, the dual-labelled probes are relatively expensive. Results We have designed a new assay based on SYBR-Green I binding that is quick, reliable, easily optimised and compares well with the published assay. Here we demonstrate its general applicability by measuring copy number in three different genetic contexts; the quantification of a gene rearrangement (T-cell receptor excision circles (TREC in peripheral blood mononuclear cells; the detection and quantification of GLI, MYC-C and MYC-N gene amplification in cell lines and cancer biopsies; and detection of deletions in the OPA1 gene in dominant optic atrophy. Conclusion Our assay has important clinical applications, providing accurate diagnostic results in less time, from less biopsy material and at less cost than assays currently employed such as FISH or Southern blotting.
Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

Directory of Open Access Journals (Sweden)

Fei Xiao

Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.
Digital Gene Expression Analysis Based on De Novo Transcriptome Assembly Reveals New Genes Associated with Floral Organ Differentiation of the Orchid Plant Cymbidium ensifolium.

Directory of Open Access Journals (Sweden)

Fengxi Yang

Full Text Available Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms
Beyond the Central Dogma: Model-Based Learning of How Genes Determine Phenotypes

Science.gov (United States)

Reinagel, Adam; Speth, Elena Bray

2016-01-01

In an introductory biology course, we implemented a learner-centered, model-based pedagogy that frequently engaged students in building conceptual models to explain how genes determine phenotypes. Model-building tasks were incorporated within case studies and aimed at eliciting students' understanding of 1) the origin of variation in a population…
PINTA: a web server for network-based gene prioritization from expression data

DEFF Research Database (Denmark)

Nitsch, Daniela; Tranchevent, Léon-Charles; Goncalves, Joana P.

2011-01-01

PINTA (available at http://www.esat.kuleuven.be/ pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction...
Gene-ontology enrichment analysis in two independent family-based samples highlights biologically plausible processes for autism spectrum disorders.

LENUS (Irish Health Repository)

Anney, Richard J L

2012-02-01

Recent genome-wide association studies (GWAS) have implicated a range of genes from discrete biological pathways in the aetiology of autism. However, despite the strong influence of genetic factors, association studies have yet to identify statistically robust, replicated major effect genes or SNPs. We apply the principle of the SNP ratio test methodology described by O\\'Dushlaine et al to over 2100 families from the Autism Genome Project (AGP). Using a two-stage design we examine association enrichment in 5955 unique gene-ontology classifications across four groupings based on two phenotypic and two ancestral classifications. Based on estimates from simulation we identify excess of association enrichment across all analyses. We observe enrichment in association for sets of genes involved in diverse biological processes, including pyruvate metabolism, transcription factor activation, cell-signalling and cell-cycle regulation. Both genes and processes that show enrichment have previously been examined in autistic disorders and offer biologically plausibility to these findings.
Imaging gene expression in gene therapy

Energy Technology Data Exchange (ETDEWEB)

Wiebe, Leonard I. [Alberta Univ., Edmonton (Canada). Noujaim Institute for Pharmaceutical Oncology Research

1997-12-31

Full text. Gene therapy can be used to introduce new genes, or to supplement the function of indigenous genes. At the present time, however, there is non-invasive test to demonstrate efficacy of the gene transfer and expression processes. It has been postulated that scintigraphic imaging can offer unique information on both the site at which the transferred gene is expressed, and the degree of expression, both of which are critical issue for safety and clinical efficacy. Many current studies are based on `suicide gene therapy` of cancer. Cells modified to express these genes commit metabolic suicide in the presence of an enzyme encoded by the transferred gene and a specifically-convertible pro drug. Pro drug metabolism can lead to selective metabolic trapping, required for scintigraphy. Herpes simplex virus type-1 thymidine kinase (H S V-1 t k{sup +}) has been use for `suicide` in vivo tumor gene therapy. It has been proposed that radiolabelled nucleosides can be used as radiopharmaceuticals to detect H S V-1 t k{sup +} gene expression where the H S V-1 t k{sup +} gene serves a reporter or therapeutic function. Animal gene therapy models have been studied using purine-([{sup 18} F]F H P G; [{sup 18} F]-A C V), and pyrimidine- ([{sup 123}/{sup 131} I]I V R F U; [{sup 124}/{sup 131I}]) antiviral nucleosides. Principles of gene therapy and gene therapy imaging will be reviewed and experimental data for [{sup 123}/{sup 131I}]I V R F U imaging with the H S V-1 t k{sup +} reporter gene will be presented
Constructing an integrated gene similarity network for the identification of disease genes.

Science.gov (United States)

Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

2017-09-20

Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .

PHYLOGENETIC RELATIONSHIPS AMONGST 10 Durio SPECIES BASED ON PCR-RFLP ANALYSIS OF TWO CHLOROPLAST GENES

Directory of Open Access Journals (Sweden)

Panca J. Santoso

2013-07-01

Full Text Available Twenty seven species of Durio have been identified in Sabah and Sarawak, Malaysia, but their relationships have not been studied. This study was conducted to analyse phylogenetic relationships amongst 10 Durio species in Malaysia using PCR-RFLP on two chloroplast DNA genes, i.e. ndhC-trnV and rbcL. DNAs were extracted from young leaves of 11 accessions from 10 Durio species collected from the Tenom Agriculture Research Station, Sabah, and University Agriculture Park, Universiti Putra Malaysia. Two pairs of oligonucleotide primers, N1-N2 and rbcL1-rbcL2, were used to flank the target regions ndhC-trnV and rbcL. Eight restriction enzymes, HindIII, BsuRI, PstI, TaqI, MspI, SmaI, BshNI, and EcoR130I, were used to digest the amplicons. Based on the results of PCR-RFLP on ndhC-trnV gene, the 10 Durio species were grouped into five distinct clusters, and the accessions generally showed high variations. However, based on the results of PCR-RFLP on the rbcL gene, the species were grouped into three distinct clusters, and generally showed low variations. This means that ndhC-trnV gene is more reliable for phylogenetic analysis in lower taxonomic level of Durio species or for diversity analysis, while rbcL gene is reliable marker for phylogenetic analysis at higher taxonomic level. PCR-RFLP on the ndhC-trnV and rbcL genes could therefore be considered as useful markers to phylogenetic analysis amongst Durio species. These finding might be used for further molecular marker assisted in Durio breeding program.
Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

Directory of Open Access Journals (Sweden)

Ujjwal Maulik

Full Text Available Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution. The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post
Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

Science.gov (United States)

Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

2015-01-01

Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data
Alteration of gene conversion patterns in Sordaria fimicola by supplementation with DNA bases.

Science.gov (United States)

Kitani, Y; Olive, L S

1970-08-01

Supplementation with DNA bases in crosses of Sordaria fimicola heterozygous for spore color markers (g(1), h(2)) within the gray-spore (g) locus has been found to cause significant alterations in patterns of gene conversion at the two mutant sites. Each base had its own characteristic effect in altering the conversion pattern, and responses of the two mutant sites to the four bases were different in several ways. Also, the responses of the two involved chromatids of the meiotic bivalent were different.
Empirical study of supervised gene screening

Directory of Open Access Journals (Sweden)

Ma Shuangge

2006-12-01

Full Text Available Abstract Background Microarray studies provide a way of linking variations of phenotypes with their genetic causations. Constructing predictive models using high dimensional microarray measurements usually consists of three steps: (1 unsupervised gene screening; (2 supervised gene screening; and (3 statistical model building. Supervised gene screening based on marginal gene ranking is commonly used to reduce the number of genes in the model building. Various simple statistics, such as t-statistic or signal to noise ratio, have been used to rank genes in the supervised screening. Despite of its extensive usage, statistical study of supervised gene screening remains scarce. Our study is partly motivated by the differences in gene discovery results caused by using different supervised gene screening methods. Results We investigate concordance and reproducibility of supervised gene screening based on eight commonly used marginal statistics. Concordance is assessed by the relative fractions of overlaps between top ranked genes screened using different marginal statistics. We propose a Bootstrap Reproducibility Index, which measures reproducibility of individual genes under the supervised screening. Empirical studies are based on four public microarray data. We consider the cases where the top 20%, 40% and 60% genes are screened. Conclusion From a gene discovery point of view, the effect of supervised gene screening based on different marginal statistics cannot be ignored. Empirical studies show that (1 genes passed different supervised screenings may be considerably different; (2 concordance may vary, depending on the underlying data structure and percentage of selected genes; (3 evaluated with the Bootstrap Reproducibility Index, genes passed supervised screenings are only moderately reproducible; and (4 concordance cannot be improved by supervised screening based on reproducibility.
Gene-based interaction analysis shows GABAergic genes interacting with parenting in adolescent depressive symptoms

NARCIS (Netherlands)

Van Assche, Evelien; Moons, Tim; Cinar, Ozan; Viechtbauer, Wolfgang; Oldehinkel, Albertine J.; Van Leeuwen, Karla; Verschueren, Karine; Colpin, Hilde; Lambrechts, Diether; Van den Noortgate, Wim; Goossens, Luc; Claes, Stephan; van Winkel, Ruud

2017-01-01

BACKGROUND: Most gene-environment interaction studies (G × E) have focused on single candidate genes. This approach is criticized for its expectations of large effect sizes and occurrence of spurious results. We describe an approach that accounts for the polygenic nature of most psychiatric
Haplotype-based case-control study between human apurinic/apyrimidinic endonuclease 1/redox effector factor-1 gene and cerebral infarction.

Science.gov (United States)

Naganuma, Takahiro; Nakayama, Tomohiro; Sato, Naoyuki; Fu, Zhenyan; Yamaguchi, Mai; Soma, Masayoshi; Aoi, Noriko; Usami, Ron; Doba, Nobutaka; Hinohara, Shigeaki

2009-10-01

The aim of this study was to investigate the relationship between cerebral infarction (CI) and the human apurinic/apyrimidinic endonuclease 1/redox effector factor-1 (APE1/REF-1) gene using single-nucleotide polymorphisms (SNPs) and a haplotype-based case-control study. We selected 5 SNPs in the human APE1/REF1 gene (rs1760944, rs3136814, rs17111967, rs3136817 and rs1130409), and performed case-control studies in 177 CI patients and 309 control subjects. rs17111967 was found to have no heterogeneity in Japanese. The overall distribution of the haplotype-based case-control study constructed by rs1760944, rs3136814 and rs1130409 showed a significant difference. The frequency of the G-C-T haplotype was significantly higher in the CI group than in the control group (2.5% vs. 0.0%, p>0.001). Based on the results of the haplotype-based case-control-study, the G-C-T haplotype may be a genetic marker of CI, and the APE1/REF-1 gene may be a CI susceptibility gene.
A postprocessing method in the HMC framework for predicting gene function based on biological instrumental data

Science.gov (United States)

Feng, Shou; Fu, Ping; Zheng, Wenbin

2018-03-01

Predicting gene function based on biological instrumental data is a complicated and challenging hierarchical multi-label classification (HMC) problem. When using local approach methods to solve this problem, a preliminary results processing method is usually needed. This paper proposed a novel preliminary results processing method called the nodes interaction method. The nodes interaction method revises the preliminary results and guarantees that the predictions are consistent with the hierarchy constraint. This method exploits the label dependency and considers the hierarchical interaction between nodes when making decisions based on the Bayesian network in its first phase. In the second phase, this method further adjusts the results according to the hierarchy constraint. Implementing the nodes interaction method in the HMC framework also enhances the HMC performance for solving the gene function prediction problem based on the Gene Ontology (GO), the hierarchy of which is a directed acyclic graph that is more difficult to tackle. The experimental results validate the promising performance of the proposed method compared to state-of-the-art methods on eight benchmark yeast data sets annotated by the GO.
HSD3B and gene-gene interactions in a pathway-based analysis of genetic susceptibility to bladder cancer.

Directory of Open Access Journals (Sweden)

Angeline S Andrew

Full Text Available Bladder cancer is the 4(th most common cancer among men in the U.S. We analyzed variant genotypes hypothesized to modify major biological processes involved in bladder carcinogenesis, including hormone regulation, apoptosis, DNA repair, immune surveillance, metabolism, proliferation, and telomere maintenance. Logistic regression was used to assess the relationship between genetic variation affecting these processes and susceptibility in 563 genotyped urothelial cell carcinoma cases and 863 controls enrolled in a case-control study of incident bladder cancer conducted in New Hampshire, U.S. We evaluated gene-gene interactions using Multifactor Dimensionality Reduction (MDR and Statistical Epistasis Network analysis. The 3'UTR flanking variant form of the hormone regulation gene HSD3B2 was associated with increased bladder cancer risk in the New Hampshire population (adjusted OR 1.85 95%CI 1.31-2.62. This finding was successfully replicated in the Texas Bladder Cancer Study with 957 controls, 497 cases (adjusted OR 3.66 95%CI 1.06-12.63. The effect of this prevalent SNP was stronger among males (OR 2.13 95%CI 1.40-3.25 than females (OR 1.56 95%CI 0.83-2.95, (SNP-gender interaction P = 0.048. We also identified a SNP-SNP interaction between T-cell activation related genes GATA3 and CD81 (interaction P = 0.0003. The fact that bladder cancer incidence is 3-4 times higher in males suggests the involvement of hormone levels. This biologic process-based analysis suggests candidate susceptibility markers and supports the theory that disrupted hormone regulation plays a role in bladder carcinogenesis.
Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

Science.gov (United States)

Zhang, Xue; Acencio, Marcio Luis; Lemke, Ney

2016-01-01

Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research. PMID:27014079
Association between NINJ2 gene polymorphisms and ischemic stroke: a family-based case-control study.

Science.gov (United States)

Zhu, Yanping; Liu, Kuo; Tang, Xun; Wang, Jinwei; Yu, Zhiping; Wu, Yiqun; Chen, Dafang; Wang, Xueyin; Fang, Kai; Li, Na; Huang, Shaoping; Hu, Yonghua

2014-11-01

Novel susceptibility genes related to ischemic stroke (IS) are proposed in recent literatures. Population-based replicate studies would cause false positive results due to population stratification. 229 recruit IS patients and their 229 non-IS siblings were used in this study to avoid population stratification. The family-based study was conducted in Beijing from June 2005 to June 2012. Association between SNPs and IS was found in the sibship discordant tests, and the conditional logistic regression was performed to identify effect size and explore gene-environment interactions. Significant allelic association was identified between NINJ2 gene rs11833579 (P = 0.008), protein kinase C η gene rs2230501 (P = 0.039) and IS. The AA genotype of rs11833579 increased 1.51-fold risk (95% CI 1.04-3.46; P = 0.043) of IS, and it conferred susceptibility to IS only in a dominant model (OR 2.69; 95% CI 1.06-6.78; P = 0.036]. Risk of IS was higher (HR 3.58; 95% CI 1.54-8.31; P = 0.003) especially when the carriers of rs11833579 AA genotype were smokers. The present study suggests A allele of rs11833579 may play a role in mediating susceptibility to IS and it may increase the risk of IS together with smoking.
SWPhylo - A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees.

Science.gov (United States)

Yu, Xiaoyu; Reva, Oleg N

2018-01-01

Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.
Allen Brain Atlas-Driven Visualizations: a web-based gene expression energy visualization tool.

Science.gov (United States)

Zaldivar, Andrew; Krichmar, Jeffrey L

2014-01-01

The Allen Brain Atlas-Driven Visualizations (ABADV) is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA) across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.
Allen Brain Atlas-Driven Visualizations: A Web-Based Gene Expression Energy Visualization Tool

Directory of Open Access Journals (Sweden)

Andrew eZaldivar

2014-05-01

Full Text Available The Allen Brain Atlas-Driven Visualizations (ABADV is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.
Detection of Fusarium verticillioides by PCR-ELISA based on FUM21 gene.

Science.gov (United States)

Omori, Aline Myuki; Ono, Elisabete Yurie Sataque; Bordini, Jaqueline Gozzi; Hirozawa, Melissa Tiemi; Fungaro, Maria Helena Pelegrinelli; Ono, Mario Augusto

2018-08-01

Fusarium verticillioides is a primary corn pathogen and fumonisin producer which is associated with toxic effects in humans and animals. The traditional methods for detection of fungal contamination based on morphological characteristics are time-consuming and show low sensitivity and specificity. Therefore, the objective of this study was to develop a PCR-ELISA based on the FUM21 gene for F. verticillioides detection. The DNA of the F. verticillioides, Fusarium sp., Aspergillus sp. and Penicillium sp. isolates was analyzed by conventional PCR and PCR-ELISA to determine the specificity. The PCR-ELISA was specific to F. verticillioides isolates, showed a 2.5 pg detection limit and was 100-fold more sensitive than conventional PCR. In corn samples inoculated with F. verticillioides conidia, the detection limit of the PCR-ELISA was 1 × 10 4 conidia/g and was also 100-fold more sensitive than conventional PCR. Naturally contaminated corn samples were analyzed by PCR-ELISA based on the FUM21 gene and PCR-ELISA absorbance values correlated positively (p PCR-ELISA developed in this study can be useful for F. verticillioides detection in corn samples. Copyright © 2018 Elsevier Ltd. All rights reserved.
Pea Marker Database (PMD) - A new online database combining known pea (Pisum sativum L.) gene-based markers.

Science.gov (United States)

Kulaeva, Olga A; Zhernakov, Aleksandr I; Afonin, Alexey M; Boikov, Sergei S; Sulima, Anton S; Tikhonovich, Igor A; Zhukov, Vladimir A

2017-01-01

Pea (Pisum sativum L.) is the oldest model object of plant genetics and one of the most agriculturally important legumes in the world. Since the pea genome has not been sequenced yet, identification of genes responsible for mutant phenotypes or desirable agricultural traits is usually performed via genetic mapping followed by candidate gene search. Such mapping is best carried out using gene-based molecular markers, as it opens the possibility for exploiting genome synteny between pea and its close relative Medicago truncatula Gaertn., possessing sequenced and annotated genome. In the last 5 years, a large number of pea gene-based molecular markers have been designed and mapped owing to the rapid evolution of "next-generation sequencing" technologies. However, the access to the complete set of markers designed worldwide is limited because the data are not uniformed and therefore hard to use. The Pea Marker Database was designed to combine the information about pea markers in a form of user-friendly and practical online tool. Version 1 (PMD1) comprises information about 2484 genic markers, including their locations in linkage groups, the sequences of corresponding pea transcripts and the names of related genes in M. truncatula. Version 2 (PMD2) is an updated version comprising 15944 pea markers in the same format with several advanced features. To test the performance of the PMD, fine mapping of pea symbiotic genes Sym13 and Sym27 in linkage groups VII and V, respectively, was carried out. The results of mapping allowed us to propose the Sen1 gene (a homologue of SEN1 gene of Lotus japonicus (Regel) K. Larsen) as the best candidate gene for Sym13, and to narrow the list of possible candidate genes for Sym27 to ten, thus proving PMD to be useful for pea gene mapping and cloning. All information contained in PMD1 and PMD2 is available at www.peamarker.arriam.ru.
Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

Energy Technology Data Exchange (ETDEWEB)

Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

2014-03-15

Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.
Lineage relationship of prostate cancer cell types based on gene expression

Directory of Open Access Journals (Sweden)

Ware Carol B

2011-05-01

Full Text Available Abstract Background Prostate tumor heterogeneity is a major factor in disease management. Heterogeneity could be due to multiple cancer cell types with distinct gene expression. Of clinical importance is the so-called cancer stem cell type. Cell type-specific transcriptomes are used to examine lineage relationship among cancer cell types and their expression similarity to normal cell types including stem/progenitor cells. Methods Transcriptomes were determined by Affymetrix DNA array analysis for the following cell types. Putative prostate progenitor cell populations were characterized and isolated by expression of the membrane transporter ABCG2. Stem cells were represented by embryonic stem and embryonal carcinoma cells. The cancer cell types were Gleason pattern 3 (glandular histomorphology and pattern 4 (aglandular sorted from primary tumors, cultured prostate cancer cell lines originally established from metastatic lesions, xenografts LuCaP 35 (adenocarcinoma phenotype and LuCaP 49 (neuroendocrine/small cell carcinoma grown in mice. No detectable gene expression differences were detected among serial passages of the LuCaP xenografts. Results Based on transcriptomes, the different cancer cell types could be clustered into a luminal-like grouping and a non-luminal-like (also not basal-like grouping. The non-luminal-like types showed expression more similar to that of stem/progenitor cells than the luminal-like types. However, none showed expression of stem cell genes known to maintain stemness. Conclusions Non-luminal-like types are all representatives of aggressive disease, and this could be attributed to the similarity in overall gene expression to stem and progenitor cell types.
Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

Science.gov (United States)

Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

2006-11-01

To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.
Genome-wide and gene-based association studies of anxiety disorders in European and African American samples.

Directory of Open Access Journals (Sweden)

Takeshi Otowa

Full Text Available Anxiety disorders (ADs are common mental disorders caused by a combination of genetic and environmental factors. Since ADs are highly comorbid with each other, partially due to shared genetic basis, studying AD phenotypes in a coordinated manner may be a powerful strategy for identifying potential genetic loci for ADs. To detect these loci, we performed genome-wide association studies (GWAS of ADs. In addition, as a complementary approach to single-locus analysis, we also conducted gene- and pathway-based analyses. GWAS data were derived from the control sample of the Molecular Genetics of Schizophrenia (MGS project (2,540 European American and 849 African American subjects genotyped on the Affymetrix GeneChip 6.0 array. We applied two phenotypic approaches: (1 categorical case-control comparisons (CC based upon psychiatric diagnoses, and (2 quantitative phenotypic factor scores (FS derived from a multivariate analysis combining information across the clinical phenotypes. Linear and logistic models were used to analyse the association with ADs using FS and CC traits, respectively. At the single locus level, no genome-wide significant association was found. A trans-population gene-based meta-analysis across both ethnic subsamples using FS identified three genes (MFAP3L on 4q32.3, NDUFAB1 and PALB2 on 16p12 with genome-wide significance (false discovery rate (FDR] <5%. At the pathway level, several terms such as transcription regulation, cytokine binding, and developmental process were significantly enriched in ADs (FDR <5%. Our approaches studying ADs as quantitative traits and utilizing the full GWAS data may be useful in identifying susceptibility genes and pathways for ADs.

TXTGate: profiling gene groups with text-based information

DEFF Research Database (Denmark)

Glenisson, P.; Coessens, B.; Van Vooren, S.

2004-01-01

We implemented a framework called TXTGate that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, term-as well as gene-centric views are offered on selected textual...
Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.

Science.gov (United States)

Wu, Yufeng

2012-03-01

Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
The YJR127C/ZMS1 gene product is involved in glycerol-based respiratory growth of the yeast Saccharomyces cerevisiae.

Science.gov (United States)

Lu, Lin; Roberts, George G; Oszust, Cynthia; Hudson, Alan P

2005-10-01

A putative yeast mitochondrial upstream activating sequence (UAS) was used in a one-hybrid screening procedure that identified the YJR127C ORF on chromosome X. This gene was previously designated ZMS1 and is listed as a transcription factor on the SGD website. Real time RT-PCR assays showed that expression of YJR127C/ZMS1 was glucose-repressible, and a deletion mutant for the gene showed a growth defect on glycerol-based but not on glucose- or ethanol-based medium. Real time RT-PCR analyses identified severely attenuated transcript levels from GUT1 and GUT2 to be the source of that growth defect, the products of GUT1 and GUT2 are required for glycerol utilization. mRNA levels from a large group of mitochondria- and respiration-related nuclear genes also were shown to be attenuated in the deletion mutant. Importantly, transcript levels from the mitochondrial OLI1 gene, which has an associated organellar UAS, were attenuated in the DeltaYJR127C mutant during glycerol-based growth, but those from COX3 (OXI2), which lacks an associated mitochondrial UAS, were not. Transcriptome analysis of the glycerol-grown deletion mutant showed that genes in several metabolic and other categories are affected by loss of this gene product, including protein transport, signal transduction, and others. Thus, the product of YJR127C/ZMS1 is involved in transcriptional control for genes in both cellular genetic compartments, many of which specify products required for glycerol-based growth, respiration, and other functions.
Radionuclide reporter gene imaging for cardiac gene therapy

International Nuclear Information System (INIS)

Inubushi, Masayuki; Tamaki, Nagara

2007-01-01

In the field of cardiac gene therapy, angiogenic gene therapy has been most extensively investigated. The first clinical trial of cardiac angiogenic gene therapy was reported in 1998, and at the peak, more than 20 clinical trial protocols were under evaluation. However, most trials have ceased owing to the lack of decisive proof of therapeutic effects and the potential risks of viral vectors. In order to further advance cardiac angiogenic gene therapy, remaining open issues need to be resolved: there needs to be improvement of gene transfer methods, regulation of gene expression, development of much safer vectors and optimisation of therapeutic genes. For these purposes, imaging of gene expression in living organisms is of great importance. In radionuclide reporter gene imaging, ''reporter genes'' transferred into cell nuclei encode for a protein that retains a complementary ''reporter probe'' of a positron or single-photon emitter; thus expression of the reporter genes can be imaged with positron emission tomography or single-photon emission computed tomography. Accordingly, in the setting of gene therapy, the location, magnitude and duration of the therapeutic gene co-expression with the reporter genes can be monitored non-invasively. In the near future, gene therapy may evolve into combination therapy with stem/progenitor cell transplantation, so-called cell-based gene therapy or gene-modified cell therapy. Radionuclide reporter gene imaging is now expected to contribute in providing evidence on the usefulness of this novel therapeutic approach, as well as in investigating the molecular mechanisms underlying neovascularisation and safety issues relevant to further progress in conventional gene therapy. (orig.)
Genome-Wide Constitutively Expressed Gene Analysis and New Reference Gene Selection Based on Transcriptome Data: A Case Study from Poplar/Canker Disease Interaction

Directory of Open Access Journals (Sweden)

Jiaping Zhao

2017-10-01

Full Text Available A number of transcriptome datasets for differential expression (DE genes have been widely used for understanding organismal biology, but these datasets also contain untapped information that can be used to develop more precise analytical tools. With the use of transcriptome data generated from poplar/canker disease interaction system, we describe a methodology to identify candidate reference genes from high-throughput sequencing data. This methodology will improve the accuracy of RT-qPCR and will lead to better standards for the normalization of expression data. Expression stability analysis from xylem and phloem of Populus bejingensis inoculated with the fungal canker pathogen Botryosphaeria dothidea revealed that 729 poplar transcripts (1.11% were stably expressed, at a threshold level of coefficient of variance (CV of FPKM < 20% and maximum fold change (MFC of FPKM < 2.0. Expression stability and bioinformatics analysis suggested that commonly used house-keeping (HK genes were not the most appropriate internal controls: 70 of the 72 commonly used HK genes were not stably expressed, 45 of the 72 produced multiple isoform transcripts, and some of their reported primers produced unspecific amplicons in PCR amplification. RT-qPCR analysis to compare and evaluate the expression stability of 10 commonly used poplar HK genes and 20 of the 729 newly-identified stably expressed transcripts showed that some of the newly-identified genes (such as SSU_S8e, LSU_L5e, and 20S_PSU had higher stability ranking than most of commonly used HK genes. Based on these results, we recommend a pipeline for deriving reference genes from transcriptome data. An appropriate candidate gene should have a unique transcript, constitutive expression, CV value of expression < 20% (or possibly 30% and MFC value of expression <2, and an expression level of 50–1,000 units. Lastly, when four of the newly identified HK genes were used in the normalization of expression data for 20
Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased.

Science.gov (United States)

Xi, Zhenxiang; Liu, Liang; Davis, Charles C

2015-11-01

The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014). Copyright © 2015 Elsevier Inc. All rights reserved.
Microarray-based analysis of IncA/C plasmid-associated genes from multidrug-resistant Salmonella enterica.

Science.gov (United States)

Lindsey, Rebecca L; Frye, Jonathan G; Fedorka-Cray, Paula J; Meinersmann, Richard J

2011-10-01

In the family Enterobacteriaceae, plasmids have been classified according to 27 incompatibility (Inc) or replicon types that are based on the inability of different plasmids with the same replication mechanism to coexist in the same cell. Certain replicon types such as IncA/C are associated with multidrug resistance (MDR). We developed a microarray that contains 286 unique 70-mer oligonucleotide probes based on sequences from five IncA/C plasmids: pYR1 (Yersinia ruckeri), pPIP1202 (Yersinia pestis), pP99-018 (Photobacterium damselae), pSN254 (Salmonella enterica serovar Newport), and pP91278 (Photobacterium damselae). DNA from 59 Salmonella enterica isolates was hybridized to the microarray and analyzed for the presence or absence of genes. These isolates represented 17 serovars from 14 different animal hosts and from different geographical regions in the United States. Qualitative cluster analysis was performed using CLUSTER 3.0 to group microarray hybridization results. We found that IncA/C plasmids occurred in two lineages distinguished by a major insertion-deletion (indel) region that contains genes encoding mostly hypothetical proteins. The most variable genes were represented by transposon-associated genes as well as four antimicrobial resistance genes (aphA, merP, merA, and aadA). Sixteen mercury resistance genes were identified and highly conserved, suggesting that mercury ion-related exposure is a stronger pressure than anticipated. We used these data to construct a core IncA/C genome and an accessory genome. The results of our studies suggest that the transfer of antimicrobial resistance determinants by transfer of IncA/C plasmids is somewhat less common than exchange within the plasmids orchestrated by transposable elements, such as transposons, integrating and conjugative elements (ICEs), and insertion sequence common regions (ISCRs), and thus pose less opportunity for exchange of antimicrobial resistance.
A comparison of 100 human genes using an alu element-based instability model.

Directory of Open Access Journals (Sweden)

George W Cook

Full Text Available The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1 the two-hit double-strand break potential of Alu elements and 2 the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.
AUDIOME: a tiered exome sequencing-based comprehensive gene panel for the diagnosis of heterogeneous nonsyndromic sensorineural hearing loss.

Science.gov (United States)

Guan, Qiaoning; Balciuniene, Jorune; Cao, Kajia; Fan, Zhiqian; Biswas, Sawona; Wilkens, Alisha; Gallo, Daniel J; Bedoukian, Emma; Tarpinian, Jennifer; Jayaraman, Pushkala; Sarmady, Mahdi; Dulik, Matthew; Santani, Avni; Spinner, Nancy; Abou Tayoun, Ahmad N; Krantz, Ian D; Conlin, Laura K; Luo, Minjie

2018-03-29

PurposeHereditary hearing loss is highly heterogeneous. To keep up with rapidly emerging disease-causing genes, we developed the AUDIOME test for nonsyndromic hearing loss (NSHL) using an exome sequencing (ES) platform and targeted analysis for the curated genes.MethodsA tiered strategy was implemented for this test. Tier 1 includes combined Sanger and targeted deletion analyses of the two most common NSHL genes and two mitochondrial genes. Nondiagnostic tier 1 cases are subjected to ES and array followed by targeted analysis of the remaining AUDIOME genes.ResultsES resulted in good coverage of the selected genes with 98.24% of targeted bases at >15 ×. A fill-in strategy was developed for the poorly covered regions, which generally fell within GC-rich or highly homologous regions. Prospective testing of 33 patients with NSHL revealed a diagnosis in 11 (33%) and a possible diagnosis in 8 cases (24.2%). Among those, 10 individuals had variants in tier 1 genes. The ES data in the remaining nondiagnostic cases are readily available for further analysis.ConclusionThe tiered and ES-based test provides an efficient and cost-effective diagnostic strategy for NSHL, with the potential to reflex to full exome to identify causal changes outside of the AUDIOME test.Genetics in Medicine advance online publication, 29 March 2018; doi:10.1038/gim.2018.48.
A Shortest-Path-Based Method for the Analysis and Prediction of Fruit-Related Genes in Arabidopsis thaliana.

Science.gov (United States)

Zhu, Liucun; Zhang, Yu-Hang; Su, Fangchu; Chen, Lei; Huang, Tao; Cai, Yu-Dong

2016-01-01

Biologically, fruits are defined as seed-bearing reproductive structures in angiosperms that develop from the ovary. The fertilization, development and maturation of fruits are crucial for plant reproduction and are precisely regulated by intrinsic genetic regulatory factors. In this study, we used Arabidopsis thaliana as a model organism and attempted to identify novel genes related to fruit-associated biological processes. Specifically, using validated genes, we applied a shortest-path-based method to identify several novel genes in a large network constructed using the protein-protein interactions observed in Arabidopsis thaliana. The described analyses indicate that several of the discovered genes are associated with fruit fertilization, development and maturation in Arabidopsis thaliana.
Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups

Directory of Open Access Journals (Sweden)

S. Pamela K. Shiao

2018-02-01

Full Text Available For the personalization of polygenic/omics-based health care, the purpose of this study was to examine the gene–environment interactions and predictors of colorectal cancer (CRC by including five key genes in the one-carbon metabolism pathways. In this proof-of-concept study, we included a total of 54 families and 108 participants, 54 CRC cases and 54 matched family friends representing four major racial ethnic groups in southern California (White, Asian, Hispanics, and Black. We used three phases of data analytics, including exploratory, family-based analyses adjusting for the dependence within the family for sharing genetic heritage, the ensemble method, and generalized regression models for predictive modeling with a machine learning validation procedure to validate the results for enhanced prediction and reproducibility. The results revealed that despite the family members sharing genetic heritage, the CRC group had greater combined gene polymorphism rates than the family controls (p < 0.05, on MTHFR C677T, MTR A2756G, MTRR A66G, and DHFR 19 bp except MTHFR A1298C. Four racial groups presented different polymorphism rates for four genes (all p < 0.05 except MTHFR A1298C. Following the ensemble method, the most influential factors were identified, and the best predictive models were generated by using the generalized regression models, with Akaike’s information criterion and leave-one-out cross validation methods. Body mass index (BMI and gender were consistent predictors of CRC for both models when individual genes versus total polymorphism counts were used, and alcohol use was interactive with BMI status. Body mass index status was also interactive with both gender and MTHFR C677T gene polymorphism, and the exposure to environmental pollutants was an additional predictor. These results point to the important roles of environmental and modifiable factors in relation to gene–environment interactions in the prevention of CRC.
Preparation and Characterization of Gelatin-Based Mucoadhesive Nanocomposites as Intravesical Gene Delivery Scaffolds

Directory of Open Access Journals (Sweden)

Ching-Wen Liu

2014-01-01

Full Text Available This study aimed to develop optimal gelatin-based mucoadhesive nanocomposites as scaffolds for intravesical gene delivery to the urothelium. Hydrogels were prepared by chemically crosslinking gelatin A or B with glutaraldehyde. Physicochemical and delivery properties including hydration ratio, viscosity, size, yield, thermosensitivity, and enzymatic degradation were studied, and scanning electron microscopy (SEM was carried out. The optimal hydrogels (H, composed of 15% gelatin A175, displayed an 81.5% yield rate, 87.1% hydration ratio, 42.9 Pa·s viscosity, and 125.8 nm particle size. The crosslinking density of the hydrogels was determined by performing pronase degradation and ninhydrin assays. In vitro lentivirus (LV release studies involving p24 capsid protein analysis in 293T cells revealed that hydrogels containing lentivirus (H-LV had a higher cumulative release than that observed for LV alone (3.7-, 2.3-, and 2.3-fold at days 1, 3, and 5, resp.. Lentivirus from lentivector constructed green fluorescent protein (GFP was then entrapped in hydrogels (H-LV-GFP. H-LV-GFP showed enhanced gene delivery in AY-27 cells in vitro and to rat urothelium by intravesical instillation in vivo. Cystometrogram showed mucoadhesive H-LV reduced peak micturition and threshold pressure and increased bladder compliance. In this study, we successfully developed first optimal gelatin-based mucoadhesive nanocomposites as intravesical gene delivery scaffolds.
Comparison of different cationized proteins as biomaterials for nanoparticle-based ocular gene delivery.

Science.gov (United States)

Zorzi, Giovanni K; Párraga, Jenny E; Seijo, Begoña; Sanchez, Alejandro

2015-11-01

Cationized polymers have been proposed as transfection agents for gene therapy. The present work aims to improve the understanding of the potential use of different cationized proteins (atelocollagen, albumin and gelatin) as nanoparticle components and to investigate the possibility of modulating the physicochemical properties of the resulting nanoparticle carriers by selecting specific protein characteristics in an attempt to improve current ocular gene-delivery approaches. The toxicity profiles, as well as internalization and transfection efficiency, of the developed nanoparticles can be modulated by modifying the molecular weight of the selected protein and the amine used for cationization. The most promising systems are nanoparticles based on intermediate molecular weight gelatin cationized with the endogenous amine spermine, which exhibit an adequate toxicological profile, as well as effective association and protection of pDNA or siRNA molecules, thereby resulting in higher transfection efficiency and gene silencing than the other studied formulations. Copyright © 2015 Elsevier B.V. All rights reserved.
Detection of 22 common leukemic fusion genes using a single-step multiplex qRT-PCR-based assay.

Science.gov (United States)

Lyu, Xiaodong; Wang, Xianwei; Zhang, Lina; Chen, Zhenzhu; Zhao, Yu; Hu, Jieying; Fan, Ruihua; Song, Yongping

2017-07-25

Fusion genes generated from chromosomal translocation play an important role in hematological malignancies. Detection of fusion genes currently employ use of either conventional RT-PCR methods or fluorescent in situ hybridization (FISH), where both methods involve tedious methodologies and require prior characterization of chromosomal translocation events as determined by cytogenetic analysis. In this study, we describe a real-time quantitative reverse transcription PCR (qRT-PCR)-based multi-fusion gene screening method with the capacity to detect 22 fusion genes commonly found in leukemia. This method does not require pre-characterization of gene translocation events, thereby facilitating immediate diagnosis and therapeutic management. We performed fluorescent qRT-PCR (F-qRT-PCR) using a commercially-available multi-fusion gene detection kit on a patient cohort of 345 individuals comprising 108 cases diagnosed with acute myeloid leukemia (AML) for initial evaluation; remaining patients within the cohort were assayed for confirmatory diagnosis. Results obtained by F-qRT-PCR were compared alongside patient analysis by cytogenetic characterization. Gene translocations detected by F-qRT-PCR in AML cases were diagnosed in 69.4% of the patient cohort, which was comparatively similar to 68.5% as diagnosed by cytogenetic analysis, thereby demonstrating 99.1% concordance. Overall gene fusion was detected in 53.7% of the overall patient population by F-qRT-PCR, 52.9% by cytogenetic prediction in leukemia, and 9.1% in non-leukemia patients by both methods. The overall concordance rate was calculated to be 99.0%. Fusion genes were detected by F-qRT-PCR in 97.3% of patients with CML, followed by 69.4% with AML, 33.3% with acute lymphoblastic leukemia (ALL), 9.1% with myelodysplastic syndromes (MDS), and 0% with chronic lymphocytic leukemia (CLL). We describe the use of a F-qRT-PCR-based multi-fusion gene screening method as an efficient one-step diagnostic procedure as an
Tumor targeted gene therapy

International Nuclear Information System (INIS)

Kang, Joo Hyun

2006-01-01

Knowledge of molecular mechanisms governing malignant transformation brings new opportunities for therapeutic intervention against cancer using novel approaches. One of them is gene therapy based on the transfer of genetic material to an organism with the aim of correcting a disease. The application of gene therapy to the cancer treatment had led to the development of new experimental approaches such as suicidal gene therapy, inhibition of oncogenes and restoration of tumor-suppressor genes. Suicidal gene therapy is based on the expression in tumor cells of a gene encoding an enzyme that converts a prodrug into a toxic product. Representative suicidal genes are Herpes simplex virus type 1 thymidine kinase (HSV1-tk) and cytosine deaminase (CD). Especially, physicians and scientists of nuclear medicine field take an interest in suicidal gene therapy because they can monitor the location and magnitude, and duration of expression of HSV1-tk and CD by PET scanner
A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

Science.gov (United States)

Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

2016-06-01

Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.
Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.

Science.gov (United States)

Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju

2017-04-27

means of network and pathway-based methodology, we explored the pathogenetic mechanism underlying AD at a systems biology level. Results from our work could provide valuable clues for understanding the molecular mechanism underlying AD. In addition, the framework proposed in this study could be used to investigate the pathological molecular network and genes relevant to other complex diseases or phenotypes.
Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

Science.gov (United States)

Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

2018-04-01

The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.
Application of microarray and functional-based screening methods for the detection of antimicrobial resistance genes in the microbiomes of healthy humans.

Directory of Open Access Journals (Sweden)

Roderick M Card

Full Text Available The aim of this study was to screen for the presence of antimicrobial resistance genes within the saliva and faecal microbiomes of healthy adult human volunteers from five European countries. Two non-culture based approaches were employed to obviate potential bias associated with difficult to culture members of the microbiota. In a gene target-based approach, a microarray was employed to screen for the presence of over 70 clinically important resistance genes in the saliva and faecal microbiomes. A total of 14 different resistance genes were detected encoding resistances to six antibiotic classes (aminoglycosides, β-lactams, macrolides, sulphonamides, tetracyclines and trimethoprim. The most commonly detected genes were erm(B, blaTEM, and sul2. In a functional-based approach, DNA prepared from pooled saliva samples was cloned into Escherichia coli and screened for expression of resistance to ampicillin or sulphonamide, two of the most common resistances found by array. The functional ampicillin resistance screen recovered genes encoding components of a predicted AcrRAB efflux pump. In the functional sulphonamide resistance screen, folP genes were recovered encoding mutant dihydropteroate synthase, the target of sulphonamide action. The genes recovered from the functional screens were from the chromosomes of commensal species that are opportunistically pathogenic and capable of exchanging DNA with related pathogenic species. Genes identified by microarray were not recovered in the activity-based screen, indicating that these two methods can be complementary in facilitating the identification of a range of resistance mechanisms present within the human microbiome. It also provides further evidence of the diverse reservoir of resistance mechanisms present in bacterial populations in the human gut and saliva. In future the methods described in this study can be used to monitor changes in the resistome in response to antibiotic therapy.
Training ANFIS structure using genetic algorithm for liver cancer classification based on microarray gene expression data

Directory of Open Access Journals (Sweden)

Bülent Haznedar

2017-02-01

Full Text Available Classification is an important data mining technique, which is used in many fields mostly exemplified as medicine, genetics and biomedical engineering. The number of studies about classification of the datum on DNA microarray gene expression is specifically increased in recent years. However, because of the reasons as the abundance of gene numbers in the datum as microarray gene expressions and the nonlinear relations mostly across those datum, the success of conventional classification algorithms can be limited. Because of these reasons, the interest on classification methods which are based on artificial intelligence to solve the problem on classification has been gradually increased in recent times. In this study, a hybrid approach which is based on Adaptive Neuro-Fuzzy Inference System (ANFIS and Genetic Algorithm (GA are suggested in order to classify liver microarray cancer data set. Simulation results are compared with the results of other methods. According to the results obtained, it is seen that the recommended method is better than the other methods.

Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk

DEFF Research Database (Denmark)

Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan

2015-01-01

BACKGROUND: Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified...... in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). RESULTS: Gene set enrichment analysis...
Genealogy-based methods for inference of historical recombination and gene flow and their application in Saccharomyces cerevisiae.

Science.gov (United States)

Jenkins, Paul A; Song, Yun S; Brem, Rachel B

2012-01-01

Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.
SWPhylo – A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees

Science.gov (United States)

Yu, Xiaoyu; Reva, Oleg N

2018-01-01

Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354
Mechanism-based biomarker gene sets for glutathione depletion-related hepatotoxicity in rats

International Nuclear Information System (INIS)

Gao Weihua; Mizukawa, Yumiko; Nakatsu, Noriyuki; Minowa, Yosuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

2010-01-01

Chemical-induced glutathione depletion is thought to be caused by two types of toxicological mechanisms: PHO-type glutathione depletion [glutathione conjugated with chemicals such as phorone (PHO) or diethyl maleate (DEM)], and BSO-type glutathione depletion [i.e., glutathione synthesis inhibited by chemicals such as L-buthionine-sulfoximine (BSO)]. In order to identify mechanism-based biomarker gene sets for glutathione depletion in rat liver, male SD rats were treated with various chemicals including PHO (40, 120 and 400 mg/kg), DEM (80, 240 and 800 mg/kg), BSO (150, 450 and 1500 mg/kg), and bromobenzene (BBZ, 10, 100 and 300 mg/kg). Liver samples were taken 3, 6, 9 and 24 h after administration and examined for hepatic glutathione content, physiological and pathological changes, and gene expression changes using Affymetrix GeneChip Arrays. To identify differentially expressed probe sets in response to glutathione depletion, we focused on the following two courses of events for the two types of mechanisms of glutathione depletion: a) gene expression changes occurring simultaneously in response to glutathione depletion, and b) gene expression changes after glutathione was depleted. The gene expression profiles of the identified probe sets for the two types of glutathione depletion differed markedly at times during and after glutathione depletion, whereas Srxn1 was markedly increased for both types as glutathione was depleted, suggesting that Srxn1 is a key molecule in oxidative stress related to glutathione. The extracted probe sets were refined and verified using various compounds including 13 additional positive or negative compounds, and they established two useful marker sets. One contained three probe sets (Akr7a3, Trib3 and Gstp1) that could detect conjugation-type glutathione depletors any time within 24 h after dosing, and the other contained 14 probe sets that could detect glutathione depletors by any mechanism. These two sets, with appropriate scoring
A Partial Least Square Approach for Modeling Gene-gene and Gene-environment Interactions When Multiple Markers Are Genotyped

Science.gov (United States)

Wang, Tao; Ho, Gloria; Ye, Kenny; Strickler, Howard; Elston, Robert C.

2008-01-01

Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense SNPs in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches: the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey’s 1-df model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women’s Health Initiative (WHI), this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with BMI. PMID:18615621
A partial least-square approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped.

Science.gov (United States)

Wang, Tao; Ho, Gloria; Ye, Kenny; Strickler, Howard; Elston, Robert C

2009-01-01

Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.
In search of functional association from time-series microarray data based on the change trend and level of gene expression

Directory of Open Access Journals (Sweden)

Zeng An-Ping

2006-02-01

Full Text Available Abstract Background The increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes. Present methods used to infer functional linkages between genes from expression data are mainly based on a point-to-point comparison. Change trends between consecutive time points in time-series data have been so far not well explored. Results In this work we present a new method based on extracting main features of the change trend and level of gene expression between consecutive time points. The method, termed as trend correlation (TC, includes two major steps: 1, calculating a maximal local alignment of change trend score by dynamic programming and a change trend correlation coefficient between the maximal matched change levels of each gene pair; 2, inferring relationships of gene pairs based on two statistical extraction procedures. The new method considers time shifts and inverted relationships in a similar way as the local clustering (LC method but the latter is merely based on a point-to-point comparison. The TC method is demonstrated with data from yeast cell cycle and compared with the LC method and the widely used Pearson correlation coefficient (PCC based clustering method. The biological significance of the gene pairs is examined with several large-scale yeast databases. Although the TC method predicts an overall lower number of gene pairs than the other two methods at a same p-value threshold, the additional number of gene pairs inferred by the TC method is considerable: e.g. 20.5% compared with the LC method and 49.6% with the PCC method for a p-value threshold of 2.7E-3. Moreover, the percentage of the inferred gene pairs consistent with databases by our method is generally higher than the LC method and similar to the PCC method. A significant number of the gene pairs only inferred by the TC method are process-identity or function-similarity pairs or have well-documented biological
Systems Pharmacology-Based Approach of Connecting Disease Genes in Genome-Wide Association Studies with Traditional Chinese Medicine.

Science.gov (United States)

Kim, Jihye; Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kang, Jaewoo; Tan, Aik Choon

2018-01-01

Traditional Chinese medicine (TCM) originated in ancient China has been practiced over thousands of years for treating various symptoms and diseases. However, the molecular mechanisms of TCM in treating these diseases remain unknown. In this study, we employ a systems pharmacology-based approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. We studied 102 TCM components and their target genes by analyzing microarray gene expression experiments. We constructed disease-gene networks from 2558 GWAS studies. We applied a systems pharmacology approach to prioritize disease-target genes. Using this bioinformatics approach, we analyzed 14,713 GWAS disease-TCM-target gene pairs and identified 115 disease-gene pairs with q value < 0.2. We validated several of these GWAS disease-TCM-target gene pairs with literature evidence, demonstrating that this computational approach could reveal novel indications for TCM. We also develop TCM-Disease web application to facilitate the traditional Chinese medicine drug repurposing efforts. Systems pharmacology is a promising approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. The computational approaches described in this study could be easily expandable to other disease-gene network analysis.
Bayesian inference based modelling for gene transcriptional dynamics by integrating multiple source of knowledge

Directory of Open Access Journals (Sweden)

Wang Shu-Qiang

2012-07-01

Full Text Available Abstract Background A key challenge in the post genome era is to identify genome-wide transcriptional regulatory networks, which specify the interactions between transcription factors and their target genes. Numerous methods have been developed for reconstructing gene regulatory networks from expression data. However, most of them are based on coarse grained qualitative models, and cannot provide a quantitative view of regulatory systems. Results A binding affinity based regulatory model is proposed to quantify the transcriptional regulatory network. Multiple quantities, including binding affinity and the activity level of transcription factor (TF are incorporated into a general learning model. The sequence features of the promoter and the possible occupancy of nucleosomes are exploited to estimate the binding probability of regulators. Comparing with the previous models that only employ microarray data, the proposed model can bridge the gap between the relative background frequency of the observed nucleotide and the gene's transcription rate. Conclusions We testify the proposed approach on two real-world microarray datasets. Experimental results show that the proposed model can effectively identify the parameters and the activity level of TF. Moreover, the kinetic parameters introduced in the proposed model can reveal more biological sense than previous models can do.
Molecular characterisation of lumpy skin disease virus and sheeppox virus based on P32 gene

Directory of Open Access Journals (Sweden)

P.M.A.Rashid

2017-06-01

Full Text Available Lumpy skin disease virus (LSDV and sheeppox virus (SPV have a considerable economic impact on the cattle and small ruminant industry. They are listed in group A of contagious disease by the World Organization for Animal Health (OIE. This study addressed molecular characterisation of first LSDV outbreak and an endemic SPV in Kurdistan region of Iraq based on P32 gene. The results indicated that P32 gene can be successfully used for diagnosis of LSDV. The phylogenic and molecular analysis showed that there may be a new LSDV isolate circulating in Kurdistan which uniquely shared the same characteristic amino acid sequence with SPV and GPV, leucine at amino acid position 51 in P32 gene as well as few genetically distinct SPV causing pox disease in Kurdistan sheep. This study provided sequence information of P32 gene for several LSDV isolates, which positively affects the epidemiological study of Capripoxvirus
Genomic DNA-based absolute quantification of gene expression in Vitis.

Science.gov (United States)

Gambetta, Gregory A; McElrone, Andrew J; Matthews, Mark A

2013-07-01

Many studies in which gene expression is quantified by polymerase chain reaction represent the expression of a gene of interest (GOI) relative to that of a reference gene (RG). Relative expression is founded on the assumptions that RG expression is stable across samples, treatments, organs, etc., and that reaction efficiencies of the GOI and RG are equal; assumptions which are often faulty. The true variability in RG expression and actual reaction efficiencies are seldom determined experimentally. Here we present a rapid and robust method for absolute quantification of expression in Vitis where varying concentrations of genomic DNA were used to construct GOI standard curves. This methodology was utilized to absolutely quantify and determine the variability of the previously validated RG ubiquitin (VvUbi) across three test studies in three different tissues (roots, leaves and berries). In addition, in each study a GOI was absolutely quantified. Data sets resulting from relative and absolute methods of quantification were compared and the differences were striking. VvUbi expression was significantly different in magnitude between test studies and variable among individual samples. Absolute quantification consistently reduced the coefficients of variation of the GOIs by more than half, often resulting in differences in statistical significance and in some cases even changing the fundamental nature of the result. Utilizing genomic DNA-based absolute quantification is fast and efficient. Through eliminating error introduced by assuming RG stability and equal reaction efficiencies between the RG and GOI this methodology produces less variation, increased accuracy and greater statistical power. © 2012 Scandinavian Plant Physiology Society.
Gene therapy for ocular diseases.

Science.gov (United States)

Liu, Melissa M; Tuo, Jingsheng; Chan, Chi-Chao

2011-05-01

The eye is an easily accessible, highly compartmentalised and immune-privileged organ that offers unique advantages as a gene therapy target. Significant advancements have been made in understanding the genetic pathogenesis of ocular diseases, and gene replacement and gene silencing have been implicated as potentially efficacious therapies. Recent improvements have been made in the safety and specificity of vector-based ocular gene transfer methods. Proof-of-concept for vector-based gene therapies has also been established in several experimental models of human ocular diseases. After nearly two decades of ocular gene therapy research, preliminary successes are now being reported in phase 1 clinical trials for the treatment of Leber congenital amaurosis. This review describes current developments and future prospects for ocular gene therapy. Novel methods are being developed to enhance the performance and regulation of recombinant adeno-associated virus- and lentivirus-mediated ocular gene transfer. Gene therapy prospects have advanced for a variety of retinal disorders, including retinitis pigmentosa, retinoschisis, Stargardt disease and age-related macular degeneration. Advances have also been made using experimental models for non-retinal diseases, such as uveitis and glaucoma. These methodological advancements are critical for the implementation of additional gene-based therapies for human ocular diseases in the near future.
Gene silencing activity of siRNA polyplexes based on thiolated N,N,N-trimethylated chitosan.

Science.gov (United States)

Varkouhi, Amir K; Verheul, Rolf J; Schiffelers, Raymond M; Lammers, Twan; Storm, Gert; Hennink, Wim E

2010-12-15

N,N,N-Trimethylated chitosan (TMC) is a biodegradable polymer emerging as a promising nonviral vector for nucleic acid and protein delivery. In the present study, we investigated whether the introduction of thiol groups in TMC enhances the extracellular stability of the complexes based on this polymer and promotes the intracellular release of siRNA. The gene silencing activity and the cellular cytotoxicity of polyplexes based on thiolated TMC were compared with those based on the nonthiolated counterpart and the regularly used lipidic transfection agent Lipofectamine. Incubation of H1299 human lung cancer cells expressing firefly luciferase with siRNA/thiolated TMC polyplexes resulted in 60-80% gene silencing activity, whereas complexes based on nonthiolated TMC showed less silencing (40%). The silencing activity of the complexes based on Lipofectamine 2000 was about 60-70%. Importantly, the TMC-SH polyplexes retained their silencing activity in the presence of hyaluronic acid, while nonthiolated TMC polyplexes hardly showed any silencing activity, demonstrating their stability against competing anionic macromolecules. Under the experimental conditions tested, the cytotoxicity of the thiolated and nonthiolated siRNA complexes was lower than those based on Lipofectamine. Given the good extracellular stability and good silencing activity, it is concluded that polyplexes based on TMC-SH are attractive systems for further in vivo evaluations.
Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

Science.gov (United States)

Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

2009-01-01

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494
Inducible, tunable and multiplex human gene regulation using CRISPR-Cpf1-based transcription factors | Office of Cancer Genomics

Science.gov (United States)

Targeted and inducible regulation of mammalian gene expression is a broadly important research capability that may also enable development of novel therapeutics for treating human diseases. Here we demonstrate that a catalytically inactive RNA-guided CRISPR-Cpf1 nuclease fused to transcriptional activation domains can up-regulate endogenous human gene expression. We engineered drug-inducible Cpf1-based activators and show how this system can be used to tune the regulation of endogenous gene transcription in human cells.
Identification of human circadian genes based on time course gene expression profiles by using a deep learning method.

Science.gov (United States)

Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui

2018-06-01

Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological
Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation.

Science.gov (United States)

Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

2014-01-01

Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific
Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases

Directory of Open Access Journals (Sweden)

Ma'ayan Avi

2007-10-01

Full Text Available Abstract Background In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP, generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Results Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Conclusion Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.
Optimization of conditions for gene delivery system based on PEI

Directory of Open Access Journals (Sweden)

Roya Cheraghi

2017-01-01

Full Text Available Objective(s: PEI based nanoparticle (NP due to dual capabilities of proton sponge and DNA binding is known as powerful tool for nucleic acid delivery to cells. However, serious cytotoxicity and complicated conditions, which govern NPs properties and its interactions with cells practically, hindered achievement to high transfection efficiency. Here, we have tried to optimize the properties of PEI/ firefly luciferase plasmid complexes and cellular condition to improve transfection efficiency. Materials and Methods: For this purpose, firefly luciferase, as a robust gene reporter, was complexed with PEI to prepare NPs with different size and charge. The physicochemical properties of nanoparticles were evaluated using agarose gel retardation and dynamic light scattering. MCF7 and BT474 cells at different confluency were also transfected with prepared nanoparticles at various concentrations for short and long times. Results: The branched PEI can instantaneously bind to DNA and form cationic NPs. The results demonstrated the production of nanoparticles with size about 100-500 nm dependent on N/P ratio. Moreover, increase of nanoparticles concentration on the cell surface drastically improved the transfection rate, so at a concentration of 30 ng/ìl, the highest transfection efficiency was achieved. On the other side, at confluency between 40-60%, the maximum efficiency was obtained. The result demonstrated that N/P ratio of 12 could establish an optimized ratio between transfection efficiency and cytotoxicity of PEI/plasmid nanoparticles. The increase of NPs N/P ratio led to significant cytotoxicity. Conclusion: Obtained results verified the optimum conditions for PEI based gene delivery in different cell lines.
DHPLC-based mutation analysis of ENG and ALK-1 genes in HHT Italian population.

Science.gov (United States)

Lenato, Gennaro M; Lastella, Patrizia; Di Giacomo, Marilena C; Resta, Nicoletta; Suppressa, Patrizia; Pasculli, Giovanna; Sabbà, Carlo; Guanti, Ginevra

2006-02-01

Hereditary haemorrhagic telangiectasia (HHT or Rendu-Osler-Weber syndrome) is an autosomal dominant disorder characterized by localized angiodysplasia due to mutations in endoglin, ALK-1 gene, and a still unidentified locus. The lack of highly recurrent mutations, locus heterogeneity, and the presence of mutations in almost all coding exons of the two genes makes the screening for mutations time-consuming and costly. In the present study, we developed a DHPLC-based protocol for mutation detection in ALK1 and ENG genes through retrospective analysis of known sequence variants, 20 causative mutations and 11 polymorphisms, and a prospective analysis on 47 probands with unknown mutation. Overall DHPLC analysis identified the causative mutation in 61 out 66 DNA samples (92.4%). We found 31 different mutations in the ALK1 gene, of which 15 are novel, and 20, of which 12 are novel, in the ENG gene, thus providing for the first time the mutational spectrum in a cohort of Italian HHT patients. In addition, we characterized the splicing pattern of ALK1 gene in lymphoblastoid cells, both in normal controls and in two individuals carrying a mutation in the non-invariant -3 position of the acceptor splice site upstream exon 6 (c.626-3C>G). Functional essay demonstrated the existence, also in normal individuals, of a small proportion of ALK1 alternative splicing, due to exon 5 skipping, and the presence of further aberrant splicing isoforms in the individuals carrying the c.626-3C>G mutation. 2006 Wiley-Liss, Inc.

Twin target self-amplification-based DNA machine for highly sensitive detection of cancer-related gene.

Science.gov (United States)

Xu, Huo; Jiang, Yifan; Liu, Dengyou; Liu, Kai; Zhang, Yafeng; Yu, Suhong; Shen, Zhifa; Wu, Zai-Sheng

2018-06-29

The sensitive detection of cancer-related genes is of great significance for early diagnosis and treatment of human cancers, and previous isothermal amplification sensing systems were often based on the reuse of target DNA, the amplification of enzymatic products and the accumulation of reporting probes. However, no reporting probes are able to be transformed into target species and in turn initiate the signal of other probes. Herein we reported a simple, isothermal and highly sensitive homogeneous assay system for tumor suppressor p53 gene detection based on a new autonomous DNA machine, where the signaling probe, molecular beacon (MB), was able to execute the function similar to target DNA besides providing the common signal. In the presence of target p53 gene, the operation of DNA machine can be initiated, and cyclical nucleic acid strand-displacement polymerization (CNDP) and nicking/polymerization cyclical amplification (NPCA) occur, during which the MB was opened by target species and cleaved by restriction endonuclease. In turn, the cleaved fragments could activate the next signaling process as target DNA did. According to the functional similarity, the cleaved fragment was called twin target, and the corresponding fashion to amplify the signal was named twin target self-amplification. Utilizing this newly-proposed DNA machine, the target DNA could be detected down to 0.1 pM with a wide dynamic range (6 orders of magnitude) and single-base mismatched targets were discriminated, indicating a very high assay sensitivity and good specificity. In addition, the DNA machine was not only used to screen the p53 gene in complex biological matrix but also was capable of practically detecting genomic DNA p53 extracted from A549 cell line. This indicates that the proposed DNA machine holds the potential application in biomedical research and early clinical diagnosis. Copyright © 2018 Elsevier B.V. All rights reserved.
An albumin-mediated cholesterol design-based strategy for tuning siRNA pharmacokinetics and gene silencing.

Science.gov (United States)

Bienk, Konrad; Hvam, Michael Lykke; Pakula, Malgorzata Maria; Dagnæs-Hansen, Frederik; Wengel, Jesper; Malle, Birgitte Mølholm; Kragh-Hansen, Ulrich; Cameron, Jason; Bukrinski, Jens Thostrup; Howard, Kenneth A

2016-06-28

Major challenges for the clinical translation of small interfering RNA (siRNA) include overcoming the poor plasma half-life, site-specific delivery and modulation of gene silencing. In this work, we exploit the intrinsic transport properties of human serum albumin to tune the blood circulatory half-life, hepatic accumulation and gene silencing; based on the number of siRNA cholesteryl modifications. We demonstrate by a gel shift assay a strong and specific affinity of recombinant human serum albumin (rHSA) towards cholesteryl-modified siRNA (Kd>1×10(-7)M) dependent on number of modifications. The rHSA/siRNA complex exhibited reduced nuclease degradation and reduced induction of TNF-α production by human peripheral blood mononuclear cells. The increased solubility of heavily cholesteryl modified siRNA in the presence of rHSA facilitated duplex annealing and consequent interaction that allowed in vivo studies using multiple cholesteryl modifications. A structural-activity-based screen of in vitro EGFP-silencing was used to select optimal siRNA designs containing cholesteryl modifications within the sense strand that were used for in vivo studies. We demonstrate plasma half-life extension in NMRI mice from t1/2 12min (naked) to t1/2 45min (single cholesteryl) and t1/2 71min (double cholesteryl) using fluorescent live bioimaging. The biodistribution showed increased accumulation in the liver for the double cholesteryl modified siRNA that correlated with an increase in hepatic Factor VII gene silencing of 28% (rHSA/siRNA) compared to 4% (naked siRNA) 6days post-injection. This work presents a novel albumin-mediated cholesteryl design-based strategy for tuning pharmacokinetics and systemic gene silencing. Copyright © 2016 Elsevier B.V. All rights reserved.
Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

Science.gov (United States)

Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

2018-01-01

Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based
Recurrent neural network based hybrid model for reconstructing gene regulatory network.

Science.gov (United States)

Raza, Khalid; Alam, Mansaf

2016-10-01

One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model. Copyright © 2016 Elsevier Ltd. All rights reserved.
PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs

Directory of Open Access Journals (Sweden)

Xue Fuzhong

2010-01-01

Full Text Available Abstract Background Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s which usually involves multiple single-nucleotide polymorphisms (SNPs available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (HWE and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (PCA are preferable in this regard but their performance varies with methods of extracting principal components (PCs. Results PCA-based bootstrap confidence interval test (PCA-BCIT, which directly uses the PC scores to assess gene-disease association, was developed and evaluated for three ways of extracting PCs, i.e., cases only(CAES, controls only(COES and cases and controls combined(CES. Extraction of PCs with COES is preferred to that with CAES and CES. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test. Conclusions PCA-BCIT is a valid and powerful method for assessing gene-disease association involving multiple SNPs.
Google goes cancer: improving outcome prediction for cancer patients by network-based ranking of marker genes.

Directory of Open Access Journals (Sweden)

Christof Winter

Full Text Available Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice.
Learning Gene Regulatory Networks Computationally from Gene Expression Data Using Weighted Consensus

KAUST Repository

Fujii, Chisato

2015-04-16

Gene regulatory networks analyze the relationships between genes allowing us to un- derstand the gene regulatory interactions in systems biology. Gene expression data from the microarray experiments is used to obtain the gene regulatory networks. How- ever, the microarray data is discrete, noisy and non-linear which makes learning the networks a challenging problem and existing gene network inference methods do not give consistent results. Current state-of-the-art study uses the average-ranking-based consensus method to combine and average the ranked predictions from individual methods. However each individual method has an equal contribution to the consen- sus prediction. We have developed a linear programming-based consensus approach which uses learned weights from linear programming among individual methods such that the methods have di↵erent weights depending on their performance. Our result reveals that assigning di↵erent weights to individual methods rather than giving them equal weights improves the performance of the consensus. The linear programming- based consensus method is evaluated and it had the best performance on in silico and Saccharomyces cerevisiae networks, and the second best on the Escherichia coli network outperformed by Inferelator Pipeline method which gives inconsistent results across a wide range of microarray data sets.
Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

Science.gov (United States)

Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

2009-02-01

Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.
Effect of Chemical Prevention Drugs-based MicroRNAs and Their Target Genes  on Tumor Inhibition

Directory of Open Access Journals (Sweden)

Yanhui JIANG

2015-04-01

Full Text Available Chemopreventive drugs including natural chemopreventive drugs and synthetic chemopreventive drugs, it not only can prevent cancer, can also play a role in tumor treatment. MicroRNAs (miRNAs is a kind of short chains of non-coding RNA, regulating the expression of many genes through the way of degradation of mRNA or inhibitting mRNA translation. In recent years, more and more studies have shown that chemopreventive drugs through influence the expression of miRNAs and their target genes play a role in the prevention and treatment in a variety of tumors, and chemopreventive drugs on the experimental study of miRNAs and their target genes in tumor have demonstrated a good safety and efficacy. Effect on chemopreventive drugs-based microRNAs and their target genes into cancer cells will be expected as a new starting point for cancer research. The thesis expounds and analyzes between the natural chemopreventive drugs and synthetic chemopreventive drugs and miRNAs and their target genes in tumor research progress.
Down-Regulation of Gene Expression by RNA-Induced Gene Silencing

Science.gov (United States)

Travella, Silvia; Keller, Beat

Down-regulation of endogenous genes via post-transcriptional gene silencing (PTGS) is a key to the characterization of gene function in plants. Many RNA-based silencing mechanisms such as post-transcriptional gene silencing, co-suppression, quelling, and RNA interference (RNAi) have been discovered among species of different kingdoms (plants, fungi, and animals). One of the most interesting discoveries was RNAi, a sequence-specific gene-silencing mechanism initiated by the introduction of double-stranded RNA (dsRNA), homologous in sequence to the silenced gene, which triggers degradation of mRNA. Infection of plants with modified viruses can also induce RNA silencing and is referred to as virus-induced gene silencing (VIGS). In contrast to insertional mutagenesis, these emerging new reverse genetic approaches represent a powerful tool for exploring gene function and for manipulating gene expression experimentally in cereal species such as barley and wheat. We examined how RNAi and VIGS have been used to assess gene function in barley and wheat, including molecular mechanisms involved in the process and available methodological elements, such as vectors, inoculation procedures, and analysis of silenced phenotypes.
An Improved Single-Step Cloning Strategy Simplifies the Agrobacterium tumefaciens-Mediated Transformation (ATMT)-Based Gene-Disruption Method for Verticillium dahliae.

Science.gov (United States)

Wang, Sheng; Xing, Haiying; Hua, Chenlei; Guo, Hui-Shan; Zhang, Jie

2016-06-01

The soilborne fungal pathogen Verticillium dahliae infects a broad range of plant species to cause severe diseases. The availability of Verticillium genome sequences has provided opportunities for large-scale investigations of individual gene function in Verticillium strains using Agrobacterium tumefaciens-mediated transformation (ATMT)-based gene-disruption strategies. Traditional ATMT vectors require multiple cloning steps and elaborate characterization procedures to achieve successful gene replacement; thus, these vectors are not suitable for high-throughput ATMT-based gene deletion. Several advancements have been made that either involve simplification of the steps required for gene-deletion vector construction or increase the efficiency of the technique for rapid recombinant characterization. However, an ATMT binary vector that is both simple and efficient is still lacking. Here, we generated a USER-ATMT dual-selection (DS) binary vector, which combines both the advantages of the USER single-step cloning technique and the efficiency of the herpes simplex virus thymidine kinase negative-selection marker. Highly efficient deletion of three different genes in V. dahliae using the USER-ATMT-DS vector enabled verification that this newly-generated vector not only facilitates the cloning process but also simplifies the subsequent identification of fungal homologous recombinants. The results suggest that the USER-ATMT-DS vector is applicable for efficient gene deletion and suitable for large-scale gene deletion in V. dahliae.
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

Science.gov (United States)

Hiscock, D; Upton, C

2000-05-01

The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
GGDonto ontology as a knowledge-base for genetic diseases and disorders of glycan metabolism and their causative genes.

Science.gov (United States)

Solovieva, Elena; Shikanai, Toshihide; Fujita, Noriaki; Narimatsu, Hisashi

2018-04-18

Inherited mutations in glyco-related genes can affect the biosynthesis and degradation of glycans and result in severe genetic diseases and disorders. The Glyco-Disease Genes Database (GDGDB), which provides information about these diseases and disorders as well as their causative genes, has been developed by the Research Center for Medical Glycoscience (RCMG) and released in April 2010. GDGDB currently provides information on about 80 genetic diseases and disorders caused by single-gene mutations in glyco-related genes. Many biomedical resources provide information about genetic disorders and genes involved in their pathogenesis, but resources focused on genetic disorders known to be related to glycan metabolism are lacking. With the aim of providing more comprehensive knowledge on genetic diseases and disorders of glycan biosynthesis and degradation, we enriched the content of the GDGDB database and improved the methods for data representation. We developed the Genetic Glyco-Diseases Ontology (GGDonto) and a RDF/SPARQL-based user interface using Semantic Web technologies. In particular, we represented the GGDonto content using Semantic Web languages, such as RDF, RDFS, SKOS, and OWL, and created an interactive user interface based on SPARQL queries. This user interface provides features to browse the hierarchy of the ontology, view detailed information on diseases and related genes, and find relevant background information. Moreover, it provides the ability to filter and search information by faceted and keyword searches. Focused on the molecular etiology, pathogenesis, and clinical manifestations of genetic diseases and disorders of glycan metabolism and developed as a knowledge-base for this scientific field, GGDonto provides comprehensive information on various topics, including links to aid the integration with other scientific resources. The availability and accessibility of this knowledge will help users better understand how genetic defects impact the
Analysis of mammalian gene function through broad based phenotypic screens across a consortium of mouse clinics

Science.gov (United States)

Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl MJ; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie

2015-01-01

The function of the majority of genes in the mouse and human genomes remains unknown. The mouse ES cell knockout resource provides a basis for characterisation of relationships between gene and phenotype. The EUMODIC consortium developed and validated robust methodologies for broad-based phenotyping of knockouts through a pipeline comprising 20 disease-orientated platforms. We developed novel statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no prior functional annotation. We captured data from over 27,000 mice finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. Novel phenotypes were uncovered for many genes with unknown function providing a powerful basis for hypothesis generation and further investigation in diverse systems. PMID:26214591
Gene Ontology-Based Analysis of Zebrafish Omics Data Using the Web Tool Comparative Gene Ontology.

Science.gov (United States)

Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan

2017-10-01

Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.
Discovering genes underlying QTL

Energy Technology Data Exchange (ETDEWEB)

Vanavichit, Apichart [Kasetsart University, Kamphaengsaen, Nakorn Pathom (Thailand)

2002-02-01

A map-based approach has allowed scientists to discover few genes at a time. In addition, the reproductive barrier between cultivated rice and wild relatives has prevented us from utilizing the germ plasm by a map-based approach. Most genetic traits important to agriculture or human diseases are manifested as observable, quantitative phenotypes called Quantitative Trait Loci (QTL). In many instances, the complexity of the phenotype/genotype interaction and the general lack of clearly identifiable gene products render the direct molecular cloning approach ineffective, thus additional strategies like genome mapping are required to identify the QTL in question. Genome mapping requires no prior knowledge of the gene function, but utilizes statistical methods to identify the most likely gene location. To completely characterize genes of interest, the initially mapped region of a gene location will have to be narrowed down to a size that is suitable for cloning and sequencing. Strategies for gene identification within the critical region have to be applied after the sequencing of a potentially large clone or set of clones that contains this gene(s). Tremendous success of positional cloning has been shown for cloning many genes responsible for human diseases, including cystic fibrosis and muscular dystrophy as well as plant disease resistance genes. Genome and QTL mapping, positional cloning: the pre-genomics era, comparative approaches to gene identification, and positional cloning: the genomics era are discussed in the report. (M. Suetake)
PlantTribes: a gene and gene family resource for comparative genomics in plants

OpenAIRE

Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.

2007-01-01

The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, ca...
Tsw gene-based resistance is triggered by a functional RNA silencing suppressor protein of the Tomato spotted wilt virus

NARCIS (Netherlands)

Ronde, de D.; Butterbach, P.B.E.; Lohuis, H.; Hedil, M.; Lent, van J.W.M.; Kormelink, R.J.M.

2013-01-01

As a result of contradictory reports, the avirulence (Avr) determinant that triggers Tsw gene-based resistance in Capsicum annuum against the Tomato spotted wilt virus (TSWV) is still unresolved. Here, the N and NSs genes of resistance-inducing (RI) and resistance-breaking (RB) isolates were cloned
Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

Science.gov (United States)

Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

2017-10-06

Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

Science.gov (United States)

Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

2016-02-01

Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Embryo quality predictive models based on cumulus cells gene expression

Directory of Open Access Journals (Sweden)

Devjak R

2016-06-01

Full Text Available Since the introduction of in vitro fertilization (IVF in clinical practice of infertility treatment, the indicators for high quality embryos were investigated. Cumulus cells (CC have a specific gene expression profile according to the developmental potential of the oocyte they are surrounding, and therefore, specific gene expression could be used as a biomarker. The aim of our study was to combine more than one biomarker to observe improvement in prediction value of embryo development. In this study, 58 CC samples from 17 IVF patients were analyzed. This study was approved by the Republic of Slovenia National Medical Ethics Committee. Gene expression analysis [quantitative real time polymerase chain reaction (qPCR] for five genes, analyzed according to embryo quality level, was performed. Two prediction models were tested for embryo quality prediction: a binary logistic and a decision tree model. As the main outcome, gene expression levels for five genes were taken and the area under the curve (AUC for two prediction models were calculated. Among tested genes, AMHR2 and LIF showed significant expression difference between high quality and low quality embryos. These two genes were used for the construction of two prediction models: the binary logistic model yielded an AUC of 0.72 ± 0.08 and the decision tree model yielded an AUC of 0.73 ± 0.03. Two different prediction models yielded similar predictive power to differentiate high and low quality embryos. In terms of eventual clinical decision making, the decision tree model resulted in easy-to-interpret rules that are highly applicable in clinical practice.
Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

KAUST Repository

Abusamra, Heba

2016-07-20

The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.
Expression analysis of some genes regulated by retinoic acid in controls and triadimefon-exposed embryos: is the amphibian Xenopus laevis a suitable model for gene-based comparative teratology?

Science.gov (United States)

Di Renzo, Francesca; Rossi, Federica; Bacchetta, Renato; Prati, Mariangela; Giavini, Erminio; Menegola, Elena

2011-06-01

The use of nonmammal models in teratological studies is a matter of debate and seems to be justified if the embryotoxic mechanism involves conserved processes. Published data on mammals and Xenopus laevis suggest that azoles are teratogenic by altering the endogenous concentration of retinoic acid (RA). The expression of some genes (Shh, Ptch-1, Gsc, and Msx2) controlled by retinoic acid is downregulated in rat embryos exposed at the phylotypic stage to the triazole triadimefon (FON). In order to propose X. laevis as a model for gene-based comparative teratology, this work evaluates the expression of Shh, Ptch-1, Gsc, and Msx2 in FON-exposed X. laevis embryos. Embryos, exposed to a high concentration level (500 µM) of FON from stage 13 till 17, were examined at stages 17, 27, and 47. Stage 17 and 27 embryos were processed to perform quantitative RT-PCR. The developmental rate was never affected by FON at any considered stage. FON-exposed stage 47 larvae showed the typical craniofacial malformations. A significant downregulation of Gsc was observed in FON-exposed stage 17 embryos. Shh, Ptch-1, Msx2 showed a high fluctuation of expression both in control and in FON-exposed samples both at stages 17 and 27. The downregulation of Gsc mimics the effects of FON on rat embryos, showing for this gene a common effect of FON in the two vertebrate classes. The high fluctuation observed in the gene expression of the other genes, however, suggests that X. laevis at this stage has limited utility for gene-based comparative teratology. © 2011 Wiley-Liss, Inc.
A strategy of gene overexpression based on tandem repetitive promoters in Escherichia coli

Directory of Open Access Journals (Sweden)

Li Mingji

2012-02-01

Full Text Available Abstract Background For metabolic engineering, many rate-limiting steps may exist in the pathways of accumulating the target metabolites. Increasing copy number of the desired genes in these pathways is a general method to solve the problem, for example, the employment of the multi-copy plasmid-based expression system. However, this method may bring genetic instability, structural instability and metabolic burden to the host, while integrating of the desired gene into the chromosome may cause inadequate transcription or expression. In this study, we developed a strategy for obtaining gene overexpression by engineering promoter clusters consisted of multiple core-tac-promoters (MCPtacs in tandem. Results Through a uniquely designed in vitro assembling process, a series of promoter clusters were constructed. The transcription strength of these promoter clusters showed a stepwise enhancement with the increase of tandem repeats number until it reached the critical value of five. Application of the MCPtacs promoter clusters in polyhydroxybutyrate (PHB production proved that it was efficient. Integration of the phaCAB genes with the 5CPtacs promoter cluster resulted in an engineered E.coli that can accumulate 23.7% PHB of the cell dry weight in batch cultivation. Conclusions The transcription strength of the MCPtacs promoter cluster can be greatly improved by increasing the tandem repeats number of the core-tac-promoter. By integrating the desired gene together with the MCPtacs promoter cluster into the chromosome of E. coli, we can achieve high and stale overexpression with only a small size. This strategy has an application potential in many fields and can be extended to other bacteria.
Molecular characterisation of the nucleocapsid protein gene, glycoprotein gene and gene junctions of rhabdovirus 903/87, a novel fish pathogenic rhabdovirus

DEFF Research Database (Denmark)

Johansson, Tove; Nylund, S.; Olesen, Niels Jørgen

2001-01-01

, M, G and L genes it was determined that transcription start and stop codons were conserved between virus 903/87 and the vesiculo viruses. Virus 903/87 has no open reading frame coding for a non-virion gene between the glycoprotein and the polymerase gene. Phylogenetic studies based on rhabdovirus...
Multicenter validation of cancer gene panel-based next-generation sequencing for translational research and molecular diagnostics.

Science.gov (United States)

Hirsch, B; Endris, V; Lassmann, S; Weichert, W; Pfarr, N; Schirmacher, P; Kovaleva, V; Werner, M; Bonzheim, I; Fend, F; Sperveslage, J; Kaulich, K; Zacher, A; Reifenberger, G; Köhrer, K; Stepanow, S; Lerke, S; Mayr, T; Aust, D E; Baretton, G; Weidner, S; Jung, A; Kirchner, T; Hansmann, M L; Burbat, L; von der Wall, E; Dietel, M; Hummel, M

2018-04-01

The simultaneous detection of multiple somatic mutations in the context of molecular diagnostics of cancer is frequently performed by means of amplicon-based targeted next-generation sequencing (NGS). However, only few studies are available comparing multicenter testing of different NGS platforms and gene panels. Therefore, seven partner sites of the German Cancer Consortium (DKTK) performed a multicenter interlaboratory trial for targeted NGS using the same formalin-fixed, paraffin-embedded (FFPE) specimen of molecularly pre-characterized tumors (n = 15; each n = 5 cases of Breast, Lung, and Colon carcinoma) and a colorectal cancer cell line DNA dilution series. Detailed information regarding pre-characterized mutations was not disclosed to the partners. Commercially available and custom-designed cancer gene panels were used for library preparation and subsequent sequencing on several devices of two NGS different platforms. For every case, centrally extracted DNA and FFPE tissue sections for local processing were delivered to each partner site to be sequenced with the commercial gene panel and local bioinformatics. For cancer-specific panel-based sequencing, only centrally extracted DNA was analyzed at seven sequencing sites. Subsequently, local data were compiled and bioinformatics was performed centrally. We were able to demonstrate that all pre-characterized mutations were re-identified correctly, irrespective of NGS platform or gene panel used. However, locally processed FFPE tissue sections disclosed that the DNA extraction method can affect the detection of mutations with a trend in favor of magnetic bead-based DNA extraction methods. In conclusion, targeted NGS is a very robust method for simultaneous detection of various mutations in FFPE tissue specimens if certain pre-analytical conditions are carefully considered.
Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

Science.gov (United States)

Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

2017-08-01

This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Transcriptome characterization and sequencing-based identification of salt-responsive genes in Millettia pinnata, a semi-mangrove plant.

Science.gov (United States)

Huang, Jianzi; Lu, Xiang; Yan, Hao; Chen, Shouyi; Zhang, Wanke; Huang, Rongfeng; Zheng, Yizhi

2012-04-01

Semi-mangroves form a group of transitional species between glycophytes and halophytes, and hold unique potential for learning molecular mechanisms underlying plant salt tolerance. Millettia pinnata is a semi-mangrove plant that can survive a wide range of saline conditions in the absence of specialized morphological and physiological traits. By employing the Illumina sequencing platform, we generated ~192 million short reads from four cDNA libraries of M. pinnata and processed them into 108,598 unisequences with a high depth of coverage. The mean length and total length of these unisequences were 606 bp and 65.8 Mb, respectively. A total of 54,596 (50.3%) unisequences were assigned Nr annotations. Functional classification revealed the involvement of unisequences in various biological processes related to metabolism and environmental adaptation. We identified 23,815 candidate salt-responsive genes with significantly differential expression under seawater and freshwater treatments. Based on the reverse transcription-polymerase chain reaction (RT-PCR) and real-time PCR analyses, we verified the changes in expression levels for a number of candidate genes. The functional enrichment analyses for the candidate genes showed tissue-specific patterns of transcriptome remodelling upon salt stress in the roots and the leaves. The transcriptome of M. pinnata will provide valuable gene resources for future application in crop improvement. In addition, this study sets a good example for large-scale identification of salt-responsive genes in non-model organisms using the sequencing-based approach.
[Gene method for inconsistent hydrological frequency calculation. 2: Diagnosis system of hydrological genes and method of hydrological moment genes with inconsistent characters].

Science.gov (United States)

Xie, Ping; Zhao, Jiang Yan; Wu, Zi Yi; Sang, Yan Fang; Chen, Jie; Li, Bin Bin; Gu, Hai Ting

2018-04-01

The analysis of inconsistent hydrological series is one of the major problems that should be solved for engineering hydrological calculation in changing environment. In this study, the diffe-rences of non-consistency and non-stationarity were analyzed from the perspective of composition of hydrological series. The inconsistent hydrological phenomena were generalized into hydrological processes with inheritance, variability and evolution characteristics or regulations. Furthermore, the hydrological genes were identified following the theory of biological genes, while their inheritance bases and variability bases were determined based on composition of hydrological series under diffe-rent time scales. To identify and test the components of hydrological genes, we constructed a diagnosis system of hydrological genes. With the P-3 distribution as an example, we described the process of construction and expression of the moment genes to illustrate the inheritance, variability and evolution principles of hydrological genes. With the annual minimum 1-month runoff series of Yunjinghong station in Lancangjiang River basin as an example, we verified the feasibility and practicability of hydrological gene theory for the calculation of inconsistent hydrological frequency. The results showed that the method could be used to reveal the evolution of inconsistent hydrological series. Therefore, it provided a new research pathway for engineering hydrological calculation in changing environment and an essential reference for the assessment of water security.
Genome Wide Association Study of SNP-, Gene-, and Pathway-based Approaches to Identify Genes Influencing Susceptibility to Staphylococcus aureus Infections

Directory of Open Access Journals (Sweden)

Zhan eYe

2014-05-01

Full Text Available Background: We conducted a genome-wide association study (GWAS to identify specific genetic variants that underlie susceptibility to disease caused by Staphylococcus aureus in humans. Methods: Cases (n=309 and controls (n=2,925 were genotyped at 508,921 single nucleotide polymorphisms (SNPs. Cases had at least one laboratory and clinician confirmed disease caused by S. aureus whereas controls did not. R-package (for SNP association, EIGENSOFT (to estimate and adjust for population stratification and gene- (VEGAS and pathway-based (DAVID, PANTHER, and Ingenuity Pathway Analysis analyses were performed.Results: No SNP reached genome-wide significance. Four SNPs exceeded the pConclusion: We identified potential susceptibility genes for S. aureus diseases in this preliminary study but confirmation by other studies is needed. The observed associations could be relevant given the complexity of S. aureus as a pathogen and its ability to exploit multiple biological pathways to cause infections in humans.
Optimization of Critical Hairpin Features Allows miRNA-based Gene Knockdown Upon Single-copy Transduction

Directory of Open Access Journals (Sweden)

Renier Myburgh

2014-01-01

Full Text Available Gene knockdown using micro RNA (miRNA-based vector constructs is likely to become a prominent gene therapy approach. It was the aim of this study to improve the efficiency of gene knockdown through optimizing the structure of miRNA mimics. Knockdown of two target genes was analyzed: CCR5 and green fluorescent protein. We describe here a novel and optimized miRNA mimic design called mirGE comprising a lower stem length of 13 base pairs (bp, positioning of the targeting strand on the 5′ side of the miRNA, together with nucleotide mismatches in upper stem positions 1 and 12 placed on the passenger strand. Our mirGE proved superior to miR-30 in four aspects: yield of targeting strand incorporation into RNA-induced silencing complex (RISC; incorporation into RISC of correct targeting strand; precision of cleavage by Drosha; and ratio of targeting strand over passenger strand. A triple mirGE hairpin cassette targeting CCR5 was constructed. It allowed CCR5 knockdown with an efficiency of over 90% upon single-copy transduction. Importantly, single-copy expression of this construct rendered transduced target cells, including primary human macrophages, resistant to infection with a CCR5-tropic strain of HIV. Our results provide new insights for a better knockdown efficiency of constructs containing miRNA. Our results also provide the proof-of-principle that cells can be rendered HIV resistant through single-copy vector transduction, rendering this approach more compatible with clinical applications.
Sphingolipid base modifying enzymes in sunflower (Helianthus annuus): cloning and characterization of a C4-hydroxylase gene and a new paralogous Δ8-desaturase gene.

Science.gov (United States)

Moreno-Pérez, Antonio J; Martínez-Force, Enrique; Garcés, Rafael; Salas, Joaquín J

2011-05-15

Sphingolipids are components of plant cell membranes that participate in the regulation of important physiological processes. Unlike their animal counterparts, plant sphingolipids are characterized by high levels of base C4-hydroxylation. Moreover, desaturation at the Δ8 position predominates over the Δ4 desaturation typically found in animal sphingolipids. These modifications are due to the action of C4-hydroxylases and Δ8-long chain base desaturases, and they are important for complex sphingolipids finally becoming functional. The long chain bases of sunflower sphingolipids have high levels of hydroxylated and unsaturated moieties. Here, a C4-long chain base hydroxylase was functionally characterized in sunflower plant, an enzyme that could complement the sur2Δ mutation when heterologously expressed in this yeast mutant deficient in hydroxylation. This hydroxylase was ubiquitously expressed in sunflower, with the highest levels found in the developing cotyledons. In addition, we identified a new Δ8-long base chain desaturase gene that displays strong homology to a previously reported desaturase gene. This desaturase was also expressed in yeast and was able to change the long chain base composition of the transformed host. We studied the expression of this desaturase and compared it with that of the other isoform described in sunflower. The desaturase form studied in this paper displayed higher expression levels in developing seeds. Copyright © 2010 Elsevier GmbH. All rights reserved.
Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics.

Science.gov (United States)

de Angelis, Martin Hrabě; Nicholson, George; Selloum, Mohammed; White, Jacqui; Morgan, Hugh; Ramirez-Solis, Ramiro; Sorg, Tania; Wells, Sara; Fuchs, Helmut; Fray, Martin; Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl Mj; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie; Holmes, Chris; Steel, Karen P; Herault, Yann; Gailus-Durner, Valérie; Mallon, Ann-Marie; Brown, Steve Dm

2015-09-01

The function of the majority of genes in the mouse and human genomes remains unknown. The mouse embryonic stem cell knockout resource provides a basis for the characterization of relationships between genes and phenotypes. The EUMODIC consortium developed and validated robust methodologies for the broad-based phenotyping of knockouts through a pipeline comprising 20 disease-oriented platforms. We developed new statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no previous functional annotation. We captured data from over 27,000 mice, finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. New phenotypes were uncovered for many genes with previously unknown function, providing a powerful basis for hypothesis generation and further investigation in diverse systems.
A polymerase chain reaction-based methodology to detect gene doping.

Science.gov (United States)

Carter, Adam; Flueck, Martin

2012-04-01

The non-therapeutic use of genes to enhance athletic performance (gene doping) is a novel threat to the world of sports. Skeletal muscle is a prime target of gene therapy and we asked whether we can develop a test system to produce and detect gene doping. Towards this end, we introduced a plasmid (pCMV-FAK, 3.8 kb, 50 μg) for constitutive expression of the chicken homologue for the regulator of muscle growth, focal adhesion kinase (FAK), via gene electro transfer in the anti-gravitational muscle, m. soleus, or gastrocnemius medialis of rats. Activation of hypertrophy signalling was monitored by assessing the ribosomal kinase p70S6K and muscle fibre cross section. Detectability of the introduced plasmid was monitored with polymerase chain reaction in deoxyribonucleic acids (DNA) from transfected muscle and serum. Muscle transfection with pCMV-FAK elevated FAK expression 7- and 73-fold, respectively, and increased mean cross section by 52 and 16% in targeted muscle fibres of soleus and gastrocnemius muscle 7 days after gene electro transfer. Concomitantly p70S6K content was increased in transfected soleus muscle (+110%). Detection of the exogenous plasmid sequence was possible in DNA and cDNA of muscle until 7 days after transfection, but not in serum except close to the site of plasmid deposition, 1 h after injection and surgery. The findings suggest that the reliable detection of gene doping in the immoral athlete is not possible unless a change in the current practice of tissue sampling is applied involving the collection of muscle biopsy close to the site of gene injection.
A kernel regression approach to gene-gene interaction detection for case-control studies.

Science.gov (United States)

Larson, Nicholas B; Schaid, Daniel J

2013-11-01

Gene-gene interactions are increasingly being addressed as a potentially important contributor to the variability of complex traits. Consequently, attentions have moved beyond single locus analysis of association to more complex genetic models. Although several single-marker approaches toward interaction analysis have been developed, such methods suffer from very high testing dimensionality and do not take advantage of existing information, notably the definition of genes as functional units. Here, we propose a comprehensive family of gene-level score tests for identifying genetic elements of disease risk, in particular pairwise gene-gene interactions. Using kernel machine methods, we devise score-based variance component tests under a generalized linear mixed model framework. We conducted simulations based upon coalescent genetic models to evaluate the performance of our approach under a variety of disease models. These simulations indicate that our methods are generally higher powered than alternative gene-level approaches and at worst competitive with exhaustive SNP-level (where SNP is single-nucleotide polymorphism) analyses. Furthermore, we observe that simulated epistatic effects resulted in significant marginal testing results for the involved genes regardless of whether or not true main effects were present. We detail the benefits of our methods and discuss potential genome-wide analysis strategies for gene-gene interaction analysis in a case-control study design. © 2013 WILEY PERIODICALS, INC.
Genome-wide specificity of DNA binding, gene regulation, and chromatin remodeling by TALE- and CRISPR/Cas9-based transcriptional activators.

Science.gov (United States)

Polstein, Lauren R; Perez-Pinera, Pablo; Kocak, D Dewran; Vockley, Christopher M; Bledsoe, Peggy; Song, Lingyun; Safi, Alexias; Crawford, Gregory E; Reddy, Timothy E; Gersbach, Charles A

2015-08-01

Genome engineering technologies based on the CRISPR/Cas9 and TALE systems are enabling new approaches in science and biotechnology. However, the specificity of these tools in complex genomes and the role of chromatin structure in determining DNA binding are not well understood. We analyzed the genome-wide effects of TALE- and CRISPR-based transcriptional activators in human cells using ChIP-seq to assess DNA-binding specificity and RNA-seq to measure the specificity of perturbing the transcriptome. Additionally, DNase-seq was used to assess genome-wide chromatin remodeling that occurs as a result of their action. Our results show that these transcription factors are highly specific in both DNA binding and gene regulation and are able to open targeted regions of closed chromatin independent of gene activation. Collectively, these results underscore the potential for these technologies to make precise changes to gene expression for gene and cell therapies or fundamental studies of gene function. © 2015 Polstein et al.; Published by Cold Spring Harbor Laboratory Press.
A new measure for gene expression biclustering based on non-parametric correlation.

Science.gov (United States)

Flores, Jose L; Inza, Iñaki; Larrañaga, Pedro; Calvo, Borja

2013-12-01

One of the emerging techniques for performing the analysis of the DNA microarray data known as biclustering is the search of subsets of genes and conditions which are coherently expressed. These subgroups provide clues about the main biological processes. Until now, different approaches to this problem have been proposed. Most of them use the mean squared residue as quality measure but relevant and interesting patterns can not be detected such as shifting, or scaling patterns. Furthermore, recent papers show that there exist new coherence patterns involved in different kinds of cancer and tumors such as inverse relationships between genes which can not be captured. The proposed measure is called Spearman's biclustering measure (SBM) which performs an estimation of the quality of a bicluster based on the non-linear correlation among genes and conditions simultaneously. The search of biclusters is performed by using a evolutionary technique called estimation of distribution algorithms which uses the SBM measure as fitness function. This approach has been examined from different points of view by using artificial and real microarrays. The assessment process has involved the use of quality indexes, a set of bicluster patterns of reference including new patterns and a set of statistical tests. It has been also examined the performance using real microarrays and comparing to different algorithmic approaches such as Bimax, CC, OPSM, Plaid and xMotifs. SBM shows several advantages such as the ability to recognize more complex coherence patterns such as shifting, scaling and inversion and the capability to selectively marginalize genes and conditions depending on the statistical significance. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
dictyExpress: a Dictyostelium discoideum gene expression database with an explorative data analysis web-based interface

Science.gov (United States)

Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz

2009-01-01

Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156
A Peptide-based Vector for Efficient Gene Transfer In Vitro and In Vivo

Science.gov (United States)

Lehto, Taavi; Simonson, Oscar E; Mäger, Imre; Ezzat, Kariem; Sork, Helena; Copolovici, Dana-Maria; Viola, Joana R; Zaghloul, Eman M; Lundin, Per; Moreno, Pedro MD; Mäe, Maarja; Oskolkov, Nikita; Suhorutšenko, Julia; Smith, CI Edvard; Andaloussi, Samir EL

2011-01-01

Finding suitable nonviral delivery vehicles for nucleic acid–based therapeutics is a landmark goal in gene therapy. Cell-penetrating peptides (CPPs) are one class of delivery vectors that has been exploited for this purpose. However, since CPPs use endocytosis to enter cells, a large fraction of peptides remain trapped in endosomes. We have previously reported that stearylation of amphipathic CPPs, such as transportan 10 (TP10), dramatically increases transfection of oligonucleotides in vitro partially by promoting endosomal escape. Therefore, we aimed to evaluate whether stearyl-TP10 could be used for the delivery of plasmids as well. Our results demonstrate that stearyl-TP10 forms stable nanoparticles with plasmids that efficiently enter different cell-types in a ubiquitous manner, including primary cells, resulting in significantly higher gene expression levels than when using stearyl-Arg9 or unmodified CPPs. In fact, the transfection efficacy of stearyl-TP10 almost reached the levels of Lipofectamine 2000 (LF2000), however, without any of the observed lipofection-associated toxicities. Most importantly, stearyl-TP10/plasmid nanoparticles are nonimmunogenic, mediate efficient gene delivery in vivo, when administrated intramuscularly (i.m.) or intradermally (i.d.) without any associated toxicity in mice. PMID:21343913
Unveiling network-based functional features through integration of gene expression into protein networks.

Science.gov (United States)

Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali

2018-06-01

Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.

Gene mutation-based and specific therapies in precision medicine.

Science.gov (United States)

Wang, Xiangdong

2016-04-01

Precision medicine has been initiated and gains more and more attention from preclinical and clinical scientists. A number of key elements or critical parts in precision medicine have been described and emphasized to establish a systems understanding of precision medicine. The principle of precision medicine is to treat patients on the basis of genetic alterations after gene mutations are identified, although questions and challenges still remain before clinical application. Therapeutic strategies of precision medicine should be considered according to gene mutation, after biological and functional mechanisms of mutated gene expression or epigenetics, or the correspondent protein, are clearly validated. It is time to explore and develop a strategy to target and correct mutated genes by direct elimination, restoration, correction or repair of mutated sequences/genes. Nevertheless, there are still numerous challenges to integrating widespread genomic testing into individual cancer therapies and into decision making for one or another treatment. There are wide-ranging and complex issues to be solved before precision medicine becomes clinical reality. Thus, the precision medicine can be considered as an extension and part of clinical and translational medicine, a new alternative of clinical therapies and strategies, and have an important impact on disease cures and patient prognoses. © 2015 The Author. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.
PRIMAL: Page Rank-Based Indoor Mapping and Localization Using Gene-Sequenced Unlabeled WLAN Received Signal Strength

Directory of Open Access Journals (Sweden)

Mu Zhou

2015-09-01

Full Text Available Due to the wide deployment of wireless local area networks (WLAN, received signal strength (RSS-based indoor WLAN localization has attracted considerable attention in both academia and industry. In this paper, we propose a novel page rank-based indoor mapping and localization (PRIMAL by using the gene-sequenced unlabeled WLAN RSS for simultaneous localization and mapping (SLAM. Specifically, first of all, based on the observation of the motion patterns of the people in the target environment, we use the Allen logic to construct the mobility graph to characterize the connectivity among different areas of interest. Second, the concept of gene sequencing is utilized to assemble the sporadically-collected RSS sequences into a signal graph based on the transition relations among different RSS sequences. Third, we apply the graph drawing approach to exhibit both the mobility graph and signal graph in a more readable manner. Finally, the page rank (PR algorithm is proposed to construct the mapping from the signal graph into the mobility graph. The experimental results show that the proposed approach achieves satisfactory localization accuracy and meanwhile avoids the intensive time and labor cost involved in the conventional location fingerprinting-based indoor WLAN localization.
PRIMAL: Page Rank-Based Indoor Mapping and Localization Using Gene-Sequenced Unlabeled WLAN Received Signal Strength.

Science.gov (United States)

Zhou, Mu; Zhang, Qiao; Xu, Kunjie; Tian, Zengshan; Wang, Yanmeng; He, Wei

2015-09-25

Due to the wide deployment of wireless local area networks (WLAN), received signal strength (RSS)-based indoor WLAN localization has attracted considerable attention in both academia and industry. In this paper, we propose a novel page rank-based indoor mapping and localization (PRIMAL) by using the gene-sequenced unlabeled WLAN RSS for simultaneous localization and mapping (SLAM). Specifically, first of all, based on the observation of the motion patterns of the people in the target environment, we use the Allen logic to construct the mobility graph to characterize the connectivity among different areas of interest. Second, the concept of gene sequencing is utilized to assemble the sporadically-collected RSS sequences into a signal graph based on the transition relations among different RSS sequences. Third, we apply the graph drawing approach to exhibit both the mobility graph and signal graph in a more readable manner. Finally, the page rank (PR) algorithm is proposed to construct the mapping from the signal graph into the mobility graph. The experimental results show that the proposed approach achieves satisfactory localization accuracy and meanwhile avoids the intensive time and labor cost involved in the conventional location fingerprinting-based indoor WLAN localization.
GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

Science.gov (United States)

Cumbie, Jason S; Kimbrel, Jeffrey A; Di, Yanming; Schafer, Daniel W; Wilhelm, Larry J; Fox, Samuel E; Sullivan, Christopher M; Curzon, Aron D; Carrington, James C; Mockler, Todd C; Chang, Jeff H

2011-01-01

GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

Directory of Open Access Journals (Sweden)

Jason S Cumbie

Full Text Available GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
Microarray-based genomic surveying of gene polymorphisms in Chlamydia trachomatis

OpenAIRE

Brunelle, Brian W; Nicholson, Tracy L; Stephens, Richard S

2004-01-01

By comparing two fully sequenced genomes of Chlamydia trachomatis using competitive hybridization on DNA microarrays, a logarithmic correlation was demonstrated between the signal ratio of the arrays and the 75-99% range of nucleotide identities of the genes. Variable genes within 14 uncharacterized strains of C. trachomatis were identified by array analysis and verified by DNA sequencing. These genes may be crucial for understanding chlamydial virulence and pathogenesis.
Mejoramiento de imágenes usando funciones de base radial Images improvement using radial basis functions

Directory of Open Access Journals (Sweden)

Jaime Alberto Echeverri Arias

2009-07-01

Full Text Available La eliminación del ruido impulsivo es un problema clásico del procesado no lineal para el mejoramiento de imágenes y las funciones de base radial de soporte global son útiles para enfrentarlo. Este trabajo presenta una técnica de interpolación que disminuye eficientemente el ruido impulsivo en imágenes, mediante el uso de interpolante obtenido por funciones de base radial en el marco de la investigación enfocada en el desarrollo de un Sistema de recuperación de imágenes de recursos acuáticos amazónicos. Esta técnica primero etiqueta los píxeles de la imagen que son ruidosos y, mediante la interpolación, genera un valor de reconstrucción de dicho píxel usando sus vecinos. Los resultados obtenidos son comparables y muchas veces mejores que otras técnicas ya publicadas y reconocidas. Según el análisis de resultados, se puede aplicar a imágenes con altas tasas de ruido, manteniendo un bajo error de reconstrucción de los píxeles "ruidosos", así como la calidad visual.Global support radial base functions are effective in eliminating impulsive noise in non-linear processing. This paper introduces an interpolation technique which efficiently reduces image impulsive noise by means of an interpolant obtained through radial base functions. These functions have been used in a research project designed to develop a system for the recovery of images of Amazonian aquatic resources. This technique starts with the tagging by interpolation of noisy image pixels. Thus, a value of reconstruction for the noisy pixels is generated using neighboring pixels. The results obtained with this technique have proved comparable and often better than those obtained with previously known techniques. According to results analysis, this technique can be successfully applied on images with high noise levels. The results are low error in noisy pixel reconstruction and better visual quality.
Congruent Deep Relationships in the Grape Family (Vitaceae) Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

Science.gov (United States)

Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A

2015-01-01

Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.
Congruent Deep Relationships in the Grape Family (Vitaceae Based on Sequences of Chloroplast Genomes and Mitochondrial Genes via Genome Skimming.

Directory of Open Access Journals (Sweden)

Ning Zhang

Full Text Available Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera. The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.
NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

Science.gov (United States)

Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun

2017-09-21

High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.
Discovering implicit entity relation with the gene-citation-gene network.

Directory of Open Access Journals (Sweden)

Min Song

Full Text Available In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner.
Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

Science.gov (United States)

Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

2014-01-01

Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.
The integration of weighted gene association networks based on information entropy.

Science.gov (United States)

Yang, Fan; Wu, Duzhi; Lin, Limei; Yang, Jian; Yang, Tinghong; Zhao, Jing

2017-01-01

Constructing genome scale weighted gene association networks (WGAN) from multiple data sources is one of research hot spots in systems biology. In this paper, we employ information entropy to describe the uncertain degree of gene-gene links and propose a strategy for data integration of weighted networks. We use this method to integrate four existing human weighted gene association networks and construct a much larger WGAN, which includes richer biology information while still keeps high functional relevance between linked gene pairs. The new WGAN shows satisfactory performance in disease gene prediction, which suggests the reliability of our integration strategy. Compared with existing integration methods, our method takes the advantage of the inherent characteristics of the component networks and pays less attention to the biology background of the data. It can make full use of existing biological networks with low computational effort.
Genotet: An Interactive Web-based Visual Exploration Framework to Support Validation of Gene Regulatory Networks.

Science.gov (United States)

Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T

2014-12-01

Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).
Taxonomic resolutions based on 18S rRNA genes: a case study of subclass copepoda.

Directory of Open Access Journals (Sweden)

Shu Wu

Full Text Available Biodiversity studies are commonly conducted using 18S rRNA genes. In this study, we compared the inter-species divergence of variable regions (V1-9 within the copepod 18S rRNA gene, and tested their taxonomic resolutions at different taxonomic levels. Our results indicate that the 18S rRNA gene is a good molecular marker for the study of copepod biodiversity, and our conclusions are as follows: 1 18S rRNA genes are highly conserved intra-species (intra-species similarities are close to 100%; and could aid in species-level analyses, but with some limitations; 2 nearly-whole-length sequences and some partial regions (around V2, V4, and V9 of the 18S rRNA gene can be used to discriminate between samples at both the family and order levels (with a success rate of about 80%; 3 compared with other regions, V9 has a higher resolution at the genus level (with an identification success rate of about 80%; and 4 V7 is most divergent in length, and would be a good candidate marker for the phylogenetic study of Acartia species. This study also evaluated the correlation between similarity thresholds and the accuracy of using nuclear 18S rRNA genes for the classification of organisms in the subclass Copepoda. We suggest that sample identification accuracy should be considered when a molecular sequence divergence threshold is used for taxonomic identification, and that the lowest similarity threshold should be determined based on a pre-designated level of acceptable accuracy.
Mining gene expression data of multiple sclerosis.

Directory of Open Access Journals (Sweden)

Pi Guo

Full Text Available Microarray produces a large amount of gene expression data, containing various biological implications. The challenge is to detect a panel of discriminative genes associated with disease. This study proposed a robust classification model for gene selection using gene expression data, and performed an analysis to identify disease-related genes using multiple sclerosis as an example.Gene expression profiles based on the transcriptome of peripheral blood mononuclear cells from a total of 44 samples from 26 multiple sclerosis patients and 18 individuals with other neurological diseases (control were analyzed. Feature selection algorithms including Support Vector Machine based on Recursive Feature Elimination, Receiver Operating Characteristic Curve, and Boruta algorithms were jointly performed to select candidate genes associating with multiple sclerosis. Multiple classification models categorized samples into two different groups based on the identified genes. Models' performance was evaluated using cross-validation methods, and an optimal classifier for gene selection was determined.An overlapping feature set was identified consisting of 8 genes that were differentially expressed between the two phenotype groups. The genes were significantly associated with the pathways of apoptosis and cytokine-cytokine receptor interaction. TNFSF10 was significantly associated with multiple sclerosis. A Support Vector Machine model was established based on the featured genes and gave a practical accuracy of ∼86%. This binary classification model also outperformed the other models in terms of Sensitivity, Specificity and F1 score.The combined analytical framework integrating feature ranking algorithms and Support Vector Machine model could be used for selecting genes for other diseases.
Genes2FANs: connecting genes through functional association networks

Science.gov (United States)

2012-01-01

Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in
A gene-based linkage map for Bicyclus anynana butterflies allows for a comprehensive analysis of synteny with the lepidopteran reference genome.

Directory of Open Access Journals (Sweden)

Patrícia Beldade

2009-02-01

Full Text Available Lepidopterans (butterflies and moths are a rich and diverse order of insects, which, despite their economic impact and unusual biological properties, are relatively underrepresented in terms of genomic resources. The genome of the silkworm Bombyx mori has been fully sequenced, but comparative lepidopteran genomics has been hampered by the scarcity of information for other species. This is especially striking for butterflies, even though they have diverse and derived phenotypes (such as color vision and wing color patterns and are considered prime models for the evolutionary and developmental analysis of ecologically relevant, complex traits. We focus on Bicyclus anynana butterflies, a laboratory system for studying the diversification of novelties and serially repeated traits. With a panel of 12 small families and a biphasic mapping approach, we first assigned 508 expressed genes to segregation groups and then ordered 297 of them within individual linkage groups. We also coarsely mapped seven color pattern loci. This is the richest gene-based map available for any butterfly species and allowed for a broad-coverage analysis of synteny with the lepidopteran reference genome. Based on 462 pairs of mapped orthologous markers in Bi. anynana and Bo. mori, we observed strong conservation of gene assignment to chromosomes, but also evidence for numerous large- and small-scale chromosomal rearrangements. With gene collections growing for a variety of target organisms, the ability to place those genes in their proper genomic context is paramount. Methods to map expressed genes and to compare maps with relevant model systems are crucial to extend genomic-level analysis outside classical model species. Maps with gene-based markers are useful for comparative genomics and to resolve mapped genomic regions to a tractable number of candidate genes, especially if there is synteny with related model species. This is discussed in relation to the identification of
Human reporter genes: potential use in clinical studies

Energy Technology Data Exchange (ETDEWEB)

Serganova, Inna [Department of Neurology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States); Ponomarev, Vladimir [Department of Radiology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States); Blasberg, Ronald [Department of Neurology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States); Department of Radiology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States)], E-mail: blasberg@neuro1.mskcc.org

2007-10-15

The clinical application of positron-emission-tomography-based reporter gene imaging will expand over the next several years. The translation of reporter gene imaging technology into clinical applications is the focus of this review, with emphasis on the development and use of human reporter genes. Human reporter genes will play an increasingly more important role in this development, and it is likely that one or more reporter systems (human gene and complimentary radiopharmaceutical) will take leading roles. Three classes of human reporter genes are discussed and compared: receptors, transporters and enzymes. Examples of highly expressed cell membrane receptors include specific membrane somatostatin receptors (hSSTrs). The transporter group includes the sodium iodide symporter (hNIS) and the norepinephrine transporter (hNET). The endogenous enzyme classification includes human mitochondrial thymidine kinase 2 (hTK2). In addition, we also discuss the nonhuman dopamine 2 receptor and two viral reporter genes, the wild-type herpes simplex virus 1 thymidine kinase (HSV1-tk) gene and the HSV1-tk mutant (HSV1-sr39tk). Initial applications of reporter gene imaging in patients will be developed within two different clinical disciplines: (a) gene therapy and (b) adoptive cell-based therapies. These studies will benefit from the availability of efficient human reporter systems that can provide critical monitoring information for adenoviral-based, retroviral-based and lenteviral-based gene therapies, oncolytic bacterial and viral therapies, and adoptive cell-based therapies. Translational applications of noninvasive in vivo reporter gene imaging are likely to include: (a) quantitative monitoring of gene therapy vectors for targeting and transduction efficacy in clinical protocols by imaging the location, extent and duration of transgene expression; (b) monitoring of cell trafficking, targeting, replication and activation in adoptive T-cell and stem/progenitor cell therapies
Human reporter genes: potential use in clinical studies

International Nuclear Information System (INIS)

Serganova, Inna; Ponomarev, Vladimir; Blasberg, Ronald

2007-01-01

The clinical application of positron-emission-tomography-based reporter gene imaging will expand over the next several years. The translation of reporter gene imaging technology into clinical applications is the focus of this review, with emphasis on the development and use of human reporter genes. Human reporter genes will play an increasingly more important role in this development, and it is likely that one or more reporter systems (human gene and complimentary radiopharmaceutical) will take leading roles. Three classes of human reporter genes are discussed and compared: receptors, transporters and enzymes. Examples of highly expressed cell membrane receptors include specific membrane somatostatin receptors (hSSTrs). The transporter group includes the sodium iodide symporter (hNIS) and the norepinephrine transporter (hNET). The endogenous enzyme classification includes human mitochondrial thymidine kinase 2 (hTK2). In addition, we also discuss the nonhuman dopamine 2 receptor and two viral reporter genes, the wild-type herpes simplex virus 1 thymidine kinase (HSV1-tk) gene and the HSV1-tk mutant (HSV1-sr39tk). Initial applications of reporter gene imaging in patients will be developed within two different clinical disciplines: (a) gene therapy and (b) adoptive cell-based therapies. These studies will benefit from the availability of efficient human reporter systems that can provide critical monitoring information for adenoviral-based, retroviral-based and lenteviral-based gene therapies, oncolytic bacterial and viral therapies, and adoptive cell-based therapies. Translational applications of noninvasive in vivo reporter gene imaging are likely to include: (a) quantitative monitoring of gene therapy vectors for targeting and transduction efficacy in clinical protocols by imaging the location, extent and duration of transgene expression; (b) monitoring of cell trafficking, targeting, replication and activation in adoptive T-cell and stem/progenitor cell therapies

Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data

Directory of Open Access Journals (Sweden)

Merchant Sabeeha S

2011-07-01

Full Text Available Abstract Background Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. Description The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of
GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

Science.gov (United States)

Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

2016-03-01

Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics
Phylogenetic analysis of Fusobacterium prausnitzii based upon the 16S rRNA gene sequence and PCR confirmation.

Science.gov (United States)

Wang, R F; Cao, W W; Cerniglia, C E

1996-01-01

In order to develop a PCR method to detect Fusobacterium prausnitzii in human feces and to clarify the phylogenetic position of this species, its 16S rRNA gene sequence was determined. The sequence described in this paper is different from the 16S rRNA gene sequence is specific for F. prausnitzii, and the results of this assay confirmed that F. prausnitzii is the most common species in human feces. However, a PCR assay based on the original GenBank sequence was negative when it was performed with two strains of F. prausnitzii obtained from the American Type Culture Collection. A phylogenetic tree based on the new 16S rRNA gene sequence was constructed. On this tree F. prausnitzii was not a member of the Fusobacterium group but was closer to some Eubacterium spp. and located between Clostridium "clusters III and IV" (M.D. Collins, P.A. Lawson, A. Willems, J.J. Cordoba, J. Fernandez-Garayzabal, P. Garcia, J. Cai, H. Hippe, and J.A.E. Farrow, Int. J. Syst. Bacteriol. 44:812-826, 1994).
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.

Science.gov (United States)

Powell, Bradford C; Hutchison, Clyde A

2006-01-19

Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

Directory of Open Access Journals (Sweden)

Hutchison Clyde A

2006-01-01

Full Text Available Abstract Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs. We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency. We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Methodological issues in detecting gene-gene interactions in breast cancer susceptibility: a population-based study in Ontario

Directory of Open Access Journals (Sweden)

Onay Venus

2007-08-01

Full Text Available Abstract Background There is growing evidence that gene-gene interactions are ubiquitous in determining the susceptibility to common human diseases. The investigation of such gene-gene interactions presents new statistical challenges for studies with relatively small sample sizes as the number of potential interactions in the genome can be large. Breast cancer provides a useful paradigm to study genetically complex diseases because commonly occurring single nucleotide polymorphisms (SNPs may additively or synergistically disturb the system-wide communication of the cellular processes leading to cancer development. Methods In this study, we systematically studied SNP-SNP interactions among 19 SNPs from 18 key genes involved in major cancer pathways in a sample of 398 breast cancer cases and 372 controls from Ontario. We discuss the methodological issues associated with the detection of SNP-SNP interactions in this dataset by applying and comparing three commonly used methods: the logistic regression model, classification and regression trees (CART, and the multifactor dimensionality reduction (MDR method. Results Our analyses show evidence for several simple (two-way and complex (multi-way SNP-SNP interactions associated with breast cancer. For example, all three methods identified XPD-[Lys751Gln]*IL10-[G(-1082A] as the most significant two-way interaction. CART and MDR identified the same critical SNPs participating in complex interactions. Our results suggest that the use of multiple statistical approaches (or an integrated approach rather than a single methodology could be the best strategy to elucidate complex gene interactions that have generally very different patterns. Conclusion The strategy used here has the potential to identify complex biological relationships among breast cancer genes and processes. This will lead to the discovery of novel biological information, which will improve breast cancer risk management.
Comprehensive Protocols for CRISPR/Cas9-based Gene Editing in Human Pluripotent Stem Cells.

Science.gov (United States)

Santos, David P; Kiskinis, Evangelos; Eggan, Kevin; Merkle, Florian T

2016-08-17

Genome editing of human pluripotent stem cells (hPSCs) with the CRISPR/Cas9 system has the potential to revolutionize hPSC-based disease modeling, drug screening, and transplantation therapy. Here, we aim to provide a single resource to enable groups, even those with limited experience with hPSC culture or the CRISPR/Cas9 system, to successfully perform genome editing. The methods are presented in detail and are supported by a theoretical framework to allow for the incorporation of inevitable improvements in the rapidly evolving gene-editing field. We describe protocols to generate hPSC lines with gene-specific knock-outs, small targeted mutations, or knock-in reporters. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Single-base resolution maps of cultivated and wild rice methylomes and regulatory roles of DNA methylation in plant gene expression

Directory of Open Access Journals (Sweden)

Li Xin

2012-07-01

Full Text Available Abstract Background DNA methylation plays important biological roles in plants and animals. To examine the rice genomic methylation landscape and assess its functional significance, we generated single-base resolution DNA methylome maps for Asian cultivated rice Oryza sativa ssp. japonica, indica and their wild relatives, Oryza rufipogon and Oryza nivara. Results The overall methylation level of rice genomes is four times higher than that of Arabidopsis. Consistent with the results reported for Arabidopsis, methylation in promoters represses gene expression while gene-body methylation generally appears to be positively associated with gene expression. Interestingly, we discovered that methylation in gene transcriptional termination regions (TTRs can significantly repress gene expression, and the effect is even stronger than that of promoter methylation. Through integrated analysis of genomic, DNA methylomic and transcriptomic differences between cultivated and wild rice, we found that primary DNA sequence divergence is the major determinant of methylational differences at the whole genome level, but DNA methylational difference alone can only account for limited gene expression variation between the cultivated and wild rice. Furthermore, we identified a number of genes with significant difference in methylation level between the wild and cultivated rice. Conclusions The single-base resolution methylomes of rice obtained in this study have not only broadened our understanding of the mechanism and function of DNA methylation in plant genomes, but also provided valuable data for future studies of rice epigenetics and the epigenetic differentiation between wild and cultivated rice.
RNA-based, transient modulation of gene expression in human haematopoietic stem and progenitor cells

Science.gov (United States)

Diener, Yvonne; Jurk, Marion; Kandil, Britta; Choi, Yeong-Hoon; Wild, Stefan; Bissels, Ute; Bosio, Andreas

2015-01-01

Modulation of gene expression is a useful tool to study the biology of haematopoietic stem and progenitor cells (HSPCs) and might also be instrumental to expand these cells for therapeutic approaches. Most of the studies so far have employed stable gene modification by viral vectors that are burdensome when translating protocols into clinical settings. Our study aimed at exploring new ways to transiently modify HSPC gene expression using non-integrating, RNA-based molecules. First, we tested different methods to deliver these molecules into HSPCs. The delivery of siRNAs with chemical transfection methods such as lipofection or cationic polymers did not lead to target knockdown, although we observed more than 90% fluorescent cells using a fluorochrome-coupled siRNA. Confocal microscopic analysis revealed that despite extensive washing, siRNA stuck to or in the cell surface, thereby mimicking a transfection event. In contrast, electroporation resulted in efficient, siRNA-mediated protein knockdown. For transient overexpression of proteins, we used optimised mRNA molecules with modified 5′- and 3′-UTRs. Electroporation of mRNA encoding GFP resulted in fast, efficient and persistent protein expression for at least seven days. Our data provide a broad-ranging comparison of transfection methods for hard-to-transfect cells and offer new opportunities for DNA-free, non-integrating gene modulation in HSPCs. PMID:26599627
Patenting human genes: Chinese academic articles' portrayal of gene patents.

Science.gov (United States)

Du, Li

2018-04-24

The patenting of human genes has been the subject of debate for decades. While China has gradually come to play an important role in the global genomics-based testing and treatment market, little is known about Chinese scholars' perspectives on patent protection for human genes. A content analysis of academic literature was conducted to identify Chinese scholars' concerns regarding gene patents, including benefits and risks of patenting human genes, attitudes that researchers hold towards gene patenting, and any legal and policy recommendations offered for the gene patent regime in China. 57.2% of articles were written by law professors, but scholars from health sciences, liberal arts, and ethics also participated in discussions on gene patent issues. While discussions of benefits and risks were relatively balanced in the articles, 63.5% of the articles favored gene patenting in general and, of the articles (n = 41) that explored gene patents in the Chinese context, 90.2% supported patent protections for human genes in China. The patentability of human genes was discussed in 33 articles, and 75.8% of these articles reached the conclusion that human genes are patentable. Chinese scholars view the patent regime as an important legal tool to protect the interests of inventors and inventions as well as the genetic resources of China. As such, many scholars support a gene patent system in China. These attitudes towards gene patents remain unchanged following the court ruling in the Myriad case in 2013, but arguments have been raised about the scope of gene patents, in particular that the increasing numbers of gene patents may negatively impact public health in China.
Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus).

Science.gov (United States)

Wei, Ling; Yang, Chao; Tao, Wenjing; Wang, Deshou

2016-02-23

The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG) box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus), and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.
Genome-Wide Identification and Transcriptome-Based Expression Profiling of the Sox Gene Family in the Nile Tilapia (Oreochromis niloticus

Directory of Open Access Journals (Sweden)

Ling Wei

2016-02-01

Full Text Available The Sox transcription factor family is characterized with the presence of a Sry-related high-mobility group (HMG box and plays important roles in various biological processes in animals, including sex determination and differentiation, and the development of multiple organs. In this study, 27 Sox genes were identified in the genome of the Nile tilapia (Oreochromis niloticus, and were classified into seven groups. The members of each group of the tilapia Sox genes exhibited a relatively conserved exon-intron structure. Comparative analysis showed that the Sox gene family has undergone an expansion in tilapia and other teleost fishes following their whole genome duplication, and group K only exists in teleosts. Transcriptome-based analysis demonstrated that most of the tilapia Sox genes presented stage-specific and/or sex-dimorphic expressions during gonadal development, and six of the group B Sox genes were specifically expressed in the adult brain. Our results provide a better understanding of gene structure and spatio-temporal expression of the Sox gene family in tilapia, and will be useful for further deciphering the roles of the Sox genes during sex determination and gonadal development in teleosts.
Sieve-based relation extraction of gene regulatory networks from biological literature.

Science.gov (United States)

Žitnik, Slavko; Žitnik, Marinka; Zupan, Blaž; Bajec, Marko

2015-01-01

Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming
Towards gene therapy based on femtosecond optical transfection

Science.gov (United States)

Antkowiak, M.; Torres-Mapa, M. L.; McGinty, J.; Chahine, M.; Bugeon, L.; Rose, A.; Finn, A.; Moleirinho, S.; Okuse, K.; Dallman, M.; French, P.; Harding, S. E.; Reynolds, P.; Gunn-Moore, F.; Dholakia, K.

2012-06-01

Gene therapy poses a great promise in treatment and prevention of a variety of diseases. However, crucial to studying and the development of this therapeutic approach is a reliable and efficient technique of gene and drug delivery into primary cell types. These cells, freshly derived from an organ or tissue, mimic more closely the in vivo state and present more physiologically relevant information compared to cultured cell lines. However, primary cells are known to be difficult to transfect and are typically transfected using viral methods, which are not only questionable in the context of an in vivo application but rely on time consuming vector construction and may also result in cell de-differentiation and loss of functionality. At the same time, well established non-viral methods do not guarantee satisfactory efficiency and viability. Recently, optical laser mediated poration of cell membrane has received interest as a viable gene and drug delivery technique. It has been shown to deliver a variety of biomolecules and genes into cultured mammalian cells; however, its applicability to primary cells remains to be proven. We demonstrate how optical transfection can be an enabling technique in research areas, such as neuropathic pain, neurodegenerative diseases, heart failure and immune or inflammatory-related diseases. Several primary cell types are used in this study, namely cardiomyocytes, dendritic cells, and neurons. We present our recent progress in optimizing this technique's efficiency and post-treatment cell viability for these types of cells and discuss future directions towards in vivo applications.
Genome-wide siRNA-based functional genomics of pigmentation identifies novel genes and pathways that impact melanogenesis in human cells.

Directory of Open Access Journals (Sweden)

Anand K Ganesan

2008-12-01

Full Text Available Melanin protects the skin and eyes from the harmful effects of UV irradiation, protects neural cells from toxic insults, and is required for sound conduction in the inner ear. Aberrant regulation of melanogenesis underlies skin disorders (melasma and vitiligo, neurologic disorders (Parkinson's disease, auditory disorders (Waardenburg's syndrome, and opthalmologic disorders (age related macular degeneration. Much of the core synthetic machinery driving melanin production has been identified; however, the spectrum of gene products participating in melanogenesis in different physiological niches is poorly understood. Functional genomics based on RNA-mediated interference (RNAi provides the opportunity to derive unbiased comprehensive collections of pharmaceutically tractable single gene targets supporting melanin production. In this study, we have combined a high-throughput, cell-based, one-well/one-gene screening platform with a genome-wide arrayed synthetic library of chemically synthesized, small interfering RNAs to identify novel biological pathways that govern melanin biogenesis in human melanocytes. Ninety-two novel genes that support pigment production were identified with a low false discovery rate. Secondary validation and preliminary mechanistic studies identified a large panel of targets that converge on tyrosinase expression and stability. Small molecule inhibition of a family of gene products in this class was sufficient to impair chronic tyrosinase expression in pigmented melanoma cells and UV-induced tyrosinase expression in primary melanocytes. Isolation of molecular machinery known to support autophagosome biosynthesis from this screen, together with in vitro and in vivo validation, exposed a close functional relationship between melanogenesis and autophagy. In summary, these studies illustrate the power of RNAi-based functional genomics to identify novel genes, pathways, and pharmacologic agents that impact a biological phenotype
Separate base usages of genes located on the leading and lagging strands in Chlamydia muridarum revealed by the Z curve method

Directory of Open Access Journals (Sweden)

Yu Xiu-Juan

2007-10-01

Full Text Available Abstract Background The nucleotide compositional asymmetry between the leading and lagging strands in bacterial genomes has been the subject of intensive study in the past few years. It is interesting to mention that almost all bacterial genomes exhibit the same kind of base asymmetry. This work aims to investigate the strand biases in Chlamydia muridarum genome and show the potential of the Z curve method for quantitatively differentiating genes on the leading and lagging strands. Results The occurrence frequencies of bases of protein-coding genes in C. muridarum genome were analyzed by the Z curve method. It was found that genes located on the two strands of replication have distinct base usages in C. muridarum genome. According to their positions in the 9-D space spanned by the variables u1 – u9 of the Z curve method, K-means clustering algorithm can assign about 94% of genes to the correct strands, which is a few percent higher than those correctly classified by K-means based on the RSCU. The base usage and codon usage analyses show that genes on the leading strand have more G than C and more T than A, particularly at the third codon position. For genes on the lagging strand the biases is reverse. The y component of the Z curves for the complete chromosome sequences show that the excess of G over C and T over A are more remarkable in C. muridarum genome than in other bacterial genomes without separating base and/or codon usages. Furthermore, for the genomes of Borrelia burgdorferi, Treponema pallidum, Chlamydia muridarum and Chlamydia trachomatis, in which distinct base and/or codon usages have been observed, closer phylogenetic distance is found compared with other bacterial genomes. Conclusion The nature of the strand biases of base composition in C. muridarum is similar to that in most other bacterial genomes. However, the base composition asymmetry between the leading and lagging strands in C. muridarum is more significant than that in
Immune Modulation of NYVAC-Based HIV Vaccines by Combined Deletion of Viral Genes that Act on Several Signalling Pathways

Directory of Open Access Journals (Sweden)

Carmen Elena Gómez

2017-12-01

Full Text Available An HIV-1 vaccine continues to be a major target to halt the AIDS pandemic. The limited efficacy of the RV144 phase III clinical trial with the canarypox virus-based vector ALVAC and a gp120 protein component led to the conclusion that improved immune responses to HIV antigens are needed for a more effective vaccine. In non-human primates, the New York vaccinia virus (NYVAC poxvirus vector has a broader immunogenicity profile than ALVAC and has been tested in clinical trials. We therefore analysed the HIV immune advantage of NYVAC after removing viral genes that act on several signalling pathways (Toll-like receptors—TLR—interferon, cytokines/chemokines, as well as genes of unknown immune function. We generated a series of NYVAC deletion mutants and studied immune behaviour (T and B cell to HIV antigens and to the NYVAC vector in mice. Our results showed that combined deletion of selected vaccinia virus (VACV genes is a valuable strategy for improving the immunogenicity of NYVAC-based vaccine candidates. These immune responses were differentially modulated, positive or negative, depending on the combination of gene deletions. The deletions also led to enhanced antigen- or vector-specific cellular and humoral responses. These findings will facilitate the development of optimal NYVAC-based vaccines for HIV and other diseases.
Next generation sequencing based transcriptome analysis of septic-injury responsive genes in the beetle Tribolium castaneum.

Directory of Open Access Journals (Sweden)

Boran Altincicek

Full Text Available Beetles (Coleoptera are the most diverse animal group on earth and interact with numerous symbiotic or pathogenic microbes in their environments. The red flour beetle Tribolium castaneum is a genetically tractable model beetle species and its whole genome sequence has recently been determined. To advance our understanding of the molecular basis of beetle immunity here we analyzed the whole transcriptome of T. castaneum by high-throughput next generation sequencing technology. Here, we demonstrate that the Illumina/Solexa sequencing approach of cDNA samples from T. castaneum including over 9.7 million reads with 72 base pairs (bp length (approximately 700 million bp sequence information with about 30× transcriptome coverage confirms the expression of most predicted genes and enabled subsequent qualitative and quantitative transcriptome analysis. This approach recapitulates our recent quantitative real-time PCR studies of immune-challenged and naïve T. castaneum beetles, validating our approach. Furthermore, this sequencing analysis resulted in the identification of 73 differentially expressed genes upon immune-challenge with statistical significance by comparing expression data to calculated values derived by fitting to generalized linear models. We identified up regulation of diverse immune-related genes (e.g. Toll receptor, serine proteinases, DOPA decarboxylase and thaumatin and of numerous genes encoding proteins with yet unknown functions. Of note, septic-injury resulted also in the elevated expression of genes encoding heat-shock proteins or cytochrome P450s supporting the view that there is crosstalk between immune and stress responses in T. castaneum. The present study provides a first comprehensive overview of septic-injury responsive genes in T. castaneum beetles. Identified genes advance our understanding of T. castaneum specific gene expression alteration upon immune-challenge in particular and may help to understand beetle immunity
UniGene Tabulator: a full parser for the UniGene format.

Science.gov (United States)

Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

2006-10-15

UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/
Monoterpenoid-based preparations in beehives affect learning, memory, and gene expression in the bee brain.

Science.gov (United States)

Bonnafé, Elsa; Alayrangues, Julie; Hotier, Lucie; Massou, Isabelle; Renom, Allan; Souesme, Guillaume; Marty, Pierre; Allaoua, Marion; Treilhou, Michel; Armengaud, Catherine

2017-02-01

Bees are exposed in their environment to contaminants that can weaken the colony and contribute to bee declines. Monoterpenoid-based preparations can be introduced into hives to control the parasitic mite Varroa destructor. The long-term effects of monoterpenoids are poorly investigated. Olfactory conditioning of the proboscis extension reflex (PER) has been used to evaluate the impact of stressors on cognitive functions of the honeybee such as learning and memory. The authors tested the PER to odorants on bees after exposure to monoterpenoids in hives. Octopamine receptors, transient receptor potential-like (TRPL), and γ-aminobutyric acid channels are thought to play a critical role in the memory of food experience. Gene expression levels of Amoa1, Rdl, and trpl were evaluated in parallel in the bee brain because these genes code for the cellular targets of monoterpenoids and some pesticides and neural circuits of memory require their expression. The miticide impaired the PER to odors in the 3 wk following treatment. Short-term and long-term olfactory memories were improved months after introduction of the monoterpenoids into the beehives. Chronic exposure to the miticide had significant effects on Amoa1, Rdl, and trpl gene expressions and modified seasonal changes in the expression of these genes in the brain. The decrease of expression of these genes in winter could partly explain the improvement of memory. The present study has led to new insights into alternative treatments, especially on their effects on memory and expression of selected genes involved in this cognitive function. Environ Toxicol Chem 2017;36:337-345. © 2016 SETAC. © 2016 SETAC.

Human gene therapy: novel approaches to improve the current gene delivery systems.

Science.gov (United States)

Cucchiarini, Magali

2016-06-01

Even though gene therapy made its way through the clinics to treat a number of human pathologies since the early years of experimental research and despite the recent approval of the first gene-based product (Glybera) in Europe, the safe and effective use of gene transfer vectors remains a challenge in human gene therapy due to the existence of barriers in the host organism. While work is under active investigation to improve the gene transfer systems themselves, the use of controlled release approaches may offer alternative, convenient tools of vector delivery to achieve a performant gene transfer in vivo while overcoming the various physiological barriers that preclude its wide use in patients. This article provides an overview of the most significant contributions showing how the principles of controlled release strategies may be adapted for human gene therapy.
Improved in vivo gene transfer into tumor tissue by stabilization of pseudodendritic oligoethylenimine-based polyplexes.

Science.gov (United States)

Russ, Verena; Fröhlich, Thomas; Li, Yunqiu; Halama, Anna; Ogris, Manfred; Wagner, Ernst

2010-02-01

HD O is a low molecular weight pseudodendrimer containing oligoethylenimine and degradable hexanediol diacrylate diesters. DNA polyplexes display encouraging gene transfer efficiency in vitro and in vivo but also a limited stability under physiological conditions. This limitation must be overcome for further development into more sophisticated formulations. HD O polyplexes were laterally stabilized by crosslinking surface amines via bifunctional crosslinkers, bioreducible dithiobis(succimidyl propionate) (DSP) or the nonreducible analog disuccinimidyl suberate (DSS). Optionally, in a subsequent step, the targeting ligand transferrin (Tf) was attached to DSP-linked HD O polyplexes via Schiff base formation between HD O amino groups and Tf aldehyde groups, which were introduced into Tf by periodate oxidation of the glycosylation sites. Crosslinked DNA polyplexes showed an increased stability against exchange reaction by salt or heparin. Disulfide bond containing DSP-linked polyplexes were susceptible to reducing conditions. These polyplexes displayed the highest gene expression levels in vitro and in vivo (upon intratumoral application in mice), and these were significantly elevated and prolonged over standard or DSS-stabilized HD O formulations. DSP-stabilized HD O polyplexes with or without Tf coating were well-tolerated after intravenous application. High gene expression levels were found in tumor tissue, with negligible gene expression in any other organ. Lateral stabilization of HD O polyplexes with DSP crosslinker enhanced gene transfer efficacy and was essential for the incorporation of a ligand (Tf) into a stable particle formulation.
Intracellular delivery of potential therapeutic genes: prospects in cancer gene therapy.

Science.gov (United States)

Bakhtiar, Athirah; Sayyad, Mustak; Rosli, Rozita; Maruyama, Atsushi; Chowdhury, Ezharul H

2014-01-01

Conventional therapies for malignant cancer such as chemotherapy and radiotherapy are associated with poor survival rates owing to the development of cellular resistance to cancer drugs and the lack of targetability, resulting in unwanted adverse effects on healthy cells and necessitating the lowering of therapeutic dose with consequential lower efficacy of the treatment. Gene therapy employing different types of viral and non-viral carriers to transport gene(s) of interest and facilitating production of the desirable therapeutic protein(s) has tremendous prospects in cancer treatments due to the high-level of specificity in therapeutic action of the expressed protein(s) with diminished off-target effects, although cancer cell-specific delivery of transgene(s) still poses some challenges to be addressed. Depending on the potential therapeutic target genes, cancer gene therapy could be categorized into tumor suppressor gene replacement therapy, immune gene therapy and enzyme- or prodrug-based therapy. This review would shed light on the current progress of delivery of potentially therapeutic genes into various cancer cells in vitro and animal models utilizing a variety of viral and non-viral vectors.
Objectives, capabilities and dangers in the role of international organizations and funding agencies in promoting gene-based technologies for livestock in developing countries

International Nuclear Information System (INIS)

Hodges, J.

2005-01-01

Gene-based technologies offer the world unprecedented opportunities for improving quality of life, or for reducing it in irreversible ways. The basic question addressed in this paper is the position and response of international bodies and donors on whether or not to provide gene-based technologies to developing countries. It will not be easy to attain a responsible and coherent answer to this challenging question. Gaining an objective understanding of the essential issues is hard when controversy rages across the supposedly neutral scientific facts. Nevertheless, the outcome of the discussion is of prime importance at a global level. This paper seeks to bring light into this arena. After the Introduction, three principle concerns are examined which should be at the top of the agenda of these international institutions. Following this, short reviews of the critical issues are presented covering: the scientific characteristics and uncertainties associated with gene-based technologies; the nature of target areas in which they may be applied; and the considerable disquiet in society generally. These short outlines highlight the possible benefits and dangers associated with the critical issues. It is concluded that the objectives, capabilities, opportunities and dangers cannot be evaluated at the scientific level alone; they must be evaluated as matters of high policy by all stakeholders before gene-based technologies are implemented on the ground. In view of these perspectives, at the end of the paper it is proposed that scientists should place a moratorium on the development of gene-based technologies for the development of transgenic animals. It is also proposed that, during the moratorium, the United Nations should carry out a global referendum on the desirability of gene-based technologies being applied to the food chain. Meanwhile it is recommended that international organizations and funding bodies should not promote these techniques. (author)
Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

Science.gov (United States)

Hur, Junguk; Özgür, Arzucan; Xiang, Zuoshuang; He, Yongqun

2015-01-01

Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature. INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these
Statistical approach for selection of biologically informative genes.

Science.gov (United States)

Das, Samarendra; Rai, Anil; Mishra, D C; Rai, Shesh N

2018-05-20

Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes
Citrus plastid-related gene profiling based on expressed sequence tag analyses

Directory of Open Access Journals (Sweden)

Tercilio Calsa Jr.

2007-01-01

Full Text Available Plastid-related sequences, derived from putative nuclear or plastome genes, were searched in a large collection of expressed sequence tags (ESTs and genomic sequences from the Citrus Biotechnology initiative in Brazil. The identified putative Citrus chloroplast gene sequences were compared to those from Arabidopsis, Eucalyptus and Pinus. Differential expression profiling for plastid-directed nuclear-encoded proteins and photosynthesis-related gene expression variation between Citrus sinensis and Citrus reticulata, when inoculated or not with Xylella fastidiosa, were also analyzed. Presumed Citrus plastome regions were more similar to Eucalyptus. Some putative genes appeared to be preferentially expressed in vegetative tissues (leaves and bark or in reproductive organs (flowers and fruits. Genes preferentially expressed in fruit and flower may be associated with hypothetical physiological functions. Expression pattern clustering analysis suggested that photosynthesis- and carbon fixation-related genes appeared to be up- or down-regulated in a resistant or susceptible Citrus species after Xylella inoculation in comparison to non-infected controls, generating novel information which may be helpful to develop novel genetic manipulation strategies to control Citrus variegated chlorosis (CVC.
Analyses of the influencing factors of soil microbial functional gene diversity in tropical rainforest based on GeoChip 5.0.

Science.gov (United States)

Cong, Jing; Liu, Xueduan; Lu, Hui; Xu, Han; Li, Yide; Deng, Ye; Li, Diqiang; Zhang, Yuguang

2015-09-01

To examine soil microbial functional gene diversity and causative factors in tropical rainforests, we used a microarray-based metagenomic tool named GeoChip 5.0 to profile it. We found that high microbial functional gene diversity and different soil microbial metabolic potential for biogeochemical processes were considered to exist in tropical rainforest. Soil available nitrogen was the most associated with soil microbial functional gene structure. Here, we mainly describe the experiment design, the data processing, and soil biogeochemical analyses attached to the study in details, which could be published on BMC microbiology Journal in 2015, whose raw data have been deposited in NCBI's Gene Expression Omnibus (accession number GSE69171).
A candidate gene-based association study of tocopherol content and composition in rapeseed (Brassica napus

Directory of Open Access Journals (Sweden)

Steffi eFritsche

2012-06-01

Full Text Available Rapeseed (Brassica napus L. is the most important oil crop of temperate climates. Rapeseed oil contains tocopherols, also known as vitamin E, which is an indispensable nutrient for humans and animals due to its antioxidant and radical scavenging abilities. Moreover, tocopherols are also important for the oxidative stability of vegetable oils. Therefore, seed oil with increased tocopherol content or altered tocopherol composition is a target for breeding. We investigated the role of nucleotide variations within candidate genes from the tocopherol biosynthesis pathway. Field trials were carried out with 229 accessions from a worldwide B. napus collection which was divided into two panels of 96 and 133 accessions. Seed tocopherol content and composition were measured by HPLC. High heritabilities were found for both traits, ranging from 0.62 to 0.94. We identified polymorphisms by sequencing selected regions of the tocopherol genes from the 96 accession panel. Subsequently, we determined the population structure (Q and relative kinship (K as detected by genotyping with genome-wide distributed SSR markers. Association studies were performed using two models, the structure-based GLM+Q and the PK mixed model. Between 26 and 12 polymorphisms within two genes (BnaX.VTE3.a, BnaA.PDS1.c were significantly associated with tocopherol traits. The SNPs explained up to 16.93 % of the genetic variance for tocopherol composition and up to 10.48 % for total tocopherol content. Based on the sequence information we designed CAPS markers for genotyping the 133 accessions from the 2nd panel. Significant associations with various tocopherol traits confirmed the results from the first experiment. We demonstrate that the polymorphisms within the tocopherol genes clearly impact tocopherol content and composition in B. napus seeds. We suggest that these nucleotide variations may be used as selectable markers for breeding rapeseed with enhanced tocopherol quality.
Analyzing Plasmodium falciparum erythrocyte membrane protein 1 gene expression by a next generation sequencing based method

DEFF Research Database (Denmark)

Jespersen, Jakob S.; Petersen, Bent; Seguin-Orlando, Andaine

2013-01-01

at identifying PfEMP1 features associated with high virulence. Here we present the first effective method for sequence analysis of var genes expressed in field samples: a sequential PCR and next generation sequencing based technique applied on expressed var sequence tags and subsequently on long range PCR......, encoded by ~60 highly variable 'var' genes per haploid genome. PfEMP1 is exported to the surface of infected erythrocytes and is thought to be fundamental to immune evasion by adhesion to host and parasite factors. The highly variable nature has constituted a roadblock in var expression studies aimed...
Integration of Genome Scale Metabolic Networks and Gene Regulation of Metabolic Enzymes With Physiologically Based Pharmacokinetics.

Science.gov (United States)

Maldonado, Elaina M; Leoncikas, Vytautas; Fisher, Ciarán P; Moore, J Bernadette; Plant, Nick J; Kierzek, Andrzej M

2017-11-01

The scope of physiologically based pharmacokinetic (PBPK) modeling can be expanded by assimilation of the mechanistic models of intracellular processes from systems biology field. The genome scale metabolic networks (GSMNs) represent a whole set of metabolic enzymes expressed in human tissues. Dynamic models of the gene regulation of key drug metabolism enzymes are available. Here, we introduce GSMNs and review ongoing work on integration of PBPK, GSMNs, and metabolic gene regulation. We demonstrate example models. © 2017 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Liposome-based DNA carriers may induce cellular stress response and change gene expression pattern in transfected cells

Science.gov (United States)

2011-01-01

Background During functional studies on the rat stress-inducible Hspa1b (hsp70.1) gene we noticed that some liposome-based DNA carriers, which are used for transfection, induce its promoter activity. This observation concerned commercial liposome formulations (LA), Lipofectin and Lipofectamine 2000. This work was aimed to understand better the mechanism of this phenomenon and its potential biological and practical consequences. Results We found that a reporter gene driven by Hspa1b promoter is activated both in the case of transient transfections and in the stably transfected cells treated with LA. Using several deletion clones containing different fragments of Hspa1b promoter, we found that the regulatory elements responsible for most efficient LA-driven inducibility were located between nucleotides -269 and +85, relative to the transcription start site. Further studies showed that the induction mechanism was independent of the classical HSE-HSF interaction that is responsible for gene activation during heat stress. Using DNA microarrays we also detected significant activation of the endogenous Hspa1b gene in cells treated with Lipofectamine 2000. Several other stress genes were also induced, along with numerous genes involved in cellular metabolism, cell cycle control and pro-apoptotic pathways. Conclusions Our observations suggest that i) some cationic liposomes may not be suitable for functional studies on hsp promoters, ii) lipofection may cause unintended changes in global gene expression in the transfected cells. PMID:21663599
Liposome-based DNA carriers may induce cellular stress response and change gene expression pattern in transfected cells

Directory of Open Access Journals (Sweden)

Lisowska Katarzyna Marta

2011-06-01

Full Text Available Abstract Background During functional studies on the rat stress-inducible Hspa1b (hsp70.1 gene we noticed that some liposome-based DNA carriers, which are used for transfection, induce its promoter activity. This observation concerned commercial liposome formulations (LA, Lipofectin and Lipofectamine 2000. This work was aimed to understand better the mechanism of this phenomenon and its potential biological and practical consequences. Results We found that a reporter gene driven by Hspa1b promoter is activated both in the case of transient transfections and in the stably transfected cells treated with LA. Using several deletion clones containing different fragments of Hspa1b promoter, we found that the regulatory elements responsible for most efficient LA-driven inducibility were located between nucleotides -269 and +85, relative to the transcription start site. Further studies showed that the induction mechanism was independent of the classical HSE-HSF interaction that is responsible for gene activation during heat stress. Using DNA microarrays we also detected significant activation of the endogenous Hspa1b gene in cells treated with Lipofectamine 2000. Several other stress genes were also induced, along with numerous genes involved in cellular metabolism, cell cycle control and pro-apoptotic pathways. Conclusions Our observations suggest that i some cationic liposomes may not be suitable for functional studies on hsp promoters, ii lipofection may cause unintended changes in global gene expression in the transfected cells.
Quantitative multiplex quantum dot in-situ hybridisation based gene expression profiling in tissue microarrays identifies prognostic genes in acute myeloid leukaemia

Energy Technology Data Exchange (ETDEWEB)

Tholouli, Eleni [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); MacDermott, Sarah [The Medical School, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Hoyland, Judith [School of Biomedicine, Faculty of Medical and Human Sciences, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Yin, John Liu [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); Byers, Richard, E-mail: richard.byers@cmft.nhs.uk [School of Cancer and Enabling Sciences, Faculty of Medical and Human Sciences, The University of Manchester, Stopford Building, Oxford Road, M13 9PT Manchester (United Kingdom)

2012-08-24

Highlights: Black-Right-Pointing-Pointer Development of a quantitative high throughput in situ expression profiling method. Black-Right-Pointing-Pointer Application to a tissue microarray of 242 AML bone marrow samples. Black-Right-Pointing-Pointer Identification of HOXA4, HOXA9, Meis1 and DNMT3A as prognostic markers in AML. -- Abstract: Measurement and validation of microarray gene signatures in routine clinical samples is problematic and a rate limiting step in translational research. In order to facilitate measurement of microarray identified gene signatures in routine clinical tissue a novel method combining quantum dot based oligonucleotide in situ hybridisation (QD-ISH) and post-hybridisation spectral image analysis was used for multiplex in-situ transcript detection in archival bone marrow trephine samples from patients with acute myeloid leukaemia (AML). Tissue-microarrays were prepared into which white cell pellets were spiked as a standard. Tissue microarrays were made using routinely processed bone marrow trephines from 242 patients with AML. QD-ISH was performed for six candidate prognostic genes using triplex QD-ISH for DNMT1, DNMT3A, DNMT3B, and for HOXA4, HOXA9, Meis1. Scrambled oligonucleotides were used to correct for background staining followed by normalisation of expression against the expression values for the white cell pellet standard. Survival analysis demonstrated that low expression of HOXA4 was associated with poorer overall survival (p = 0.009), whilst high expression of HOXA9 (p < 0.0001), Meis1 (p = 0.005) and DNMT3A (p = 0.04) were associated with early treatment failure. These results demonstrate application of a standardised, quantitative multiplex QD-ISH method for identification of prognostic markers in formalin-fixed paraffin-embedded clinical samples, facilitating measurement of gene expression signatures in routine clinical samples.
Sex-based differences in gene expression in hippocampus following postnatal lead exposure

International Nuclear Information System (INIS)

Schneider, J.S.; Anderson, D.W.; Sonnenahalli, H.; Vadigepalli, R.

2011-01-01

The influence of sex as an effect modifier of childhood lead poisoning has received little systematic attention. Considering the paucity of information available concerning the interactive effects of lead and sex on the brain, the current study examined the interactive effects of lead and sex on gene expression patterns in the hippocampus, a structure involved in learning and memory. Male or female rats were fed either 1500 ppm lead-containing chow or control chow for 30 days beginning at weaning.Blood lead levels were 26.7 ± 2.1 μg/dl and 27.1 ± 1.7 μg/dl for females and males, respectively. The expression of 175 unique genes was differentially regulated between control male and female rats. A total of 167 unique genes were differentially expressed in response to lead in either males or females. Lead exposure had a significant effect without a significant difference between male and female responses in 77 of these genes. In another set of 71 genes, there were significant differences in male vs. female response. A third set of 30 genes was differentially expressed in opposite directions in males vs. females, with the majority of genes expressed at a lower level in females than in males. Highly differentially expressed genes in males and females following lead exposure were associated with diverse biological pathways and functions. These results show that a brief exposure to lead produced significant changes in expression of a variety of genes in the hippocampus and that the response of the brain to a given lead exposure may vary depending on sex. - Highlights: → Postnatal lead exposure has a significant effect on hippocampal gene expression patterns. → At least one set of genes was affected in opposite directions in males and females. → Differentially expressed genes were associated with diverse biological pathways.
Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

International Nuclear Information System (INIS)

Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

2000-01-01

A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society
Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

Energy Technology Data Exchange (ETDEWEB)

Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

2000-09-18

A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society.
Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk.

Science.gov (United States)

Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P

2015-10-01

Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.
Republished review: Gene therapy for ocular diseases.

Science.gov (United States)

Liu, Melissa M; Tuo, Jingsheng; Chan, Chi-Chao

2011-07-01

The eye is an easily accessible, highly compartmentalised and immune-privileged organ that offers unique advantages as a gene therapy target. Significant advancements have been made in understanding the genetic pathogenesis of ocular diseases, and gene replacement and gene silencing have been implicated as potentially efficacious therapies. Recent improvements have been made in the safety and specificity of vector-based ocular gene transfer methods. Proof-of-concept for vector-based gene therapies has also been established in several experimental models of human ocular diseases. After nearly two decades of ocular gene therapy research, preliminary successes are now being reported in phase 1 clinical trials for the treatment of Leber congenital amaurosis. This review describes current developments and future prospects for ocular gene therapy. Novel methods are being developed to enhance the performance and regulation of recombinant adeno-associated virus- and lentivirus-mediated ocular gene transfer. Gene therapy prospects have advanced for a variety of retinal disorders, including retinitis pigmentosa, retinoschisis, Stargardt disease and age-related macular degeneration. Advances have also been made using experimental models for non-retinal diseases, such as uveitis and glaucoma. These methodological advancements are critical for the implementation of additional gene-based therapies for human ocular diseases in the near future.
FurIOS: a web-based tool for identification of Vibrionaceae species using the fur gene

DEFF Research Database (Denmark)

Machado, Henrique; Cardoso, Joao; Giubergia, Sonia

2017-01-01

-sequence. The input is a DNA sequence that can be uploaded on the web service; the output is a table containing the strain identifier, e-value, and percentage of identity for each of the matches with rows colored in green for hits with high probability of being the same species. The service is available on the web at......: http://www.cbs.dtu.dk/services/furIOS-1.0/. The fur-sequences can be derived either from genome sequences or from PCR-amplification of the genomic region encoding the fur gene. We have used 191 strains identified as Vibrionaceae based on 16S rRNA gene sequence to test the PCR method and the web service...

Haplotype-based case-control study on human apurinic/apyrimidinic endonuclease 1/redox effector factor-1 gene and essential hypertension.

Science.gov (United States)

Naganuma, Takahiro; Nakayama, Tomohiro; Sato, Naoyuki; Fu, Zhenyan; Soma, Masayoshi; Yamaguchi, Mai; Shimodaira, Masanori; Aoi, Noriko; Usami, Ron

2010-02-01

Oxidative DNA damage is involved in the pathophysiology of essential hypertension (EH), which is a multifactorial disorder. Apurinic/apyrimidinic endonuclease 1/redox effector factor-1 (APE1/REF-1) is an essential endonuclease in the base excision repair pathway of oxidatively damaged DNA, in addition to having reducing properties that promote the binding of redox-sensitive transcription factors. Blood pressure in APE1/REF-1-knockout mice is reported to be significantly higher than in wild-type mice. The aim of this study was to investigate the relationship between EH and the human APE1/REF-1 gene through a haplotype-based case-control study using single-nucleotide polymorphisms (SNPs). We selected five SNPs in the human APE1/REF-1 gene (rs1760944, rs3136814, rs17111967, rs3136817, and rs1130409), and performed case-control studies in 265 EH patients and 266 age-matched normotensive (NT) subjects. rs17111967 was found to show nonheterogeneity among Japanese subjects. There were no significant differences in the overall distribution of genotypes or alleles for each SNP between EH and NT groups. In the overall distribution of the haplotype-based case-control study constructed based on rs1760944, rs3136817, and rs1130409, the frequency of the G-T-T haplotype was significantly higher in the EH group than in the NT group (2.1% vs. 0.0%, P = 0.001). Multiple logistic regression analysis also revealed significant differences for the G-T-T haplotype, even after adjustment for confounding factors (OR = 8.600, 95% CI: 1.073-68.951, P = 0.043). Based on the present results, the G-T-T haplotype appears to be a genetic marker of EH, and the APE1/REF-1 gene appears to be a susceptibility gene for EH.
Optimal consistency in microRNA expression analysis using reference-gene-based normalization.

Science.gov (United States)

Wang, Xi; Gardiner, Erin J; Cairns, Murray J

2015-05-01

Normalization of high-throughput molecular expression profiles secures differential expression analysis between samples of different phenotypes or biological conditions, and facilitates comparison between experimental batches. While the same general principles apply to microRNA (miRNA) normalization, there is mounting evidence that global shifts in their expression patterns occur in specific circumstances, which pose a challenge for normalizing miRNA expression data. As an alternative to global normalization, which has the propensity to flatten large trends, normalization against constitutively expressed reference genes presents an advantage through their relative independence. Here we investigated the performance of reference-gene-based (RGB) normalization for differential miRNA expression analysis of microarray expression data, and compared the results with other normalization methods, including: quantile, variance stabilization, robust spline, simple scaling, rank invariant, and Loess regression. The comparative analyses were executed using miRNA expression in tissue samples derived from subjects with schizophrenia and non-psychiatric controls. We proposed a consistency criterion for evaluating methods by examining the overlapping of differentially expressed miRNAs detected using different partitions of the whole data. Based on this criterion, we found that RGB normalization generally outperformed global normalization methods. Thus we recommend the application of RGB normalization for miRNA expression data sets, and believe that this will yield a more consistent and useful readout of differentially expressed miRNAs, particularly in biological conditions characterized by large shifts in miRNA expression.
Development of a blood-based gene expression algorithm for assessment of obstructive coronary artery disease in non-diabetic patients

Directory of Open Access Journals (Sweden)

Ellis Stephen G

2011-03-01

Full Text Available Abstract Background Alterations in gene expression in peripheral blood cells have been shown to be sensitive to the presence and extent of coronary artery disease (CAD. A non-invasive blood test that could reliably assess obstructive CAD likelihood would have diagnostic utility. Results Microarray analysis of RNA samples from a 195 patient Duke CATHGEN registry case:control cohort yielded 2,438 genes with significant CAD association (p RT-PCR analysis of these 113 genes in a PREDICT cohort of 640 non-diabetic subject samples was used for algorithm development. Gene expression correlations identified clusters of CAD classifier genes which were reduced to meta-genes using LASSO. The final classifier for assessment of obstructive CAD was derived by Ridge Regression and contained sex-specific age functions and 6 meta-gene terms, comprising 23 genes. This algorithm showed a cross-validated estimated AUC = 0.77 (95% CI 0.73-0.81 in ROC analysis. Conclusions We have developed a whole blood classifier based on gene expression, age and sex for the assessment of obstructive CAD in non-diabetic patients from a combination of microarray and RT-PCR data derived from studies of patients clinically indicated for invasive angiography. Clinical trial registration information PREDICT, Personalized Risk Evaluation and Diagnosis in the Coronary Tree, http://www.clinicaltrials.gov, NCT00500617
Rapid and tunable method to temporally control gene editing based on conditional Cas9 stabilization. | Office of Cancer Genomics

Science.gov (United States)

The CRISPR/Cas9 system is a powerful tool for studying gene function. Here, we describe a method that allows temporal control of CRISPR/Cas9 activity based on conditional Cas9 destabilization. We demonstrate that fusing an FKBP12-derived destabilizing domain to Cas9 (DD-Cas9) enables conditional Cas9 expression and temporal control of gene editing in the presence of an FKBP12 synthetic ligand. This system can be easily adapted to co-express, from the same promoter, DD-Cas9 with any other gene of interest without co-modulation of the latter.
Comparison of Current Regulatory Status for Gene-Based Vaccines in the U.S., Europe and Japan

Directory of Open Access Journals (Sweden)

Yoshikazu Nakayama

2015-03-01

Full Text Available Gene-based vaccines as typified by plasmid DNA vaccines and recombinant viral-vectored vaccines are expected as promising solutions against infectious diseases for which no effective prophylactic vaccines exist such as HIV, dengue virus, Ebola virus and malaria, and for which more improved vaccines are needed such as tuberculosis and influenza virus. Although many preclinical and clinical trials have been conducted to date, no DNA vaccines or recombinant viral-vectored vaccines expressing heterologous antigens for human use have yet been licensed in the U.S., Europe or Japan. In this research, we describe the current regulatory context for gene-based prophylactic vaccines against infectious disease in the U.S., Europe, and Japan. We identify the important considerations, in particular, on the preclinical assessments that would allow these vaccines to proceed to clinical trials, and the differences on the regulatory pathway for the marketing authorization in each region.
A family-based association study of the HTR1B gene in eating disorders

Directory of Open Access Journals (Sweden)

Sandra Hernández

Full Text Available Objective: To explore the association of three polymorphisms of the serotonin receptor 1Dβ gene (HTR1B in the etiology of eating disorders and their relationship with clinical characteristics. Methods: We analyzed the G861C, A-161T, and A1180G polymorphisms of the HTR1B gene through a family-based association test (FBAT in 245 nuclear families. The sample was stratified into anorexia nervosa (AN spectrum and bulimia nervosa (BN spectrum. In addition, we performed a quantitative FBAT analysis of anxiety severity, depression severity, and Yale-Brown-Cornell Eating Disorders Scale (YBC-EDS in the AN and BN-spectrum groups. Results: FBAT analysis of the A-161T polymorphism found preferential transmission of allele A-161 in the overall sample. This association was stronger when the sample was stratified by spectrums, showing transmission disequilibrium between the A-161 allele and BN spectrum (z = 2.871, p = 0.004. Quantitative trait analysis showed an association between severity of anxiety symptoms and the C861 allele in AN-spectrum participants (z = 2.871, p = 0.004. We found no associations on analysis of depression severity or preoccupation and ritual scores in AN or BN-spectrum participants. Conclusions: Our preliminary findings suggest a role of the HTR1B gene in susceptibility to development of BN subtypes. Furthermore, this gene might have an impact on the severity of anxiety in AN-spectrum patients.
DGGE based whole-gene mutation scanning of the dystrophlin gene in Duchenne and Becker muscular dystrophy patients

NARCIS (Netherlands)

Hofstra, RMW; Mulder, IM; Vossen, R; de Koning-Gans, PAM; Kraak, M; Ginjaar, IB; van der Hout, AH; Bakker, E; Buys, CHCM; van Essen, AJ; den Dunnen, JT

2004-01-01

Duchenne and Becker muscular dystrophy (DMD and BMD) are caused by mutations in the dystrophin gene. Large rearrangements in the gene are found in about two,thirds of DMD patients, with similar to60% carrying deletions and 5-10% carrying duplications. Most of the remaining 30-35% of patients are
16S rRNA gene-based phylogenetic microarray for simultaneous identification of members of the genus Burkholderia.

Science.gov (United States)

Schönmann, Susan; Loy, Alexander; Wimmersberger, Céline; Sobek, Jens; Aquino, Catharine; Vandamme, Peter; Frey, Beat; Rehrauer, Hubert; Eberl, Leo

2009-04-01

For cultivation-independent and highly parallel analysis of members of the genus Burkholderia, an oligonucleotide microarray (phylochip) consisting of 131 hierarchically nested 16S rRNA gene-targeted oligonucleotide probes was developed. A novel primer pair was designed for selective amplification of a 1.3 kb 16S rRNA gene fragment of Burkholderia species prior to microarray analysis. The diagnostic performance of the microarray for identification and differentiation of Burkholderia species was tested with 44 reference strains of the genera Burkholderia, Pandoraea, Ralstonia and Limnobacter. Hybridization patterns based on presence/absence of probe signals were interpreted semi-automatically using the novel likelihood-based strategy of the web-tool Phylo- Detect. Eighty-eight per cent of the reference strains were correctly identified at the species level. The evaluated microarray was applied to investigate shifts in the Burkholderia community structure in acidic forest soil upon addition of cadmium, a condition that selected for Burkholderia species. The microarray results were in agreement with those obtained from phylogenetic analysis of Burkholderia 16S rRNA gene sequences recovered from the same cadmiumcontaminated soil, demonstrating the value of the Burkholderia phylochip for determinative and environmental studies.
Transcription activator-like effector-mediated regulation of gene expression based on the inducible packaging and delivery via designed extracellular vesicles

International Nuclear Information System (INIS)

Lainšček, Duško; Lebar, Tina; Jerala, Roman

2017-01-01

Transcription activator-like effector (TALE) proteins present a powerful tool for genome editing and engineering, enabling introduction of site-specific mutations, gene knockouts or regulation of the transcription levels of selected genes. TALE nucleases or TALE-based transcription regulators are introduced into mammalian cells mainly via delivery of the coding genes. Here we report an extracellular vesicle-mediated delivery of TALE transcription regulators and their ability to upregulate the reporter gene in target cells. Designed transcriptional activator TALE-VP16 fused to the appropriate dimerization domain was enriched as a cargo protein within extracellular vesicles produced by mammalian HEK293 cells stimulated by Ca-ionophore and using blue light- or rapamycin-inducible dimerization systems. Blue light illumination or rapamycin increased the amount of the TALE-VP16 activator in extracellular vesicles and their addition to the target cells resulted in an increased expression of the reporter gene upon addition of extracellular vesicles to the target cells. This technology therefore represents an efficient delivery for the TALE-based transcriptional regulators. - Highlights: • Inducible dimerization enriched cargo proteins within extracellular vesicles (EV). • Farnesylation surpassed LAMP-1 fusion proteins for the EV packing. • Extracellular vesicles were able to deliver TALE regulators to mammalian cells. • TALE mediated transcriptional activation was achieved by designed EV.
Association Study between BDNF Gene Polymorphisms and Autism by Three-Dimensional Gel-Based Microarray

Directory of Open Access Journals (Sweden)

Zuhong Lu

2009-06-01

Full Text Available Single nucleotide polymorphisms (SNPs are important markers which can be used in association studies searching for susceptible genes of complex diseases. High-throughput methods are needed for SNP genotyping in a large number of samples. In this study, we applied polyacrylamide gel-based microarray combined with dual-color hybridization for association study of four BDNF polymorphisms with autism. All the SNPs in both patients and controls could be analyzed quickly and correctly. Among four SNPs, only C270T polymorphism showed significant differences in the frequency of the allele (χ2 = 7.809, p = 0.005 and genotype (χ2 = 7.800, p = 0.020. In the haplotype association analysis, there was significant difference in global haplotype distribution between the groups (χ2 = 28.19，p = 3.44e-005. We suggest that BDNF has a possible role in the pathogenesis of autism. The study also show that the polyacrylamide gel-based microarray combined with dual-color hybridization is a rapid, simple and high-throughput method for SNPs genotyping, and can be used for association study of susceptible gene with disorders in large samples.
SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

Directory of Open Access Journals (Sweden)

Oelofse Dean

2010-04-01

Full Text Available Abstract Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L. Walp. We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i to normalize the data effectively using spike-in control spot normalization, and (ii to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped
Chitosan nanoparticle-based delivery of fused NKG2D–IL-21 gene suppresses colon cancer growth in mice

Directory of Open Access Journals (Sweden)

Tan L

2017-04-01

Full Text Available Lunmei Tan,1 Sen Han,2 Shizhen Ding,2 Weiming Xiao,3,4 Yanbing Ding,3 Li Qian,2,4 Chenming Wang,1,5 Weijuan Gong1–5 1Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, 2Department of Immunology, School of Medicine, 3Department of Gastroenterology, The Second Clinical Medical College, 4Department of Integrated Chinese and Western Medicine, School of Medicine, 5Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, People’s Republic of China Abstract: Nanoparticles can be loaded with exogenous DNA for the potential expression of cytokines with immune-stimulatory function. NKG2D identifies major histocompatibility complex class I chain-related protein in human and retinoic acid early induced transcript-1 in mouse, which acts as tumor-associated antigens. Biologic agents based on interleukin 21 (IL-21 have displayed antitumor activities through lymphocyte activation. The NKG2D–IL-21 fusion protein theoretically identifies tumor cells through NKG2D moiety and activates T cells through IL-21 moiety. In this study, double-gene fragments that encode the extracellular domains of NKG2D and IL-21 genes were connected and then inserted into the pcDNA3.1(– plasmid. PcDNA3.1–dsNKG2D–IL-21 plasmid nanoparticles based on chitosan were generated. Tumor cells pretransfected with dsNKG2D–IL-21 gene nanoparticles can activate natural killer (NK and CD8+ T cells in vitro. Serum IL-21 levels were enhanced in mice intramuscularly injected with the gene nanoparticles. DsNKG2D–IL-21 gene nanoparticles accumulated in tumor tissues after being intravenously injected for ~4–24 h. Treatment of dsNKG2D–IL-21 gene nanoparticles also retarded tumor growth and elongated the life span of tumor-bearing mice by activating NK and T cells in vivo. Thus, the dsNKG2D–IL-21 gene nanoparticles exerted efficient antitumor activities and would be potentially used for tumor therapy. Keywords: NKG2
Identification of Key Pathways and Genes in the Dynamic Progression of HCC Based on WGCNA.

Science.gov (United States)

Yin, Li; Cai, Zhihui; Zhu, Baoan; Xu, Cunshuan

2018-02-14

Hepatocellular carcinoma (HCC) is a devastating disease worldwide. Though many efforts have been made to elucidate the process of HCC, its molecular mechanisms of development remain elusive due to its complexity. To explore the stepwise carcinogenic process from pre-neoplastic lesions to the end stage of HCC, we employed weighted gene co-expression network analysis (WGCNA) which has been proved to be an effective method in many diseases to detect co-expressed modules and hub genes using eight pathological stages including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early and early, advanced HCC and very advanced HCC. Among the eight consecutive pathological stages, five representative modules are selected to perform canonical pathway enrichment and upstream regulator analysis by using ingenuity pathway analysis (IPA) software. We found that cell cycle related biological processes were activated at four neoplastic stages, and the degree of activation of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and the expression value of the genes decreased only at four neoplastic stages. The brown module, enriched in protein ubiquitination and ephrin receptor signaling pathways, correlated mainly with the very early stage of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, correlated with the cirrhotic stage only. The high degree hub genes were identified based on the protein-protein interaction (PPI) network and were verified by Kaplan-Meier survival analysis. The novel five high degree hub genes signature that was identified in our study may shed light on future prognostic and therapeutic approaches. Our study brings a new perspective to the understanding of the key pathways and genes in the dynamic changes of HCC progression. These findings shed light on further investigations.
Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

Science.gov (United States)

Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

2015-06-01

To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Imaging of Herpes Simplex Virus Type 1 Thymidine Kinase Gene Expression with Radiolabeled 5-(2-iodovinyl)-2'-deoxyuridine (IVDU) in Liver by Hydrodynamic-based Procedure

Energy Technology Data Exchange (ETDEWEB)

Song, In Ho; Lee, Tae Sup; Kang, Joo Hyun; Lee, Yong Jin; Kim, Kwang Il; An, Gwang Il; Chung, Wee Sup; Cheon, Gi Jeong; Choi, Chang Woon; Lim, Sang Moo [Korea Institute of Radiological and Medical Sciences, Seoul (Korea, Republic of)

2009-10-15

Hydrodynamic-based procedure is a simple and effective gene delivery method to lead a high gene expression in liver tissue. Non-invasive imaging reporter gene system has been used widely with herpes simplex virus type 1 thymidine kinase (HSV1-tk) and its various substrates. In the present study, we investigated to image the expression of HSV1-tk gene with 5-(2-iodovinyl)-2'-deoxyuridine (IVDU) in mouse liver by the hydrodynamicbased procedure. HSV1-tk or enhanced green fluorescence protein (EGFP) encoded plasmid DNA was transferred into the mouse liver by hydrodynamic injection. At 24 h post-injection, RT-PCR, biodistribution, fluorescence imaging, nuclear imaging and digital wholebody autoradiography (DWBA) were performed to confirm transferred gene expression. In RT-PCR assay using mRNA from the mouse liver, specific bands of HSV1-tk and EGFP gene were observed in HSV1-tk and EGFP expressing plasmid injected mouse, respectively. Higher uptake of radiolabeled IVDU was exhibited in liver of HSV1-tk gene transferred mouse by biodistribution study. In fluorescence imaging, the liver showed specific fluorescence signal in EGFP gene transferred mouse. Gamma-camera image and DWBA results showed that radiolabeled IVDU was accumulated in the liver of HSV1-tk gene transferred mouse. In this study, hydrodynamic-based procedure was effective in liver-specific gene delivery and it could be quantified with molecular imaging methods. Therefore, co-expression of HSV1-tk reporter gene and target gene by hydrodynamic-based procedure is expected to be a useful method for the evaluation of the target gene expression level with radiolabeled IVDU.
Gene expression inference with deep learning.

Science.gov (United States)

Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

2016-06-15

Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Junk DNA enhances pEI-based non-viral gene delivery

NARCIS (Netherlands)

Gaal, E.V.B. van; Oosting, R.S.; Hennink, W.E.; Crommelin, D.J.A.; Mastrobattista, E.

Gene therapy aims at delivering exogenous DNA into the nuclei of target cells to establish expression of a therapeutic protein. Non-viral gene delivery is examined as a safer alternative to viral approaches, but is presently characterized by a low efficiency. In the past years several non-viral
Reference gene selection for quantitative gene expression studies during biological invasions: A test on multiple genes and tissues in a model ascidian Ciona savignyi.

Science.gov (United States)

Huang, Xuena; Gao, Yangchun; Jiang, Bei; Zhou, Zunchun; Zhan, Aibin

2016-01-15

As invasive species have successfully colonized a wide range of dramatically different local environments, they offer a good opportunity to study interactions between species and rapidly changing environments. Gene expression represents one of the primary and crucial mechanisms for rapid adaptation to local environments. Here, we aim to select reference genes for quantitative gene expression analysis based on quantitative Real-Time PCR (qRT-PCR) for a model invasive ascidian, Ciona savignyi. We analyzed the stability of ten candidate reference genes in three tissues (siphon, pharynx and intestine) under two key environmental stresses (temperature and salinity) in the marine realm based on three programs (geNorm, NormFinder and delta Ct method). Our results demonstrated only minor difference for stability rankings among the three methods. The use of different single reference gene might influence the data interpretation, while multiple reference genes could minimize possible errors. Therefore, reference gene combinations were recommended for different tissues - the optimal reference gene combination for siphon was RPS15 and RPL17 under temperature stress, and RPL17, UBQ and TubA under salinity treatment; for pharynx, TubB, TubA and RPL17 were the most stable genes under temperature stress, while TubB, TubA and UBQ were the best under salinity stress; for intestine, UBQ, RPS15 and RPL17 were the most reliable reference genes under both treatments. Our results suggest that the necessity of selection and test of reference genes for different tissues under varying environmental stresses. The results obtained here are expected to reveal mechanisms of gene expression-mediated invasion success using C. savignyi as a model species. Copyright © 2015 Elsevier B.V. All rights reserved.
Analyses of the influencing factors of soil microbial functional gene diversity in tropical rainforest based on GeoChip 5.0

Directory of Open Access Journals (Sweden)

Jing Cong

2015-09-01

Full Text Available To examine soil microbial functional gene diversity and causative factors in tropical rainforests, we used a microarray-based metagenomic tool named GeoChip 5.0 to profile it. We found that high microbial functional gene diversity and different soil microbial metabolic potential for biogeochemical processes were considered to exist in tropical rainforest. Soil available nitrogen was the most associated with soil microbial functional gene structure. Here, we mainly describe the experiment design, the data processing, and soil biogeochemical analyses attached to the study in details, which could be published on BMC microbiology Journal in 2015, whose raw data have been deposited in NCBI's Gene Expression Omnibus (accession number GSE69171.
DTFP-Growth: Dynamic Threshold-Based FP-Growth Rule Mining Algorithm Through Integrating Gene Expression, Methylation, and Protein-Protein Interaction Profiles.

Science.gov (United States)

Mallik, Saurav; Bhadra, Tapas; Mukherji, Ayan; Mallik, Saurav; Bhadra, Tapas; Mukherji, Ayan; Mallik, Saurav; Bhadra, Tapas; Mukherji, Ayan

2018-04-01

Association rule mining is an important technique for identifying interesting relationships between gene pairs in a biological data set. Earlier methods basically work for a single biological data set, and, in maximum cases, a single minimum support cutoff can be applied globally, i.e., across all genesets/itemsets. To overcome this limitation, in this paper, we propose dynamic threshold-based FP-growth rule mining algorithm that integrates gene expression, methylation and protein-protein interaction profiles based on weighted shortest distance to find the novel associations among different pairs of genes in multi-view data sets. For this purpose, we introduce three new thresholds, namely, Distance-based Variable/Dynamic Supports (DVS), Distance-based Variable Confidences (DVC), and Distance-based Variable Lifts (DVL) for each rule by integrating co-expression, co-methylation, and protein-protein interactions existed in the multi-omics data set. We develop the proposed algorithm utilizing these three novel multiple threshold measures. In the proposed algorithm, the values of , , and are computed for each rule separately, and subsequently it is verified whether the support, confidence, and lift of each evolved rule are greater than or equal to the corresponding individual , , and values, respectively, or not. If all these three conditions for a rule are found to be true, the rule is treated as a resultant rule. One of the major advantages of the proposed method compared with other related state-of-the-art methods is that it considers both the quantitative and interactive significance among all pairwise genes belonging to each rule. Moreover, the proposed method generates fewer rules, takes less running time, and provides greater biological significance for the resultant top-ranking rules compared to previous methods.

Vector for IS element entrapment and functional characterization based on turning on expression of distal promoterless genes.

Science.gov (United States)

Szeverényi, I; Hodel, A; Arber, W; Olasz, F

1996-09-26

We constructed and characterized a novel trap vector for rapid isolation of insertion sequences. The strategy used for the isolation of IS elements is based on the ability of many IS elements to turn on the expression of otherwise silent genes distal to some sites of insertion. The simple transposition of an IS element can sometimes cause the constitutive expression of promoterless antibiotic resistance genes resulting in selectable phenotypes. The trap vector pAW1326 is based on a pBR322 replicon, it carries ampicillin and streptomycin resistance genes, and also silenced genes that confer chloramphenicol and kanamycin resistance once activated. The trap vector pAW1326 proved to be efficient and 85 percent of all isolated mutations were insertions. The majority of IS elements resident in the studied Escherichia coli strains tested became trapped, namely IS2, IS3, IS5, IS150, IS186 and Tn1000. We also encountered an insertion sequence, called IS10L/R-2, which is a hybrid of the two IS variants IS10L and IS10R. IS10L/R-2 is absent from most E. coli strains, but it is detectable in some strains such as JM109 which had been submitted to Tn10 mutagenesis. The distribution of the insertion sequences within the trap region was not random. Rather, the integration of chromosomal mobile genetic elements into the offered target sequence occurred in element-specific clusters. This is explained both by the target specificity and by the specific requirements for the activation of gene transcription by the DNA rearrangement. The employed trap vector pAW1326 proved to be useful for the isolation of mobile genetic elements, for a demonstration of their transposition activity as well as for the further characterization of some of the functional parameters of transposition.
Gene-based single nucleotide polymorphism markers for genetic and association mapping in common bean.

Science.gov (United States)

Galeano, Carlos H; Cortés, Andrés J; Fernández, Andrea C; Soler, Álvaro; Franco-Herrera, Natalia; Makunde, Godwill; Vanderleyden, Jos; Blair, Matthew W

2012-06-26

In common bean, expressed sequence tags (ESTs) are an underestimated source of gene-based markers such as insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). However, due to the nature of these conserved sequences, detection of markers is difficult and portrays low levels of polymorphism. Therefore, development of intron-spanning EST-SNP markers can be a valuable resource for genetic experiments such as genetic mapping and association studies. In this study, a total of 313 new gene-based markers were developed at target genes. Intronic variation was deeply explored in order to capture more polymorphism. Introns were putatively identified after comparing the common bean ESTs with the soybean genome, and the primers were designed over intron-flanking regions. The intronic regions were evaluated for parental polymorphisms using the single strand conformational polymorphism (SSCP) technique and Sequenom MassARRAY system. A total of 53 new marker loci were placed on an integrated molecular map in the DOR364 × G19833 recombinant inbred line (RIL) population. The new linkage map was used to build a consensus map, merging the linkage maps of the BAT93 × JALO EEP558 and DOR364 × BAT477 populations. A total of 1,060 markers were mapped, with a total map length of 2,041 cM across 11 linkage groups. As a second application of the generated resource, a diversity panel with 93 genotypes was evaluated with 173 SNP markers using the MassARRAY-platform and KASPar technology. These results were coupled with previous SSR evaluations and drought tolerance assays carried out on the same individuals. This agglomerative dataset was examined, in order to discover marker-trait associations, using general linear model (GLM) and mixed linear model (MLM). Some significant associations with yield components were identified, and were consistent with previous findings. In short, this study illustrates the power of intron-based markers for linkage and association mapping in
Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

Science.gov (United States)

Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza

2015-01-01

Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks.

Directory of Open Access Journals (Sweden)

Vipin Narang

Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.
Independent Gene Discovery and Testing

Science.gov (United States)

Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

2010-01-01

A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…
Evolution by Pervasive Gene Fusion in Antibiotic Resistance and Antibiotic Synthesizing Genes

Directory of Open Access Journals (Sweden)

Orla Coleman

2015-03-01

Full Text Available Phylogenetic (tree-based approaches to understanding evolutionary history are unable to incorporate convergent evolutionary events where two genes merge into one. In this study, as exemplars of what can be achieved when a tree is not assumed a priori, we have analysed the evolutionary histories of polyketide synthase genes and antibiotic resistance genes and have shown that their history is replete with convergent events as well as divergent events. We demonstrate that the overall histories of these genes more closely resembles the remodelling that might be seen with the children’s toy Lego, than the standard model of the phylogenetic tree. This work demonstrates further that genes can act as public goods, available for re-use and incorporation into other genetic goods.
A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

Directory of Open Access Journals (Sweden)

Ruzzo Walter L

2006-03-01

Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.
A genetic ensemble approach for gene-gene interaction identification

Directory of Open Access Journals (Sweden)

Ho Joshua WK

2010-10-01

Full Text Available Abstract Background It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging. Methods In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA and an ensemble of classifiers (called genetic ensemble. Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise double fault is designed to quantify the degree of complementarity. Conclusions Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR and is slightly better than Polymorphism Interaction Analysis (PIA, which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of
Gene-based technologies for livestock industries in the 3rd millennium

International Nuclear Information System (INIS)

Cunningham, E.P.

2005-01-01

The first complete genome sequence of an organism was for yeast, in 1996. Since then, the much larger task of doing a complete human sequence has been completed. Those of major domestic animals are following rapidly. It will always be impossible to foresee the full potential of such an explosion in knowledge, but aspects of gene-based technologies are already beginning to have an impact in the livestock sector. The first and most obvious area of impact concerns feed supply, which constitutes 50-75 percent of total costs in many livestock systems. Production costs for maize and soybean are being reduced by genetic modification of the crop for herbicide and insect resistance. Maize has been modified to reduce phosphorous and nitrogen excretion in swine and poultry, and also to provide a more valuable amino acid balance. Genetic modification of the animal is also possible. Most dramatically, the insertion of a growth hormone in the DNA of fish accelerates growth. However, in this and all other cases, the genetic modification (GM) of animals has produced profound physiological disturbances. At the same time, the administration of GM-produced growth hormone to dairy cows is now routine in the United States of America and several other countries. This is not permitted in Europe, where the attitude to all GM technologies has been much more cautious. Conventional selection programmes continue to deliver steady genetic improvement in all animal populations. New molecular methods offer the prospect of enhancing genetic gains, particularly for traits that are difficult or expensive to measure, or which have low heritability. Gene technologies have much to contribute to the control of disease in animals. As pressure to reduce antibiotic and drug use increases, genetically modified vaccines with proven specificity and distinguishable from natural infections are already in use. DNA typing is helping with rapid and precise diagnosis. In addition, the interaction of some pathogens
Prediction of lymphatic metastasis based on gene expression profile analysis after brachytherapy for early-stage oral tongue carcinoma

International Nuclear Information System (INIS)

Watanabe, Hiroshi; Mogushi, Kaoru; Miura, Masahiko; Yoshimura, Ryo-ichi; Kurabayashi, Tohru; Shibuya, Hitoshi; Tanaka, Hiroshi; Noda, Shuhei; Iwakawa, Mayumi; Imai, Takashi

2008-01-01

Background and purpose: The management of lymphatic metastasis of early-stage oral tongue carcinoma patients is crucial for its prognosis. The purpose of this study was to evaluate the predictive ability of lymphatic metastasis after brachytherapy (BRT) for early-stage tongue carcinoma based on gene expression profiling. Patients and methods: Pre-therapeutic biopsies from 39 patients with T1 or T2 tongue cancer were analyzed for gene expression signatures using Codelink Uniset Human 20K Bioarray. All patients were treated with low dose-rate BRT for their primary lesions and underwent strict follow-up under a wait-and-see policy for cervical lymphatic metastasis. Candidate genes were selected for predicting lymph-node status in the reference group by the permutation test. Predictive accuracy was further evaluated by the prediction strength (PS) scoring system using an independent validation group. Results: We selected a set of 19 genes whose expression differed significantly between classes with or without lymphatic metastasis in the reference group. The lymph-node status in the validation group was predicted by the PS scoring system with an accuracy of 76%. Conclusions: Gene expression profiling using 19 genes in primary tumor tissues may allow prediction of lymphatic metastasis after BRT for early-stage oral tongue carcinoma
BEEF CATTLE MUSCULARITY CANDIDATE GENES

Directory of Open Access Journals (Sweden)

Irida Novianti

2010-04-01

Full Text Available Muscularity is a potential indicator for the selection of more productive cattle. Mapping quantitative trait loci (QTL for traits related to muscularity is useful to identify the genomic regions where the genes affecting muscularity reside. QTL analysis from a Limousin-Jersey double backcross herd was conducted using QTL Express software with cohort and breed as the fixed effects. Nine QTL suggested to have an association with muscularity were identified on cattle chromosomes BTA 1, 2, 3, 4, 5, 8, 12, 14 and 17. The myostatin gene is located at the centromeric end of chromosome 2 and not surprisingly, the Limousin myostatin F94L variant accounted for the QTL on BTA2. However, when the myostatin F94L genotype was included as an additional fixed effect, the QTL on BTA17 was also no longer significant. This result suggests that there may be gene(s that have epistatic effects with myostatin located on cattle chromosome 17. Based on the position of the QTL in base pairs, all the genes that reside in the region were determined using the Ensembl data base (www.ensembl.org. There were two potential candidate genes residing within these QTL regions were selected. They were Smad nuclear interacting protein 1 (SNIP1 and similar to follistatin-like 5 (FSTL5. (JIIPB 2010 Vol 20 No 1: 1-10
[Phylogenetic analysis of closely related Leuconostoc citreum species based on partial housekeeping genes].

Science.gov (United States)

Lv, Qiang; Chen, Ming; Xu, Haiyan; Song, Yuqin; Sun, Zhihong; Dan, Tong; Sun, Tiansong

2013-07-04

Using the 16S rRNA, dnaA, murC and pyrG gene sequences, we identified the phylogenetic relationship among closely related Leuconostoc citreum species. Seven Leu. citreum strains originally isolated from sourdough were characterized by PCR methods to amplify the dnaA, murC and pyrG gene sequences, which were determined to assess the suitability as phylogenetic markers. Then, we estimated the genetic distance and constructed the phylogenetic trees including 16S rRNA and above mentioned three housekeeping genes combining with published corresponding sequences. By comparing the phylogenetic trees, the topology of three housekeeping genes trees were consistent with that of 16S rRNA gene. The homology of closely related Leu. citreum species among dnaA, murC, pyrG and 16S rRNA gene sequences were different, ranged from75.5% to 97.2%, 50.2% to 99.7%, 65.0% to 99.8% and 98.5% 100%, respectively. The phylogenetic relationship of three housekeeping genes sequences were highly consistent with the results of 16S rRNA gene sequence, while the genetic distance of these housekeeping genes were extremely high than 16S rRNA gene. Consequently, the dnaA, murC and pyrG gene are suitable for classification and identification closely related Leu. citreum species.
Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models.

Science.gov (United States)

Mahony, Shaun; McInerney, James O; Smith, Terry J; Golden, Aaron

2004-03-05

Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to gene-prediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.
Human DNA repair and recombination genes

International Nuclear Information System (INIS)

Thompson, L.H.; Weber, C.A.; Jones, N.J.

1988-09-01

Several genes involved in mammalian DNA repair pathways were identified by complementation analysis and chromosomal mapping based on hybrid cells. Eight complementation groups of rodent mutants defective in the repair of uv radiation damage are now identified. At least seven of these genes are probably essential for repair and at least six of them control the incision step. The many genes required for repair of DNA cross-linking damage show overlap with those involved in the repair of uv damage, but some of these genes appear to be unique for cross-link repair. Two genes residing on human chromosome 19 were cloned from genomic transformants using a cosmid vector, and near full-length cDNA clones of each gene were isolated and sequenced. Gene ERCC2 efficiently corrects the defect in CHO UV5, a nucleotide excision repair mutant. Gene XRCC1 normalizes repair of strand breaks and the excessive sister chromatid exchange in CHO mutant EM9. ERCC2 shows a remarkable /approximately/52% overall homology at both the amino acid and nucleotide levels with the yeast RAD3 gene. Evidence based on mutation induction frequencies suggests that ERCC2, like RAD3, might also be an essential gene for viability. 100 refs., 4 tabs
A family-based association study identified CYP17 as a candidate gene for obesity susceptibility in Caucasians.

Science.gov (United States)

Yan, H; Guo, Y; Yang, T-L; Zhao, L-J; Deng, H-W

2012-08-06

The cytochrome P450c17α gene (CYP17) encodes a key biosynthesis enzyme of estrogen, which is critical in regulating adipogenesis and adipocyte development in humans. We therefore hypothesized that CYP17 is a candidate gene for predicting obesity. In order to test this hypothesis, we performed a family-based association test to investigate the relationship between the CYP17 gene and obesity phenotypes in a large sample comprising 1873 subjects from 405 Caucasian nuclear families of European origin recruited by the Osteoporosis Research Center of Creighton University, USA. Both single SNPs and haplotypes were tested for associations with obesity-related phenotypes, including body mass index (BMI) and fat mass. We identified three SNPs to be significantly associated with BMI, including rs3740397, rs6163, and rs619824. We further characterized the linkage disequilibrium structure for CYP17 and found that the whole CYP17 gene was located in a single-linkage disequilibrium block. This block was observed to be significantly associated with BMI. A major haplotype in this block was significantly associated with both BMI and fat mass. In conclusion, we suggest that the CYP17 gene has an effect on obesity in the Caucasian population. Further independent studies will be needed to confirm our findings.
Analysis of gene expression profiles of soft tissue sarcoma using a combination of knowledge-based filtering with integration of multiple statistics.

Directory of Open Access Journals (Sweden)

Anna Takahashi

Full Text Available The diagnosis and treatment of soft tissue sarcomas (STS have been difficult. Of the diverse histological subtypes, undifferentiated pleomorphic sarcoma (UPS is particularly difficult to diagnose accurately, and its classification per se is still controversial. Recent advances in genomic technologies provide an excellent way to address such problems. However, it is often difficult, if not impossible, to identify definitive disease-associated genes using genome-wide analysis alone, primarily because of multiple testing problems. In the present study, we analyzed microarray data from 88 STS patients using a combination method that used knowledge-based filtering and a simulation based on the integration of multiple statistics to reduce multiple testing problems. We identified 25 genes, including hypoxia-related genes (e.g., MIF, SCD1, P4HA1, ENO1, and STAT1 and cell cycle- and DNA repair-related genes (e.g., TACC3, PRDX1, PRKDC, and H2AFY. These genes showed significant differential expression among histological subtypes, including UPS, and showed associations with overall survival. STAT1 showed a strong association with overall survival in UPS patients (logrank p = 1.84 × 10(-6 and adjusted p value 2.99 × 10(-3 after the permutation test. According to the literature, the 25 genes selected are useful not only as markers of differential diagnosis but also as prognostic/predictive markers and/or therapeutic targets for STS. Our combination method can identify genes that are potential prognostic/predictive factors and/or therapeutic targets in STS and possibly in other cancers. These disease-associated genes deserve further preclinical and clinical validation.
Establishment of a Cre recombinase based mutagenesis protocol for markerless gene deletion in Streptococcus suis.

Science.gov (United States)

Koczula, A; Willenborg, J; Bertram, R; Takamatsu, D; Valentin-Weigand, P; Goethe, R

2014-12-01

The lack of knowledge about pathogenicity mechanisms of Streptococcus (S.) suis is, at least partially, attributed to limited methods for its genetic manipulation. Here, we established a Cre-lox based recombination system for markerless gene deletions in S. suis serotype 2 with high selective pressure and without undesired side effects. Copyright © 2014 Elsevier B.V. All rights reserved.
Dual delivery systems based on polyamine analog BENSpm as prodrug and gene delivery vectors

Science.gov (United States)

Zhu, Yu

Combination drug and gene therapy shows promise in cancer treatment. However, the success of such strategy requires careful selection of the therapeutic agents, as well as development of efficient delivery vectors. BENSpm (N 1, N11-bisethylnorspermine), a polyamine analogue targeting the intracellular polyamine pathway, draws our special attention because of the following reasons: (1) polyamine pathway is frequently dysregulated in cancer; (2) BENSpm exhibits multiple functions to interfere with the polyamine pathway, such as to up-regulate polyamine metabolism enzymes and down-regulate polyamine biosynthesis enzymes. Therefore BENSpm depletes all natural polyamines and leads to apoptosis and cell growth inhibition in a wide range of cancers; (3) preclinical studies proved that BENSpm can act synergistically with various chemotherapy agents, making it a promising candidate in combination therapy; (4) multiple positive charges in BENSpm enable it as a suitable building block for cationic polymers, which can be further applied to gene delivery. In this dissertation, our goal was to design dual-function delivery vector based on BENSpm that can function as a gene delivery vector and, after intracellular degradation, as an active anticancer agent targeting dysregulated polyamine metabolism. We first demonstrated strong synergism between BENSpm and a potential therapeutic gene product TRAIL. Strong synergism was obtained in both estrogen-dependent MCF-7 breast cancer cells and triple-negative MDA-MB-231 breast cancer cells. Significant dose reduction of TRAIL in combination with BENSpm in MDA-MB-231 cells, together with the fact that BENSpm rendered MCF-7 cells more sensitive to TRAIL treatment verified our rationale of designing BENSpm-based delivery platform. This was expected to be beneficial for overcoming drug resistance in chemotherapy, as well as boosting the therapeutic effect of therapeutic genes. We first designed a lipid-based BENSpm dual vector (Lipo
Transgenic Sugarcane Resistant to Sorghum mosaic virus Based on Coat Protein Gene Silencing by RNA Interference

Directory of Open Access Journals (Sweden)

Jinlong Guo

2015-01-01

Full Text Available As one of the critical diseases of sugarcane, sugarcane mosaic disease can lead to serious decline in stalk yield and sucrose content. It is mainly caused by Potyvirus sugarcane mosaic virus (SCMV and/or Sorghum mosaic virus (SrMV, with additional differences in viral strains. RNA interference (RNAi is a novel strategy for producing viral resistant plants. In this study, based on multiple sequence alignment conducted on genomic sequences of different strains and isolates of SrMV, the conserved region of coat protein (CP genes was selected as the target gene and the interference sequence with size of 423 bp in length was obtained through PCR amplification. The RNAi vector pGII00-HACP with an expression cassette containing both hairpin interference sequence and cp4-epsps herbicide-tolerant gene was transferred to sugarcane cultivar ROC22 via Agrobacterium-mediated transformation. After herbicide screening, PCR molecular identification, and artificial inoculation challenge, anti-SrMV positive transgenic lines were successfully obtained. SrMV resistance rate of the transgenic lines with the interference sequence was 87.5% based on SrMV challenge by artificial inoculation. The genetically modified SrMV-resistant lines of cultivar ROC22 provide resistant germplasm for breeding lines and can also serve as resistant lines having the same genetic background for study of resistance mechanisms.
Differential gene expression in an elite hybrid rice cultivar (Oryza sativa, L and its parental lines based on SAGE data

Directory of Open Access Journals (Sweden)

Chen Chen

2007-09-01

Full Text Available Abstract Background It was proposed that differentially-expressed genes, aside from genetic variations affecting protein processing and functioning, between hybrid and its parents provide essential candidates for studying heterosis or hybrid vigor. Based our serial analysis of gene expression (SAGE data from an elite Chinese super-hybrid rice (LYP9 and its parental cultivars (93-11 and PA64s in three major tissue types (leaves, roots and panicles at different developmental stages, we analyzed the transcriptome and looked for candidate genes related to rice heterosis. Results By using an improved strategy of tag-to-gene mapping and two recently annotated genome assemblies (93-11 and PA64s, we identified 10,268 additional high-quality tags, reaching a grand total of 20,595 together with our previous result. We further detected 8.5% and 5.9% physically-mapped genes that are differentially-expressed among the triad (in at least one of the three stages with P-values less than 0.05 and 0.01, respectively. These genes distributed in 12 major gene expression patterns; among them, 406 up-regulated and 469 down-regulated genes (P Conclusion We improved tag-to-gene mapping strategy by combining information from transcript sequences and rice genome annotation, and obtained a more comprehensive view on genes that related to rice heterosis. The candidates for heterosis-related genes among different genotypes provided new avenue for exploring the molecular mechanism underlying heterosis.

Combining many interaction networks to predict gene function and analyze gene lists.

Science.gov (United States)

Mostafavi, Sara; Morris, Quaid

2012-05-01

In this article, we review how interaction networks can be used alone or in combination in an automated fashion to provide insight into gene and protein function. We describe the concept of a "gene-recommender system" that can be applied to any large collection of interaction networks to make predictions about gene or protein function based on a query list of proteins that share a function of interest. We discuss these systems in general and focus on one specific system, GeneMANIA, that has unique features and uses different algorithms from the majority of other systems. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Whole genome homology-based identification of candidate genes ...

African Journals Online (AJOL)

Josephine Erhiakporeh

2016-07-06

Jul 6, 2016 ... candidate genes for drought tolerance in sesame. (Sesamum ... Our results provided genomic resources for further functional analysis and genetic engineering .... reverse transcribed using the Reverse Transcription System.
A gene-based radiation hybrid map of the gilthead sea bream Sparus aurata refines and exploits conserved synteny with Tetraodon nigroviridis

Directory of Open Access Journals (Sweden)

Tsalavouta Matina

2007-02-01

Full Text Available Abstract Background Comparative teleost studies are of great interest since they are important in aquaculture and in evolutionary issues. Comparing genomes of fully sequenced model fish species with those of farmed fish species through comparative mapping offers shortcuts for quantitative trait loci (QTL detections and for studying genome evolution through the identification of regions of conserved synteny in teleosts. Here a comparative mapping study is presented by radiation hybrid (RH mapping genes of the gilthead sea bream Sparus aurata, a non-model teleost fish of commercial and evolutionary interest, as it represents the worldwide distributed species-rich family of Sparidae. Results An additional 74 microsatellite markers and 428 gene-based markers appropriate for comparative mapping studies were mapped on the existing RH map of Sparus aurata. The anchoring of the RH map to the genetic linkage map resulted in 24 groups matching the karyotype of Sparus aurata. Homologous sequences to Tetraodon were identified for 301 of the gene-based markers positioned on the RH map of Sparus aurata. Comparison between Sparus aurata RH groups and Tetraodon chromosomes (karyotype of Tetraodon consists of 21 chromosomes in this study reveals an unambiguous one-to-one relationship suggesting that three Tetraodon chromosomes correspond to six Sparus aurata radiation hybrid groups. The exploitation of this conserved synteny relationship is furthermore demonstrated by in silico mapping of gilthead sea bream expressed sequence tags (EST that give a significant similarity hit to Tetraodon. Conclusion The addition of primarily gene-based markers increased substantially the density of the existing RH map and facilitated comparative analysis. The anchoring of this gene-based radiation hybrid map to the genome maps of model species broadened the pool of candidate genes that mainly control growth, disease resistance, sex determination and reversal, reproduction as well
Gene doping in modern sport.

OpenAIRE

MAREK SAWCZUK; AGNIESZKA MACIEJEWSKA; PAWEL CIESZCZYK,

2009-01-01

Background: The subject of this paper is gene doping, which should be understood as "he non-therapeutic use of cells, genes, genetic elements, or of the modulation of gene expression, having the capacity to improve athletic performance". The authors of this work, based on the review of literature and previous research, make an attempt at wider characterization of gene doping and the discussion of related potential threats.Methods: This is a comprehensive survey of literature on the latest app...
A Generalized Approach for Measuring Relationships Among Genes.

Science.gov (United States)

Wang, Lijun; Ahsan, Md Asif; Chen, Ming

2017-07-21

Several methods for identifying relationships among pairs of genes have been developed. In this article, we present a generalized approach for measuring relationships between any pairs of genes, which is based on statistical prediction. We derive two particular versions of the generalized approach, least squares estimation (LSE) and nearest neighbors prediction (NNP). According to mathematical proof, LSE is equivalent to the methods based on correlation; and NNP is approximate to one popular method called the maximal information coefficient (MIC) according to the performances in simulations and real dataset. Moreover, the approach based on statistical prediction can be extended from two-genes relationships to multi-genes relationships. This application would help to identify relationships among multi-genes.
Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

Directory of Open Access Journals (Sweden)

Andrew J Burt

Full Text Available Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris. Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08 where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Prediction of drug efficacy for cancer treatment based on comparative analysis of chemosensitivity and gene expression data

DEFF Research Database (Denmark)

Wan, Peng; Li, Qiyuan; Larsen, Jens Erik Pontoppidan

2012-01-01

The NCI60 database is the largest available collection of compounds with measured anti-cancer activity. The strengths and limitations for using the NCI60 database as a source of new anti-cancer agents are explored and discussed in relation to previous studies. We selected a sub-set of 2333...... and in a data set of expression profiles of 1901 genes for the corresponding tumor cell lines. Five clusters were identified based on the gene expression data using self-organizing maps (SOM), comprising leukemia, melanoma, ovarian and prostate, basal breast, and luminal breast cancer cells, respectively....... The strong difference in gene expression between basal and luminal breast cancer cells was reflected clearly in the chemosensitivity data. Although most compounds in the data set were of low potency, high efficacy compounds that showed specificity with respect to tissue of origin could be found. Furthermore...
KMgene: a unified R package for gene-based association analysis for complex traits.

Science.gov (United States)

Yan, Qi; Fang, Zhou; Chen, Wei; Stegle, Oliver

2018-02-09

In this report, we introduce an R package KMgene for performing gene-based association tests for familial, multivariate or longitudinal traits using kernel machine (KM) regression under a generalized linear mixed model (GLMM) framework. Extensive simulations were performed to evaluate the validity of the approaches implemented in KMgene. http://cran.r-project.org/web/packages/KMgene. qi.yan@chp.edu or wei.chen@chp.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.
Mesenchymal stem cell-based NK4 gene therapy in nude mice bearing gastric cancer xenografts

Directory of Open Access Journals (Sweden)

Zhu Y

2014-12-01

tissues after systemic injection. The microvessel density of tumor xenografts was decreased, and tumor cellular apoptosis was significantly induced in the mice treated with MSCs-NK4 compared to control mice. These findings demonstrate that MSC-based NK4 gene therapy can obviously inhibit the growth of gastric cancer xenografts, and MSCs are a better vehicle for NK4 gene therapy than lentiviral vectors. Further studies are warranted to explore the efficacy and safety of the MSC-based NK4 gene therapy in animals and cancer patients.Keywords: gastric cancer, gene therapy, tumor xenograft, hepatocyte growth factor, lentivirus, angiogenesis, apoptosis
Amino acid-substituted gemini surfactant-based nanoparticles as safe and versatile gene delivery agents.

Science.gov (United States)

Singh, Jagbir; Yang, Peng; Michel, Deborah; Verrall, Ronald E; Foldvari, Marianna; Badea, Ildiko

2011-05-01

Gene based therapy represents an important advance in the treatment of diseases that heretofore have had either no treatment or cure. To capitalize on the true potential of gene therapy, there is a need to develop better delivery systems that can protect these therapeutic biomolecules and deliver them safely to the target sites. Recently, we have designed and developed a series of novel amino acid-substituted gemini surfactants with the general chemical formula C(12)H(25) (CH(3))(2)N(+)-(CH(2))(3)-N(AA)-(CH(2))(3)-N(+) (CH(3))(2)-C(12)H(25) (AA= glycine, lysine, glycyl-lysine and, lysyl-lysine). These compounds were synthesized and tested in rabbit epithelial cells using a model plasmid and a helper lipid. Plasmid/gemini/lipid (P/G/L) nanoparticles formulated using these novel compounds achieved higher gene expression than the nanoparticles containing the parent unsubstituted compound. In this study, we evaluated the cytotoxicity of P/G/L nanoparticles and explored the relationship between transfection efficiency/toxicity and their physicochemical characteristics (such as size, binding properties, etc.). An overall low toxicity is observed for all complexes with no significant difference among substituted and unsubstituted compounds. An interesting result revealed by the dye exclusion assay suggests a more balanced protection of the DNA by the glycine and glycyl-lysine substituted compounds. Thus, the higher transfection efficiency is attributed to the greater biocompatibility and flexibility of the amino acid/peptide-substituted gemini surfactants and demonstrates the feasibility of using amino acid-substituted gemini surfactants as gene carriers for the treatment of diseases affecting epithelial tissue.
Bayesian assignment of gene ontology terms to gene expression experiments

Science.gov (United States)

Sykacek, P.

2012-01-01

Motivation: Gene expression assays allow for genome scale analyses of molecular biological mechanisms. State-of-the-art data analysis provides lists of involved genes, either by calculating significance levels of mRNA abundance or by Bayesian assessments of gene activity. A common problem of such approaches is the difficulty of interpreting the biological implication of the resulting gene lists. This lead to an increased interest in methods for inferring high-level biological information. A common approach for representing high level information is by inferring gene ontology (GO) terms which may be attributed to the expression data experiment. Results: This article proposes a probabilistic model for GO term inference. Modelling assumes that gene annotations to GO terms are available and gene involvement in an experiment is represented by a posterior probabilities over gene-specific indicator variables. Such probability measures result from many Bayesian approaches for expression data analysis. The proposed model combines these indicator probabilities in a probabilistic fashion and provides a probabilistic GO term assignment as a result. Experiments on synthetic and microarray data suggest that advantages of the proposed probabilistic GO term inference over statistical test-based approaches are in particular evident for sparsely annotated GO terms and in situations of large uncertainty about gene activity. Provided that appropriate annotations exist, the proposed approach is easily applied to inferring other high level assignments like pathways. Availability: Source code under GPL license is available from the author. Contact: peter.sykacek@boku.ac.at PMID:22962488
Bayesian assignment of gene ontology terms to gene expression experiments.

Science.gov (United States)

Sykacek, P

2012-09-15

Gene expression assays allow for genome scale analyses of molecular biological mechanisms. State-of-the-art data analysis provides lists of involved genes, either by calculating significance levels of mRNA abundance or by Bayesian assessments of gene activity. A common problem of such approaches is the difficulty of interpreting the biological implication of the resulting gene lists. This lead to an increased interest in methods for inferring high-level biological information. A common approach for representing high level information is by inferring gene ontology (GO) terms which may be attributed to the expression data experiment. This article proposes a probabilistic model for GO term inference. Modelling assumes that gene annotations to GO terms are available and gene involvement in an experiment is represented by a posterior probabilities over gene-specific indicator variables. Such probability measures result from many Bayesian approaches for expression data analysis. The proposed model combines these indicator probabilities in a probabilistic fashion and provides a probabilistic GO term assignment as a result. Experiments on synthetic and microarray data suggest that advantages of the proposed probabilistic GO term inference over statistical test-based approaches are in particular evident for sparsely annotated GO terms and in situations of large uncertainty about gene activity. Provided that appropriate annotations exist, the proposed approach is easily applied to inferring other high level assignments like pathways. Source code under GPL license is available from the author. peter.sykacek@boku.ac.at.
Discovering gene annotations in biomedical text databases

Directory of Open Access Journals (Sweden)

Ozsoyoglu Gultekin

2008-03-01

Full Text Available Abstract Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i automating the annotation of genomic entities with Gene Ontology concepts, and (ii providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate
Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning.

Science.gov (United States)

Doll, Bradley B; Bath, Kevin G; Daw, Nathaniel D; Frank, Michael J

2016-01-27

Considerable evidence suggests that multiple learning systems can drive behavior. Choice can proceed reflexively from previous actions and their associated outcomes, as captured by "model-free" learning algorithms, or flexibly from prospective consideration of outcomes that might occur, as captured by "model-based" learning algorithms. However, differential contributions of dopamine to these systems are poorly understood. Dopamine is widely thought to support model-free learning by modulating plasticity in striatum. Model-based learning may also be affected by these striatal effects, or by other dopaminergic effects elsewhere, notably on prefrontal working memory function. Indeed, prominent demonstrations linking striatal dopamine to putatively model-free learning did not rule out model-based effects, whereas other studies have reported dopaminergic modulation of verifiably model-based learning, but without distinguishing a prefrontal versus striatal locus. To clarify the relationships between dopamine, neural systems, and learning strategies, we combine a genetic association approach in humans with two well-studied reinforcement learning tasks: one isolating model-based from model-free behavior and the other sensitive to key aspects of striatal plasticity. Prefrontal function was indexed by a polymorphism in the COMT gene, differences of which reflect dopamine levels in the prefrontal cortex. This polymorphism has been associated with differences in prefrontal activity and working memory. Striatal function was indexed by a gene coding for DARPP-32, which is densely expressed in the striatum where it is necessary for synaptic plasticity. We found evidence for our hypothesis that variations in prefrontal dopamine relate to model-based learning, whereas variations in striatal dopamine function relate to model-free learning. Decisions can stem reflexively from their previously associated outcomes or flexibly from deliberative consideration of potential choice outcomes
AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density.

Science.gov (United States)

Zhao, X G; Dai, W; Li, Y; Tian, L

2011-11-01

The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no reliable numerical algorithm identifying optimal combination of a set of biomarkers to maximize the AUC, especially when the number of biomarkers is large. We have proposed a novel AUC-based statistical ensemble methods for combining multiple biomarkers to differentiate a binary response of interest. Specifically, we propose to replace the non-continuous and non-convex AUC objective function by a convex surrogate loss function, whose minimizer can be efficiently identified. With the established framework, the lasso and other regularization techniques enable feature selections. Extensive simulations have demonstrated the superiority of the new methods to the existing methods. The proposal has been applied to a gene expression dataset to construct gene expression scores to differentiate elderly women with low bone mineral density (BMD) and those with normal BMD. The AUCs of the resulting scores in the independent test dataset has been satisfactory. Aiming for directly maximizing AUC, the proposed AUC-based ensemble method provides an efficient means of generating a stable combination of multiple biomarkers, which is especially useful under the high-dimensional settings. lutian@stanford.edu. Supplementary data are available at Bioinformatics online.
Dynamic association rules for gene expression data analysis.

Science.gov (United States)

Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung

2015-10-14

The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed
Heterologous Reconstitution of the Intact Geodin Gene Cluster in Aspergillus nidulans through a Simple and Versatile PCR Based Approach

DEFF Research Database (Denmark)

Nielsen, Morten Thrane; Nielsen, Jakob Blæsbjerg; Anyaogu, Dianna Chinyere

2013-01-01

was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to ransformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were...... of solid methodology for genetic manipulation of most species severely hampers pathway haracterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus...... successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC) encodes a polyketide synthase, ATEG_08453 (gedR) encodes a transcription factor...
High rate of translocation-based gene birth on the Drosophila Y chromosome.

Science.gov (United States)

Tobler, Ray; Nolte, Viola; Schlötterer, Christian

2017-10-31

The Y chromosome is a unique genetic environment defined by a lack of recombination and male-limited inheritance. The Drosophila Y chromosome has been gradually acquiring genes from the rest of the genome, with only seven Y-linked genes being gained over the past 63 million years (0.12 gene gains per million years). Using a next-generation sequencing (NGS)-powered genomic scan, we show that gene transfers to the Y chromosome are much more common than previously suspected: at least 25 have arisen across three Drosophila species over the past 5.4 million years (1.67 per million years for each lineage). The gene transfer rate is significantly lower in Drosophila melanogaster than in the Drosophila simulans clade, primarily due to Y-linked retrotranspositions being significantly more common in the latter. Despite all Y-linked gene transfers being evolutionarily recent (Drosophila Y chromosome to be more dynamic than previously appreciated. Our analytical method provides a powerful means to identify Y-linked gene transfers and will help illuminate the evolutionary dynamics of the Y chromosome in Drosophila and other species. Copyright © 2017 the Author(s). Published by PNAS.
Mannosylated Chitosan Nanoparticles Based Macrophage-Targeting Gene Delivery System Enhanced Cellular Uptake and Improved Transfection Efficiency.

Science.gov (United States)

Peng, Yixing; Yao, Wenjun; Wang, Bo; Zong, Li

2015-04-01

Gene transfer mediated by mannosylated chitosan (MCS) is a safe and promising approach for gene and vaccine delivery. MCS nanoparticles based gene delivery system showed high in vivo delivery efficiency and elicited strong immune responses in mice. However, little knowledge about the cell binding, transfection efficiency and intracellular trafficking of MCS nanoparticles had been acquired. In this study, using gastrin-releasing peptide as a model plasmid (pGRP), the binding of MCS/pGRP nanoparticles to macrophages and the intracellular trafficking of MCS/pGRP nanoparticles in macrophages were investigated. MCS-mediated transfection efficiency in macrophages was also evaluated using pGL-3 as a reporter gene. The results showed that the binding and transfection efficiency of MCS nanoparticles in macrophages was higher than that of CS, which was attributed to the interaction between mannose ligands in MCS and mannose receptors on the surface of macrophages. Observation with a confocal laser scanning microscope indicated the cellular uptake of MCS/pGRP nanoparticles were more than that of CS/pGRP nanoparticles in macrophages. MCS/pGRP nanoparticles were taken up by macrophages and most of them were entrapped in endosomal/lysosomal compartments. After the nanoparticles escaping from endosomal/lysosomal compartments, naked pGRP entered the nucleus, and a few MCS might enter the nucleus in terms of nanoparticles. Overall, MCS has the potential to be an excellent macrophage-targeting gene delivery carrier.
Models of gene gain and gene loss for probabilistic reconstruction of gene content in the last universal common ancestor of life.

Science.gov (United States)

Kannan, Lavanya; Li, Hua; Rubinstein, Boris; Mushegian, Arcady

2013-12-19

The problem of probabilistic inference of gene content in the last common ancestor of several extant species with completely sequenced genomes is: for each gene that is conserved in all or some of the genomes, assign the probability that its ancestral gene was present in the genome of their last common ancestor. We have developed a family of models of gene gain and gene loss in evolution, and applied the maximum-likelihood approach that uses phylogenetic tree of prokaryotes and the record of orthologous relationships between their genes to infer the gene content of LUCA, the Last Universal Common Ancestor of all currently living cellular organisms. The crucial parameter, the ratio of gene losses and gene gains, was estimated from the data and was higher in models that take account of the number of in-paralogs in genomes than in models that treat gene presences and absences as a binary trait. While the numbers of genes that are placed confidently into LUCA are similar in the ML methods and in previously published methods that use various parsimony-based approaches, the identities of genes themselves are different. Most of the models of either kind treat the genes found in many existing genomes in a similar way, assigning to them high probabilities of being ancestral ("high ancestrality"). The ML models are more likely than others to assign high ancestrality to the genes that are relatively rare in the present-day genomes.

GeneTrailExpress: a web-based pipeline for the statistical evaluation of microarray experiments

Directory of Open Access Journals (Sweden)

Kohlbacher Oliver

2008-12-01

Full Text Available Abstract Background High-throughput methods that allow for measuring the expression of thousands of genes or proteins simultaneously have opened new avenues for studying biochemical processes. While the noisiness of the data necessitates an extensive pre-processing of the raw data, the high dimensionality requires effective statistical analysis methods that facilitate the identification of crucial biological features and relations. For these reasons, the evaluation and interpretation of expression data is a complex, labor-intensive multi-step process. While a variety of tools for normalizing, analysing, or visualizing expression profiles has been developed in the last years, most of these tools offer only functionality for accomplishing certain steps of the evaluation pipeline. Results Here, we present a web-based toolbox that provides rich functionality for all steps of the evaluation pipeline. Our tool GeneTrailExpress offers besides standard normalization procedures powerful statistical analysis methods for studying a large variety of biological categories and pathways. Furthermore, an integrated graph visualization tool, BiNA, enables the user to draw the relevant biological pathways applying cutting-edge graph-layout algorithms. Conclusion Our gene expression toolbox with its interactive visualization of the pathways and the expression values projected onto the nodes will simplify the analysis and interpretation of biochemical pathways considerably.
Use of reporter-gene based bacteria to quantify phenanthrene biodegradation and toxicity in soil

Energy Technology Data Exchange (ETDEWEB)

Shin, Doyun [Department of Civil and Environmental Engineering, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of); Moon, Hee Sun [School of Earth and Environmental Science, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of); Lin, Chu-Ching; Barkay, Tamar [Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Drive, New Brunswick, NJ 08901 (United States); Nam, Kyoungphile, E-mail: kpnam@snu.ac.k [Department of Civil and Environmental Engineering, Seoul National University, Gwanakno 599, Seoul 151-742 (Korea, Republic of)

2011-02-15

A phenanthrene-degrading bacterium, Sphingomonas paucimobilis EPA505 was used to construct two fluorescence-based reporter strains. Strain D harboring gfp gene was constructed to generate green fluorescence when the strain started to biodegrade phenanthrene. Strain S possessing gef gene was designed to die once phenanthrene biodegradation was initiated and thus to lose green fluorescence when visualized by a live/dead cell staining. Confocal laser scanning microscopic observation followed by image analysis demonstrates that the fluorescence intensity generated by strain D increased and the intensity by strain S decreased linearly at the phenanthrene concentration of up to 200 mg/L. Such quantitative increase and decrease of fluorescence intensity in strain D (i.e., from 1 to 11.90 {+-} 0.72) and strain S (from 1 to 0.40 {+-} 0.07) were also evident in the presence of Ottawa sand spiked with the phenanthrene up to 1000 mg/kg. The potential use of the reporter strains in quantitatively determining biodegradable or toxic phenanthrene was discussed. - Research highlights: A novel reporter bacterial strain has been developed. The bacterium can quantitatively determine the change in fluorescence intensity. The intensity can represent the bioavailable phenanthrene in solid matrix. - A cell-killing gene harboring reporter bacterium shows phenanthrene toxicity.
A Cas9-based toolkit to program gene expression in Saccharomyces cerevisiae

DEFF Research Database (Denmark)

Apel, Amanda Reider; d'Espaux, Leo; Wehrs, Maren

2017-01-01

of these parts via a web-based tool, that automates the generation of DNA fragments for integration. Our system builds upon existing gene editing methods in the thoroughness with which the parts are standardized and characterized, the types and number of parts available and the ease with which our methodology...... can be used to perform genetic edits in yeast. We demonstrated the applicability of this toolkit by optimizing the expression of a challenging but industrially important enzyme, taxadiene synthase (TXS). This approach enabled us to diagnose an issue with TXS solubility, the resolution of which yielded...
Apoptosis Gene Information System--AGIS.

Science.gov (United States)

Sakharkar, Kishore R; Clement, Marie V; Chow, Vincent T K; Pervaiz, Shazib

2006-05-01

Genes implicated in apoptosis have great relevance to biology, medicine and oncology. Here, we describe a unique resource, Apoptosis Gene Information System (AGIS) that provides data for over 2400 genes involved directly or indirectly, in apoptotic pathways of more than 350 different organisms. The organization of this information system is based on the principle of one-gene, one record. AGIS will be updated on a six monthly basis as new information becomes available. AGIS can be accessed at: http://www.cellfate.org/AGIS/.
Reporter gene imaging: potential impact on therapy

International Nuclear Information System (INIS)

Serganova, Inna; Blasberg, Ronald

2005-01-01

Positron emission tomography (PET)-based molecular-genetic imaging in living organisms has enjoyed exceptional growth over the past 5 years; this is particularly striking since it has been identified as a new discipline only within the past decade. Positron emission tomography is one of three imaging technologies (nuclear, magnetic resonance and optical) that has begun to incorporate methods that are established in molecular and cell biology research. The convergence of these disciplines and the wider application of multi-modality imaging are at the heart of this success story. Most current molecular-genetic imaging strategies are 'indirect,' coupling a 'reporter gene' with a complimentary 'reporter probe.' Reporter gene constructs can be driven by constitutive promoter elements and used to monitor gene therapy vectors and the efficacy of trans gene targeting and transduction, as well as to monitor adoptive cell-based therapies. Inducible promoters can be used as 'sensors' to regulate the magnitude of reporter gene expression and can be used to provide information about endogenous cell processes. Reporter systems can also be constructed to monitor mRNA stabilization and specific protein-protein interactions. Promoters can be cell specific and restrict transgene expression to certain tissue and organs. The translation of reporter gene imaging to specific clinical applications is discussed. Several examples that have potential for patient imaging studies in the near future include monitoring adenoviral-based gene therapy, oncolytic herpes virus therapy, adoptive cell-based therapies and Salmonella-based tumor-targeted cancer therapy and imaging. The primary translational applications of noninvasive in vivo reporter gene imaging are likely to be (a) quantitative monitoring of the gene therapy vector and the efficacy of transduction in clinical protocols, by imaging the location, extent and duration of transgene expression; (b) monitoring cell trafficking, targeting
Gene therapy in periodontics.

Science.gov (United States)

Chatterjee, Anirban; Singh, Nidhi; Saluja, Mini

2013-03-01

GENES are made of DNA - the code of life. They are made up of two types of base pair from different number of hydrogen bonds AT, GC which can be turned into instruction. Everyone inherits genes from their parents and passes them on in turn to their children. Every person's genes are different, and the changes in sequence determine the inherited differences between each of us. Some changes, usually in a single gene, may cause serious diseases. Gene therapy is 'the use of genes as medicine'. It involves the transfer of a therapeutic or working gene copy into specific cells of an individual in order to repair a faulty gene copy. Thus it may be used to replace a faulty gene, or to introduce a new gene whose function is to cure or to favorably modify the clinical course of a condition. It has a promising era in the field of periodontics. Gene therapy has been used as a mode of tissue engineering in periodontics. The tissue engineering approach reconstructs the natural target tissue by combining four elements namely: Scaffold, signaling molecules, cells and blood supply and thus can help in the reconstruction of damaged periodontium including cementum, gingival, periodontal ligament and bone.
The Gene-Lifestyle Interaction on Leptin Sensitivity and Lipid Metabolism in Adults: A Population Based Study.

Science.gov (United States)

Luglio, Harry Freitag; Sulistyoningrum, Dian Caturini; Huriyati, Emy; Lee, Yi Yi; Wan Muda, Wan Abdul Manan

2017-07-07

Obesity has been associated with leptin resistance and this might be caused by genetic factors. The aim of this study was to investigate the gene-lifestyle interaction between -866G/A UCP2 (uncoupling protein 2) gene polymorphism, dietary intake and leptin in a population based study. This is a cross sectional study conducted in adults living at urban area of Yogyakarta, Indonesia. Data of adiposity, lifestyle, triglyceride, high density lipoprotein (HDL) cholesterol, leptin and UCP2 gene polymorphism were obtained in 380 men and female adults. UCP2 gene polymorphism was not significantly associated with adiposity, leptin, triglyceride, HDL cholesterol, dietary intake and physical activity (all p > 0.05). Leptin was lower in overweight subjects with AA + GA genotypes than those with GG genotype counterparts ( p = 0.029). In subjects with AA + GA genotypes there was a negative correlation between leptin concentration ( r = -0.324; p correlation was not seen in GG genotype ( r = -0.111; p = 0.188). In summary, we showed how genetic variation in -866G/A UCP2 affected individual response to leptin production. AA + GA genotype had a better leptin sensitivity shown by its response in dietary intake and body mass index (BMI) and this explained the protective effect of A allele to obesity.
Accurate, model-based tuning of synthetic gene expression using introns in S. cerevisiae.

Directory of Open Access Journals (Sweden)

Ido Yofe

2014-06-01

Full Text Available Introns are key regulators of eukaryotic gene expression and present a potentially powerful tool for the design of synthetic eukaryotic gene expression systems. However, intronic control over gene expression is governed by a multitude of complex, incompletely understood, regulatory mechanisms. Despite this lack of detailed mechanistic understanding, here we show how a relatively simple model enables accurate and predictable tuning of synthetic gene expression system in yeast using several predictive intron features such as transcript folding and sequence motifs. Using only natural Saccharomyces cerevisiae introns as regulators, we demonstrate fine and accurate control over gene expression spanning a 100 fold expression range. These results broaden the engineering toolbox of synthetic gene expression systems and provide a framework in which precise and robust tuning of gene expression is accomplished.
Gene doping.

Science.gov (United States)

Haisma, H J; de Hon, O

2006-04-01

Together with the rapidly increasing knowledge on genetic therapies as a promising new branch of regular medicine, the issue has arisen whether these techniques might be abused in the field of sports. Previous experiences have shown that drugs that are still in the experimental phases of research may find their way into the athletic world. Both the World Anti-Doping Agency (WADA) and the International Olympic Committee (IOC) have expressed concerns about this possibility. As a result, the method of gene doping has been included in the list of prohibited classes of substances and prohibited methods. This review addresses the possible ways in which knowledge gained in the field of genetic therapies may be misused in elite sports. Many genes are readily available which may potentially have an effect on athletic performance. The sporting world will eventually be faced with the phenomena of gene doping to improve athletic performance. A combination of developing detection methods based on gene arrays or proteomics and a clear education program on the associated risks seems to be the most promising preventive method to counteract the possible application of gene doping.
Recent Trends of Polymer Mediated Liposomal Gene Delivery System

Directory of Open Access Journals (Sweden)

Shyamal Kumar Kundu

2014-01-01

Full Text Available Advancement in the gene delivery system have resulted in clinical successes in gene therapy for patients with several genetic diseases, such as immunodeficiency diseases, X-linked adrenoleukodystrophy (X-ALD blindness, thalassemia, and many more. Among various delivery systems, liposomal mediated gene delivery route is offering great promises for gene therapy. This review is an attempt to depict a portrait about the polymer based liposomal gene delivery systems and their future applications. Herein, we have discussed in detail the characteristics of liposome, importance of polymer for liposome formulation, gene delivery, and future direction of liposome based gene delivery as a whole.
Development of novel recombinant biomimetic chimeric MPG-based peptide as nanocarriers for gene delivery: Imitation of a real cargo.

Science.gov (United States)

Majidi, Asia; Nikkhah, Maryam; Sadeghian, Faranak; Hosseinkhani, Saman

2016-10-01

In last decades great efforts have been devoted to the study of development of recombinant peptide based vectors that consist of biological motifs with potential applications in gene therapy. Recombinant Biomimetic Chimeric Vectors (rBCVs) are biopolymeric nanocarriers that are designed to mimic viral features to overcome the cellular obstacles in gene transferring pathway into cell nucleus. In this research, we designed and genetically engineered three novel rBCVs with similar sequences that differed in motifs arrangement and motif abundance: MPG-2H1, 2TMPG-2H1 and 2RMPG-2H1. The MPG as a famous amphipathic cell penetrating peptide is the main segment of these constructs which was studied for the first time in association with truncated histone H1 DNA condensing motif. Through the performance of several physicochemical and biological assays, the rBCVs were remarkably examined regarding transfection efficiency. The main objective of this study is focused on the importance of motif design in transfection efficiency of rBCVs on one hand, and the assessment of correlation between structural features and functionality of motifs on the other hand. The results revealed that all three kinds of rBCVs/pDNA nanoparticles with average sizes of 200nm could overwhelm the cellular obstacles associated with gene transfer, and lead to efficient gene delivery. Furthermore, no significant toxicity was perceived and efficient endosome disruptive activity was obtained. It is noteworthy to say among three mentioned constructs 2RMPG-2H1 showed the highest transfection efficiency. Overall the peptide based vectors hold great promise as a nontoxic and effective gene carrier in vitro and in vivo, besides the rational design possibility as the most vital advantages over the other non-viral gene delivery vectors. Copyright © 2016 Elsevier B.V. All rights reserved.
Side-by-side comparison of gene-based smallpox vaccine with MVA in nonhuman primates.

Science.gov (United States)

Golden, Joseph W; Josleyn, Matthew; Mucker, Eric M; Hung, Chien-Fu; Loudon, Peter T; Wu, T C; Hooper, Jay W

2012-01-01

Orthopoxviruses remain a threat as biological weapons and zoonoses. The licensed live-virus vaccine is associated with serious health risks, making its general usage unacceptable. Attenuated vaccines are being developed as alternatives, the most advanced of which is modified-vaccinia virus Ankara (MVA). We previously developed a gene-based vaccine, termed 4pox, which targets four orthopoxvirus antigens, A33, B5, A27 and L1. This vaccine protects mice and non-human primates from lethal orthopoxvirus disease. Here, we investigated the capacity of the molecular adjuvants GM-CSF and Escherichia coli heat-labile enterotoxin (LT) to enhance the efficacy of the 4pox gene-based vaccine. Both adjuvants significantly increased protective antibody responses in mice. We directly compared the 4pox plus LT vaccine against MVA in a monkeypox virus (MPXV) nonhuman primate (NHP) challenge model. NHPs were vaccinated twice with MVA by intramuscular injection or the 4pox/LT vaccine delivered using a disposable gene gun device. As a positive control, one NHP was vaccinated with ACAM2000. NHPs vaccinated with each vaccine developed anti-orthopoxvirus antibody responses, including those against the 4pox antigens. After MPXV intravenous challenge, all control NHPs developed severe disease, while the ACAM2000 vaccinated animal was well protected. All NHPs vaccinated with MVA were protected from lethality, but three of five developed severe disease and all animals shed virus. All five NHPs vaccinated with 4pox/LT survived and only one developed severe disease. None of the 4pox/LT-vaccinated animals shed virus. Our findings show, for the first time, that a subunit orthopoxvirus vaccine delivered by the same schedule can provide a degree of protection at least as high as that of MVA.
Side-by-side comparison of gene-based smallpox vaccine with MVA in nonhuman primates.

Directory of Open Access Journals (Sweden)

Joseph W Golden

Full Text Available Orthopoxviruses remain a threat as biological weapons and zoonoses. The licensed live-virus vaccine is associated with serious health risks, making its general usage unacceptable. Attenuated vaccines are being developed as alternatives, the most advanced of which is modified-vaccinia virus Ankara (MVA. We previously developed a gene-based vaccine, termed 4pox, which targets four orthopoxvirus antigens, A33, B5, A27 and L1. This vaccine protects mice and non-human primates from lethal orthopoxvirus disease. Here, we investigated the capacity of the molecular adjuvants GM-CSF and Escherichia coli heat-labile enterotoxin (LT to enhance the efficacy of the 4pox gene-based vaccine. Both adjuvants significantly increased protective antibody responses in mice. We directly compared the 4pox plus LT vaccine against MVA in a monkeypox virus (MPXV nonhuman primate (NHP challenge model. NHPs were vaccinated twice with MVA by intramuscular injection or the 4pox/LT vaccine delivered using a disposable gene gun device. As a positive control, one NHP was vaccinated with ACAM2000. NHPs vaccinated with each vaccine developed anti-orthopoxvirus antibody responses, including those against the 4pox antigens. After MPXV intravenous challenge, all control NHPs developed severe disease, while the ACAM2000 vaccinated animal was well protected. All NHPs vaccinated with MVA were protected from lethality, but three of five developed severe disease and all animals shed virus. All five NHPs vaccinated with 4pox/LT survived and only one developed severe disease. None of the 4pox/LT-vaccinated animals shed virus. Our findings show, for the first time, that a subunit orthopoxvirus vaccine delivered by the same schedule can provide a degree of protection at least as high as that of MVA.
Gene-based vaccine development for improving animal production in developing countries. Possibilities and constraints

International Nuclear Information System (INIS)

Egerton, J.R.

2005-01-01

For vaccine production, recombinant antigens must be protective. Identifying protective antigens or candidate antigens is an essential precursor to vaccine development. Even when a protective antigen has been identified, cloning of its gene does not lead directly to vaccine development. The fimbrial protein of Dichelobacter nodosus, the agent of foot-rot in ruminants, was known to be protective. Recombinant vaccines against this infection are ineffective if expressed protein subunits are not assembled as mature fimbriae. Antigenic competition between different, but closely related, recombinant antigens limited the use of multivalent vaccines based on this technology. Recombinant antigens may need adjuvants to enhance response. DNA vaccines, potentiated with genes for different cytokines, may replace the need for aggressive adjuvants, and especially where cellular immunity is essential for protection. The expression of antigens from animal pathogens in plants and the demonstration of some immunity to a disease like rinderpest after ingestion of these, suggests an alternative approach to vaccination by injection. Research on disease pathogenesis and the identification of candidate antigens is specific to the disease agent. The definition of expression systems and the formulation of a vaccine for each disease must be followed by research to establish safety and efficacy. Where vaccines are based on unique gene sequences, the intellectual property is likely to be protected by patent. Organizations, licensed to produce recombinant vaccines, expect to recover their costs and to make a profit. The consequence is that genetically-derived vaccines are expensive. The capacity of vaccines to help animal owners of poorer countries depends not only on quality and cost but also on the veterinary infrastructure where they are used. Ensuring the existence of an effective animal health infrastructure in developing countries is as great a challenge for the developed world as
Homeobox genes and melatonin synthesis

DEFF Research Database (Denmark)

Rohde, Kristian; Møller, Morten; Rath, Martin Fredensborg

2014-01-01

Nocturnal synthesis of melatonin in the pineal gland is controlled by a circadian rhythm in arylalkylamine N-acetyltransferase (AANAT) enzyme activity. In the rodent, Aanat gene expression displays a marked circadian rhythm; release of norepinephrine in the gland at night causes a cAMP-based indu......Nocturnal synthesis of melatonin in the pineal gland is controlled by a circadian rhythm in arylalkylamine N-acetyltransferase (AANAT) enzyme activity. In the rodent, Aanat gene expression displays a marked circadian rhythm; release of norepinephrine in the gland at night causes a c......AMP-based induction of Aanat transcription. However, additional transcriptional control mechanisms exist. Homeobox genes, which are generally known to encode transcription factors controlling developmental processes, are also expressed in the mature rodent pineal gland. Among these, the cone-rod homeobox (CRX......) transcription factor is believed to control pineal-specific Aanat expression. Based on recent advances in our understanding of Crx in the rodent pineal gland, we here suggest that homeobox genes play a role in adult pineal physiology both by ensuring pineal-specific Aanat expression and by facilitating c...
Gene coexpression network analysis as a source of functional annotation for rice genes.

Directory of Open Access Journals (Sweden)

Kevin L Childs

Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional
Gene Expression Commons: an open platform for absolute gene expression profiling.

Directory of Open Access Journals (Sweden)

Jun Seita

Full Text Available Gene expression profiling using microarrays has been limited to comparisons of gene expression between small numbers of samples within individual experiments. However, the unknown and variable sensitivities of each probeset have rendered the absolute expression of any given gene nearly impossible to estimate. We have overcome this limitation by using a very large number (>10,000 of varied microarray data as a common reference, so that statistical attributes of each probeset, such as the dynamic range and threshold between low and high expression, can be reliably discovered through meta-analysis. This strategy is implemented in a web-based platform named "Gene Expression Commons" (https://gexc.stanford.edu/ which contains data of 39 distinct highly purified mouse hematopoietic stem/progenitor/differentiated cell populations covering almost the entire hematopoietic system. Since the Gene Expression Commons is designed as an open platform, investigators can explore the expression level of any gene, search by expression patterns of interest, submit their own microarray data, and design their own working models representing biological relationship among samples.
DNA sequence of 15 base pairs is sufficient to mediate both glucocorticoid and progesterone induction of gene expression

International Nuclear Information System (INIS)

Straehle, U.; Klock, G.; Schuetz, G.

1987-01-01

To define the recognition sequence of the glucocorticoid receptor and its relationship with that of the progesterone receptor, oligonucleotides derived from the glucocorticoid response element of the tyrosine aminotransferase gene were tested upstream of a heterologous promoter for their capacity to mediate effects of these two steroids. The authors show that a 15-base-pair sequence with partial symmetry is sufficient to confer glucocorticoid inducibility on the promoter of the herpes simplex virus thymidine kinase gene. The same 15-base-pair sequence mediates induction by progesterone. Point mutations in the recognition sequence affect inducibility by glucocorticoids and progesterone similarly. Together with the strong conservation of the sequence of the DNA-binding domain of the two receptors, these data suggest that both proteins recognize a sequence that is similar, if not the same
Novel gene sets improve set-level classification of prokaryotic gene expression data.

Science.gov (United States)

Holec, Matěj; Kuželka, Ondřej; Železný, Filip

2015-10-28

Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.
A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses

OpenAIRE

He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

2017-01-01

Background Female moths synthesize species-specific sex pheromone components and release them to attract male moths, which depend on precise sex pheromone chemosensory system to locate females. Two types of genes involved in the sex pheromone biosynthesis and degradation pathways play essential roles in this important moth behavior. To understand the function of genes in the sex pheromone pathway, this study investigated the genome-wide and digital gene expression of sex pheromone biosynthesi...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.