Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K
Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.
Kevin L Childs
Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional
Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo
In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.
Full Text Available Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3, the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA by: i introducing quality control of co-expression similarities, ii parallelizing embedded network construction, and iii developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs. We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA. MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Song, Won-Min; Zhang, Bin
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro
Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.
Zheng, Guangyong; Huang, Tao
In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Shlykova, Irina; Ponosov, Arcady
There are different ways of how to model gene regulatory networks. Differential equations allow for a detailed description of the network's dynamics and provide an explicit model of the gene concentration changes over time. Production and relative degradation rate functions used in such models depend on the vector of steeply sloped threshold functions which characterize the activity of genes. The most popular example of the threshold functions comes from the Boolean network approach, where the threshold functions are given by step functions. The system of differential equations becomes then piecewise linear. The dynamics of this system can be described very easily between the thresholds, but not in the switching domains. For instance this approach fails to analyze stationary points of the system and to define continuous solutions in the switching domains. These problems were studied in , , but the proposed model did not take into account a time delay in cellular systems. However, analysis of real gene expression data shows a considerable number of time-delayed interactions suggesting that time delay is essential in gene regulation. Therefore, delays may have a great effect on the dynamics of the system presenting one of the critical factors that should be considered in reconstruction of gene regulatory networks. The goal of this work is to apply the singular perturbation analysis to certain systems with delay and to obtain an analog of Tikhonov's theorem, which provides sufficient conditions for constracting the limit system in the delay case.
Pavlogiannis, Andreas; Mozhayskiy, Vadim; Tagkopoulos, Ilias
Biological networks tend to have high interconnectivity, complex topologies and multiple types of interactions. This renders difficult the identification of sub-networks that are involved in condition- specific responses. In addition, we generally lack scalable methods that can reveal the information flow in gene regulatory and biochemical pathways. Doing so will help us to identify key participants and paths under specific environmental and cellular context. This paper introduces the theory of network flooding, which aims to address the problem of network minimization and regulatory information flow in gene regulatory networks. Given a regulatory biological network, a set of source (input) nodes and optionally a set of sink (output) nodes, our task is to find (a) the minimal sub-network that encodes the regulatory program involving all input and output nodes and (b) the information flow from the source to the sink nodes of the network. Here, we describe a novel, scalable, network traversal algorithm and we assess its potential to achieve significant network size reduction in both synthetic and E. coli networks. Scalability and sensitivity analysis show that the proposed method scales well with the size of the network, and is robust to noise and missing data. The method of network flooding proves to be a useful, practical approach towards information flow analysis in gene regulatory networks. Further extension of the proposed theory has the potential to lead in a unifying framework for the simultaneous network minimization and information flow analysis across various "omics" levels.
Lipner, Ettie M.; Garcia, Benjamin J.; Strong, Michael
Tuberculosis and nontuberculous mycobacterial infections constitute a high burden of pulmonary disease in humans, resulting in over 1.5 million deaths per year. Building on the premise that genetic factors influence the instance, progression, and defense of infectious disease, we undertook a systems biology approach to investigate relationships among genetic factors that may play a role in increased susceptibility or control of mycobacterial infections. We combined literature and database mining with network analysis and pathway enrichment analysis to examine genes, pathways, and networks, involved in the human response to Mycobacterium tuberculosis and nontuberculous mycobacterial infections. This approach allowed us to examine functional relationships among reported genes, and to identify novel genes and enriched pathways that may play a role in mycobacterial susceptibility or control. Our findings suggest that the primary pathways and genes influencing mycobacterial infection control involve an interplay between innate and adaptive immune proteins and pathways. Signaling pathways involved in autoimmune disease were significantly enriched as revealed in our networks. Mycobacterial disease susceptibility networks were also examined within the context of gene-chemical relationships, in order to identify putative drugs and nutrients with potential beneficial immunomodulatory or anti-mycobacterial effects. PMID:26751573
Liu, Wei; Li, Li; Ye, Hua; Tu, Wei
High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.
Haibe-Kains, Benjamin; Olsen, Catharina; Djebbari, Amira; Bontempi, Gianluca; Correll, Mick; Bouton, Christopher; Quackenbush, John
Genomics provided us with an unprecedented quantity of data on the genes that are activated or repressed in a wide range of phenotypes. We have increasingly come to recognize that defining the networks and pathways underlying these phenotypes requires both the integration of multiple data types and the development of advanced computational methods to infer relationships between the genes and to estimate the predictive power of the networks through which they interact. To address these issues we have developed Predictive Networks (PN), a flexible, open-source, web-based application and data services framework that enables the integration, navigation, visualization and analysis of gene interaction networks. The primary goal of PN is to allow biomedical researchers to evaluate experimentally derived gene lists in the context of large-scale gene interaction networks. The PN analytical pipeline involves two key steps. The first is the collection of a comprehensive set of known gene interactions derived from a variety of publicly available sources. The second is to use these 'known' interactions together with gene expression data to infer robust gene networks. The PN web application is accessible from http://predictivenetworks.org. The PN code base is freely available at https://sourceforge.net/projects/predictivenets/.
Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin
This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia
To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung
Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene (GxG) interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.
Akberdin, Ilya R; Omelyanchuk, Nadezda A; Fadeev, Stanislav I; Leskova, Natalya E; Oschepkova, Evgeniya A; Kazantsev, Fedor V; Matushkin, Yury G; Afonnikov, Dmitry A; Kolchanov, Nikolay A
Multiple experimental data demonstrated that the core gene network orchestrating self-renewal and differentiation of mouse embryonic stem cells involves activity of Oct4, Sox2 and Nanog genes by means of a number of positive feedback loops among them. However, recent studies indicated that the architecture of the core gene network should also incorporate negative Nanog autoregulation and might not include positive feedbacks from Nanog to Oct4 and Sox2. Thorough parametric analysis of the mathematical model based on this revisited core regulatory circuit identified that there are substantial changes in model dynamics occurred depending on the strength of Oct4 and Sox2 activation and molecular complexity of Nanog autorepression. The analysis showed the existence of four dynamical domains with different numbers of stable and unstable steady states. We hypothesize that these domains can constitute the checkpoints in a developmental progression from naïve to primed pluripotency and vice versa. During this transition, parametric conditions exist, which generate an oscillatory behavior of the system explaining heterogeneity in expression of pluripotent and differentiation factors in serum ESC cultures. Eventually, simulations showed that addition of positive feedbacks from Nanog to Oct4 and Sox2 leads mainly to increase of the parametric space for the naïve ESC state, in which pluripotency factors are strongly expressed while differentiation ones are repressed.
Kim, Dong Sub; Kim, Jinbaek; Kim, Sang Hoon
In this project, we irradiated Arabidopsis plants with various doses of gamma-rays at the vegetative and reproductive stages to assess their radiation sensitivity. After the gene expression profiles and an analysis of the antioxidant response, we selected several Arabidopsis genes for uses of 'Radio marker genes (RMG)' and conducted over-expression and knock-down experiments to confirm the radio sensitivity. Based on these results, we applied two patents for the detection of two RMG (At3g28210 and At4g37990) and development of transgenic plants. Also, we developed a Genechip for use of high-throughput screening of Arabidopsis genes responding only to ionizing radiation and identified RMG to detect radiation leaks. Based on these results, we applied two patents associated with the use of Genechip for different types of radiation and different growth stages. Also, we conducted co-expression network study of specific expressed probes against gamma-ray stress and identified expressed patterns of duplicated genes formed by whole/500kb segmental genome duplication
Singh, Pramesh; Chen, Tianlong; Arendsee, Zebulun; Wurtele, Eve S.; Bassler, Kevin E.
Orphan genes, which are genes unique to each particular species, have recently drawn significant attention for their potential usefulness for organismal robustness. Their origin and regulatory interaction patterns remain largely undiscovered. Recently, methods that use the context likelihood of relatedness to infer a network followed by modularity maximizing community detection algorithms on the inferred network to find the functional structure of regulatory networks were shown to be effective. We apply improved versions of these methods to gene expression data from Arabidopsis thaliana, identify groups (clusters) of interacting genes with related patterns of expression and analyze the structure within those groups. Focusing on clusters that contain orphan genes, we compare the identified clusters to gene ontology (GO) terms, regulons, and pathway designations and analyze their hierarchical structure. We predict new regulatory interactions and unravel the structure of the regulatory interaction patterns of orphan genes. Work supported by the NSF through Grants DMR-1507371 and IOS-1546858.
Ahsen, Mehmet Eren; Niculescu, Silviu-Iulian
This brief examines a deterministic, ODE-based model for gene regulatory networks (GRN) that incorporates nonlinearities and time-delayed feedback. An introductory chapter provides some insights into molecular biology and GRNs. The mathematical tools necessary for studying the GRN model are then reviewed, in particular Hill functions and Schwarzian derivatives. One chapter is devoted to the analysis of GRNs under negative feedback with time delays and a special case of a homogenous GRN is considered. Asymptotic stability analysis of GRNs under positive feedback is then considered in a separate chapter, in which conditions leading to bi-stability are derived. Graduate and advanced undergraduate students and researchers in control engineering, applied mathematics, systems biology and synthetic biology will find this brief to be a clear and concise introduction to the modeling and analysis of GRNs.
Full Text Available Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs. For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR is one of the powerful and efficient methods for detecting high-order gene-gene (GxG interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI. Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.
Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.
Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...
Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.
Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori
Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo
In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both
Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo
Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further
Hill, Jonathon T; Demarest, Bradley; Gorsi, Bushra; Smith, Megan; Yost, H Joseph
During embryogenesis the heart forms as a linear tube that then undergoes multiple simultaneous morphogenetic events to obtain its mature shape. To understand the gene regulatory networks (GRNs) driving this phase of heart development, during which many congenital heart disease malformations likely arise, we conducted an RNA-seq timecourse in zebrafish from 30 hpf to 72 hpf and identified 5861 genes with altered expression. We clustered the genes by temporal expression pattern, identified transcription factor binding motifs enriched in each cluster, and generated a model GRN for the major gene batteries in heart morphogenesis. This approach predicted hundreds of regulatory interactions and found batteries enriched in specific cell and tissue types, indicating that the approach can be used to narrow the search for novel genetic markers and regulatory interactions. Subsequent analyses confirmed the GRN using two mutants, Tbx5 and nkx2-5 , and identified sets of duplicated zebrafish genes that do not show temporal subfunctionalization. This dataset provides an essential resource for future studies on the genetic/epigenetic pathways implicated in congenital heart defects and the mechanisms of cardiac transcriptional regulation. © 2017. Published by The Company of Biologists Ltd.
Full Text Available Abstract Background The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. Results We have developed MINER (Microarray Interactive Network Exploration and Representation, an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Conclusion Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Cava, Claudia; Bertoli, Gloria; Colaprico, Antonio; Olsen, Catharina; Bontempi, Gianluca; Castiglioni, Isabella
Modern high-throughput genomic technologies represent a comprehensive hallmark of molecular changes in pan-cancer studies. Although different cancer gene signatures have been revealed, the mechanism of tumourigenesis has yet to be completely understood. Pathways and networks are important tools to explain the role of genes in functional genomic studies. However, few methods consider the functional non-equal roles of genes in pathways and the complex gene-gene interactions in a network. We present a novel method in pan-cancer analysis that identifies de-regulated genes with a functional role by integrating pathway and network data. A pan-cancer analysis of 7158 tumour/normal samples from 16 cancer types identified 895 genes with a central role in pathways and de-regulated in cancer. Comparing our approach with 15 current tools that identify cancer driver genes, we found that 35.6% of the 895 genes identified by our method have been found as cancer driver genes with at least 2/15 tools. Finally, we applied a machine learning algorithm on 16 independent GEO cancer datasets to validate the diagnostic role of cancer driver genes for each cancer. We obtained a list of the top-ten cancer driver genes for each cancer considered in this study. Our analysis 1) confirmed that there are several known cancer driver genes in common among different types of cancer, 2) highlighted that cancer driver genes are able to regulate crucial pathways.
Bergholdt, R.; Størling, Zenia, Marian; Hansen, Kasper Lage
We have developed an integrative analysis method combining genetic interactions, identified using type 1 diabetes genome scan data, and a high-confidence human protein interaction network. Resulting networks were ranked by the significance of the enrichment of proteins from interacting regions. We...... identified a number of new protein network modules and novel candidate genes/proteins for type 1 diabetes. We propose this type of integrative analysis as a general method for the elucidation of genes and networks involved in diabetes and other complex diseases....
Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar
Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.
Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
van Leeuwen, D. M.; Pedersen, Marie; Knudsen, Lisbeth E.
checkpoint and aneuploidy. The MN-related gene network was tested against a transcriptomics case study associated with MN measurements. In this case study, transcriptomic data from children and adults differentially exposed to ambient air pollution in the Czech Republic were analysed and visualised......Mechanistically relevant information on responses of humans to xenobiotic exposure in relation to chemically induced biological effects, such as micronuclei (MN) formation can be obtained through large-scale transcriptomics studies. Network analysis may enhance the analysis and visualisation...... of such data. Therefore, this study aimed to develop a 'MN formation' network based on a priori knowledge, by using the pathway tool MetaCore. The gene network contained 27 genes and three gene complexes that are related to processes involved in MN formation, e.g. spindle assembly checkpoint, cell cycle...
Full Text Available Context: Paracoccidioides brasiliensis, a dimorphic fungus is the causative agent of paracoccidioidomycosis, a disease globally affecting millions of people. The haloacid dehalogenase (HAD superfamily hydrolases enzyme in the fungi, in particular, is known to be responsible in the pathogenesis by adhering to the tissue. Hence, identification of novel drug targets is essential. Aims: In-silico based identification of co-expressed genes along with HAD superfamily hydrolase in P. brasiliensis during the morphogenesis from mycelium to yeast to identify possible genes as drug targets. Materials and Methods: In total, four datasets were retrieved from the NCBI-gene expression omnibus (GEO database, each containing 4340 genes, followed by gene filtration expression of the data set. Further co-expression (CE study was performed individually and then a combination these genes were visualized in the Cytoscape 2. 8.3. Statistical Analysis Used: Mean and standard deviation value of the HAD superfamily hydrolase gene was obtained from the expression data and this value was subsequently used for the CE calculation purpose by selecting specific correlation power and filtering threshold. Results: The 23 genes that were thus obtained are common with respect to the HAD superfamily hydrolase gene. A significant network was selected from the Cytoscape network visualization that contains total 7 genes out of which 5 genes, which do not have significant protein hits, obtained from gene annotation of the expressed sequence tags by BLAST X. For all the protein PSI-BLAST was performed against human genome to find the homology. Conclusions: The gene co-expression network was obtained with respect to HAD superfamily dehalogenase gene in P. Brasiliensis.
Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena
Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Full Text Available Network analysis is a novel method to understand the complex pathogenesis of inflammation-driven atherosclerosis. Using this approach, we attempted to identify key inflammatory genes and their core transcriptional regulators in coronary artery disease (CAD. Initially, we obtained 124 candidate genes associated with inflammation and CAD using Polysearch and CADgene database for which protein-protein interaction network was generated using STRING 9.0 (Search Tool for the Retrieval of Interacting Genes and visualized using Cytoscape v 2.8.3. Based on betweenness centrality (BC and node degree as key topological parameters, we identified interleukin-6 (IL-6, vascular endothelial growth factor A (VEGFA, interleukin-1 beta (IL-1B, tumor necrosis factor (TNF and prostaglandin-endoperoxide synthase 2 (PTGS2 as hub nodes. The backbone network constructed with these five hub genes showed 111 nodes connected via 348 edges, with IL-6 having the largest degree and highest BC. Nuclear factor kappa B1 (NFKB1, signal transducer and activator of transcription 3 (STAT3 and JUN were identified as the three core transcription factors from the regulatory network derived using MatInspector. For the purpose of validation of the hub genes, 97 test networks were constructed, which revealed the accuracy of the backbone network to be 0.7763 while the frequency of the hub nodes remained largely unaltered. Pathway enrichment analysis with ClueGO, KEGG and REACTOME showed significant enrichment of six validated CAD pathways - smooth muscle cell proliferation, acute-phase response, calcidiol 1-monooxygenase activity, toll-like receptor signaling, NOD-like receptor signaling and adipocytokine signaling pathways. Experimental verification of the above findings in 64 cases and 64 controls showed increased expression of the five candidate genes and the three transcription factors in the cases relative to the controls (p<0.05. Thus, analysis of complex networks aid in the
Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen
Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.
Dec 9, 2013 ... early diagnosis of complex diseases or cancer without obvious symptoms. [Gong J., Diao B., Yao G. J., ... expression levels of thousands of genes in a specific cell or tissue. Previous ..... base of the brain. It mainly controls the ...
Full Text Available MYB transcription factor (TF is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by top down and guide gene approaches. More than 50% of OsMYBs were strongly correlated under fifty experimental conditions with 51 hub genes via top down approach. Further, clusters were identified using Markov Clustering (MCL. To maximize the clustering performance, parameter evaluation of the MCL inflation score (I was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by guide gene approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought
Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.
Zhu Xiaodong; Guo Ya; Qu Song; Li Ling; Huang Shiting; Li Danrong; Zhang Wei
Objective: To discover radioresistance associated molecular biomarkers and its mechanism in nasopharyngeal carcinoma by protein-protein interaction network analysis. Methods: Whole genome expression microarray was applied to screen out differentially expressed genes in two cell lines CNE-2R and CNE-2 with different radiosensitivity. Four differentially expressed genes were randomly selected for further verification by the semi-quantitative RT-PCR analysis with self-designed primers. The common differentially expressed genes from two experiments were analyzed with the SNOW online database in order to find out the central node related to the biomarkers of nasopharyngeal carcinoma radioresistance. The expression of STAT1 in CNE-2R and CNE-2 cells was measured by Western blot. Results: Compared with CNE-2 cells, 374 genes in CNE-2R cells were differentially expressed while 197 genes showed significant differences. Four randomly selected differentially expressed genes were verified by RT-PCR and had same change trend in consistent with the results of chip assay. Analysis with the SNOW database demonstrated that those 197 genes could form a complicated interaction network where STAT1 and JUN might be two key nodes. Indeed, the STAT1-α expression in CNE-2R was higher than that in CNE-2 (t=4.96, P<0.05). Conclusions: The key nodes of STAT1 and JUN may be the molecular biomarkers leading to radioresistance in nasopharyngeal carcinoma, and STAT1-α might have close relationship with radioresistance. (authors)
Borlawsky Tara B
Full Text Available Abstract Background Chronic lymphocytic leukemia (CLL is the most common adult leukemia. It is a highly heterogeneous disease, and can be divided roughly into indolent and progressive stages based on classic clinical markers. Immunoglobin heavy chain variable region (IgVH mutational status was found to be associated with patient survival outcome, and biomarkers linked to the IgVH status has been a focus in the CLL prognosis research field. However, biomarkers highly correlated with IgVH mutational status which can accurately predict the survival outcome are yet to be discovered. Results In this paper, we investigate the use of gene co-expression network analysis to identify potential biomarkers for CLL. Specifically we focused on the co-expression network involving ZAP70, a well characterized biomarker for CLL. We selected 23 microarray datasets corresponding to multiple types of cancer from the Gene Expression Omnibus (GEO and used the frequent network mining algorithm CODENSE to identify highly connected gene co-expression networks spanning the entire genome, then evaluated the genes in the co-expression network in which ZAP70 is involved. We then applied a set of feature selection methods to further select genes which are capable of predicting IgVH mutation status from the ZAP70 co-expression network. Conclusions We have identified a set of genes that are potential CLL prognostic biomarkers IL2RB, CD8A, CD247, LAG3 and KLRK1, which can predict CLL patient IgVH mutational status with high accuracies. Their prognostic capabilities were cross-validated by applying these biomarker candidates to classify patients into different outcome groups using a CLL microarray datasets with clinical information.
Blenk, Steffen; Engelmann, Julia C; Pinkert, Stefan; Weniger, Markus; Schultz, Jörg; Rosenwald, Andreas; Müller-Hermelink, Hans K; Müller, Tobias; Dandekar, Thomas
Mantle cell lymphoma (MCL) is an incurable B cell lymphoma and accounts for 6% of all non-Hodgkin's lymphomas. On the genetic level, MCL is characterized by the hallmark translocation t(11;14) that is present in most cases with few exceptions. Both gene expression and comparative genomic hybridization (CGH) data vary considerably between patients with implications for their prognosis. We compare patients over and below the median of survival. Exploratory principal component analysis of gene expression data showed that the second principal component correlates well with patient survival. Explorative analysis of CGH data shows the same correlation. On chromosome 7 and 9 specific genes and bands are delineated which improve prognosis prediction independent of the previously described proliferation signature. We identify a compact survival predictor of seven genes for MCL patients. After extensive re-annotation using GEPAT, we established protein networks correlating with prognosis. Well known genes (CDC2, CCND1) and further proliferation markers (WEE1, CDC25, aurora kinases, BUB1, PCNA, E2F1) form a tight interaction network, but also non-proliferative genes (SOCS1, TUBA1B CEBPB) are shown to be associated with prognosis. Furthermore we show that aggressive MCL implicates a gene network shift to higher expressed genes in late cell cycle states and refine the set of non-proliferative genes implicated with bad prognosis in MCL. The results from explorative data analysis of gene expression and CGH data are complementary to each other. Including further tests such as Wilcoxon rank test we point both to proliferative and non-proliferative gene networks implicated in inferior prognosis of MCL and identify suitable markers both in gene expression and CGH data
Davin, Nicolas; Edger, Patrick P; Hefer, Charles A; Mizrachi, Eshchar; Schuetz, Mathias; Smets, Erik; Myburg, Alexander A; Douglas, Carl J; Schranz, Michael E; Lens, Frederic
Many plant genes are known to be involved in the development of cambium and wood, but how the expression and functional interaction of these genes determine the unique biology of wood remains largely unknown. We used the soc1ful loss of function mutant - the woodiest genotype known in the otherwise herbaceous model plant Arabidopsis - to investigate the expression and interactions of genes involved in secondary growth (wood formation). Detailed anatomical observations of the stem in combination with mRNA sequencing were used to assess transcriptome remodeling during xylogenesis in wild-type and woody soc1ful plants. To interpret the transcriptome changes, we constructed functional gene association networks of differentially expressed genes using the STRING database. This analysis revealed functionally enriched gene association hubs that are differentially expressed in herbaceous and woody tissues. In particular, we observed the differential expression of genes related to mechanical stress and jasmonate biosynthesis/signaling during wood formation in soc1ful plants that may be an effect of greater tension within woody tissues. Our results suggest that habit shifts from herbaceous to woody life forms observed in many angiosperm lineages could have evolved convergently by genetic changes that modulate the gene expression and interaction network, and thereby redeploy the conserved wood developmental program. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Full Text Available Abstract Background The availability of large-scale high-throughput data possesses considerable challenges toward their functional analysis. For this reason gene network inference methods gained considerable interest. However, our current knowledge, especially about the influence of the structure of a gene network on its inference, is limited. Results In this paper we present a comprehensive investigation of the structural influence of gene networks on the inferential characteristics of C3NET - a recently introduced gene network inference algorithm. We employ local as well as global performance metrics in combination with an ensemble approach. The results from our numerical study for various biological and synthetic network structures and simulation conditions, also comparing C3NET with other inference algorithms, lead a multitude of theoretical and practical insights into the working behavior of C3NET. In addition, in order to facilitate the practical usage of C3NET we provide an user-friendly R package, called c3net, and describe its functionality. It is available from https://r-forge.r-project.org/projects/c3net and from the CRAN package repository. Conclusions The availability of gene network inference algorithms with known inferential properties opens a new era of large-scale screening experiments that could be equally beneficial for basic biological and biomedical research with auspicious prospects. The availability of our easy to use software package c3net may contribute to the popularization of such methods. Reviewers This article was reviewed by Lev Klebanov, Joel Bader and Yuriy Gusev.
Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M
Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms
Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J
Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types
Full Text Available Abstract Background Amyotrophic Lateral Sclerosis (ALS is a lethal disorder characterized by progressive degeneration of motor neurons in the brain and spinal cord. Diagnosis is mainly based on clinical symptoms, and there is currently no therapy to stop the disease or slow its progression. Since access to spinal cord tissue is not possible at disease onset, we investigated changes in gene expression profiles in whole blood of ALS patients. Results Our transcriptional study showed dramatic changes in blood of ALS patients; 2,300 probes (9.4% showed significant differential expression in a discovery dataset consisting of 30 ALS patients and 30 healthy controls. Weighted gene co-expression network analysis (WGCNA was used to find disease-related networks (modules and disease related hub genes. Two large co-expression modules were found to be associated with ALS. Our findings were replicated in a second (30 patients and 30 controls and third dataset (63 patients and 63 controls, thereby demonstrating a highly significant and consistent association of two large co-expression modules with ALS disease status. Ingenuity Pathway Analysis of the ALS related module genes implicates enrichment of functional categories related to genetic disorders, neurodegeneration of the nervous system and inflammatory disease. The ALS related modules contain a number of candidate genes possibly involved in pathogenesis of ALS. Conclusion This first large-scale blood gene expression study in ALS observed distinct patterns between cases and controls which may provide opportunities for biomarker development as well as new insights into the molecular mechanisms of the disease.
Full Text Available Duchenne Muscular Dystrophy (DMD is an important pathology associated with the human skeletal muscle and has been studied extensively. Gene expression measurements on skeletal muscle of patients afflicted with DMD provides the opportunity to understand the underlying mechanisms that lead to the pathology. Community structure analysis is a useful computational technique for understanding and modeling genetic interaction networks. In this paper, we leverage this technique in combination with gene expression measurements from normal and DMD patient skeletal muscle tissue to study the structure of genetic interactions in the context of DMD. We define a novel framework for transforming a raw dataset of gene expression measurements into an interaction network, and subsequently apply algorithms for community structure analysis for the extraction of topological communities. The emergent communities are analyzed from a biological standpoint in terms of their constituent biological pathways, and an interpretation that draws correlations between functional and structural organization of the genetic interactions is presented. We also compare these communities and associated functions in pathology against those in normal human skeletal muscle. In particular, differential enhancements are observed in the following pathways between pathological and normal cases: Metabolic, Focal adhesion, Regulation of actin cytoskeleton and Cell adhesion, and implication of these mechanisms are supported by prior work. Furthermore, our study also includes a gene-level analysis to identify genes that are involved in the coupling between the pathways of interest. We believe that our results serve to highlight important distinguishing features in the structural/functional organization of constituent biological pathways, as it relates to normal and DMD cases, and provide the mechanistic basis for further biological investigations into specific pathways differently regulated
Zhang, Yu; Huang, Zhen; Zhu, Zhiqiang; Liu, Jianwei; Zheng, Xin; Zhang, Yuhai
Prostate cancer (PC) is the second most common cancer among men in the United States, and it imposes a considerable threat to human health. A deep understanding of its underlying molecular mechanisms is the premise for developing effective targeted therapies. Recently, deep transcriptional sequencing has been used as an effective genomic assay to obtain insights into diseases and may be helpful in the study of PC. In present study, ChIP-Seq data for PC and normal samples were compared, and differential peaks identified, based upon fold changes (with P-values calculated with t-tests). Annotations of these peaks were performed. Protein-protein interaction (PPI) network analysis was performed with BioGRID and constructed with Cytoscape, following which the highly connected genes were screened. We obtained a total of 5,570 differential peaks, including 3,726 differentially enriched peaks in tumor samples and 1,844 differentially enriched peaks in normal samples. There were eight significant regions of the peaks. The intergenic region possessed the highest score (51%), followed by intronic (31%) and exonic (11%) regions. The analysis revealed the top 35 highly connected genes, which comprised 33 differential genes (such as YWHAQ, tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein and θ polypeptide) from ChIP-Seq data and 2 differential genes retrieved from the PPI network: UBA52 (ubiquitin A-52 residue ribosomal protein fusion product (1) and SUMO2 (SMT3 suppressor of mif two 3 homolog (2) . Our findings regarding potential PC-related genes increase the understanding of PC and provides direction for future research.
Full Text Available Abstract Background Phylogenetic methods are well-established bioinformatic tools for sequence analysis, allowing to describe the non-independencies of sequences because of their common ancestor. However, the evolutionary profiles of bacterial genes are often complicated by hidden paralogy and extensive and/or (multiple horizontal gene transfer (HGT events which make bifurcating trees often inappropriate. In this context, plasmid sequences are paradigms of network-like relationships characterizing the evolution of prokaryotes. Actually, they can be transferred among different organisms allowing the dissemination of novel functions, thus playing a pivotal role in prokaryotic evolution. However, the study of their evolutionary dynamics is complicated by the absence of universally shared genes, a prerequisite for phylogenetic analyses. Results To overcome such limitations we developed a bioinformatic package, named Blast2Network (B2N, allowing the automatic phylogenetic profiling and the visualization of homology relationships in a large number of plasmid sequences. The software was applied to the study of 47 completely sequenced plasmids coming from Escherichia, Salmonella and Shigella spps. Conclusion The tools implemented by B2N allow to describe and visualize in a new way some of the evolutionary features of plasmid molecules of Enterobacteriaceae; in particular it helped to shed some light on the complex history of Escherichia, Salmonella and Shigella plasmids and to focus on possible roles of unannotated proteins. The proposed methodology is general enough to be used for comparative genomic analyses of bacteria.
Reddy, Anupama; Huang, C Chris; Liu, Huiqing; Delisi, Charles; Nevalainen, Marja T; Szalma, Sandor; Bhanot, Gyan
We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily
Luo, Jie; Xu, Pei; Cao, Peijian; Wan, Hongjian; Lv, Xiaonan; Xu, Shengchun; Wang, Gangjun; Cook, Melloni N; Jones, Byron C; Lu, Lu; Wang, Xusheng
Although the link between stress and alcohol is well recognized, the underlying mechanisms of how they interplay at the molecular level remain unclear. The purpose of this study is to identify molecular networks underlying the effects of alcohol and stress responses, as well as their interaction on anxiety behaviors in the hippocampus of mice using a systems genetics approach. Here, we applied a gene co-expression network approach to transcriptomes of 41 BXD mouse strains under four conditions: stress, alcohol, stress-induced alcohol and control. The co-expression analysis identified 14 modules and characterized four expression patterns across the four conditions. The four expression patterns include up-regulation in no restraint stress and given an ethanol injection (NOE) but restoration in restraint stress followed by an ethanol injection (RSE; pattern 1), down-regulation in NOE but rescue in RSE (pattern 2), up-regulation in both restraint stress followed by a saline injection (RSS) and NOE, and further amplification in RSE (pattern 3), and up-regulation in RSS but reduction in both NOE and RSE (pattern 4). We further identified four functional subnetworks by superimposing protein-protein interactions (PPIs) to the 14 co-expression modules, including γ-aminobutyric acid receptor (GABA) signaling, glutamate signaling, neuropeptide signaling, cAMP-dependent signaling. We further performed module specificity analysis to identify modules that are specific to stress, alcohol, or stress-induced alcohol responses. Finally, we conducted causality analysis to link genetic variation to these identified modules, and anxiety behaviors after stress and alcohol treatments. This study underscores the importance of integrative analysis and offers new insights into the molecular networks underlying stress and alcohol responses.
Full Text Available Although the link between stress and alcohol is well recognized, the underlying mechanisms of how they interplay at the molecular level remain unclear. The purpose of this study is to identify molecular networks underlying the effects of alcohol and stress responses, as well as their interaction on anxiety behaviors in the hippocampus of mice using a systems genetics approach. Here, we applied a gene co-expression network approach to transcriptomes of 41 BXD mouse strains under four conditions: stress, alcohol, stress-induced alcohol and control. The co-expression analysis identified 14 modules and characterized four expression patterns across the four conditions. The four expression patterns include up-regulation in no restraint stress and given an ethanol injection (NOE but restoration in restraint stress followed by an ethanol injection (RSE; pattern 1, down-regulation in NOE but rescue in RSE (pattern 2, up-regulation in both restraint stress followed by a saline injection (RSS and NOE, and further amplification in RSE (pattern 3, and up-regulation in RSS but reduction in both NOE and RSE (pattern 4. We further identified four functional subnetworks by superimposing protein-protein interactions (PPIs to the 14 co-expression modules, including γ-aminobutyric acid receptor (GABA signaling, glutamate signaling, neuropeptide signaling, cAMP-dependent signaling. We further performed module specificity analysis to identify modules that are specific to stress, alcohol, or stress-induced alcohol responses. Finally, we conducted causality analysis to link genetic variation to these identified modules, and anxiety behaviors after stress and alcohol treatments. This study underscores the importance of integrative analysis and offers new insights into the molecular networks underlying stress and alcohol responses.
Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were
Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria
The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.
Brian W Kunkle
Full Text Available In this study we have identified key genes that are critical in development of astrocytic tumors. Meta-analysis of microarray studies which compared normal tissue to astrocytoma revealed a set of 646 differentially expressed genes in the majority of astrocytoma. Reverse engineering of these 646 genes using Bayesian network analysis produced a gene network for each grade of astrocytoma (Grade I-IV, and 'key genes' within each grade were identified. Genes found to be most influential to development of the highest grade of astrocytoma, Glioblastoma multiforme were: COL4A1, EGFR, BTF3, MPP2, RAB31, CDK4, CD99, ANXA2, TOP2A, and SERBP1. All of these genes were up-regulated, except MPP2 (down regulated. These 10 genes were able to predict tumor status with 96-100% confidence when using logistic regression, cross validation, and the support vector machine analysis. Markov genes interact with NFkβ, ERK, MAPK, VEGF, growth hormone and collagen to produce a network whose top biological functions are cancer, neurological disease, and cellular movement. Three of the 10 genes - EGFR, COL4A1, and CDK4, in particular, seemed to be potential 'hubs of activity'. Modified expression of these 10 Markov Blanket genes increases lifetime risk of developing glioblastoma compared to the normal population. The glioblastoma risk estimates were dramatically increased with joint effects of 4 or more than 4 Markov Blanket genes. Joint interaction effects of 4, 5, 6, 7, 8, 9 or 10 Markov Blanket genes produced 9, 13, 20.9, 26.7, 52.8, 53.2, 78.1 or 85.9%, respectively, increase in lifetime risk of developing glioblastoma compared to normal population. In summary, it appears that modified expression of several 'key genes' may be required for the development of glioblastoma. Further studies are needed to validate these 'key genes' as useful tools for early detection and novel therapeutic options for these tumors.
Sun, Mengyang; Cheng, Xianrui; Socolar, Joshua E S
A common approach to the modeling of gene regulatory networks is to represent activating or repressing interactions using ordinary differential equations for target gene concentrations that include Hill function dependences on regulator gene concentrations. An alternative formulation represents the same interactions using Boolean logic with time delays associated with each network link. We consider the attractors that emerge from the two types of models in the case of a simple but nontrivial network: a figure-8 network with one positive and one negative feedback loop. We show that the different modeling approaches give rise to the same qualitative set of attractors with the exception of a possible fixed point in the ordinary differential equation model in which concentrations sit at intermediate values. The properties of the attractors are most easily understood from the Boolean perspective, suggesting that time-delay Boolean modeling is a useful tool for understanding the logic of regulatory networks.
Farber, Charles R
Bone mineral density (BMD) is influenced by a complex network of gene interactions; therefore, elucidating the relationships between genes and how those genes, in turn, influence BMD is critical for developing a comprehensive understanding of osteoporosis. To investigate the role of transcriptional networks in the regulation of BMD, we performed a weighted gene coexpression network analysis (WGCNA) using microarray expression data on monocytes from young individuals with low or high BMD. WGCNA groups genes into modules based on patterns of gene coexpression. and our analysis identified 11 gene modules. We observed that the overall expression of one module (referred to as module 9) was significantly higher in the low-BMD group (p = .03). Module 9 was highly enriched for genes belonging to the immune system-related gene ontology (GO) category "response to virus" (p = 7.6 × 10(-11)). Using publically available genome-wide association study data, we independently validated the importance of module 9 by demonstrating that highly connected module 9 hubs were more likely, relative to less highly connected genes, to be genetically associated with BMD. This study highlights the advantages of systems-level analyses to uncover coexpression modules associated with bone mass and suggests that particular monocyte expression patterns may mediate differences in BMD. © 2010 American Society for Bone and Mineral Research.
Li, Jianqiang; Zhou, Doudou; Qiu, Weiliang; Shi, Yuliang; Yang, Ji-Jiang; Chen, Shi; Wang, Qing; Pan, Hui
Investigating how genes jointly affect complex human diseases is important, yet challenging. The network approach (e.g., weighted gene co-expression network analysis (WGCNA)) is a powerful tool. However, genomic data usually contain substantial batch effects, which could mask true genomic signals. Paired design is a powerful tool that can reduce batch effects. However, it is currently unclear how to appropriately apply WGCNA to genomic data from paired design. In this paper, we modified the current WGCNA pipeline to analyse high-throughput genomic data from paired design. We illustrated the modified WGCNA pipeline by analysing the miRNA dataset provided by Shiah et al. (2014), which contains forty oral squamous cell carcinoma (OSCC) specimens and their matched non-tumourous epithelial counterparts. OSCC is the sixth most common cancer worldwide. The modified WGCNA pipeline identified two sets of novel miRNAs associated with OSCC, in addition to the existing miRNAs reported by Shiah et al. (2014). Thus, this work will be of great interest to readers of various scientific disciplines, in particular, genetic and genomic scientists as well as medical scientists working on cancer.
Waaijenborg, S.; Zwinderman, A.H.
ABSTRACT: BACKGROUND: We generalized penalized canonical correlation analysis for analyzing microarray gene-expression measurements for checking completeness of known metabolic pathways and identifying candidate genes for incorporation in the pathway. We used Wold's method for calculation of the
Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.
Full Text Available Genetic studies (in particular linkage and association studies identify chromosomal regions involved in a disease or phenotype of interest, but those regions often contain many candidate genes, only a few of which can be followed-up for biological validation. Recently, computational methods to identify (prioritize the most promising candidates within a region have been proposed, but they are usually not applicable to cases where little is known about the phenotype (no or few confirmed disease genes, fragmentary understanding of the biological cascades involved. We seek to overcome this limitation by replacing knowledge about the biological process by experimental data on differential gene expression between affected and healthy individuals. Considering the problem from the perspective of a gene/protein network, we assess a candidate gene by considering the level of differential expression in its neighborhood under the assumption that strong candidates will tend to be surrounded by differentially expressed neighbors. We define a notion of soft neighborhood where each gene is given a contributing weight, which decreases with the distance from the candidate gene on the protein network. To account for multiple paths between genes, we define the distance using the Laplacian exponential diffusion kernel. We score candidates by aggregating the differential expression of neighbors weighted as a function of distance. Through a randomization procedure, we rank candidates by p-values. We illustrate our approach on four monogenic diseases and successfully prioritize the known disease causing genes.
Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng
Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.
Fondi, Marco; Karkman, Antti; Tamminen, Manu V; Bosi, Emanuele; Virta, Marko; Fani, Renato; Alm, Eric; McInerney, James O
The spatial distribution of microbes on our planet is famously formulated in the Baas Becking hypothesis as "everything is everywhere but the environment selects." While this hypothesis does not strictly rule out patterns caused by geographical effects on ecology and historical founder effects, it does propose that the remarkable dispersal potential of microbes leads to distributions generally shaped by environmental factors rather than geographical distance. By constructing sequence similarity networks from uncultured environmental samples, we show that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes. We find that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne). We find that freshwater habitats act as a gene exchange bridge between otherwise disconnected habitats. Finally, certain antibiotic resistance genes deviate from the general trend of habitat specificity by exhibiting a high degree of cross-habitat mobility. The strong cross-habitat mobility of antibiotic resistance genes is a cause for concern and provides a paradigmatic example of the rate by which genes colonize new habitats when new selective forces emerge. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Rund, Samuel S C; Yoo, Boyoung; Alam, Camille; Green, Taryn; Stephens, Melissa T; Zeng, Erliang; George, Gary F; Sheppard, Aaron D; Duffield, Giles E; Milenković, Tijana; Pfrender, Michael E
Marine and freshwater zooplankton exhibit daily rhythmic patterns of behavior and physiology which may be regulated directly by the light:dark (LD) cycle and/or a molecular circadian clock. One of the best-studied zooplankton taxa, the freshwater crustacean Daphnia, has a 24 h diel vertical migration (DVM) behavior whereby the organism travels up and down through the water column daily. DVM plays a critical role in resource tracking and the behavioral avoidance of predators and damaging ultraviolet radiation. However, there is little information at the transcriptional level linking the expression patterns of genes to the rhythmic physiology/behavior of Daphnia. Here we analyzed genome-wide temporal transcriptional patterns from Daphnia pulex collected over a 44 h time period under a 12:12 LD cycle (diel) conditions using a cosine-fitting algorithm. We used a comprehensive network modeling and analysis approach to identify novel co-regulated rhythmic genes that have similar network topological properties and functional annotations as rhythmic genes identified by the cosine-fitting analyses. Furthermore, we used the network approach to predict with high accuracy novel gene-function associations, thus enhancing current functional annotations available for genes in this ecologically relevant model species. Our results reveal that genes in many functional groupings exhibit 24 h rhythms in their expression patterns under diel conditions. We highlight the rhythmic expression of immunity, oxidative detoxification, and sensory process genes. We discuss differences in the chronobiology of D. pulex from other well-characterized terrestrial arthropods. This research adds to a growing body of literature suggesting the genetic mechanisms governing rhythmicity in crustaceans may be divergent from other arthropod lineages including insects. Lastly, these results highlight the power of using a network analysis approach to identify differential gene expression and provide novel
Henry D Priest
Full Text Available Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.
Mallik, Saurav; Maulik, Ujjwal
Gene ranking is an important problem in bioinformatics. Here, we propose a new framework for ranking biomolecules (viz., miRNAs, transcription-factors/TFs and genes) in a multi-informative uterine leiomyoma dataset having both gene expression and methylation data using (statistical) eigenvector centrality based approach. At first, genes that are both differentially expressed and methylated, are identified using Limma statistical test. A network, comprising these genes, corresponding TFs from TRANSFAC and ITFP databases, and targeter miRNAs from miRWalk database, is then built. The biomolecules are then ranked based on eigenvector centrality. Our proposed method provides better average accuracy in hub gene and non-hub gene classifications than other methods. Furthermore, pre-ranked Gene set enrichment analysis is applied on the pathway database as well as GO-term databases of Molecular Signatures Database with providing a pre-ranked gene-list based on different centrality values for comparing among the ranking methods. Finally, top novel potential gene-markers for the uterine leiomyoma are provided. Copyright © 2015 Elsevier Inc. All rights reserved.
Kunkle, Brian W.; Yoo, Changwon; Roy, Deodutta
In this study we have identified key genes that are critical in development of astrocytic tumors. Meta-analysis of microarray studies which compared normal tissue to astrocytoma revealed a set of 646 differentially expressed genes in the majority of astrocytoma. Reverse engineering of these 646 genes using Bayesian network analysis produced a gene network for each grade of astrocytoma (Grade I–IV), and ‘key genes’ within each grade were identified. Genes found to be most influential to development of the highest grade of astrocytoma, Glioblastoma multiforme were: COL4A1, EGFR, BTF3, MPP2, RAB31, CDK4, CD99, ANXA2, TOP2A, and SERBP1. All of these genes were up-regulated, except MPP2 (down regulated). These 10 genes were able to predict tumor status with 96–100% confidence when using logistic regression, cross validation, and the support vector machine analysis. Markov genes interact with NFkβ, ERK, MAPK, VEGF, growth hormone and collagen to produce a network whose top biological functions are cancer, neurological disease, and cellular movement. Three of the 10 genes - EGFR, COL4A1, and CDK4, in particular, seemed to be potential ‘hubs of activity’. Modified expression of these 10 Markov Blanket genes increases lifetime risk of developing glioblastoma compared to the normal population. The glioblastoma risk estimates were dramatically increased with joint effects of 4 or more than 4 Markov Blanket genes. Joint interaction effects of 4, 5, 6, 7, 8, 9 or 10 Markov Blanket genes produced 9, 13, 20.9, 26.7, 52.8, 53.2, 78.1 or 85.9%, respectively, increase in lifetime risk of developing glioblastoma compared to normal population. In summary, it appears that modified expression of several ‘key genes’ may be required for the development of glioblastoma. Further studies are needed to validate these ‘key genes’ as useful tools for early detection and novel therapeutic options for these tumors. PMID:23737970
Constructing, evaluating, and interpreting gene networks generally sits within the broader field of systems biology, which continues to emerge rapidly, particular with respect to its application to understanding the complexity of signaling in the context of cancer biology. For the purposes of this volume, we take a broad definition of systems biology. Considering an organism or disease within an organism as a system, systems biology is the study of the integrated and coordinated interactions of the network(s) of genes, their variants both natural and mutated (e.g., polymorphisms, rearrangements, alternate splicing, mutations), their proteins and isoforms, and the organic and inorganic molecules with which they interact, to execute the biochemical reactions (e.g., as enzymes, substrates, products) that reflect the function of that system. Central to systems biology, and perhaps the only approach that can effectively manage the complexity of such systems, is the building of quantitative multiscale predictive models. The predictions of the models can vary substantially depending on the nature of the model and its inputoutput relationships. For example, a model may predict the outcome of a specific molecular reaction(s), a cellular phenotype (e.g., alive, dead, growth arrest, proliferation, and motility), a change in the respective prevalence of cell or subpopulations, a patient or patient subgroup outcome(s). Such models necessarily require computers. Computational modeling can be thought of as using machine learning and related tools to integrate the very high dimensional data generated from modern, high throughput omics technologies including genomics (next generation sequencing), transcriptomics (gene expression microarrays; RNAseq), metabolomics and proteomics (ultra high performance liquid chromatography, mass spectrometry), and "subomic" technologies to study the kinome, methylome, and others. Mathematical modeling can be thought of as the use of ordinary
Zhang, L.; Mallick, B. K.
graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which
Full Text Available Muscle, a multinucleate syncytium formed by the fusion of mononuclear myoblasts, arises from quiescent progenitors (satellite cells via activation of muscle-specific transcription factors (MyoD, Myf5, myogenin: MYOG, and MRF4. Subsequent to a decline in Pax7, induction in the expression of MYOG is a hallmark of myoblasts that have entered the differentiation phase following cell cycle withdrawal. It is evident that MYOG function cannot be compensated by any other myogenic regulatory factors (MRFs. Despite a plethora of information available regarding MYOG, the mechanism by which MYOG regulates muscle cell differentiation has not yet been identified. Using an RNA-Seq approach, analysis of MYOG knock-down muscle satellite cells (MSCs have shown that genes associated with cell cycle and division, DNA replication, and phosphate metabolism are differentially expressed. By constructing an interaction network of differentially expressed genes (DEGs using GeneMANIA, cadherin-associated protein (CTNNA2 was identified as the main hub gene in the network with highest node degree. Four functional clusters (modules or communities were identified in the network and the functional enrichment analysis revealed that genes included in these clusters significantly contribute to skeletal muscle development. To confirm this finding, in vitro studies revealed increased expression of CTNNA2 in MSCs on day 12 compared to day 10. Expression of CTNNA2 was decreased in MYOG knock-down cells. However, knocking down CTNNA2, which leads to increased expression of extracellular matrix (ECM genes (type I collagen α1 and type I collagen α2 along with myostatin (MSTN, was not found significantly affecting the expression of MYOG in C2C12 cells. We therefore propose that MYOG exerts its regulatory effects by acting upstream of CTNNA2, which in turn regulates the differentiation of C2C12 cells via interaction with ECM genes. Taken together, these findings highlight a new
Poorebrahim, Mansour; Salarian, Ali; Najafi, Saeideh; Abazari, Mohammad Foad; Aleagha, Maryam Nouri; Dadras, Mohammad Nasr; Jazayeri, Seyed Mohammad; Ataei, Atousa; Poortahmasebi, Vahdat
Epstein-Barr virus (EBV) is the most common cause of infectious mononucleosis (IM) and establishes lifetime infection associated with a variety of cancers and autoimmune diseases. The aim of this study was to develop an integrative gene regulatory network (GRN) approach and overlying gene expression data to identify the representative subnetworks for IM and EBV latent infection (LI). After identifying differentially expressed genes (DEGs) in both IM and LI gene expression profiles, functional annotations were applied using gene ontology (GO) and BiNGO tools, and construction of GRNs, topological analysis and identification of modules were carried out using several plugins of Cytoscape. In parallel, a human-EBV GRN was generated using the Hu-Vir database for further analyses. Our analysis revealed that the majority of DEGs in both IM and LI were involved in cell-cycle and DNA repair processes. However, these genes showed a significant negative correlation in the IM and LI states. Furthermore, cyclin-dependent kinase 2 (CDK2) - a hub gene with the highest centrality score - appeared to be the key player in cell cycle regulation in IM disease. The most significant functional modules in the IM and LI states were involved in the regulation of the cell cycle and apoptosis, respectively. Human-EBV network analysis revealed several direct targets of EBV proteins during IM disease. Our study provides an important first report on the response to IM/LI EBV infection in humans. An important aspect of our data was the upregulation of genes associated with cell cycle progression and proliferation.
Zinati, Zahra; Shamloo-Dashtpagerdi, Roohollah; Behpouri, Ali
As an aromatic and colorful plant of substantive taste, saffron (Crocus sativus L.) owes such properties of matter to growing class of the secondary metabolites derived from the carotenoids, apocarotenoids. Regarding the critical role of microRNAs in secondary metabolic synthesis and the limited number of identified miRNAs in C. sativus, on the other hand, one may see the point how the characterization of miRNAs along with the corresponding target genes in C. sativus might expand our perspectives on the roles of miRNAs in carotenoid/apocarotenoid biosynthetic pathway. A computational analysis was used to identify miRNAs and their targets using EST (Expressed Sequence Tag) library from mature saffron stigmas. Then, a gene co- expression network was constructed to identify genes which are potentially involved in carotenoid/apocarotenoid biosynthetic pathways. EST analysis led to the identification of two putative miRNAs (miR414 and miR837-5p) along with the corresponding stem- looped precursors. To our knowledge, this is the first report on miR414 and miR837-5p in C. sativus. Co-expression network analysis indicated that miR414 and miR837-5p may play roles in C. sativus metabolic pathways and led to identification of candidate genes including six transcription factors and one protein kinase probably involved in carotenoid/apocarotenoid biosynthetic pathway. Presence of transcription factors, miRNAs and protein kinase in the network indicated multiple layers of regulation in saffron stigma. The candidate genes from this study may help unraveling regulatory networks underlying the carotenoid/apocarotenoid biosynthesis in saffron and designing metabolic engineering for enhanced secondary metabolites. PMID:28261627
Full Text Available Dioscorea contains critically important species which can be used as staple foods or sources of bioactive substances, including Dioscorea nipponica, which has been used to develop highly successful drugs to treat cardiovascular disease. Its major active ingredients are thought to be sterol compounds such as diosgenin, which has been called “medicinal gold” because of its valuable properties. However, reliance on naturally growing plants as a production system limits the potential use of D. nipponica, raising interest in engineering metabolic pathways to enhance the production of secondary metabolites. However, the biosynthetic pathway of diosgenin is still poorly understood, and D. nipponica is poorly characterized at a molecular level, hindering in-depth investigation. In the present work, the RNAs from five organs and seven methyl jasmonate treated D. nipponica rhizomes were sequenced using the Illumina high-throughput sequencing platform, yielding 52 gigabases of data, which were pooled and assembled into a reference transcriptome. Four hundred and eighty two genes were found to be highly expressed in the rhizomes, and these genes are mainly involved in stress response and transcriptional regulation. Based on their expression patterns, 36 genes were selected for further investigation as candidate genes involved in dioscin biosynthesis. Constructing co-expression networks based on significant changes in gene expression revealed 15 gene modules. Of these, four modules with properties correlating to dioscin regulation and biosynthesis, consisting of 4,665 genes in total, were selected for further functional investigation. These results improve our understanding of dioscin biosynthesis in this important medicinal plant and will help guide more intensive investigations.
Venkata Suresh Bonthala
Full Text Available Bambara groundnut (Vigna subterranea (L. Verdc. is an African legume and is a promising underutilized crop with good seed nutritional values. Low temperature stress in a number of African countries at night, such as Botswana, can effect the growth and development of bambara groundnut, leading to losses in potential crop yield. Therefore, in this study we developed a computational pipeline to identify and analyze the genes and gene modules associated with low temperature stress responses in bambara groundnut using the cross-species microarray technique (as bambara groundnut has no microarray chip coupled with network-based analysis. Analyses of the bambara groundnut transcriptome using cross-species gene expression data resulted in the identification of 375 and 659 differentially expressed genes (p<0.01 under the sub-optimal (23°C and very sub-optimal (18°C temperatures, respectively, of which 110 genes are commonly shared between the two stress conditions. The construction of a Highest Reciprocal Rank-based gene co-expression network, followed by its partition using a Heuristic Cluster Chiseling Algorithm resulted in 6 and 7 gene modules in sub-optimal and very sub-optimal temperature stresses being identified, respectively. Modules of sub-optimal temperature stress are principally enriched with carbohydrate and lipid metabolic processes, while most of the modules of very sub-optimal temperature stress are significantly enriched with responses to stimuli and various metabolic processes. Several transcription factors (from MYB, NAC, WRKY, WHIRLY & GATA classes that may regulate the downstream genes involved in response to stimulus in order for the plant to withstand very sub-optimal temperature stress were highlighted. The identified gene modules could be useful in breeding for low-temperature stress tolerant bambara groundnut varieties.
Hollander, Markus; Hamed, Mohamed; Helms, Volkhard; Neininger, Kerstin
Mutations in genomic key elements can influence gene expression and function in various ways, and hence greatly contribute to the phenotype. We developed MutaNET to score the impact of individual mutations on gene regulation and function of a given genome. MutaNET performs statistical analyses of mutations in different genomic regions. The tool also incorporates the mutations in a provided gene regulatory network to estimate their global impact. The integration of a next-generation sequencing pipeline enables calling mutations prior to the analyses. As application example, we used MutaNET to analyze the impact of mutations in antibiotic resistance (AR) genes and their potential effect on AR of bacterial strains. MutaNET is freely available at https://sourceforge.net/projects/mutanet/. It is implemented in Python and supported on Mac OS X, Linux and MS Windows. Step-by-step instructions are available at http://service.bioinformatik.uni-saarland.de/mutanet/. email@example.com. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org
Siva K. Panguluri
Full Text Available Conditioned taste aversion (CTA is an adaptive behavior that benefits survival of animals including humans and also serves as a powerful model to study the neural mechanisms of learning. Memory formation is a necessary component of CTA learning and involves neural processing and regulation of gene expression in the amygdala. Many studies have been focused on the identification of intracellular signaling cascades involved in CTA, but not late responsive genes underlying the long-lasting behavioral plasticity. In this study, we explored in silico experiments to identify persistent changes in gene expression associated with CTA in rats. We used oligonucleotide microarrays to identify 248 genes in the amygdala regulated by CTA. Pathway Studio and IPA software analyses showed that the differentially expressed genes in the amygdala fall in diverse functional categories such as behavior, psychological disorders, nervous system development and function, and cell-to-cell signaling. Conditioned taste aversion is a complex behavioral trait which involves association of visceral and taste inputs, consolidation of taste and visceral information, memory formation, retrieval of stored information, and extinction phase. In silico analysis of differentially expressed genes is therefore necessary to manipulate specific phase/stage of CTA to understand the molecular insight.
Wang, Weijing; Jiang, Wenjie; Hou, Lin; Duan, Haiping; Wu, Yili; Xu, Chunsheng; Tan, Qihua; Li, Shuxia; Zhang, Dongfeng
The therapeutic management of obesity is challenging, hence further elucidating the underlying mechanisms of obesity development and identifying new diagnostic biomarkers and therapeutic targets are urgent and necessary. Here, we performed differential gene expression analysis and weighted gene co-expression network analysis (WGCNA) to identify significant genes and specific modules related to BMI based on gene expression profile data of 7 discordant monozygotic twins. In the differential gene expression analysis, it appeared that 32 differentially expressed genes (DEGs) were with a trend of up-regulation in twins with higher BMI when compared to their siblings. Categories of positive regulation of nitric-oxide synthase biosynthetic process, positive regulation of NF-kappa B import into nucleus, and peroxidase activity were significantly enriched within GO database and NF-kappa B signaling pathway within KEGG database. DEGs of NAMPT, TLR9, PTGS2, HBD, and PCSK1N might be associated with obesity. In the WGCNA, among the total 20 distinct co-expression modules identified, coral1 module (68 genes) had the strongest positive correlation with BMI (r = 0.56, P = 0.04) and disease status (r = 0.56, P = 0.04). Categories of positive regulation of phospholipase activity, high-density lipoprotein particle clearance, chylomicron remnant clearance, reverse cholesterol transport, intermediate-density lipoprotein particle, chylomicron, low-density lipoprotein particle, very-low-density lipoprotein particle, voltage-gated potassium channel complex, cholesterol transporter activity, and neuropeptide hormone activity were significantly enriched within GO database for this module. And alcoholism and cell adhesion molecules pathways were significantly enriched within KEGG database. Several hub genes, such as GAL, ASB9, NPPB, TBX2, IL17C, APOE, ABCG4, and APOC2 were also identified. The module eigengene of saddlebrown module (212 genes) was also significantly
The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.
Full Text Available Alzheimer’s disease (AD is a neurodegenerative disorder contributing to rapid decline in cognitive function and ultimately dementia. Most cases of AD occur in elderly and later years. There is a growing need for understanding the relationship between aging and AD to identify shared and unique hallmarks associated with the disease in a region and cell-type specific manner. Although genomic studies on AD have been performed extensively, the molecular mechanism of disease progression is still not clear. The major objective of our study is to obtain a higher-order network-level understanding of aging and AD, and their relationship using the hippocampal gene expression profiles of young (20–50 years, aging (70–99 years, and AD (70–99 years. The hippocampus is vulnerable to damage at early stages of AD and altered neurogenesis in the hippocampus is linked to the onset of AD. We combined the weighted gene co-expression network and weighted protein–protein interaction network-level approaches to study the transition from young to aging to AD. The network analysis revealed the organization of co-expression network into functional modules that are cell-type specific in aging and AD. We found that modules associated with astrocytes, endothelial cells and microglial cells are upregulated and significantly correlate with both aging and AD. The modules associated with neurons, mitochondria and endoplasmic reticulum are downregulated and significantly correlate with AD than aging. The oligodendrocytes module does not show significant correlation with neither aging nor disease. Further, we identified aging- and AD-specific interactions/subnetworks by integrating the gene expression with a human protein–protein interaction network. We found dysregulation of genes encoding protein kinases (FYN, SYK, SRC, PKC, MAPK1, ephrin receptors and transcription factors (FOS, STAT3, CEBPB, MYC, NFKβ, and EGR1 in AD. Further, we found genes that encode proteins
Full Text Available Complex diseases result from molecular changes induced by multiple genetic factors and the environment. To derive a systems view of how genetic loci interact in the context of tissue-specific molecular networks, we constructed an F2 intercross comprised of >500 mice from diabetes-resistant (B6 and diabetes-susceptible (BTBR mouse strains made genetically obese by the Leptin(ob/ob mutation (Lep(ob. High-density genotypes, diabetes-related clinical traits, and whole-transcriptome expression profiling in five tissues (white adipose, liver, pancreatic islets, hypothalamus, and gastrocnemius muscle were determined for all mice. We performed an integrative analysis to investigate the inter-relationship among genetic factors, expression traits, and plasma insulin, a hallmark diabetes trait. Among five tissues under study, there are extensive protein-protein interactions between genes responding to different loci in adipose and pancreatic islets that potentially jointly participated in the regulation of plasma insulin. We developed a novel ranking scheme based on cross-loci protein-protein network topology and gene expression to assess each gene's potential to regulate plasma insulin. Unique candidate genes were identified in adipose tissue and islets. In islets, the Alzheimer's gene App was identified as a top candidate regulator. Islets from 17-week-old, but not 10-week-old, App knockout mice showed increased insulin secretion in response to glucose or a membrane-permeant cAMP analog, in agreement with the predictions of the network model. Our result provides a novel hypothesis on the mechanism for the connection between two aging-related diseases: Alzheimer's disease and type 2 diabetes.
Chen, Juan; Xu, Juan; Li, Yongsheng; Zhang, Jinwen; Chen, Hong; Lu, Jianping; Wang, Zishan; Zhao, Xueying; Xu, Kang; Li, Yixue; Li, Xia; Zhang, Yan
Although competing endogenous RNAs (ceRNAs) have been implicated in many solid tumors, their roles in breast cancer subtypes are not well understood. We therefore generated a ceRNA network for each subtype based on the significance of both, positive co-expression and the shared miRNAs, in the corresponding subtype miRNA dys-regulatory network, which was constructed based on negative regulations between differentially expressed miRNAs and targets. All four subtype ceRNA networks exhibited scale-free architecture and showed that the common ceRNAs were at the core of the networks. Furthermore, the common ceRNA hubs had greater connectivity than the subtype-specific hubs. Functional analysis of the common subtype ceRNA hubs highlighted factors involved in proliferation, MAPK signaling pathways and tube morphogenesis. Subtype-specific ceRNA hubs highlighted unique subtype-specific pathways, like the estrogen response and inflammatory pathways in the luminal subtypes or the factors involved in the coagulation process that participates in the basal-like subtype. Ultimately, we identified 29 critical subtype-specific ceRNA hubs that characterized the different breast cancer subtypes. Our study thus provides new insight into the common and specific subtype ceRNA interactions that define the different categories of breast cancer and enhances our understanding of the pathology underlying the different breast cancer subtypes, which can have prognostic and therapeutic implications in the future.
Ray, Sumanta; Hossain, Sk Md Mosaddek; Khatun, Lutfunnesa; Mukhopadhyay, Anirban
Alzheimer's disease (AD) is a chronic neuro-degenerative disruption of the brain which involves in large scale transcriptomic variation. The disease does not impact every regions of the brain at the same time, instead it progresses slowly involving somewhat sequential interaction with different regions. Analysis of the expression patterns of the genes in different regions of the brain influenced in AD surely contribute for a enhanced comprehension of AD pathogenesis and shed light on the early characterization of the disease. Here, we have proposed a framework to identify perturbation and preservation characteristics of gene expression patterns across six distinct regions of the brain ("EC", "HIP", "PC", "MTG", "SFG", and "VCX") affected in AD. Co-expression modules were discovered considering a couple of regions at once. These are then analyzed to know the preservation and perturbation characteristics. Different module preservation statistics and a rank aggregation mechanism have been adopted to detect the changes of expression patterns across brain regions. Gene ontology (GO) and pathway based analysis were also carried out to know the biological meaning of preserved and perturbed modules. In this article, we have extensively studied the preservation patterns of co-expressed modules in six distinct brain regions affected in AD. Some modules are emerged as the most preserved while some others are detected as perturbed between a pair of brain regions. Further investigation on the topological properties of preserved and non-preserved modules reveals a substantial association amongst "betweenness centrality" and "degree" of the involved genes. Our findings may render a deeper realization of the preservation characteristics of gene expression patterns in discrete brain regions affected by AD.
Qiao, Liang; Cao, Minghao; Zheng, Jian; Zhao, Yihong; Zheng, Zhi-Liang
The ratio of sugars to organic acids, two of the major metabolites in fleshy fruits, has been considered the most important contributor to fruit sweetness. Although accumulation of sugars and acids have been extensively studied, whether plants evolve a mechanism to maintain, sense or respond to the fruit sugar/acid ratio remains a mystery. In a prior study, we used an integrated systems biology tool to identify a group of 39 acid-associated genes from the fruit transcriptomes in four sweet orange varieties (Citrus sinensis L. Osbeck) with varying fruit acidity, Succari (acidless), Bingtang (low acid), and Newhall and Xinhui (normal acid). We reanalyzed the prior sweet orange fruit transcriptome data, leading to the identification of 72 genes highly correlated with the fruit sugar/acid ratio. The majority of these sugar/acid ratio-related genes are predicted to be involved in regulatory functions such as transport, signaling and transcription or encode enzymes involved in metabolism. Surprisingly, only three of these sugar/acid ratio-correlated genes are weakly correlated with sugar level and none of them overlaps with the acid-associated genes. Weighted Gene Coexpression Network Analysis (WGCNA) has revealed that these genes belong to four modules, Blue, Grey, Brown and Turquoise, with the former two modules being unique to the sugar/acid ratio control. Our results indicate that orange fruits contain a possible mechanistically distinct class of genes that may potentially be involved in maintaining fruit sugar/acid ratios and/or responding to the cellular sugar/acid ratio status. Therefore, our analysis of orange transcriptomes provides an intriguing insight into the potentially novel genetic or molecular mechanisms controlling the sugar/acid ratio in fruits.
Reyes-Gibby, Cielito C; Melkonian, Stephanie C; Wang, Jian; Yu, Robert K; Shelburne, Samuel A; Lu, Charles; Gunn, Gary Brandon; Chambers, Mark S; Hanna, Ehab Y; Yeung, Sai-Ching J; Shete, Sanjay
Mucositis is a complex, dose-limiting toxicity of chemotherapy or radiotherapy that leads to painful mouth ulcers, difficulty eating or swallowing, gastrointestinal distress, and reduced quality of life for patients with cancer. Mucositis is most common for those undergoing high-dose chemotherapy and hematopoietic stem cell transplantation and for those being treated for malignancies of the head and neck. Treatment and management of mucositis remain challenging. It is expected that multiple genes are involved in the formation, severity, and persistence of mucositis. We used Ingenuity Pathway Analysis (IPA), a novel network-based approach that integrates complex intracellular and intercellular interactions involved in diseases, to systematically explore the molecular complexity of mucositis. As a first step, we searched the literature to identify genes that harbor or are close to the genetic variants significantly associated with mucositis. Our literature review identified 27 candidate genes, of which ERCC1, XRCC1, and MTHFR were the most frequently studied for mucositis. On the basis of this 27-gene list, we used IPA to generate gene networks for mucositis. The most biologically significant novel molecules identified through IPA analyses included TP53, CTNNB1, MYC, RB1, P38 MAPK, and EP300. Additionally, uracil degradation II (reductive) and thymine degradation pathways (p = 1.06-08) were most significant. Finally, utilizing 66 SNPs within the 8 most connected IPA-derived candidate molecules, we conducted a genetic association study for oral mucositis in the head and neck cancer patients who were treated using chemotherapy and/or radiation therapy (186 head and neck cancer patients with oral mucositis vs. 699 head and neck cancer patients without oral mucositis). The top ranked gene identified through this association analysis was RB1 (rs2227311, p-value = 0.034, odds ratio = 0.67). In conclusion, gene network analysis identified novel molecules and biological
Cielito C Reyes-Gibby
Full Text Available Mucositis is a complex, dose-limiting toxicity of chemotherapy or radiotherapy that leads to painful mouth ulcers, difficulty eating or swallowing, gastrointestinal distress, and reduced quality of life for patients with cancer. Mucositis is most common for those undergoing high-dose chemotherapy and hematopoietic stem cell transplantation and for those being treated for malignancies of the head and neck. Treatment and management of mucositis remain challenging. It is expected that multiple genes are involved in the formation, severity, and persistence of mucositis. We used Ingenuity Pathway Analysis (IPA, a novel network-based approach that integrates complex intracellular and intercellular interactions involved in diseases, to systematically explore the molecular complexity of mucositis. As a first step, we searched the literature to identify genes that harbor or are close to the genetic variants significantly associated with mucositis. Our literature review identified 27 candidate genes, of which ERCC1, XRCC1, and MTHFR were the most frequently studied for mucositis. On the basis of this 27-gene list, we used IPA to generate gene networks for mucositis. The most biologically significant novel molecules identified through IPA analyses included TP53, CTNNB1, MYC, RB1, P38 MAPK, and EP300. Additionally, uracil degradation II (reductive and thymine degradation pathways (p = 1.06-08 were most significant. Finally, utilizing 66 SNPs within the 8 most connected IPA-derived candidate molecules, we conducted a genetic association study for oral mucositis in the head and neck cancer patients who were treated using chemotherapy and/or radiation therapy (186 head and neck cancer patients with oral mucositis vs. 699 head and neck cancer patients without oral mucositis. The top ranked gene identified through this association analysis was RB1 (rs2227311, p-value = 0.034, odds ratio = 0.67. In conclusion, gene network analysis identified novel molecules and
Chen, Wen; Zhang, Xuan; Li, Jing; Huang, Shulan; Xiang, Shuanglin; Hu, Xiang; Liu, Changning
Zebrafish is a full-developed model system for studying development processes and human disease. Recent studies of deep sequencing had discovered a large number of long non-coding RNAs (lncRNAs) in zebrafish. However, only few of them had been functionally characterized. Therefore, how to take advantage of the mature zebrafish system to deeply investigate the lncRNAs' function and conservation is really intriguing. We systematically collected and analyzed a series of zebrafish RNA-seq data, then combined them with resources from known database and literatures. As a result, we obtained by far the most complete dataset of zebrafish lncRNAs, containing 13,604 lncRNA genes (21,128 transcripts) in total. Based on that, a co-expression network upon zebrafish coding and lncRNA genes was constructed and analyzed, and used to predict the Gene Ontology (GO) and the KEGG annotation of lncRNA. Meanwhile, we made a conservation analysis on zebrafish lncRNA, identifying 1828 conserved zebrafish lncRNA genes (1890 transcripts) that have their putative mammalian orthologs. We also found that zebrafish lncRNAs play important roles in regulation of the development and function of nervous system; these conserved lncRNAs present a significant sequential and functional conservation, with their mammalian counterparts. By integrative data analysis and construction of coding-lncRNA gene co-expression network, we gained the most comprehensive dataset of zebrafish lncRNAs up to present, as well as their systematic annotations and comprehensive analyses on function and conservation. Our study provides a reliable zebrafish-based platform to deeply explore lncRNA function and mechanism, as well as the lncRNA commonality between zebrafish and human.
Full Text Available Xian-guo Zhou,1,2,* Xiao-liang Huang,1,2,* Si-yuan Liang,1–3 Shao-mei Tang,1,2 Si-kao Wu,1,2 Tong-tong Huang,1,2 Zeng-nan Mo,1,2,4 Qiu-yan Wang1,2,5 1Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, People’s Republic of China; 2Guangxi Key Laboratory for Genomic and Personalized Medicine, Guangxi Collaborative Innovation Center for Genomic and Personalized Medicine, Nanning, Guangxi Zhuang Autonomous Region, People’s Republic of China; 3Department of Colorectal Surgery, First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, People’s Republic of China; 4Department of Urology and Nephrology, The First Affiliated Hospital of Guangxi, Medical University, Nanning, Guangxi Zhuang Autonomous Region, People’s Republic of China; 5Guangxi Colleges and Universities Key Laboratory of Biological Molecular Medicine Research, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, People’s Republic of China *These authors contributed equally to this work Introduction: Colorectal cancer (CRC is the fourth most common cause of cancer-related mortality worldwide. The tumor, node, metastasis (TNM stage remains the standard for CRC prognostication. Identification of meaningful microRNA (miRNA and gene modules or representative biomarkers related to the pathological stage of colon cancer helps to predict prognosis and reveal the mechanisms behind cancer progression.Materials and methods: We applied a systems biology approach by combining differential expression analysis and weighted gene co-expression network analysis (WGCNA to detect the pathological stage-related miRNA and gene modules and construct a miRNA–gene network. The Cancer Genome Atlas (TCGA colon adenocarcinoma (CAC RNA-sequencing data and miRNA-sequencing data were subjected to WGCNA analysis, and the GSE29623, GSE35602 and GSE39396 were utilized to validate and
Garrett, K A; Andersen, K F; Asche, F; Bowden, R L; Forbes, G A; Kulakow, P A; Zhou, B
Resistance genes are a major tool for managing crop diseases. The networks of crop breeders who exchange resistance genes and deploy them in varieties help to determine the global landscape of resistance and epidemics, an important system for maintaining food security. These networks function as a complex adaptive system, with associated strengths and vulnerabilities, and implications for policies to support resistance gene deployment strategies. Extensions of epidemic network analysis can be used to evaluate the multilayer agricultural networks that support and influence crop breeding networks. Here, we evaluate the general structure of crop breeding networks for cassava, potato, rice, and wheat. All four are clustered due to phytosanitary and intellectual property regulations, and linked through CGIAR hubs. Cassava networks primarily include public breeding groups, whereas others are more mixed. These systems must adapt to global change in climate and land use, the emergence of new diseases, and disruptive breeding technologies. Research priorities to support policy include how best to maintain both diversity and redundancy in the roles played by individual crop breeding groups (public versus private and global versus local), and how best to manage connectivity to optimize resistance gene deployment while avoiding risks to the useful life of resistance genes. [Formula: see text] Copyright © 2017 The Author(s). This is an open access article distributed under the CC BY 4.0 International license .
Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun
High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.
Comparative transcriptome and gene co-expression network analysis reveal genes and signaling pathways adaptively responsive to varied adverse stresses in the insect fungal pathogen, Beauveria bassiana.
He, Zhangjiang; Zhao, Xin; Lu, Zhuoyue; Wang, Huifang; Liu, Pengfei; Zeng, Fanqin; Zhang, Yongjun
Sensing, responding, and adapting to the surrounding environment are crucial for all living organisms to survive, proliferate, and differentiate in their biological niches. Beauveria bassiana is an economically important insect-pathogenic fungus which is widely used as a biocontrol agent to control a variety of insect pests. The fungal pathogen unavoidably encounters a variety of adverse environmental stresses and defense response from the host insects during application of the fungal agents. However, few are known about the transcription response of the fungus to respond or adapt varied adverse stresses. Here, we comparatively analyzed the transcriptome of B. bassiana in globe genome under the varied stationary-phase stresses including osmotic agent (0.8 M NaCl), high temperature (32 °C), cell wall-perturbing agent (Congo red), and oxidative agents (H 2 O 2 or menadione). Total of 12,412 reads were obtained, and mapped to the 6767 genes of the B. bassiana. All of these stresses caused transcription responses involved in basal metabolism, cell wall construction, stress response or cell rescue/detoxification, signaling transduction and gene transcription regulation, and likely other cellular processes. An array of genes displayed similar transcription patterns in response to at least two of the five stresses, suggesting a shared transcription response to varied adverse stresses. Gene co-expression network analysis revealed that mTOR signaling pathway, but not HOG1 MAP kinase pathway, played a central role in regulation the varied adverse stress responses, which was verified by RNAi-mediated knockdown of TOR1. Our findings provided an insight of transcription response and gene co-expression network of B. bassiana in adaptation to varied environments. Copyright © 2017 Elsevier Inc. All rights reserved.
Edwards, R.; Glass, L.
The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such
Baur, Brittany; Bozdag, Serdar
One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.
The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.
Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R
The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.
Jia, Peilin; Zhao, Zhongming
A major challenge in interpreting the large volume of mutation data identified by next-generation sequencing (NGS) is to distinguish driver mutations from neutral passenger mutations to facilitate the identification of targetable genes and new drugs. Current approaches are primarily based on mutation frequencies of single-genes, which lack the power to detect infrequently mutated driver genes and ignore functional interconnection and regulation among cancer genes. We propose a novel mutation network method, VarWalker, to prioritize driver genes in large scale cancer mutation data. VarWalker fits generalized additive models for each sample based on sample-specific mutation profiles and builds on the joint frequency of both mutation genes and their close interactors. These interactors are selected and optimized using the Random Walk with Restart algorithm in a protein-protein interaction network. We applied the method in >300 tumor genomes in two large-scale NGS benchmark datasets: 183 lung adenocarcinoma samples and 121 melanoma samples. In each cancer, we derived a consensus mutation subnetwork containing significantly enriched consensus cancer genes and cancer-related functional pathways. These cancer-specific mutation networks were then validated using independent datasets for each cancer. Importantly, VarWalker prioritizes well-known, infrequently mutated genes, which are shown to interact with highly recurrently mutated genes yet have been ignored by conventional single-gene-based approaches. Utilizing VarWalker, we demonstrated that network-assisted approaches can be effectively adapted to facilitate the detection of cancer driver genes in NGS data.
Zhou Qing; Plath Kathrin; Fan Guoping; Mason Mike J; Horvath Steve
Abstract Background Recent work has revealed that a core group of transcription factors (TFs) regulates the key characteristics of embryonic stem (ES) cells: pluripotency and self-renewal. Current efforts focus on identifying genes that play important roles in maintaining pluripotency and self-renewal in ES cells and aim to understand the interactions among these genes. To that end, we...
Aalt D J van Dijk
Full Text Available Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor-target gene interactions but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive. In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence.
Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand
Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. Published by Elsevier Ltd.
Bordeaux John M
Full Text Available Abstract Background Global transcriptional analysis of loblolly pine (Pinus taeda L. is challenging due to limited molecular tools. PtGen2, a 26,496 feature cDNA microarray, was fabricated and used to assess drought-induced gene expression in loblolly pine propagule roots. Statistical analysis of differential expression and weighted gene correlation network analysis were used to identify drought-responsive genes and further characterize the molecular basis of drought tolerance in loblolly pine. Results Microarrays were used to interrogate root cDNA populations obtained from 12 genotype × treatment combinations (four genotypes, three watering regimes. Comparison of drought-stressed roots with roots from the control treatment identified 2445 genes displaying at least a 1.5-fold expression difference (false discovery rate = 0.01. Genes commonly associated with drought response in pine and other plant species, as well as a number of abiotic and biotic stress-related genes, were up-regulated in drought-stressed roots. Only 76 genes were identified as differentially expressed in drought-recovered roots, indicating that the transcript population can return to the pre-drought state within 48 hours. Gene correlation analysis predicts a scale-free network topology and identifies eleven co-expression modules that ranged in size from 34 to 938 members. Network topological parameters identified a number of central nodes (hubs including those with significant homology (E-values ≤ 2 × 10-30 to 9-cis-epoxycarotenoid dioxygenase, zeatin O-glucosyltransferase, and ABA-responsive protein. Identified hubs also include genes that have been associated previously with osmotic stress, phytohormones, enzymes that detoxify reactive oxygen species, and several genes of unknown function. Conclusion PtGen2 was used to evaluate transcriptome responses in loblolly pine and was leveraged to identify 2445 differentially expressed genes responding to severe drought stress in
Background Global transcriptional analysis of loblolly pine (Pinus taeda L.) is challenging due to limited molecular tools. PtGen2, a 26,496 feature cDNA microarray, was fabricated and used to assess drought-induced gene expression in loblolly pine propagule roots. Statistical analysis of differential expression and weighted gene correlation network analysis were used to identify drought-responsive genes and further characterize the molecular basis of drought tolerance in loblolly pine. Results Microarrays were used to interrogate root cDNA populations obtained from 12 genotype × treatment combinations (four genotypes, three watering regimes). Comparison of drought-stressed roots with roots from the control treatment identified 2445 genes displaying at least a 1.5-fold expression difference (false discovery rate = 0.01). Genes commonly associated with drought response in pine and other plant species, as well as a number of abiotic and biotic stress-related genes, were up-regulated in drought-stressed roots. Only 76 genes were identified as differentially expressed in drought-recovered roots, indicating that the transcript population can return to the pre-drought state within 48 hours. Gene correlation analysis predicts a scale-free network topology and identifies eleven co-expression modules that ranged in size from 34 to 938 members. Network topological parameters identified a number of central nodes (hubs) including those with significant homology (E-values ≤ 2 × 10-30) to 9-cis-epoxycarotenoid dioxygenase, zeatin O-glucosyltransferase, and ABA-responsive protein. Identified hubs also include genes that have been associated previously with osmotic stress, phytohormones, enzymes that detoxify reactive oxygen species, and several genes of unknown function. Conclusion PtGen2 was used to evaluate transcriptome responses in loblolly pine and was leveraged to identify 2445 differentially expressed genes responding to severe drought stress in roots. Many of the
Reyes-Gibby, Cielito C; Yuan, Christine; Wang, Jian; Yeung, Sai-Ching J; Shete, Sanjay
Addictions to alcohol and tobacco, known risk factors for cancer, are complex heritable disorders. Addictive behaviors have a bidirectional relationship with pain. We hypothesize that the associations between alcohol, smoking, and opioid addiction observed in cancer patients have a genetic basis. Therefore, using bioinformatics tools, we explored the underlying genetic basis and identified new candidate genes and common biological pathways for smoking, alcohol, and opioid addiction. Literature search showed 56 genes associated with alcohol, smoking and opioid addiction. Using Core Analysis function in Ingenuity Pathway Analysis software, we found that ERK1/2 was strongly interconnected across all three addiction networks. Genes involved in immune signaling pathways were shown across all three networks. Connect function from IPA My Pathway toolbox showed that DRD2 is the gene common to both the list of genetic variations associated with all three addiction phenotypes and the components of the brain neuronal signaling network involved in substance addiction. The top canonical pathways associated with the 56 genes were: 1) calcium signaling, 2) GPCR signaling, 3) cAMP-mediated signaling, 4) GABA receptor signaling, and 5) G-alpha i signaling. Cancer patients are often prescribed opioids for cancer pain thus increasing their risk for opioid abuse and addiction. Our findings provide candidate genes and biological pathways underlying addiction phenotypes, which may be future targets for treatment of addiction. Further study of the variations of the candidate genes could allow physicians to make more informed decisions when treating cancer pain with opioid analgesics.
Full Text Available Mammalian genomes contain several dozens of large (>0.5 Mbp lineage-specific gene loci harbouring functionally related genes. However, spatial chromatin folding, organization of the enhancer-promoter networks and their relevance to Topologically Associating Domains (TADs in these loci remain poorly understood. TADs are principle units of the genome folding and represents the DNA regions within which DNA interacts more frequently and less frequently across the TAD boundary. Here, we used Chromatin Conformation Capture Carbon Copy (5C technology to characterize spatial chromatin interaction network in the 3.1 Mb Epidermal Differentiation Complex (EDC locus harbouring 61 functionally related genes that show lineage-specific activation during terminal keratinocyte differentiation in the epidermis. 5C data validated by 3D-FISH demonstrate that the EDC locus is organized into several TADs showing distinct lineage-specific chromatin interaction networks based on their transcription activity and the gene-rich or gene-poor status. Correlation of the 5C results with genome-wide studies for enhancer-specific histone modifications (H3K4me1 and H3K27ac revealed that the majority of spatial chromatin interactions that involves the gene-rich TADs at the EDC locus in keratinocytes include both intra- and inter-TAD interaction networks, connecting gene promoters and enhancers. Compared to thymocytes in which the EDC locus is mostly transcriptionally inactive, these interactions were found to be keratinocyte-specific. In keratinocytes, the promoter-enhancer anchoring regions in the gene-rich transcriptionally active TADs are enriched for the binding of chromatin architectural proteins CTCF, Rad21 and chromatin remodeler Brg1. In contrast to gene-rich TADs, gene-poor TADs show preferential spatial contacts with each other, do not contain active enhancers and show decreased binding of CTCF, Rad21 and Brg1 in keratinocytes. Thus, spatial interactions between gene
Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize
Full Text Available Rapeseed (Brassica napus is an important oil seed crop, providing more than 13% of the world’s supply of edible oils. An in-depth knowledge of the gene network involved in biosynthesis and accumulation of seed oil is critical for the improvement of B. napus. Using available genomic and transcriptomic resources, we identified 1,750 acyl lipid metabolism (ALM genes that are distributed over 19 chromosomes in the B. napus genome. B. rapa and B. oleracea, two diploid progenitors of B. napus, contributed almost equally to the ALM genes. Genome collinearity analysis demonstrated that the majority of the ALM genes have arisen due to genome duplication or segmental duplication events. In addition, we profiled the expression patterns of the ALM genes in four different developmental stages. Furthermore, we developed two B. napus near isogenic lines (NILs. The high oil NIL, YC13-559, accumulates more than 10% of seed oil compared to the other, YC13-554. Comparative gene expression analysis revealed upregulation of lipid biosynthesis-related regulatory genes in YC13-559, including SHOOTMERISTEMLESS, LEAFY COTYLEDON 1 (LEC1, LEC2, FUSCA3, ABSCISIC ACID INSENSITIVE 3 (ABI3, ABI4, ABI5, and WRINKLED1, as well as structural genes, such as ACETYL-CoA CARBOXYLASE, ACYL-CoA DIACYLGLYCEROL ACYLTRANSFERASE, and LONG-CHAIN ACYL-CoA SYNTHETASES. We observed that several genes related to the phytohormones, gibberellins, jasmonate, and indole acetic acid, were differentially expressed in the NILs. Our findings provide a broad account of the numbers, distribution, and expression profiles of acyl lipid metabolism genes, as well as gene networks that potentially control oil accumulation in B. napus seeds. The upregulation of key regulatory and structural genes related to lipid biosynthesis likely plays a major role for the increased seed oil in YC13-559.
Fath, B.D.; Scharler, U.M.; Ulanowicz, R.E.; Hannon, B.
Ecological network analysis (ENA) is a systems-oriented methodology to analyze within system interactions used to identify holistic properties that are otherwise not evident from the direct observations. Like any analysis technique, the accuracy of the results is as good as the data available, but
Gu, Yunyan; Wang, Hongwei; Qin, Yao; Zhang, Yujing; Zhao, Wenyuan; Qi, Lishuang; Zhang, Yuannv; Wang, Chenguang; Guo, Zheng
The heterogeneity of genetic alterations in human cancer genomes presents a major challenge to advancing our understanding of cancer mechanisms and identifying cancer driver genes. To tackle this heterogeneity problem, many approaches have been proposed to investigate genetic alterations and predict driver genes at the individual pathway level. However, most of these approaches ignore the correlation of alteration events between pathways and miss many genes with rare alterations collectively contributing to carcinogenesis. Here, we devise a network-based approach to capture the cooperative functional modules hidden in genome-wide somatic mutation and copy number alteration profiles of glioblastoma (GBM) from The Cancer Genome Atlas (TCGA), where a module is a set of altered genes with dense interactions in the protein interaction network. We identify 7 pairs of significantly co-altered modules that involve the main pathways known to be altered in GBM (TP53, RB and RTK signaling pathways) and highlight the striking co-occurring alterations among these GBM pathways. By taking into account the non-random correlation of gene alterations, the property of co-alteration could distinguish oncogenic modules that contain driver genes involved in the progression of GBM. The collaboration among cancer pathways suggests that the redundant models and aggravating models could shed new light on the potential mechanisms during carcinogenesis and provide new indications for the design of cancer therapeutic strategies.
Full Text Available Abstract Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model.
Li, Chengxian; Liu, Haihong; Zhang, Tonghua; Yan, Fang
In this paper, a gene regulatory network mediated by small noncoding RNA involving two time delays and diffusion under the Neumann boundary conditions is studied. Choosing the sum of delays as the bifurcation parameter, the stability of the positive equilibrium and the existence of spatially homogeneous and spatially inhomogeneous periodic solutions are investigated by analyzing the corresponding characteristic equation. It is shown that the sum of delays can induce Hopf bifurcation and the diffusion incorporated into the system can effect the amplitude of periodic solutions. Furthermore, the spatially homogeneous periodic solution always exists and the spatially inhomogeneous periodic solution will arise when the diffusion coefficients of protein and mRNA are suitably small. Particularly, the small RNA diffusion coefficient is more robust and its effect on model is much less than protein and mRNA. Finally, the explicit formulae for determining the direction of Hopf bifurcation and the stability of the bifurcating periodic solutions are derived by employing the normal form theory and center manifold theorem for partial functional differential equations. Finally, numerical simulations are carried out to illustrate our theoretical analysis.
Sugimoto, Manabu; Gusev, Oleg; Wheeler, Raymond; Levinskikh, Margarita; Sychev, Vladimir; Bingham, Gail; Hummerick, Mary; Oono, Youko; Matsumoto, Takashi; Yazawa, Takayuki
We have developed a plant growth system, namely Lada, which was installed in ISS to study and grow plants, including vegetables in a spaceflight environment. We have succeeded in cultivating Mizuna, tomato, pea, radish, wheat, rice, and barley in long-term spaceflight. Transcription levels of superoxide dismutase, glutamyl transferase, catalase, and ascorbate peroxidase were increased in the barley germinated and grown for 26 days in Lada, though the whole-plant growth and development of the barley in spaceflight were the same as in the ground control barley. In this study, we investigated the response of the ROS gene network in Mizuna, Brassica rapa var. nipposinica, cultivated under spaceflight condition. Seeds of Mizuna were sown in the root module of LADA aboard the Zvezda module of ISS and the seedlings were grown under 24h lighting in the leaf chamber. After 27 days of cultivation, the plants were harvested and stored at -80(°) C in MELFI aboard the Destiny module, and were transported to the ground at < -20(°) C in GLACIER aboard Space Shuttle. Ground control cultivation was carried out under the same conditions in LADA. Total RNA isolated from leaves was subjected to mRNA-Seq using next generation sequencing (NGS) technology. A total of 20 in 32 ROS oxidative marker genes were up-regulated, including high expression of four hallmarks, and preferentially expressed genes associated with ROS-scavenging including thioredoxin, glutaredoxin, and alternative oxidase genes. In the transcription factors of the ROS gene network, MEKK1-MKK4-MPK3, OXI1-MKK4-MPK3, and OXI1-MPK3 of MAP cascades, induction of WRKY22 by MEKK1-MKK4-MPK3 cascade, induction of WRKY25 and repression of Zat7 by Zat12 were suggested. These results revealed that the spaceflight environment induced oxidative stress and the ROS gene network activation in the space-grown Mizuna.
Lopes, Miguel; Kutlu, Burak; Miani, Michela
Type 1 Diabetes (T1D) is an autoimmune disease where local release of cytokines such as IL-1β and IFN-γ contributes to β-cell apoptosis. To identify relevant genes regulating this process we performed a meta-analysis of 8 datasets of β-cell gene expression after exposure to IL-1β and IFN-γ. Two...
Fondi, Marco; Karkman, Antti; Tamminen, Manu V.; Bosi, Emanuele; Virta, Marko; Fani, Renato; Alm, Eric; McInerney, James O.
The spatial distribution of microbes on our planet is famously formulated in the Baas Becking hypothesis as “everything is everywhere but the environment selects.” While this hypothesis does not strictly rule out patterns caused by geographical effects on ecology and historical founder effects, it does propose that the remarkable dispersal potential of microbes leads to distributions generally shaped by environmental factors rather than geographical distance. By constructing sequence similarity networks from uncultured environmental samples, we show that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes. We find that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne). We find that freshwater habitats act as a gene exchange bridge between otherwise disconnected habitats. Finally, certain antibiotic resistance genes deviate from the general trend of habitat specificity by exhibiting a high degree of cross-habitat mobility. The strong cross-habitat mobility of antibiotic resistance genes is a cause for concern and provides a paradigmatic example of the rate by which genes colonize new habitats when new selective forces emerge. PMID:27190206
Fan, Shengjun; Pan, Zhenyu; Geng, Qiang; Li, Xin; Wang, Yefan; An, Yu; Xu, Yan; Tie, Lu; Pan, Yan; Li, Xuejun
Ulcerative colitis (UC) was the most frequently diagnosed inflammatory bowel disease (IBD) and closely linked to colorectal carcinogenesis. By far, the underlying mechanisms associated with the disease are still unclear. With the increasing accumulation of microarray gene expression profiles, it is profitable to gain a systematic perspective based on gene regulatory networks to better elucidate the roles of genes associated with disorders. However, a major challenge for microarray data analysis is the integration of multiple-studies generated by different groups. In this study, firstly, we modeled a signaling regulatory network associated with colorectal cancer (CRC) initiation via integration of cross-study microarray expression data sets using Empirical Bayes (EB) algorithm. Secondly, a manually curated human cancer signaling map was established via comprehensive retrieval of the publicly available repositories. Finally, the co-differently-expressed genes were manually curated to portray the layered signaling regulatory networks. Overall, the remodeled signaling regulatory networks were separated into four major layers including extracellular, membrane, cytoplasm and nucleus, which led to the identification of five core biological processes and four signaling pathways associated with colorectal carcinogenesis. As a result, our biological interpretation highlighted the importance of EGF/EGFR signaling pathway, EPO signaling pathway, T cell signal transduction and members of the BCR signaling pathway, which were responsible for the malignant transition of CRC from the benign UC to the aggressive one. The present study illustrated a standardized normalization approach for cross-study microarray expression data sets. Our model for signaling networks construction was based on the experimentally-supported interaction and microarray co-expression modeling. Pathway-based signaling regulatory networks analysis sketched a directive insight into colorectal carcinogenesis
Full Text Available Vein graft failure occurs between 1 and 6 months after implantation due to obstructive intimal hyperplasia, related in part to implantation injury. The cell-specific and temporal response of the transcriptome to vein graft implantation injury was determined by transcriptional profiling of laser capture microdissected endothelial cells (EC and medial smooth muscle cells (SMC from canine vein grafts, 2 hours (H to 30 days (D following surgery. Our results demonstrate a robust genomic response beginning at 2 H, peaking at 12-24 H, declining by 7 D, and resolving by 30 D. Gene ontology and pathway analyses of differentially expressed genes indicated that implantation injury affects inflammatory and immune responses, apoptosis, mitosis, and extracellular matrix reorganization in both cell types. Through backpropagation an integrated network was built, starting with genes differentially expressed at 30 D, followed by adding upstream interactive genes from each prior time-point. This identified significant enrichment of IL-6, IL-8, NF-κB, dendritic cell maturation, glucocorticoid receptor, and Triggering Receptor Expressed on Myeloid Cells (TREM-1 signaling, as well as PPARα activation pathways in graft EC and SMC. Interactive network-based analyses identified IL-6, IL-8, IL-1α, and Insulin Receptor (INSR as focus hub genes within these pathways. Real-time PCR was used for the validation of two of these genes: IL-6 and IL-8, in addition to Collagen 11A1 (COL11A1, a cornerstone of the backpropagation. In conclusion, these results establish causality relationships clarifying the pathogenesis of vein graft implantation injury, and identifying novel targets for its prevention.
Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin
Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Yunoki, Tatsuya; Tabuchi, Yoshiaki; Hayashi, Atsushi; Kondo, Takashi
BCL2-associated athanogene 3 (BAG3), a co-chaperone of the heat shock 70 kDa protein (HSPA) family of proteins, is a cytoprotective protein that acts against various stresses, including heat stress. The aim of the present study was to identify gene networks involved in the enhancement of hyperthermia (HT) sensitivity by the knockdown (KD) of BAG3 in human oral squamous cell carcinoma (OSCC) cells. Although a marked elevation in the protein expression of BAG3 was detected in human the OSCC HSC-3 cells exposed to HT at 44˚C for 90 min, its expression was almost completely suppressed in the cells transfected with small interfering RNA against BAG3 (siBAG) under normal and HT conditions. The silencing of BAG3 also enhanced the cell death that was increased in the HSC-3 cells by exposure to HT. Global gene expression analysis revealed many genes that were differentially expressed by >2-fold in the cells exposed to HT and transfected with siBAG. Moreover, Ingenuity® pathways analysis demonstrated two unique gene networks, designated as Pro-cell death and Anti-cell death, which were obtained from upregulated genes and were mainly associated with the biological functions of induction and the prevention of cell death, respectively. Of note, the expression levels of genes in the Pro-cell death and Anti-cell death gene networks were significantly elevated and reduced in the HT + BAG3-KD group compared to those in the HT control group, respectively. These results provide further insight into the molecular mechanisms involved in the enhancement of HT sensitivity by the silencing of BAG3 in human OSCC cells.
Full Text Available Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas, especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein - protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation, sucrose metabolism (upregulation, and starch biosynthesis (upregulation. In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70 was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur.
Chen, Yulong; Su, Zhiguang
Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.
Zhang, Hui; Li, Tangxin; Zheng, Linqing
Oral squamous cell carcinoma is one of the most malignant tumors with high mortality rate worldwide. Biomarker discovery is critical for early diagnosis and precision treatment of this disease. MicroRNAs are small noncoding RNA molecules which often regulate essential biological processes and are good candidates for biomarkers. By integrative analysis of both the cancer-associated gene expression data and microRNA-mRNA network, miR-148b-3p, miR-629-3p, miR-27a-3p, and miR-142-3p were screened as novel diagnostic biomarkers for oral squamous cell carcinoma based on their unique regulatory abilities in the network structure of the conditional microRNA-mRNA network and their important functions. These findings were confirmed by literature verification and functional enrichment analysis. Future experimental validation is expected for the further investigation of their molecular mechanisms. PMID:29098014
Full Text Available Neurological disorders are known to show similar phenotypic manifestations like anxiety, depression, and cognitive impairment. There is a need to identify shared genetic markers and molecular pathways in these diseases, which lead to such comorbid conditions. Our study aims to prioritize novel genetic markers that might increase the susceptibility of patients affected with one neurological disorder to other diseases with similar manifestations. Identification of pathways involving common candidate markers will help in the development of improved diagnosis and treatments strategies for patients affected with neurological disorders. This systems biology study for the first time integratively uses 3D-structural protein interface descriptors and network topological properties that characterize proteins in a neurological protein interaction network, to aid the identification of genes that are previously not known to be shared between these diseases. Results of protein prioritization by machine learning have identified known as well as new genetic markers which might have direct or indirect involvement in several neurological disorders. Important gene hubs have also been identified that provide an evidence for shared molecular pathways in the neurological disease network.
Kaur, Surleen; Archer, Kellie J; Devi, M Gouri; Kriplani, Alka; Strauss, Jerome F; Singh, Rita
Polycystic ovary syndrome (PCOS) is a heterogeneous, genetically complex, endocrine disorder of uncertain etiology in women. Our aim was to compare the gene expression profiles in stimulated granulosa cells of PCOS women with and without insulin resistance vs. matched controls. This study included 12 normal ovulatory women (controls), 12 women with PCOS without evidence for insulin resistance (PCOS non-IR), and 16 women with insulin resistance (PCOS-IR) undergoing in vitro fertilization. Granulosa cell gene expression profiling was accomplished using Affymetrix Human Genome-U133 arrays. Differentially expressed genes were classified according to gene ontology using ingenuity pathway analysis tools. Microarray results for selected genes were confirmed by real-time quantitative PCR. A total of 211 genes were differentially expressed in PCOS non-IR and PCOS-IR granulosa cells (fold change≥1.5; P≤0.001) vs. matched controls. Diabetes mellitus and inflammation genes were significantly increased in PCOS-IR patients. Real-time quantitative PCR confirmed higher expression of NCF2 (2.13-fold), TCF7L2 (1.92-fold), and SERPINA1 (5.35-fold). Increased expression of inflammation genes ITGAX (3.68-fold) and TAB2 (1.86-fold) was confirmed in PCOS non-IR. Different cardiometabolic disease genes were differentially expressed in the two groups. Decreased expression of CAV1 (-3.58-fold) in PCOS non-IR and SPARC (-1.88-fold) in PCOS-IR was confirmed. Differential expression of genes involved in TGF-β signaling (IGF2R, increased; and HAS2, decreased), and oxidative stress (TXNIP, increased) was confirmed in both groups. Microarray analysis demonstrated differential expression of genes linked to diabetes mellitus, inflammation, cardiovascular diseases, and infertility in the granulosa cells of PCOS women with and without insulin resistance. Because these dysregulated genes are also involved in oxidative stress, lipid metabolism, and insulin signaling, we hypothesize that these
Subramanian, Devika; Natarajan, Jeyakumar
Staphylococcus aureus is a major human pathogen and ramoplanin is an antimicrobial attributed for effective treatment. The goal of this study was to examine the transcriptomic profiles of ramoplanin sensitive and resistant S. aureus to identify putative modules responsible for virulence and resistance-mechanisms and its characteristic novel genes. The dysregulated genes were used to reconstruct protein functional association networks for virulence-factors and resistance-mechanisms individually. Strong link between metabolic-pathways and development of virulence/resistance is suggested. We identified 15 putative modules of virulence factors. Six hypothetical genes were annotated with novel virulence activity among which SACOL0281 was discovered to be an essential virulence factor EsaD. The roles of MazEF toxin-antitoxin system, SACOL0202/SACOL0201 two-component system and that of amino-sugar and nucleotide-sugar metabolism in virulence are also suggested. In addition, 14 putative modules of resistance mechanisms including modules of ribosomal protein-coding genes and metabolic pathways such as biotin-synthesis, TCA-cycle, riboflavin-biosynthesis, peptidoglycan-biosynthesis etc. are also indicated. Copyright © 2015 Elsevier B.V. All rights reserved.
Reppe, Sjur; Datta, Harish K; Gautvik, Kaare M
The skeleton is a metabolically active organ throughout life where specific bone cell activity and paracrine/endocrine factors regulate its morphogenesis and remodeling. In recent years, an increasing number of reports have used multi-omics technologies to characterize subsets of bone biological molecular networks. The skeleton is affected by primary and secondary disease, lifestyle and many drugs. Therefore, to obtain relevant and reliable data from well characterized patient and control cohorts are vital. Here we provide a brief overview of omics studies performed on human bone, of which our own studies performed on trans-iliacal bone biopsies from postmenopausal women with osteoporosis (OP) and healthy controls are among the first and largest. Most other studies have been performed on smaller groups of patients, undergoing hip replacement for osteoarthritis (OA) or fracture, and without healthy controls. The major findings emerging from the combined studies are: 1. Unstressed and stressed bone show profoundly different gene expression reflecting differences in bone turnover and remodeling and 2. Omics analyses comparing healthy/OP and control/OA cohorts reveal characteristic changes in transcriptomics, epigenomics (DNA methylation), proteomics and metabolomics. These studies, together with genome-wide association studies, in vitro observations and transgenic animal models have identified a number of genes and gene products that act via Wnt and other signaling systems and are highly associated to bone density and fracture. Future challenge is to understand the functional interactions between bone-related molecular networks and their significance in OP and OA pathogenesis, and also how the genomic architecture is affected in health and disease. Copyright © 2017 Elsevier Inc. All rights reserved.
Joshi, Tejal; Elias, Daniel; Stenvang, Jan
and 14-3-3 family genes. Integrating the inferred miRNA-target relationships, we investigated the functional importance of 2 central genes, SNAI2 and FYN, which showed increased expression in TamR cells, while their corresponding regulatory miRNA were downregulated. Using specific chemical inhibitors......Tamoxifen is an effective anti-estrogen treatment for patients with estrogen receptor-positive (ER+) breast cancer, however, tamoxifen resistance is frequently observed. To elucidate the underlying molecular mechanisms of tamoxifen resistance, we performed a systematic analysis of miRNA......-mediated gene regulation in three clinically-relevant tamoxifen-resistant breast cancer cell lines (TamRs) compared to their parental tamoxifen-sensitive cell line. Alterations in the expression of 131 miRNAs in tamoxifen-resistant vs. parental cell lines were identified, 22 of which were common to all Tam...
El-Sharkawy, Islam; Liang, Dong; Xu, Kenong
Using RNA-seq, this study analysed an apple (Malus×domestica) anthocyanin-deficient yellow-skin somatic mutant 'Blondee' (BLO) and its red-skin parent 'Kidd's D-8' (KID), the original name of 'Gala', to understand the molecular mechanisms underlying the mutation. A total of 3299 differentially expressed genes (DEGs) were identified between BLO and KID at four developmental stages and/or between two adjacent stages within BLO and/or KID. A weighted gene co-expression network analysis (WGCNA) of the DEGs uncovered a network module of 34 genes highly correlated (r=0.95, P=9.0×10(-13)) with anthocyanin contents. Although 12 of the 34 genes in the WGCNA module were characterized and known of roles in anthocyanin, the remainder 22 appear to be novel. Examining the expression of ten representative genes in the module in 14 diverse apples revealed that at least eight were significantly correlated with anthocyanin variation. MdMYB10 (MDP0000259614) and MdGST (MDP0000252292) were among the most suppressed module member genes in BLO despite being undistinguishable in their corresponding sequences between BLO and KID. Methylation assay of MdMYB10 and MdGST in fruit skin revealed that two regions (MR3 and MR7) in the MdMYB10 promoter exhibited remarkable differences between BLO and KID. In particular, methylation was high and progressively increased alongside fruit development in BLO while was correspondingly low and constant in KID. The methylation levels in both MR3 and MR7 were negatively correlated with anthocyanin content as well as the expression of MdMYB10 and MdGST. Clearly, the collective repression of the 34 genes explains the loss-of-colour in BLO while the methylation in MdMYB10 promoter is likely causal for the mutation. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in
Zhao, Luyang; Gu, Chenglei; Ye, Mingxia; Zhang, Zhe; Li, Li'an; Fan, Wensheng; Meng, Yuanguang
The etiology and pathophysiology of endometriosis remain unclear. Accumulating evidence suggests that aberrant microRNA (miRNA) and transcription factor (TF) expression may be involved in the pathogenesis and development of endometriosis. This study therefore aims to survey the key miRNAs, TFs and genes and further understand the mechanism of endometriosis. Paired expression profiling of miRNA and mRNA in ectopic endometria compared with eutopic endometria were determined by high-throughput sequencing techniques in eight patients with ovarian endometriosis. Binary interactions and circuits among the miRNAs, TFs, and corresponding genes were identified by the Pearson correlation coefficients. miRNA-TF-gene regulatory networks were constructed using bioinformatic methods. Eleven selected miRNAs and TFs were validated by quantitative reverse transcription-polymerase chain reaction in 22 patients. Overall, 107 differentially expressed miRNAs and 6112 differentially expressed mRNAs were identified by comparing the sequencing of the ectopic endometrium group and the eutopic endometrium group. The miRNA-TF-gene regulatory network consists of 22 miRNAs, 12 TFs and 430 corresponding genes. Specifically, some key regulators from the miR-449 and miR-34b/c cluster, miR-200 family, miR-106a-363 cluster, miR-182/183, FOX family, GATA family, and E2F family as well as CEBPA, SOX9 and HNF4A were suggested to play vital regulatory roles in the pathogenesis of endometriosis. Integration analysis of the miRNA and mRNA expression profiles presents a unique insight into the regulatory network of this enigmatic disorder and possibly provides clues regarding replacement therapy for endometriosis.
Peng, Hui; Lan, Chaowang; Zheng, Yi; Hutvagner, Gyorgy; Tao, Dacheng; Li, Jinyan
MicroRNAs always function cooperatively in their regulation of gene expression. Dysfunctions of these co-functional microRNAs can play significant roles in disease development. We are interested in those multi-disease associated co-functional microRNAs that regulate their common dysfunctional target genes cooperatively in the development of multiple diseases. The research is potentially useful for human disease studies at the transcriptional level and for the study of multi-purpose microRNA therapeutics. We designed a computational method to detect multi-disease associated co-functional microRNA pairs and conducted cross disease analysis on a reconstructed disease-gene-microRNA (DGR) tripartite network. The construction of the DGR tripartite network is by the integration of newly predicted disease-microRNA associations with those relationships of diseases, microRNAs and genes maintained by existing databases. The prediction method uses a set of reliable negative samples of disease-microRNA association and a pre-computed kernel matrix instead of kernel functions. From this reconstructed DGR tripartite network, multi-disease associated co-functional microRNA pairs are detected together with their common dysfunctional target genes and ranked by a novel scoring method. We also conducted proof-of-concept case studies on cancer-related co-functional microRNA pairs as well as on non-cancer disease-related microRNA pairs. With the prioritization of the co-functional microRNAs that relate to a series of diseases, we found that the co-function phenomenon is not unusual. We also confirmed that the regulation of the microRNAs for the development of cancers is more complex and have more unique properties than those of non-cancer diseases.
We propose a gene regulatory network model which incorporates the microscopic interactions between genes and transcription factors. In particular the gene's expression level is determined by deterministic synchronous dynamics with contribution from excitatory interactions. We study the structure of networks that have a particular '' function '' and are subject to the natural selection pressure. The question of network robustness against point mutations is addressed, and we conclude that only a small part of connections defined as '' essential '' for cell's existence is fragile. Additionally, the obtained networks are sparse with narrow in-degree and broad out-degree, properties well known from experimental study of biological regulatory networks. Furthermore, during sampling procedure we observe that significantly different genotypes can emerge under mutation-selection balance. All the preceding features hold for the model parameters which lay in the experimentally relevant range. (author)
Li, Yupeng; Jackson, Scott A
Farace, Richard V.; Mabee, Timothy
This paper reviews a variety of analytic procedures that can be applied to network data, discussing the assumptions and usefulness of each procedure when applied to the complexity of human communication. Special attention is paid to the network properties measured or implied by each procedure. Factor analysis and multidimensional scaling are among…
Pluripotent stem cells can be isolated from embryos or derived by reprogramming. Pluripotency is stabilized by an interconnected network of pluripotency genes that cooperatively regulate gene expression. Here we describe the molecular principles of pluripotency gene function and highlight post-transcriptional controls, particularly those induced by RNA-binding proteins and alternative splicing, as an important regulatory layer of pluripotency. We also discuss heterogeneity in pluripotency regulation, alternative pluripotency states and future directions of pluripotent stem cell research.
Li, Mo; Belmonte, Juan Carlos Izpisua
Pluripotent stem cells can be isolated from embryos or derived by reprogramming. Pluripotency is stabilized by an interconnected network of pluripotency genes that cooperatively regulate gene expression. Here we describe the molecular principles of pluripotency gene function and highlight post-transcriptional controls, particularly those induced by RNA-binding proteins and alternative splicing, as an important regulatory layer of pluripotency. We also discuss heterogeneity in pluripotency regulation, alternative pluripotency states and future directions of pluripotent stem cell research.
Eric C Wooten
Full Text Available Bicuspid Aortic Valve (BAV is a highly heritable congenital heart defect. The low frequency of BAV (1% of general population limits our ability to perform genome-wide association studies. We present the application of four a priori SNP selection techniques, reducing the multiple-testing penalty by restricting analysis to SNPs relevant to BAV in a genome-wide SNP dataset from a cohort of 68 BAV probands and 830 control subjects. Two knowledge-based approaches, CANDID and STRING, were used to systematically identify BAV genes, and their SNPs, from the published literature, microarray expression studies and a genome scan. We additionally tested Functionally Interpolating SNPs (fitSNPs present on the array; the fourth consisted of SNPs selected by Random Forests, a machine learning approach. These approaches reduced the multiple testing penalty by lowering the fraction of the genome probed to 0.19% of the total, while increasing the likelihood of studying SNPs within relevant BAV genes and pathways. Three loci were identified by CANDID, STRING, and fitSNPS. A haplotype within the AXIN1-PDIA2 locus (p-value of 2.926x10(-06 and a haplotype within the Endoglin gene (p-value of 5.881x10(-04 were found to be strongly associated with BAV. The Random Forests approach identified a SNP on chromosome 3 in association with BAV (p-value 5.061x10(-06. The results presented here support an important role for genetic variants in BAV and provide support for additional studies in well-powered cohorts. Further, these studies demonstrate that leveraging existing expression and genomic data in the context of GWAS studies can identify biologically relevant genes and pathways associated with a congenital heart defect.
Full Text Available We tackle the problem of completing and inferring genetic networks under stationary conditions from static data, where network completion is to make the minimum amount of modifications to an initial network so that the completed network is most consistent with the expression data in which addition of edges and deletion of edges are basic modification operations. For this problem, we present a new method for network completion using dynamic programming and least-squares fitting. This method can find an optimal solution in polynomial time if the maximum indegree of the network is bounded by a constant. We evaluate the effectiveness of our method through computational experiments using synthetic data. Furthermore, we demonstrate that our proposed method can distinguish the differences between two types of genetic networks under stationary conditions from lung cancer and normal gene expression data.
Wang, Weijing; Jiang, Wenjie; Hou, Lin
BACKGROUND: The therapeutic management of obesity is challenging, hence further elucidating the underlying mechanisms of obesity development and identifying new diagnostic biomarkers and therapeutic targets are urgent and necessary. Here, we performed differential gene expression analysis......) were with a trend of up-regulation in twins with higher BMI when compared to their siblings. Categories of positive regulation of nitric-oxide synthase biosynthetic process, positive regulation of NF-kappa B import into nucleus, and peroxidase activity were significantly enriched within GO database...
Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes
The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network. PMID:19876378
Saik, Olga V; Demenkov, Pavel S; Ivanisenko, Timofey V; Bragina, Elena Yu; Freidin, Maxim B; Goncharova, Irina A; Dosenko, Victor E; Zolotareva, Olga I; Hofestaedt, Ralf; Lavrik, Inna N; Rogaev, Evgeny I; Ivanisenko, Vladimir A
Hypertension and bronchial asthma are a major issue for people's health. As of 2014, approximately one billion adults, or ~ 22% of the world population, have had hypertension. As of 2011, 235-330 million people globally have been affected by asthma and approximately 250,000-345,000 people have died each year from the disease. The development of the effective treatment therapies against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and their treatment. Hence, in this study the bioinformatical methodology for the analysis of the comorbidity of these two diseases have been developed. As such, the search for candidate genes related to the comorbid conditions of asthma and hypertension can help in elucidating the molecular mechanisms underlying the comorbid condition of these two diseases, and can also be useful for genotyping and identifying new drug targets. Using ANDSystem, the reconstruction and analysis of gene networks associated with asthma and hypertension was carried out. The gene network of asthma included 755 genes/proteins and 62,603 interactions, while the gene network of hypertension - 713 genes/proteins and 45,479 interactions. Two hundred and five genes/proteins and 9638 interactions were shared between asthma and hypertension. An approach for ranking genes implicated in the comorbid condition of two diseases was proposed. The approach is based on nine criteria for ranking genes by their importance, including standard methods of gene prioritization (Endeavor, ToppGene) as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analysed genes. According to the proposed approach, the genes IL10, TLR4, and CAT had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the list of top genes is enriched with apoptotic genes and genes involved in
... Matters NIH Research Matters August 12, 2013 Mutated Genes in Schizophrenia Map to Brain Networks Schizophrenia networks ... have a high number of spontaneous mutations in genes that form a network in the front region ...
Full Text Available Abstract Background In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP, generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Results Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Conclusion Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.
Full Text Available Echinoderms, which are phylogenetically related to vertebrates and produce large numbers of transparent embryos that can be experimentally manipulated, offer many advantages for the analysis of the gene regulatory networks (GRN regulating germ layer formation. During development of the sea urchin embryo, the ectoderm is the source of signals that pattern all three germ layers along the dorsal-ventral axis. How this signaling center controls patterning and morphogenesis of the embryo is not understood. Here, we report a large-scale analysis of the GRN deployed in response to the activity of this signaling center in the embryos of the Mediterranean sea urchin Paracentrotus lividus, in which studies with high spatial resolution are possible. By using a combination of in situ hybridization screening, overexpression of mRNA, recombinant ligand treatments, and morpholino-based loss-of-function studies, we identified a cohort of transcription factors and signaling molecules expressed in the ventral ectoderm, dorsal ectoderm, and interposed neurogenic ("ciliary band" region in response to the known key signaling molecules Nodal and BMP2/4 and defined the epistatic relationships between the most important genes. The resultant GRN showed a number of striking features. First, Nodal was found to be essential for the expression of all ventral and dorsal marker genes, and BMP2/4 for all dorsal genes. Second, goosecoid was identified as a central player in a regulatory sub-circuit controlling mouth formation, while tbx2/3 emerged as a critical factor for differentiation of the dorsal ectoderm. Finally, and unexpectedly, a neurogenic ectoderm regulatory circuit characterized by expression of "ciliary band" genes was triggered in the absence of TGF beta signaling. We propose a novel model for ectoderm regionalization, in which neural ectoderm is the default fate in the absence of TGF beta signaling, and suggest that the stomodeal and neural subcircuits that we
Joshi, Tejal; Elias, Daniel; Stenvang, Jan
Tamoxifen is an effective anti-estrogen treatment for patients with estrogen receptor-positive (ER+) breast cancer, however, tamoxifen resistance is frequently observed. To elucidate the underlying molecular mechanisms of tamoxifen resistance, we performed a systematic analysis of mi......+ breast cancer patients receiving adjuvant tamoxifen mono-therapy. Our results provide new insight into the molecular mechanisms of tamoxifen resistance and may form the basis for future medical intervention for the large number of women with tamoxifen-resistant ER+ breast cancer.......RNA-mediated gene regulation in three clinically-relevant tamoxifen-resistant breast cancer cell lines (TamRs) compared to their parental tamoxifen-sensitive cell line. Alterations in the expression of 131 miRNAs in tamoxifen-resistant vs. parental cell lines were identified, 22 of which were common to all Tam...
Alexey Anatolievich Morozov
Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.
Daniel Victor Guebel
Full Text Available Motivation: In the brain of elderly-healthy individuals, the effects of sexual dimorphism and those due to normal ageing appear overlapped. Discrimination of these two dimensions would powerfully contribute to a better understanding of the aetiology of some neurodegenerative diseases, such as sporadic Alzheimer. Methods: Following a system biology approach, top-down and bottom-up strategies were combined. First, public transcriptome data corresponding to the transition from adulthood to the ageing stage in normal, human hippocampus were analysed through an optimized microarray post-processing (Q-GDEMAR method together with a proper experimental design (full factorial analysis. Second, the identified genes were placed in context by building compatible networks. The subsequent ontology analyses carried out on these networks clarify the main functionalities involved. Results: Noticeably we could identify large sets of genes according to three groups: those that exclusively depend on the sex, those that exclusively depend on the age, and those that depend on the particular combinations of sex and age (interaction. The genes identified were validated against three independent sources (a proteomic study of ageing, a senescence database, and a mitochondrial genetic database. We arrived to several new inferences about the biological functions compromised during ageing in two ways: by taking into account the sex-independent effects of ageing, and considering the interaction between age and sex where pertinent. In particular, we discuss the impact of our findings on the functions of mitochondria, autophagy, mitophagia, and microRNAs.Conclusions: The evidence obtained herein supports the occurrence of significant neurobiological differences in the hippocampus, not only between adult and elderly individuals, but between old-healthy women and old-healthy men. Hence, to obtain realistic results in further analysis of the transition from the normal ageing to
Gupta, Chinmaya; López, José Manuel; Ott, William; Josić, Krešimir; Bennett, Matthew R
Transcriptional delay can significantly impact the dynamics of gene networks. Here we examine how such delay affects bistable systems. We investigate several stochastic models of bistable gene networks and find that increasing delay dramatically increases the mean residence times near stable states. To explain this, we introduce a non-Markovian, analytically tractable reduced model. The model shows that stabilization is the consequence of an increased number of failed transitions between stable states. Each of the bistable systems that we simulate behaves in this manner.
Liao James C
Full Text Available Abstract Background Network Component Analysis (NCA has been used to deduce the activities of transcription factors (TFs from gene expression data and the TF-gene binding relationship. However, the TF-gene interaction varies in different environmental conditions and tissues, but such information is rarely available and cannot be predicted simply by motif analysis. Thus, it is beneficial to identify key TF-gene interactions under the experimental condition based on transcriptome data. Such information would be useful in identifying key regulatory pathways and gene markers of TFs in further studies. Results We developed an algorithm to trim network connectivity such that the important regulatory interactions between the TFs and the genes were retained and the regulatory signals were deduced. Theoretical studies demonstrated that the regulatory signals were accurately reconstructed even in the case where only three independent transcriptome datasets were available. At least 80% of the main target genes were correctly predicted in the extreme condition of high noise level and small number of datasets. Our algorithm was tested with transcriptome data taken from mice under rapamycin treatment. The initial network topology from the literature contains 70 TFs, 778 genes, and 1423 edges between the TFs and genes. Our method retained 1074 edges (i.e. 75% of the original edge number and identified 17 TFs as being significantly perturbed under the experimental condition. Twelve of these TFs are involved in MAPK signaling or myeloid leukemia pathways defined in the KEGG database, or are known to physically interact with each other. Additionally, four of these TFs, which are Hif1a, Cebpb, Nfkb1, and Atf1, are known targets of rapamycin. Furthermore, the trimmed network was able to predict Eno1 as an important target of Hif1a; this key interaction could not be detected without trimming the regulatory network. Conclusions The advantage of our new algorithm
Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin
Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Westenberg, M.A.; Hijum, van S.A.F.T.; Lulko, A.T.; Kuipers, O.P.; Roerdink, J.B.T.M.; Linsen, L.; Hagen, H.; Hamann, B.
We present GENeVis, an application to visualize gene expression time series data in a gene regulatory network context. This is a network of regulator proteins that regulate the expression of their respective target genes. The networks are represented as graphs, in which the nodes represent genes,
Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.
Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.
The book presents some key mathematical tools for the performance analysis of communication networks and computer systems.Communication networks and computer systems have become extremely complex. The statistical resource sharing induced by the random behavior of users and the underlying protocols and algorithms may affect Quality of Service.This book introduces the main results of queuing theory that are useful for analyzing the performance of these systems. These mathematical tools are key to the development of robust dimensioning rules and engineering methods. A number of examples i
Li, Zhiyuan; Bianco, Simone; Zhang, Zhaoyang; Tang, Chao
Modeling gene regulatory networks (GRNs) is an important topic in systems biology. Although there has been much work focusing on various specific systems, the generic behavior of GRNs with continuous variables is still elusive. In particular, it is not clear typically how attractors partition among the three types of orbits: steady state, periodic and chaotic, and how the dynamical properties change with network's topological characteristics. In this work, we first investigated these questions in random GRNs with different network sizes, connectivity, fraction of inhibitory links and transcription regulation rules. Then we searched for the core motifs that govern the dynamic behavior of large GRNs. We show that the stability of a random GRN is typically governed by a few embedding motifs of small sizes, and therefore can in general be understood in the context of these short motifs. Our results provide insights for the study and design of genetic networks.
Network Systems Security Analysis has utmost importance in today's world. Many companies, like banks which give priority to data management, test their own data security systems with "Penetration Tests" by time to time. In this context, companies must also test their own network/server systems and take precautions, as the data security draws attention. Based on this idea, the study cyber-attacks are researched throughoutly and Penetration Test technics are examined. With these information on, classification is made for the cyber-attacks and later network systems' security is tested systematically. After the testing period, all data is reported and filed for future reference. Consequently, it is found out that human beings are the weakest circle of the chain and simple mistakes may unintentionally cause huge problems. Thus, it is clear that some precautions must be taken to avoid such threats like updating the security software.
Full Text Available Abstract Background Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models. Results We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion 123 which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples. Conclusion Hybrid simplifications can be used for onion-like (multi-layered approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach.
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918
Mark D Schroeder
Full Text Available The segmentation gene network of Drosophila consists of maternal and zygotic factors that generate, by transcriptional (cross- regulation, expression patterns of increasing complexity along the anterior-posterior axis of the embryo. Using known binding site information for maternal and zygotic gap transcription factors, the computer algorithm Ahab recovers known segmentation control elements (modules with excellent success and predicts many novel modules within the network and genome-wide. We show that novel module predictions are highly enriched in the network and typically clustered proximal to the promoter, not only upstream, but also in intronic space and downstream. When placed upstream of a reporter gene, they consistently drive patterned blastoderm expression, in most cases faithfully producing one or more pattern elements of the endogenous gene. Moreover, we demonstrate for the entire set of known and newly validated modules that Ahab's prediction of binding sites correlates well with the expression patterns produced by the modules, revealing basic rules governing their composition. Specifically, we show that maternal factors consistently act as activators and that gap factors act as repressors, except for the bimodal factor Hunchback. Our data suggest a simple context-dependent rule for its switch from repressive to activating function. Overall, the composition of modules appears well fitted to the spatiotemporal distribution of their positive and negative input factors. Finally, by comparing Ahab predictions with different categories of transcription factor input, we confirm the global regulatory structure of the segmentation gene network, but find odd skipped behaving like a primary pair-rule gene. The study expands our knowledge of the segmentation gene network by increasing the number of experimentally tested modules by 50%. For the first time, the entire set of validated modules is analyzed for binding site composition under a
This textbook presents the mathematical theory and techniques necessary for analyzing and modeling high-performance global networks, such as the Internet. The three main building blocks of high-performance networks are links, switching equipment connecting the links together, and software employed at the end nodes and intermediate switches. This book provides the basic techniques for modeling and analyzing these last two components. Topics covered include, but are not limited to: Markov chains and queuing analysis, traffic modeling, interconnection networks and switch architectures and buffering strategies. · Provides techniques for modeling and analysis of network software and switching equipment; · Discusses design options used to build efficient switching equipment; · Includes many worked examples of the application of discrete-time Markov chains to communication systems; · Covers the mathematical theory and techniques necessary for ana...
Castillo Luis F.
Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.
Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier; Lecompte, Odile
The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user's specific interests and provides an efficient way to share information with collaborators. Furthermore, the user's behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. ©Alexis Allot, Kirsley Chennen, Yannis
Full Text Available The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network
Full Text Available Abstract Background Recent years have seen a dramatic increase in the use of mathematical modeling to gain insight into gene regulatory network behavior across many different organisms. In particular, there has been considerable interest in using mathematical tools to understand how multistable regulatory networks may contribute to developmental processes such as cell fate determination. Indeed, such a network may subserve the formation of unicellular leaf hairs (trichomes in the model plant Arabidopsis thaliana. Results In order to investigate the capacity of small gene regulatory networks to generate multiple equilibria, we present a chemical reaction network (CRN-based modeling formalism and describe a number of methods for CRN analysis in a parameter-free context. These methods are compared and applied to a full set of one-component subnetworks, as well as a large random sample from 40,680 similarly constructed two-component subnetworks. We find that positive feedback and cooperativity mediated by transcription factor (TF dimerization is a requirement for one-component subnetwork bistability. For subnetworks with two components, the presence of these processes increases the probability that a randomly sampled subnetwork will exhibit multiple equilibria, although we find several examples of bistable two-component subnetworks that do not involve cooperative TF-promoter binding. In the specific case of epidermal differentiation in Arabidopsis, dimerization of the GL3-GL1 complex and cooperative sequential binding of GL3-GL1 to the CPC promoter are each independently sufficient for bistability. Conclusion Computational methods utilizing CRN-specific theorems to rule out bistability in small gene regulatory networks are far superior to techniques generally applicable to deterministic ODE systems. Using these methods to conduct an unbiased survey of parameter-free deterministic models of small networks, and the Arabidopsis epidermal cell
Chiu, Yu-Chiao; Wang, Li-Ju; Hsiao, Tzu-Hung; Chuang, Eric Y; Chen, Yidong
With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking. We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions. Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.
Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina
Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD.
Tsuchiya, Masa; Selvarajoo, Kumar; Piras, Vincent; Tomita, Masaru; Giuliani, Alessandro
An exacerbated sensitivity to apparently minor stimuli and a general resilience of the entire system stay together side-by-side in biological systems. This apparent paradox can be explained by the consideration of biological systems as very strongly interconnected network systems. Some nodes of these networks, thanks to their peculiar location in the network architecture, are responsible for the sensitivity aspects, while the large degree of interconnection is at the basis of the resilience properties of the system. One relevant feature of the high degree of connectivity of gene regulation networks is the emergence of collective ordered phenomena influencing the entire genome and not only a specific portion of transcripts. The great majority of existing gene regulation models give the impression of purely local ‘hard-wired’ mechanisms disregarding the emergence of global ordered behavior encompassing thousands of genes while the general, genome wide, aspects are less known. Here we address, on a data analysis perspective, the discrimination between local and global scale regulations, this goal was achieved by means of the examination of two biological systems: innate immune response in macrophages and oscillating growth dynamics in yeast. Our aim was to reconcile the ‘hard-wired’ local view of gene regulation with a global continuous and scalable one borrowed from statistical physics. This reconciliation is based on the network paradigm in which the local ‘hard-wired’ activities correspond to the activation of specific crucial nodes in the regulation network, while the scalable continuous responses can be equated to the collective oscillations of the network after a perturbation.
Shin, Yong-Jun; Bleris, Leonidas
Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain) and linear state-space (time domain) can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.
Kadarmideen, Haja; Watson-Haigh, Nathan S.
Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four......) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT...... (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended....
Full Text Available We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX model with a group smoothly clipped absolute deviation (SCAD method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon. To reveal the diurnal changes in the transcriptome in B. distachyon, we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon. On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon, aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Huang, Yen-Tsung; Lin, Xihong
Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
Uddin, Raihan; Singh, Shiva M
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they
Ramachandran, Maya; Fogle, Homer; Costes, Sylvain
Network analysis methods leverage prior knowledge of cellular systems and the statistical and conceptual relationships between analyte measurements to determine gene connectivity. Correlation and conditional metrics are used to infer a network topology and provide a systems-level context for cellular responses. Integration across multiple experimental conditions and omics domains can reveal the regulatory mechanisms that underlie gene expression. GeneLab has assembled rich multi-omic (transcriptomics, proteomics, epigenomics, and epitranscriptomics) datasets for multiple murine tissues from the Rodent Research 1 (RR-1) experiment. RR-1 assesses the impact of 37 days of spaceflight on gene expression across a variety of tissue types, such as adrenal glands, quadriceps, gastrocnemius, tibalius anterior, extensor digitorum longus, soleus, eye, and kidney. Network analysis is particularly useful for RR-1 -omics datasets because it reinforces subtle relationships that may be overlooked in isolated analyses and subdues confounding factors. Our objective is to use network analysis to determine potential target nodes for therapeutic intervention and identify similarities with existing disease models. Multiple network algorithms are used for a higher confidence consensus.
Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A
Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes.
Živković, J.; Tadić, B.; Wick, N.; Thurner, S.
We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J
Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.
Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.
Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167
Gong, Ping; Madak-Erdogan, Zeynep; Li, Jilong; Cheng, Jianlin; Greenlief, C. Michael; Helferich, William G.; Katzenellenbogen, John A.
The estrogen receptors (ERs) ERα and ERβ mediate the actions of endogenous estrogens as well as those of botanical estrogens (BEs) present in plants. BEs are ingested in the diet and also widely consumed by postmenopausal women as dietary supplements, often as a substitute for the loss of endogenous estrogens at menopause. However, their activities and efficacies, and similarities and differences in gene expression programs with respect to endogenous estrogens such as estradiol (E2) are not fully understood. Because gene expression patterns underlie and control the broad physiological effects of estrogens, we have investigated and compared the gene networks that are regulated by different BEs and by E2. Our aim was to determine if the soy and licorice BEs control similar or different gene expression programs and to compare their gene regulations with that of E2. Gene expression was examined by RNA-Seq in human breast cancer (MCF7) cells treated with control vehicle, BE or E2. These cells contained three different complements of ERs, ERα only, ERα+ERβ, or ERβ only, reflecting the different ratios of these two receptors in different human breast cancers and in different estrogen target cells. Using principal component, hierarchical clustering, and gene ontology and interactome analyses, we found that BEs regulated many of the same genes as did E2. The genes regulated by each BE, however, were somewhat different from one another, with some genes being regulated uniquely by each compound. The overlap with E2 in regulated genes was greatest for the soy isoflavones genistein and S-equol, while the greatest difference from E2 in gene expression pattern was observed for the licorice root BE liquiritigenin. The gene expression pattern of each ligand depended greatly on the cell background of ERs present. Despite similarities in gene expression pattern with E2, the BEs were generally less stimulatory of genes promoting proliferation and were more pro-apoptotic in their
Full Text Available Marine organisms possess a series of cellular strategies to counteract the negative effects of toxic compounds, including the massive reorganization of gene expression networks. Here we report the modulated dose-dependent response of activated genes by diatom polyunsaturated aldehydes (PUAs in the sea urchin Paracentrotus lividus. PUAs are secondary metabolites deriving from the oxidation of fatty acids, inducing deleterious effects on the reproduction and development of planktonic and benthic organisms that feed on these unicellular algae and with anti-cancer activity. Our previous results showed that PUAs target several genes, implicated in different functional processes in this sea urchin. Using interactomic Ingenuity Pathway Analysis we now show that the genes targeted by PUAs are correlated with four HUB genes, NF-κB, p53, δ-2-catenin and HIF1A, which have not been previously reported for P. lividus. We propose a working model describing hypothetical pathways potentially involved in toxic aldehyde stress response in sea urchins. This represents the first report on gene networks affected by PUAs, opening new perspectives in understanding the cellular mechanisms underlying the response of benthic organisms to diatom exposure.
Fung, Elizabeth-sharon [Los Alamos National Laboratory
Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.
ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.
Full Text Available In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner.
Cline, M.S.; Smoot, M.; Cerami, E.
of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules......Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context...... and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape....
Full Text Available Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.
Gu, Zuguang; Zhang, Chenyu; Wang, Jin
Hepatocellular carcinoma (HCC) is one of the most lethal cancers worldwide, and the mechanisms that lead to the disease are still relatively unclear. However, with the development of high-throughput technologies it is possible to gain a systematic view of biological systems to enhance the understanding of the roles of genes associated with HCC. Thus, analysis of the mechanism of molecule interactions in the context of gene regulatory networks can reveal specific sub-networks that lead to the development of HCC. In this study, we aimed to identify the most important gene regulations that are dysfunctional in HCC generation. Our method for constructing gene regulatory network is based on predicted target interactions, experimentally-supported interactions, and co-expression model. Regulators in the network included both transcription factors and microRNAs to provide a complete view of gene regulation. Analysis of gene regulatory network revealed that gene regulation in HCC is highly modular, in which different sets of regulators take charge of specific biological processes. We found that microRNAs mainly control biological functions related to mitochondria and oxidative reduction, while transcription factors control immune responses, extracellular activity and the cell cycle. On the higher level of gene regulation, there exists a core network that organizes regulations between different modules and maintains the robustness of the whole network. There is direct experimental evidence for most of the regulators in the core gene regulatory network relating to HCC. We infer it is the central controller of gene regulation. Finally, we explored the influence of the core gene regulatory network on biological pathways. Our analysis provides insights into the mechanism of transcriptional and post-transcriptional control in HCC. In particular, we highlight the importance of the core gene regulatory network; we propose that it is highly related to HCC and we believe further
Wang Dan-Ling; Yu Zu-Guo; Anh V
Complex networks have recently attracted much attention in diverse areas of science and technology. Many networks such as the WWW and biological networks are known to display spatial heterogeneity which can be characterized by their fractal dimensions. Multifractal analysis is a useful way to systematically describe the spatial heterogeneity of both theoretical and experimental fractal patterns. In this paper, we introduce a new box-covering algorithm for multifractal analysis of complex networks. This algorithm is used to calculate the generalized fractal dimensions D q of some theoretical networks, namely scale-free networks, small world networks, and random networks, and one kind of real network, namely protein—protein interaction networks of different species. Our numerical results indicate the existence of multifractality in scale-free networks and protein—protein interaction networks, while the multifractal behavior is not clear-cut for small world networks and random networks. The possible variation of D q due to changes in the parameters of the theoretical network models is also discussed. (general)
Cui, Ying; Cai, Meng; Stanley, H. Eugene
Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.
Anne de Jong
Full Text Available In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks.
Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs. PMID:29470492
Munsky, Brian; Trinh, Brooke; Khammash, Mustafa
The cellular environment is abuzz with noise originating from the inherent random motion of reacting molecules in the living cell. In this noisy environment, clonal cell populations exhibit cell-to-cell variability that can manifest significant prototypical differences. Noise induced stochastic fluctuations in cellular constituents can be measured and their statistics quantified using flow cytometry, single molecule fluorescence in situ hybridization, time lapse fluorescence microscopy and other single cell and single molecule measurement techniques. We show that these random fluctuations carry within them valuable information about the underlying genetic network. Far from being a nuisance, the ever-present cellular noise acts as a rich source of excitation that, when processed through a gene network, carries its distinctive fingerprint that encodes a wealth of information about that network. We demonstrate that in some cases the analysis of these random fluctuations enables the full identification of network parameters, including those that may otherwise be difficult to measure. We use theoretical investigations to establish experimental guidelines for the identification of gene regulatory networks, and we apply these guideline to experimentally identify predictive models for different regulatory mechanisms in bacteria and yeast.
Full Text Available Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain and linear state-space (time domain can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.
Full Text Available Abstract Background Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs. As a logical model, probabilistic Boolean networks (PBNs consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n or O(nN2n for a sparse matrix. Results This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN. An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n, where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a
Full Text Available Abstract Background Genome-scale metabolic network models have contributed to elucidating biological phenomena, and predicting gene targets to engineer for biotechnological applications. With their increasing importance, their precise network characterization has also been crucial for better understanding of the cellular physiology. Results We herein introduce a framework for network modularization and Bayesian network analysis (FMB to investigate organism’s metabolism under perturbation. FMB reveals direction of influences among metabolic modules, in which reactions with similar or positively correlated flux variation patterns are clustered, in response to specific perturbation using metabolic flux data. With metabolic flux data calculated by constraints-based flux analysis under both control and perturbation conditions, FMB, in essence, reveals the effects of specific perturbations on the biological system through network modularization and Bayesian network analysis at metabolic modular level. As a demonstration, this framework was applied to the genetically perturbed Escherichia coli metabolism, which is a lpdA gene knockout mutant, using its genome-scale metabolic network model. Conclusions After all, it provides alternative scenarios of metabolic flux distributions in response to the perturbation, which are complementary to the data obtained from conventionally available genome-wide high-throughput techniques or metabolic flux analysis.
Kim, Hyun Uk; Kim, Tae Yong; Lee, Sang Yup
Genome-scale metabolic network models have contributed to elucidating biological phenomena, and predicting gene targets to engineer for biotechnological applications. With their increasing importance, their precise network characterization has also been crucial for better understanding of the cellular physiology. We herein introduce a framework for network modularization and Bayesian network analysis (FMB) to investigate organism's metabolism under perturbation. FMB reveals direction of influences among metabolic modules, in which reactions with similar or positively correlated flux variation patterns are clustered, in response to specific perturbation using metabolic flux data. With metabolic flux data calculated by constraints-based flux analysis under both control and perturbation conditions, FMB, in essence, reveals the effects of specific perturbations on the biological system through network modularization and Bayesian network analysis at metabolic modular level. As a demonstration, this framework was applied to the genetically perturbed Escherichia coli metabolism, which is a lpdA gene knockout mutant, using its genome-scale metabolic network model. After all, it provides alternative scenarios of metabolic flux distributions in response to the perturbation, which are complementary to the data obtained from conventionally available genome-wide high-throughput techniques or metabolic flux analysis.
Langfelder, Peter; Horvath, Steve
Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA.
Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria
The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3
McCabe, James D
Traditionally, networking has had little or no basis in analysis or architectural development, with designers relying on technologies they are most familiar with or being influenced by vendors or consultants. However, the landscape of networking has changed so that network services have now become one of the most important factors to the success of many third generation networks. It has become an important feature of the designer's job to define the problems that exist in his network, choose and analyze several optimization parameters during the analysis process, and then prioritize and evalua
Flowers, Jonathan M; Hanzawa, Yoshie; Hall, Megan C; Moore, Richard C; Purugganan, Michael D
The time to flowering is a key component of the life-history strategy of the model plant Arabidopsis thaliana that varies quantitatively among genotypes. A significant problem for evolutionary and ecological genetics is to understand how natural selection may operate on this ecologically significant trait. Here, we conduct a population genomic study of resequencing data from 52 genes in the flowering time network. McDonald-Kreitman tests of neutrality suggested a strong excess of amino acid polymorphism when pooling across loci. This excess of replacement polymorphism across the flowering time network and a skewed derived frequency spectrum toward rare alleles for both replacement and noncoding polymorphisms relative to synonymous changes is consistent with a large class of deleterious polymorphisms segregating in these genes. Assuming selective neutrality of synonymous changes, we estimate that approximately 30% of amino acid polymorphisms are deleterious. Evidence of adaptive substitution is less prominent in our analysis. The photoperiod regulatory gene, CO, and a gibberellic acid transcription factor, AtMYB33, show evidence of adaptive fixation of amino acid mutations. A test for extended haplotypes revealed no examples of flowering time alleles with haplotypes comparable in length to those associated with the null fri(Col) allele reported previously. This suggests that the FRI gene likely has a uniquely intense or recent history of selection among the flowering time genes considered here. Although there is some evidence for adaptive evolution in these life-history genes, it appears that slightly deleterious polymorphisms are a major component of natural molecular variation in the flowering time network of A. thaliana.
Olszewski Kellen L
Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the
Kalb, Jeffrey L.; Lee, David S.
Emerging high-bandwidth, low-latency network technology has made network-based architectures both feasible and potentially desirable for use in satellite payload architectures. The selection of network topology is a critical component when developing these multi-node or multi-point architectures. This study examines network topologies and their effect on overall network performance. Numerous topologies were reviewed against a number of performance, reliability, and cost metrics. This document identifies a handful of good network topologies for satellite applications and the metrics used to justify them as such. Since often multiple topologies will meet the requirements of the satellite payload architecture under development, the choice of network topology is not easy, and in the end the choice of topology is influenced by both the design characteristics and requirements of the overall system and the experience of the developer.
Patrycja Vasilyev Missiuro
Full Text Available Recent studies of cellular networks have revealed modular organizations of genes and proteins. For example, in interactome networks, a module refers to a group of interacting proteins that form molecular complexes and/or biochemical pathways and together mediate a biological process. However, it is still poorly understood how biological information is transmitted between different modules. We have developed information flow analysis, a new computational approach that identifies proteins central to the transmission of biological information throughout the network. In the information flow analysis, we represent an interactome network as an electrical circuit, where interactions are modeled as resistors and proteins as interconnecting junctions. Construing the propagation of biological signals as flow of electrical current, our method calculates an information flow score for every protein. Unlike previous metrics of network centrality such as degree or betweenness that only consider topological features, our approach incorporates confidence scores of protein-protein interactions and automatically considers all possible paths in a network when evaluating the importance of each protein. We apply our method to the interactome networks of Saccharomyces cerevisiae and Caenorhabditis elegans. We find that the likelihood of observing lethality and pleiotropy when a protein is eliminated is positively correlated with the protein's information flow score. Even among proteins of low degree or low betweenness, high information scores serve as a strong predictor of loss-of-function lethality or pleiotropy. The correlation between information flow scores and phenotypes supports our hypothesis that the proteins of high information flow reside in central positions in interactome networks. We also show that the ranks of information flow scores are more consistent than that of betweenness when a large amount of noisy data is added to an interactome. Finally, we
Coronnello, C; Tumminello, M; Micciche, S; Mantegna, R.N.
Many biological systems can be described as networks where different elements interact, in order to perform biological processes. We introduce a network associated with the Gene Ontology. Specifically, we construct a correlation-based network where the vertices are the terms of the Gene Ontology and the link between each two terms is weighted on the basis of the number of genes that they have in common. We analyze a filtered network obtained from the correlation-based network and we characterize its evolution over different releases of the Gene Ontology.
Genes under H NS control can be. (a) regulated by H NS. (b) regulated by H NS and StpA. Because backup by StpA is partial. Page 19. Gene expression level. H NS regulated xenogenes. Other genes. Page 20 ... recollect: H&NS silences highl transcribable genes. Gene expression level unilateral. Other genes epistatic ...
Bove, Jérôme; Lucas, Philippe; Godin, Béatrice; Ogé, Laurent; Jullien, Marc; Grappin, Philippe
Seed dormancy in Nicotiana plumbaginifolia is characterized by an abscisic acid accumulation linked to a pronounced germination delay. Dormancy can be released by 1 year after-ripening treatment. Using a cDNA-amplified fragment length polymorphism (cDNA-AFLP) approach we compared the gene expression patterns of dormant and after-ripened seeds, air-dry or during one day imbibition and analyzed 15,000 cDNA fragments. Among them 1020 were found to be differentially regulated by dormancy. Of 412 sequenced cDNA fragments, 83 were assigned to a known function by search similarities to public databases. The functional categories of the identified dormancy maintenance and breaking responsive genes, give evidence that after-ripening turns in the air-dry seed to a new developmental program that modulates, at the RNA level, components of translational control, signaling networks, transcriptional control and regulated proteolysis.
Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex
This thesis entitled "Function analysis of unknown genes" presents the use of proteome analysis for the characterisation of yeast (Saccharomyces cerevisiae) genes and their products (proteins especially those of unknown function). This study illustrates that proteome analysis can be used...... to describe different aspects of molecular biology of the cell, to study changes that occur in the cell due to overexpression or deletion of a gene and to identify various protein modifications. The biological questions and the results of the described studies show the diversity of the information that can...... genes and proteins. It reports the first global proteome database collecting 36 yeast single gene deletion mutants and selecting over 650 differences between analysed mutants and the wild type strain. The obtained results show that two-dimensional gel electrophoresis and mass spectrometry based proteome...
Contract No. DASG60-00-M-0201 Purchase request no.: Foot in the Door-01 Title Name: Artificial Neural Network Analysis System Company: Atlantic... Artificial Neural Network Analysis System 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Powell, Bruce C 5d. PROJECT NUMBER 5e. TASK NUMBER...34) 27-02-2001 Report Type N/A Dates Covered (from... to) ("DD MON YYYY") 28-10-2000 27-02-2001 Title and Subtitle Artificial Neural Network Analysis
Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to
Koonin Eugene V
Full Text Available Abstract Background A genome-wide comparative analysis of human and mouse gene expression patterns was performed in order to evaluate the evolutionary divergence of mammalian gene expression. Tissue-specific expression profiles were analyzed for 9,105 human-mouse orthologous gene pairs across 28 tissues. Expression profiles were resolved into species-specific coexpression networks, and the topological properties of the networks were compared between species. Results At the global level, the topological properties of the human and mouse gene coexpression networks are, essentially, identical. For instance, both networks have topologies with small-world and scale-free properties as well as closely similar average node degrees, clustering coefficients, and path lengths. However, the human and mouse coexpression networks are highly divergent at the local level: only a small fraction ( Conclusion The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal. Nevertheless, there is substantial evolutionary conservation of the local network structure which is compatible with the notion that gene coexpression networks are subject to purifying selection.
Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Zhong, Gan-Yuan; Xu, Kenong
Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that putatively encodes a vacuolar aluminum-activated malate transporter1 (ALMT1)-like protein is a strong candidate gene. We hypothesize that fruit acidity is governed by a gene network in which Ma1 is key member. The goal of this study is to identify the gene network and the potential mechanisms through which the network operates. Guided by Ma1, we analyzed the transcriptomes of mature fruit of contrasting acidity from six apple accessions of genotype Ma_ (MaMa or Mama) and four of mama using RNA-seq and identified 1301 fruit acidity associated genes, among which 18 were most significant acidity genes (MSAGs). Network inferring using weighted gene co-expression network analysis (WGCNA) revealed five co-expression gene network modules of significant (P acidity. Overall, this study provides important insight into the Ma1-mediated gene network controlling acidity in mature apple fruit of diverse genetic background.
Full Text Available The potent proinflammatory cytokine interleukin (IL-1 triggers gene expression through the NF-κB signaling pathway. Here, we investigated the cofactor requirements of strongly regulated IL-1 target genes whose expression is impaired in p65 NF-κB-deficient murine embryonic fibroblasts. By two independent small-hairpin (shRNA screens, we examined 170 genes annotated to encode nuclear cofactors for their role in Cxcl2 mRNA expression and identified 22 factors that modulated basal or IL-1-inducible Cxcl2 levels. The functions of 16 of these factors were validated for Cxcl2 and further analyzed for their role in regulation of 10 additional IL-1 target genes by RT-qPCR. These data reveal that each inducible gene has its own (quantitative requirement of cofactors to maintain basal levels and to respond to IL-1. Twelve factors (Epc1, H2afz, Kdm2b, Kdm6a, Mbd3, Mta2, Phf21a, Ruvbl1, Sin3b, Suv420h1, Taf1, and Ube3a have not been previously implicated in inflammatory cytokine functions. Bioinformatics analysis indicates that they are components of complex nuclear protein networks that regulate chromatin functions and gene transcription. Collectively, these data suggest that downstream from the essential NF-κB signal each cytokine-inducible target gene has further subtle requirements for individual sets of nuclear cofactors that shape its transcriptional activation profile.
Debrabant, Birgit; Soerensen, Mette
Abstract We discuss the use of modified Kolmogorov-Smirnov (KS) statistics in the context of gene set analysis and review corresponding null and alternative hypotheses. Especially, we show that, when enhancing the impact of highly significant genes in the calculation of the test statistic, the co...
Meta-Analysis of Transcriptome Data Related to Hippocampus Biopsies and iPSC-Derived Neuronal Cells from Alzheimer's Disease Patients Reveals an Association with FOXA1 and FOXA2 Gene Regulatory Networks.
Wruck, Wasco; Schröter, Friederike; Adjaye, James
Although the incidence of Alzheimer's disease (AD) is continuously increasing in the aging population worldwide, effective therapies are not available. The interplay between causative genetic and environmental factors is partially understood. Meta-analyses have been performed on aspects such as polymorphisms, cytokines, and cognitive training. Here, we propose a meta-analysis approach based on hierarchical clustering analysis of a reliable training set of hippocampus biopsies, which is condensed to a gene expression signature. This gene expression signature was applied to various test sets of brain biopsies and iPSC-derived neuronal cell models to demonstrate its ability to distinguish AD samples from control. Thus, our identified AD-gene signature may form the basis for determination of biomarkers that are urgently needed to overcome current diagnostic shortfalls. Intriguingly, the well-described AD-related genes APP and APOE are not within the signature because their gene expression profiles show a lower correlation to the disease phenotype than genes from the signature. This is in line with the differing characteristics of the disease as early-/late-onset or with/without genetic predisposition. To investigate the gene signature's systemic role(s), signaling pathways, gene ontologies, and transcription factors were analyzed which revealed over-representation of response to stress, regulation of cellular metabolic processes, and reactive oxygen species. Additionally, our results clearly point to an important role of FOXA1 and FOXA2 gene regulatory networks in the etiology of AD. This finding is in corroboration with the recently reported major role of the dopaminergic system in the development of AD and its regulation by FOXA1 and FOXA2.
Materna, Stefan C
The control processes that underlie the progression of development can be summarized in maps of gene regulatory networks (GRNs). A critical step in their assembly is the systematic perturbation of network candidates. In sea urchins the most important method for interfering with expression in a gene-specific way is application of morpholino antisense oligonucleotides (MOs). MOs act by binding to their sequence complement in transcripts resulting in a block in translation or a change in splicing and thus result in a loss of function. Despite the tremendous success of this technology, recent comparisons to mutants generated by genome editing have led to renewed criticism and challenged its reliability. As with all methods based on sequence recognition, MOs are prone to off-target binding that may result in phenotypes that are erroneously ascribed to the loss of the intended target. However, the slow progression of development in sea urchins has enabled extremely detailed studies of gene activity in the embryo. This wealth of knowledge paired with the simplicity of the sea urchin embryo enables careful analysis of MO phenotypes through a variety of methods that do not rely on terminal phenotypes. This article summarizes the use of MOs in probing GRNs and the steps that should be taken to assure their specificity.
Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Sîrbu, Alina; Crane, Martin; Ruskin, Heather J
Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Mustafin, Zakhar Sergeevich; Lashin, Sergey Alexandrovich; Matushkin, Yury Georgievich; Gunbin, Konstantin Vladimirovich; Afonnikov, Dmitry Arkadievich
There are many available software tools for visualization and analysis of biological networks. Among them, Cytoscape ( http://cytoscape.org/ ) is one of the most comprehensive packages, with many plugins and applications which extends its functionality by providing analysis of protein-protein interaction, gene regulatory and gene co-expression networks, metabolic, signaling, neural as well as ecological-type networks including food webs, communities networks etc. Nevertheless, only three plugins tagged 'network evolution' found in Cytoscape official app store and in literature. We have developed a new Cytoscape 3.0 application Orthoscape aimed to facilitate evolutionary analysis of gene networks and visualize the results. Orthoscape aids in analysis of evolutionary information available for gene sets and networks by highlighting: (1) the orthology relationships between genes; (2) the evolutionary origin of gene network components; (3) the evolutionary pressure mode (diversifying or stabilizing, negative or positive selection) of orthologous groups in general and/or branch-oriented mode. The distinctive feature of Orthoscape is the ability to control all data analysis steps via user-friendly interface. Orthoscape allows its users to analyze gene networks or separated gene sets in the context of evolution. At each step of data analysis, Orthoscape also provides for convenient visualization and data manipulation.
Presents insight into the social behaviour of animals (including the study of animal tracks and learning by members of the same species). This book provides web-based evidence of social interaction, perceptual learning, information granulation and the behaviour of humans and affinities between web-based social networks
Aaron R Wolen
Full Text Available Individual differences in initial sensitivity to ethanol are strongly related to the heritable risk of alcoholism in humans. To elucidate key molecular networks that modulate ethanol sensitivity we performed the first systems genetics analysis of ethanol-responsive gene expression in brain regions of the mesocorticolimbic reward circuit (prefrontal cortex, nucleus accumbens, and ventral midbrain across a highly diverse family of 27 isogenic mouse strains (BXD panel before and after treatment with ethanol.Acute ethanol altered the expression of ~2,750 genes in one or more regions and 400 transcripts were jointly modulated in all three. Ethanol-responsive gene networks were extracted with a powerful graph theoretical method that efficiently summarized ethanol's effects. These networks correlated with acute behavioral responses to ethanol and other drugs of abuse. As predicted, networks were heavily populated by genes controlling synaptic transmission and neuroplasticity. Several of the most densely interconnected network hubs, including Kcnma1 and Gsk3β, are known to influence behavioral or physiological responses to ethanol, validating our overall approach. Other major hub genes like Grm3, Pten and Nrg3 represent novel targets of ethanol effects. Networks were under strong genetic control by variants that we mapped to a small number of chromosomal loci. Using a novel combination of genetic, bioinformatic and network-based approaches, we identified high priority cis-regulatory candidate genes, including Scn1b, Gria1, Sncb and Nell2.The ethanol-responsive gene networks identified here represent a previously uncharacterized intermediate phenotype between DNA variation and ethanol sensitivity in mice. Networks involved in synaptic transmission were strongly regulated by ethanol and could contribute to behavioral plasticity seen with chronic ethanol. Our novel finding that hub genes and a small number of loci exert major influence over the ethanol
Applied network theory has seen pronounced expansion in recent years, in fields such as epidemiology, computer science, and sociology. Concurrent development of analytical methods and frameworks has increased possibilities and tools available to researchers seeking to apply network theory to a variety of problems. While water and nutrient fluxes through stream systems clearly demonstrate a directional network structure, the hydrological applications of network theory remain underexplored. This presentation covers a review of network applications in hydrology, followed by an overview of promising network analytical tools that potentially offer new insights into conceptual modeling of hydrologic systems, identifying behavioral transition zones in stream networks and thresholds of dynamical system response. Network applications were tested along an urbanization gradient in Atlanta, Georgia, USA. Peachtree Creek and Proctor Creek. Peachtree Creek contains a nest of five longterm USGS streamflow and water quality gages, allowing network application of longterm flow statistics. The watershed spans a range of suburban and heavily urbanized conditions. Summary flow statistics and water quality metrics were analyzed using a suite of network analysis techniques, to test the conceptual modeling and predictive potential of the methodologies. Storm events and low flow dynamics during Summer 2016 were analyzed using multiple network approaches, with an emphasis on tomogravity methods. Results indicate that network theory approaches offer novel perspectives for understanding long term and eventbased hydrological data. Key future directions for network applications include 1) optimizing data collection, 2) identifying "hotspots" of contaminant and overland flow influx to stream systems, 3) defining process domains, and 4) analyzing dynamic connectivity of various system components, including groundwatersurface water interactions.
Song, Mingzhou (Joe) [New Mexico State University, Las Cruces; Lewis, Chris K. [New Mexico State University, Las Cruces; Lance, Eric [New Mexico State University, Las Cruces; Chesler, Elissa J [ORNL; Kirova, Roumyana [Bristol-Myers Squibb Pharmaceutical Research & Development, NJ; Langston, Michael A [University of Tennessee, Knoxville (UTK); Bergeson, Susan [Texas Tech University, Lubbock
The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from high-throughput transcriptomic data is addressed. A network reconstruction algorithm was developed that uses the statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. Using temporal gene expression data collected from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol, this algorithm identified genes from a major neuronal pathway as putative components of the alcohol response mechanism. Three of these genes have known associations with alcohol in the literature. Several other potentially relevant genes, highlighted and agreeing with independent results from literature mining, may play a role in the response to alcohol. Additional, previously-unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.
Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey
Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.
Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas
Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
Bhargava, Anuprabha; Herzel, Hanspeter; Ananthasubramaniam, Bharath
Background Most physiological processes in mammals are temporally regulated by means of a master circadian clock in the brain and peripheral oscillators in most other tissues. A transcriptional-translation feedback network of clock genes produces near 24 h oscillations in clock gene and protein expression. Here, we aim to identify novel additions to the clock network using a meta-analysis of public chromatin immunoprecipitation sequencing (ChIP-seq), proteomics and protein-protein interaction...
Full Text Available The impact of gene silencing on cellular phenotypes is difficult to establish due to the complexity of interactions in the associated biological processes and pathways. A recent genome-wide RNA knock-down study both identified and phenotypically characterized a set of important genes for the cell cycle in HeLa cells. Here, we combine a molecular interaction network analysis, based on physical and functional protein interactions, in conjunction with evolutionary information, to elucidate the common biological and topological properties of these key genes. Our results show that these genes tend to be conserved with their corresponding protein interactions across several species and are key constituents of the evolutionary conserved molecular interaction network. Moreover, a group of bistable network motifs is found to be conserved within this network, which are likely to influence the network stability and therefore the robustness of cellular functioning. They form a cluster, which displays functional homogeneity and is significantly enriched in genes phenotypically relevant for mitosis. Additional results reveal a relationship between specific cellular processes and the phenotypic outcomes induced by gene silencing. This study introduces new ideas regarding the relationship between genotype and phenotype in the context of the cell cycle. We show that the analysis of molecular interaction networks can result in the identification of genes relevant to cellular processes, which is a promising avenue for future research.
Hur, Junguk; Özgür, Arzucan; He, Yongqun
Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of
Full Text Available Abstract Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user can be analyzed in the context of known
Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand
Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.
Full Text Available Abstract Background CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Results Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. Conclusions CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. Open peer review This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten
Brodt, Avital; Lurie-Weinberger, Mor N; Gophna, Uri
CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten).
Benjamin A Logsdon
Full Text Available Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL, which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1, and genes involved in endocytosis (RCY1, the spindle checkpoint (BUB2, sulfonate catabolism (JLP1, and cell-cell communication (PRM7. Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data.
Full Text Available The network-based approach has been used to describe the relationship among genes and various phenotypes, producing a network describing complex biological relationships. Such networks can be constructed by aggregating previously reported associations in the literature from various databases. In this work, we applied the network-based approach to investigate how different brain areas are associated to genetic disorders and genes. In particular, a tripartite network with genes, genetic diseases, and brain areas was constructed based on the associations among them reported in the literature through text mining. In the resulting network, a disproportionately large number of gene-disease and disease-brain associations were attributed to a small subset of genes, diseases, and brain areas. Furthermore, a small number of brain areas were found to be associated with a large number of the same genes and diseases. These core brain regions encompassed the areas identified by the previous genome-wide association studies, and suggest potential areas of focus in the future imaging genetics research. The approach outlined in this work demonstrates the utility of the network-based approach in studying genetic effects on the brain.
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
The purpose of this work is a unified and general treatment of activity in neural networks from a mathematical pOint of view. Possible applications of the theory presented are indica ted throughout the text. However, they are not explored in de tail for two reasons : first, the universal character of n- ral activity in nearly all animals requires some type of a general approach~ secondly, the mathematical perspicuity would suffer if too many experimental details and empirical peculiarities were interspersed among the mathematical investigation. A guide to many applications is supplied by the references concerning a variety of specific issues. Of course the theory does not aim at covering all individual problems. Moreover there are other approaches to neural network theory (see e.g. Poggio-Torre, 1978) based on the different lev els at which the nervous system may be viewed. The theory is a deterministic one reflecting the average be havior of neurons or neuron pools. In this respect the essay is writt...
Rasmussen, Christian Jørgen
This thesis describes the development of a computer-based simulator for transmission analysis in optical wavelength division multiplexing networks. A great part of the work concerns fundamental optical network simulator issues. Among these issues are identification of the versatility and user...... the different component models are invoked during the simulation of a system. A simple set of rules which makes it possible to simulate any network architectures is laid down. The modelling of the nonlinear fibre and the optical receiver is also treated. The work on the fibre concerns the numerical solution...
Kaltenbach, Hans-Michael; Stelling, Jörg
The analysis of complex biological networks has traditionally relied on decomposition into smaller, semi-autonomous units such as individual signaling pathways. With the increased scope of systems biology (models), rational approaches to modularization have become an important topic. With increasing acceptance of de facto modularity in biology, widely different definitions of what constitutes a module have sparked controversies. Here, we therefore review prominent classes of modular approaches based on formal network representations. Despite some promising research directions, several important theoretical challenges remain open on the way to formal, function-centered modular decompositions for dynamic biological networks.
Full Text Available Chronic obstructive pulmonary disease (COPD is a multi-factor disease, in which metabolic disturbances played important roles. In this paper, functional information was integrated into a COPD-related metabolic network to assess similarity between genes. Then a gene prioritization method was applied to the COPD-related metabolic network to prioritize COPD candidate genes. The gene prioritization method was superior to ToppGene and ToppNet in both literature validation and functional enrichment analysis. Top-ranked genes prioritized from the metabolic perspective with functional information could promote the better understanding about the molecular mechanism of this disease. Top 100 genes might be potential markers for diagnostic and effective therapies.
Aronow Bruce J
Full Text Available Abstract Background Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN analyses. Results For the first time, extended versions of the PageRank and HITS algorithms, and the K-Step Markov method are applied to prioritize disease candidate genes in a training-test schema. Using a list of known disease-related genes from our earlier study as a training set ("seeds", and the rest of the known genes as a test list, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Under appropriate settings – for example, a back probability of 0.3 for PageRank with Priors and HITS with Priors, and step size 6 for K-Step Markov method – the three methods achieved a comparable AUC value, suggesting a similar performance. Conclusion Even though network-based methods are generally not as effective as integrated functional annotation-based methods for disease candidate gene prioritization, in a one-to-one comparison, PPIN-based candidate gene prioritization performs better than all other gene features or annotations. Additionally, we demonstrate that methods used for studying both social and Web networks can be successfully used for disease candidate gene prioritization.
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix–loop–helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks e...
Full Text Available In this study, we infer the breast cancer gene regulatory network from gene expression data. This network is obtained from the application of the BC3Net inference algorithm to a large-scale gene expression data set consisting of $351$ patient samples. In order to elucidate the functional relevance of the inferred network, we are performing a Gene Ontology (GO analysis for its structural components. Our analysis reveals that most significant GO-terms we find for the breast cancer network represent functional modules of biological processes that are described by known cancer hallmarks, including translation, immune response, cell cycle, organelle fission, mitosis, cell adhesion, RNA processing, RNA splicing and response to wounding. Furthermore, by using a curated list of census cancer genes, we find an enrichment in these functional modules. Finally, we study cooperative effects of chromosomes based on information of interacting genes in the beast cancer network. We find that chromosome $21$ is most coactive with other chromosomes. To our knowledge this is the first study investigating the genome-scale breast cancer network.
Smith, William T.
Conventional computing schemes have long been used to analyze problems in electromagnetics (EM). The vast majority of EM applications require computationally intensive algorithms involving numerical integration and solutions to large systems of equations. The feasibility of using neural network computing algorithms for antenna analysis is investigated. The ultimate goal is to use a trained neural network algorithm to reduce the computational demands of existing reflector surface error compensation techniques. Neural networks are computational algorithms based on neurobiological systems. Neural nets consist of massively parallel interconnected nonlinear computational elements. They are often employed in pattern recognition and image processing problems. Recently, neural network analysis has been applied in the electromagnetics area for the design of frequency selective surfaces and beam forming networks. The backpropagation training algorithm was employed to simulate classical antenna array synthesis techniques. The Woodward-Lawson (W-L) and Dolph-Chebyshev (D-C) array pattern synthesis techniques were used to train the neural network. The inputs to the network were samples of the desired synthesis pattern. The outputs are the array element excitations required to synthesize the desired pattern. Once trained, the network is used to simulate the W-L or D-C techniques. Various sector patterns and cosecant-type patterns (27 total) generated using W-L synthesis were used to train the network. Desired pattern samples were then fed to the neural network. The outputs of the network were the simulated W-L excitations. A 20 element linear array was used. There were 41 input pattern samples with 40 output excitations (20 real parts, 20 imaginary). A comparison between the simulated and actual W-L techniques is shown for a triangular-shaped pattern. Dolph-Chebyshev is a different class of synthesis technique in that D-C is used for side lobe control as opposed to pattern
Chavan, Shweta S; Bauer, Michael A; Scutari, Marco; Nagarajan, Radhakrishnan
There has been recent interest in capturing the functional relationships (FRs) from high-throughput assays using suitable computational techniques. FRs elucidate the working of genes in concert as a system as opposed to independent entities hence may provide preliminary insights into biological pathways and signalling mechanisms. Bayesian structure learning (BSL) techniques and its extensions have been used successfully for modelling FRs from expression profiles. Such techniques are especially useful in discovering undocumented FRs, investigating non-canonical signalling mechanisms and cross-talk between pathways. The objective of the present study is to develop a graphical user interface (GUI), NATbox: Network Analysis Toolbox in the language R that houses a battery of BSL algorithms in conjunction with suitable statistical tools for modelling FRs in the form of acyclic networks from gene expression profiles and their subsequent analysis. NATbox is a menu-driven open-source GUI implemented in the R statistical language for modelling and analysis of FRs from gene expression profiles. It provides options to (i) impute missing observations in the given data (ii) model FRs and network structure from gene expression profiles using a battery of BSL algorithms and identify robust dependencies using a bootstrap procedure, (iii) present the FRs in the form of acyclic graphs for visualization and investigate its topological properties using network analysis metrics, (iv) retrieve FRs of interest from published literature. Subsequently, use these FRs as structural priors in BSL (v) enhance scalability of BSL across high-dimensional data by parallelizing the bootstrap routines. NATbox provides a menu-driven GUI for modelling and analysis of FRs from gene expression profiles. By incorporating readily available functions from existing R-packages, it minimizes redundancy and improves reproducibility, transparency and sustainability, characteristic of open-source environments
Miryala, Sravan Kumar; Anbarasu, Anand; Ramaiah, Sudha
Computational analysis of biomolecular interaction networks is now gaining a lot of importance to understand the functions of novel genes/proteins. Gene interaction (GI) network analysis and protein-protein interaction (PPI) network analysis play a major role in predicting the functionality of interacting genes or proteins and gives an insight into the functional relationships and evolutionary conservation of interactions among the genes. An interaction network is a graphical representation of gene/protein interactome, where each gene/protein is a node, and interaction between gene/protein is an edge. In this review, we discuss the popular open source databases that serve as data repositories to search and collect protein/gene interaction data, and also tools available for the generation of interaction network, visualization and network analysis. Also, various network analysis approaches like topological approach and clustering approach to study the network properties and functional enrichment server which illustrates the functions and pathway of the genes and proteins has been discussed. Hence the distinctive attribute mentioned in this review is not only to provide an overview of tools and web servers for gene and protein-protein interaction (PPI) network analysis but also to extract useful and meaningful information from the interaction networks. Copyright © 2017 Elsevier B.V. All rights reserved.
David J. Burks
Full Text Available The Auxin Response Factor (ARF family of transcription factors is an important regulator of environmental response and symbiotic nodulation in the legume Medicago truncatula. While previous studies have identified members of this family, a recent spurt in gene expression data coupled with genome update and reannotation calls for a reassessment of the prevalence of ARF genes and their interaction networks in M. truncatula. We performed a comprehensive analysis of the M. truncatula genome and transcriptome that entailed search for novel ARF genes and the co-expression networks. Our investigation revealed 8 novel M. truncatula ARF (MtARF genes, of the total 22 identified, and uncovered novel gene co-expression networks as well. Furthermore, the topological clustering and single enrichment analysis of several network models revealed the roles of individual members of the MtARF family in nitrogen regulation, nodule initiation, and post-embryonic development through a specialized protein packaging and secretory pathway. In summary, this study not just shines new light on an important gene family, but also provides a guideline for identification of new members of gene families and their functional characterization through network analyses.
Carson, Matthew B; Lu, Hui
In recent years, high-throughput protein interaction identification methods have generated a large amount of data. When combined with the results from other in vivo and in vitro experiments, a complex set of relationships between biological molecules emerges. The growing popularity of network analysis and data mining has allowed researchers to recognize indirect connections between these molecules. Due to the interdependent nature of network entities, evaluating proteins in this context can reveal relationships that may not otherwise be evident. We examined the human protein interaction network as it relates to human illness using the Disease Ontology. After calculating several topological metrics, we trained an alternating decision tree (ADTree) classifier to identify disease-associated proteins. Using a bootstrapping method, we created a tree to highlight conserved characteristics shared by many of these proteins. Subsequently, we reviewed a set of non-disease-associated proteins that were misclassified by the algorithm with high confidence and searched for evidence of a disease relationship. Our classifier was able to predict disease-related genes with 79% area under the receiver operating characteristic (ROC) curve (AUC), which indicates the tradeoff between sensitivity and specificity and is a good predictor of how a classifier will perform on future data sets. We found that a combination of several network characteristics including degree centrality, disease neighbor ratio, eccentricity, and neighborhood connectivity help to distinguish between disease- and non-disease-related proteins. Furthermore, the ADTree allowed us to understand which combinations of strongly predictive attributes contributed most to protein-disease classification. In our post-processing evaluation, we found several examples of potential novel disease-related proteins and corresponding literature evidence. In addition, we showed that first- and second-order neighbors in the PPI network
Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.
Full Text Available In biology, networks are used in different contexts as ways to represent relationships between entities, such as for instance interactions between genes, proteins or metabolites. Despite progress in the analysis of such networks and their potential to better understand the collective impact of genes on complex traits, one remaining challenge is to establish the biologic validity of gene co-expression networks and to determine what governs their organization. We used WGCNA to construct and analyze seven gene expression datasets from several tissues of mouse recombinant inbred strains (RIS. For six out of the 7 networks, we found that linkage to module QTLs (mQTLs could be established for 29.3% of gene co-expression modules detected in the several mouse RIS. For about 74.6% of such genetically-linked modules, the mQTL was on the same chromosome as the one contributing most genes to the module, with genes originating from that chromosome showing higher connectivity than other genes in the modules. Such modules (that we considered as genetically-driven had network statistic properties (density, centralization and heterogeneity that set them apart from other modules in the network. Altogether, a sizeable portion of gene co-expression modules detected in mouse RIS panels had genetic determinants as their main organizing principle. In addition to providing a biologic interpretation validation for these modules, these genetic determinants imparted on them particular properties that set them apart from other modules in the network, to the point that they can be predicted to a large extent on the basis of their network statistics.
Full Text Available Here, to understand the molecular mechanisms underlying cell death induced by sodium fluoride (NaF, we analyzed gene expression patterns in rat oral epithelial ROE2 cells exposed to NaF using global-scale microarrays and bioinformatics tools. A relatively high concentration of NaF (2 mM induced cell death concomitant with decreases in mitochondrial membrane potential, chromatin condensation and caspase-3 activation. Using 980 probe sets, we identified 432 up-regulated and 548 down-regulated genes, that were differentially expressed by >2.5-fold in the cells treated with 2 mM of NaF and categorized them into 4 groups by K-means clustering. Ingenuity® pathway analysis revealed several gene networks from gene clusters. The gene networks Up-I and Up-II included many up-regulated genes that were mainly associated with the biological function of induction or prevention of cell death, respectively, such as Atf3, Ddit3 and Fos (for Up-I and Atf4 and Hspa5 (for Up-II. Interestingly, knockdown of Ddit3 and Hspa5 significantly increased and decreased the number of viable cells, respectively. Moreover, several endoplasmic reticulum (ER stress-related genes including, Ddit3, Atf4 and Hapa5, were observed in these gene networks. These findings will provide further insight into the molecular mechanisms of NaF-induced cell death accompanying ER stress in oral epithelial cells.
Smita, Shuchi; Katiyar, Amit; Pandey, Dev Mani; Chinnusamy, Viswanathan; Archak, Sunil; Bansal, Kailash Chander
Identification of genes that are coexpressed across various tissues and environmental stresses is biologically interesting, since they may play coordinated role in similar biological processes. Genes with correlated expression patterns can be best identified by using coexpression network analysis of transcriptome data. In the present study, we analyzed the temporal-spatial coordination of gene expression in root, leaf and panicle of rice under drought stress and constructed network using WGCNA and Cytoscape. Total of 2199 differentially expressed genes (DEGs) were identified in at least three or more tissues, wherein 88 genes have coordinated expression profile among all the six tissues under drought stress. These 88 highly coordinated genes were further subjected to module identification in the coexpression network. Based on chief topological properties we identified 18 hub genes such as ABC transporter, ATP-binding protein, dehydrin, protein phosphatase 2C, LTPL153 - Protease inhibitor, phosphatidylethanolaminebinding protein, lactose permease-related, NADP-dependent malic enzyme, etc. Motif enrichment analysis showed the presence of ABRE cis-elements in the promoters of > 62% of the coordinately expressed genes. Our results suggest that drought stress mediated upregulated gene expression was coordinated through an ABA-dependent signaling pathway across tissues, at least for the subset of genes identified in this study, while down regulation appears to be regulated by tissue specific pathways in rice.
Kari Y Lam
Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.
The NET-2 Network Analysis Program is a general purpose digital computer program which solves the nonlinear time domain response and the linearized small signal frequency domain response of an arbitrary network of interconnected components. NET-2 is capable of handling a variety of components and has been applied to problems in several engineering fields, including electronic circuit design and analysis, missile flight simulation, control systems, heat flow, fluid flow, mechanical systems, structural dynamics, digital logic, communications network design, solid state device physics, fluidic systems, and nuclear vulnerability due to blast, thermal, gamma radiation, neutron damage, and EMP effects. Network components may be selected from a repertoire of built-in models or they may be constructed by the user through appropriate combinations of mathematical, empirical, and topological functions. Higher-level components may be defined by subnetworks composed of any combination of user-defined components and built-in models. The program provides a modeling capability to represent and intermix system components on many levels, e.g., from hole and electron spatial charge distributions in solid state devices through discrete and integrated electronic components to functional system blocks. NET-2 is capable of simultaneous computation in both the time and frequency domain, and has statistical and optimization capability. Network topology may be controlled as a function of the network solution. (U.S.)
Liu, Xiaoping; Tang, Wei-Hua; Zhao, Xing-Ming; Chen, Luonan
Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB), which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN) of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other pathogenic fungi, which
Full Text Available Thickening of tree stems is the result of secondary growth, accomplished by the meristematic activity of the vascular cambium. Secondary growth of the stem entails developmental cascades resulting in the formation of secondary phloem outwards and secondary xylem (i.e., wood inwards of the stem. Signaling and transcriptional reprogramming by the phytohormone ethylene modifies cambial growth and cell differentiation, but the molecular link between ethylene and secondary growth remains unknown. We addressed this shortcoming by analyzing expression profiles and co-expression networks of ethylene pathway genes using the AspWood transcriptome database which covers all stages of secondary growth in aspen (Populus tremula stems. ACC synthase expression suggests that the ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC is synthesized during xylem expansion and xylem cell maturation. Ethylene-mediated transcriptional reprogramming occurs during all stages of secondary growth, as deduced from AspWood expression profiles of ethylene-responsive genes. A network centrality analysis of the AspWood dataset identified EIN3D and 11 ERFs as hubs. No overlap was found between the co-expressed genes of the EIN3 and ERF hubs, suggesting target diversification and hence independent roles for these transcription factor families during normal wood formation. The EIN3D hub was part of a large co-expression gene module, which contained 16 transcription factors, among them several new candidates that have not been earlier connected to wood formation and a VND-INTERACTING 2 (VNI2 homolog. We experimentally demonstrated Populus EIN3D function in ethylene signaling in Arabidopsis thaliana. The ERF hubs ERF118 and ERF119 were connected on the basis of their expression pattern and gene co-expression module composition to xylem cell expansion and secondary cell wall formation, respectively. We hereby establish data resources for ethylene-responsive genes and
Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut
Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional
Sep 28, 2015 ... Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging ... two types of methods differ primarily based on whether ..... negligible, allowing us to draw the qualitative conclusions .... research will be conducted to develop additional biologically.
Chen, Huajun; Ding, Li; Wu, Zhaohui; Yu, Tong; Dhanapalan, Lavanya; Chen, Jake Y
The Semantic Web technology enables integration of heterogeneous data on the World Wide Web by making the semantics of data explicit through formal ontologies. In this article, we survey the feasibility and state of the art of utilizing the Semantic Web technology to represent, integrate and analyze the knowledge in various biomedical networks. We introduce a new conceptual framework, semantic graph mining, to enable researchers to integrate graph mining with ontology reasoning in network data analysis. Through four case studies, we demonstrate how semantic graph mining can be applied to the analysis of disease-causal genes, Gene Ontology category cross-talks, drug efficacy analysis and herb-drug interactions analysis.
Tornow, Sabine; Mewes, H W
Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships.
Full Text Available With advances in next-generation sequencing(NGS technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast cancer is known as a heterogeneous disease. The identification of subtype-specific driver genes is critical to guide the diagnosis, assessment of prognosis and treatment of breast cancer. We developed an integrated frame based on gene expression profiles and copy number variation (CNV data to identify breast cancer subtype-specific driver genes. In this frame, we employed statistical machine-learning method to select gene subsets and utilized an module-network analysis method to identify potential candidate driver genes. The final subtype-specific driver genes were acquired by paired-wise comparison in subtypes. To validate specificity of the driver genes, the gene expression data of these genes were applied to classify the patient samples with 10-fold cross validation and the enrichment analysis were also conducted on the identified driver genes. The experimental results show that the proposed integrative method can identify the potential driver genes and the classifier with these genes acquired better performance than with genes identified by other methods.
Grammaticos, B; Carstea, A S; Ramani, A
We examine the dynamics of a network of genes focusing on a periodic chain of genes, of arbitrary length. We show that within a given class of sigmoids representing the equilibrium probability of the binding of the RNA polymerase to the core promoter, the system possesses a single stable fixed point. By slightly modifying the sigmoid, introducing 'stiffer' forms, we show that it is possible to find network configurations exhibiting bistable behaviour. Our results do not depend crucially on the length of the chain considered: calculations with finite chains lead to similar results. However, a realistic study of regulatory genetic networks would require the consideration of more complex topologies and interactions
Full Text Available The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23. In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We show that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that, in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution.
Pájaro, Manuel; Otero-Muras, Irene; Vázquez, Carlos; Alonso, Antonio A
Gene regulation is inherently stochastic. In many applications concerning Systems and Synthetic Biology such as the reverse engineering and the de novo design of genetic circuits, stochastic effects (yet potentially crucial) are often neglected due to the high computational cost of stochastic simulations. With advances in these fields there is an increasing need of tools providing accurate approximations of the stochastic dynamics of gene regulatory networks (GRNs) with reduced computational effort. This work presents SELANSI (SEmi-LAgrangian SImulation of GRNs), a software toolbox for the simulation of stochastic multidimensional gene regulatory networks. SELANSI exploits intrinsic structural properties of gene regulatory networks to accurately approximate the corresponding Chemical Master Equation with a partial integral differential equation that is solved by a semi-lagrangian method with high efficiency. Networks under consideration might involve multiple genes with self and cross regulations, in which genes can be regulated by different transcription factors. Moreover, the validity of the method is not restricted to a particular type of kinetics. The tool offers total flexibility regarding network topology, kinetics and parameterization, as well as simulation options. SELANSI runs under the MATLAB environment, and is available under GPLv3 license at https://sites.google.com/view/selansi. email@example.com. © The Author(s) 2017. Published by Oxford University Press.
Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Vanderstocken, Gilles; van Helden, Jacques
Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.
Robins, Garry; Lewis, Jenny; Wang, Peng
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs......To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social......), and stochastic actor-oriented models. We focus most attention on ERGMs by providing an illustrative example of a model for a strategic information network within a local government. We draw inferences about the structural role played by individuals recognized as key innovators and conclude that such an approach...
Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.
Kolaczyk, Eric D
Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).
Mostafavi, Sara; Morris, Quaid
In this article, we review how interaction networks can be used alone or in combination in an automated fashion to provide insight into gene and protein function. We describe the concept of a "gene-recommender system" that can be applied to any large collection of interaction networks to make predictions about gene or protein function based on a query list of proteins that share a function of interest. We discuss these systems in general and focus on one specific system, GeneMANIA, that has unique features and uses different algorithms from the majority of other systems. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quevedo-Tumailli, Viviana F; Ortega-Tenezaca, Bernabé; González-Díaz, Humbert
The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and
Full Text Available Abstract Background Network motifs provided a “conceptual tool” for understanding the functional principles of biological networks, but such motifs have primarily been used to consider static network structures. Static networks, however, cannot be used to reveal time- and region-specific traits of biological systems. To overcome this limitation, we proposed the concept of a “spatiotemporal network motif,” a spatiotemporal sequence of network motifs of sub-networks which are active only at specific time points and body parts. Results On the basis of this concept, we analyzed the developmental gene regulatory network of the Drosophila melanogaster embryo. We identified spatiotemporal network motifs and investigated their distribution pattern in time and space. As a result, we found how key developmental processes are temporally and spatially regulated by the gene network. In particular, we found that nested feedback loops appeared frequently throughout the entire developmental process. From mathematical simulations, we found that mutual inhibition in the nested feedback loops contributes to the formation of spatial expression patterns. Conclusions Taken together, the proposed concept and the simulations can be used to unravel the design principle of developmental gene regulatory networks.
Shiraishi, Tetsuya; Matsuyama, Shinako; Kitano, Hiroaki
Protein–protein interaction and gene regulatory networks are likely to be locked in a state corresponding to a disease by the behavior of one or more bistable circuits exhibiting switch-like behavior. Sets of genes could be over-expressed or repressed when anomalies due to disease appear, and the circuits responsible for this over- or under-expression might persist for as long as the disease state continues. This paper shows how a large-scale analysis of network bistability for various human cancers can identify genes that can potentially serve as drug targets or diagnosis biomarkers. PMID:20628618
Full Text Available The architecture of tomato inflorescence strongly affects flower production and subsequent crop yield. To understand the genetic activities involved, insight into the underlying network of genes that initiate and control the sympodial growth in the tomato is essential. In this paper, we show how the structure of this network can be derived from available data of the expressions of the involved genes. Our approach starts from employing biological expert knowledge to select the most probable gene candidates behind branching behavior. To find how these genes interact, we develop a stepwise procedure for computational inference of the network structure. Our data consists of expression levels from primary shoot meristems, measured at different developmental stages on three different genotypes of tomato. With the network inferred by our algorithm, we can explain the dynamics corresponding to all three genotypes simultaneously, despite their apparent dissimilarities. We also correctly predict the chronological order of expression peaks for the main hubs in the network. Based on the inferred network, using optimal experimental design criteria, we are able to suggest an informative set of experiments for further investigation of the mechanisms underlying branching behavior.
Jia, Peilin; Ewers, Jeffrey M; Zhao, Zhongming
Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The strategy can be
Full Text Available Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs that are more likely to be associated with epilepsy. The responsible gene(s within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research.In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways.The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Full Text Available Abstract Background It has been long well known that genes do not act alone; rather groups of genes act in consort during a biological process. Consequently, the expression levels of genes are dependent on each other. Experimental techniques to detect such interacting pairs of genes have been in place for quite some time. With the advent of microarray technology, newer computational techniques to detect such interaction or association between gene expressions are being proposed which lead to an association network. While most microarray analyses look for genes that are differentially expressed, it is of potentially greater significance to identify how entire association network structures change between two or more biological settings, say normal versus diseased cell types. Results We provide a recipe for conducting a differential analysis of networks constructed from microarray data under two experimental settings. At the core of our approach lies a connectivity score that represents the strength of genetic association or interaction between two genes. We use this score to propose formal statistical tests for each of following queries: (i whether the overall modular structures of the two networks are different, (ii whether the connectivity of a particular set of "interesting genes" has changed between the two networks, and (iii whether the connectivity of a given single gene has changed between the two networks. A number of examples of this score is provided. We carried out our method on two types of simulated data: Gaussian networks and networks based on differential equations. We show that, for appropriate choices of the connectivity scores and tuning parameters, our method works well on simulated data. We also analyze a real data set involving normal versus heavy mice and identify an interesting set of genes that may play key roles in obesity. Conclusions Examining changes in network structure can provide valuable information about the
Full Text Available Abstract Background Recently, supervised learning methods have been exploited to reconstruct gene regulatory networks from gene expression data. The reconstruction of a network is modeled as a binary classification problem for each pair of genes. A statistical classifier is trained to recognize the relationships between the activation profiles of gene pairs. This approach has been proven to outperform previous unsupervised methods. However, the supervised approach raises open questions. In particular, although known regulatory connections can safely be assumed to be positive training examples, obtaining negative examples is not straightforward, because definite knowledge is typically not available that a given pair of genes do not interact. Results A recent advance in research on data mining is a method capable of learning a classifier from only positive and unlabeled examples, that does not need labeled negative examples. Applied to the reconstruction of gene regulatory networks, we show that this method significantly outperforms the current state of the art of machine learning methods. We assess the new method using both simulated and experimental data, and obtain major performance improvement. Conclusions Compared to unsupervised methods for gene network inference, supervised methods are potentially more accurate, but for training they need a complete set of known regulatory connections. A supervised method that can be trained using only positive and unlabeled data, as presented in this paper, is especially beneficial for the task of inferring gene regulatory networks, because only an incomplete set of known regulatory connections is available in public databases such as RegulonDB, TRRD, KEGG, Transfac, and IPA.
Jia, Chen; Qian, Hong; Chen, Min; Zhang, Michael Q.
The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterized by the spectral gap of the Markov model, and the feedback sign of the underlying gene circuit. When a network has no feedback, the relaxation rate is exactly the decaying rate of the protein. We further show that positive feedback always slows down the relaxation kinetics while negative feedback always speeds it up. Numerical simulations demonstrate that this relation provides a possible method to infer the feedback topology of autoregulatory gene networks by using time-series data of gene expression.
Wu, Guanming; Haw, Robin
Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
Gene regulatory networks analyze the relationships between genes allowing us to un- derstand the gene regulatory interactions in systems biology. Gene expression data from the microarray experiments is used to obtain the gene regulatory networks. How- ever, the microarray data is discrete, noisy and non-linear which makes learning the networks a challenging problem and existing gene network inference methods do not give consistent results. Current state-of-the-art study uses the average-ranking-based consensus method to combine and average the ranked predictions from individual methods. However each individual method has an equal contribution to the consen- sus prediction. We have developed a linear programming-based consensus approach which uses learned weights from linear programming among individual methods such that the methods have di↵erent weights depending on their performance. Our result reveals that assigning di↵erent weights to individual methods rather than giving them equal weights improves the performance of the consensus. The linear programming- based consensus method is evaluated and it had the best performance on in silico and Saccharomyces cerevisiae networks, and the second best on the Escherichia coli network outperformed by Inferelator Pipeline method which gives inconsistent results across a wide range of microarray data sets.
Zhou, Xuezhong; Liu, Baoyan; Wu, Zhaohui; Feng, Yi
The amount of biomedical data in different disciplines is growing at an exponential rate. Integrating these significant knowledge sources to generate novel hypotheses for systems biology research is difficult. Traditional Chinese medicine (TCM) is a completely different discipline, and is a complementary knowledge system to modern biomedical science. This paper uses a significant TCM bibliographic literature database in China, together with MEDLINE, to help discover novel gene functional knowledge. We present an integrative mining approach to uncover the functional gene relationships from MEDLINE and TCM bibliographic literature. This paper introduces TCM literature (about 50,000 records) as one knowledge source for constructing literature-based gene networks. We use the TCM diagnosis, TCM syndrome, to automatically congregate the related genes. The syndrome-gene relationships are discovered based on the syndrome-disease relationships extracted from TCM literature and the disease-gene relationships in MEDLINE. Based on the bubble-bootstrapping and relation weight computing methods, we have developed a prototype system called MeDisco/3S, which has name entity and relation extraction, and online analytical processing (OLAP) capabilities, to perform the integrative mining process. We have got about 200,000 syndrome-gene relations, which could help generate syndrome-based gene networks, and help analyze the functional knowledge of genes from syndrome perspective. We take the gene network of Kidney-Yang Deficiency syndrome (KYD syndrome) and the functional analysis of some genes, such as CRH (corticotropin releasing hormone), PTH (parathyroid hormone), PRL (prolactin), BRCA1 (breast cancer 1, early onset) and BRCA2 (breast cancer 2, early onset), to demonstrate the preliminary results. The underlying hypothesis is that the related genes of the same syndrome will have some biological functional relationships, and will constitute a functional network. This paper presents
Richiardi, Jonas; Altmann, Andre; Milazzo, Anna-Clare; Chang, Catie; Chakravarty, M Mallar; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Bromberg, Uli; Büchel, Christian; Conrod, Patricia; Fauth-Bühler, Mira; Flor, Herta; Frouin, Vincent; Gallinat, Jürgen; Garavan, Hugh; Gowland, Penny; Heinz, Andreas; Lemaître, Hervé; Mann, Karl F; Martinot, Jean-Luc; Nees, Frauke; Paus, Tomáš; Pausova, Zdenka; Rietschel, Marcella; Robbins, Trevor W; Smolka, Michael N; Spanagel, Rainer; Ströhle, Andreas; Schumann, Gunter; Hawrylycz, Mike; Poline, Jean-Baptiste; Greicius, Michael D
During rest, brain activity is synchronized between different regions widely distributed throughout the brain, forming functional networks. However, the molecular mechanisms supporting functional connectivity remain undefined. We show that functional brain networks defined with resting-state functional magnetic resonance imaging can be recapitulated by using measures of correlated gene expression in a post mortem brain tissue data set. The set of 136 genes we identify is significantly enriched for ion channels. Polymorphisms in this set of genes significantly affect resting-state functional connectivity in a large sample of healthy adolescents. Expression levels of these genes are also significantly associated with axonal connectivity in the mouse. The results provide convergent, multimodal evidence that resting-state functional networks correlate with the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Copyright © 2015, American Association for the Advancement of Science.
In this paper we perform a preliminary analysis of semantic networks to determine the most important terms that could be used to optimize a summarization task. In our experiments, we measure how the properties of a semantic network change, when the terms in the network are removed. Our preliminar...
Social networks have received much attention these days. Researchers have developed different methods to study the structure and characteristics of the network topology. Our focus is on spectral analysis of the adjacency matrix of the underlying network. Recent work showed good properties in the adjacency spectral space but there are few…
Gadaleta, Francesco; Bessonov, Kyrylo; Van Steen, Kristel
The vast amount of heterogeneous omics data, encompassing a broad range of biomolecular information, requires novel methods of analysis, including those that integrate the available levels of information. In this work, we describe Regression2Net, a computational approach that is able to integrate gene expression and genomic or methylation data in two steps. First, penalized regressions are used to build Expression-Expression (EEnet) and Expression-Genomic or Expression-Methylation (EMnet) networks. Second, network theory is used to highlight important communities of genes. When applying our approach, Regression2Net to gene expression and methylation profiles for individuals with glioblastoma multiforme, we identified, respectively, 284 and 447 potentially interesting genes in relation to glioblastoma pathology. These genes showed at least one connection in the integrated networks ANDnet and XORnet derived from aforementioned EEnet and EMnet networks. Although the edges in ANDnet occur in both EEnet and EMnet, the edges in XORnet occur in EMnet but not in EEnet. In-depth biological analysis of connected genes in ANDnet and XORnet revealed genes that are related to energy metabolism, cell cycle control (AATF), immune system response, and several cancer types. Importantly, we observed significant overrepresentation of cancer-related pathways including glioma, especially in the XORnet network, suggesting a nonignorable role of methylation in glioblastoma multiforma. In the ANDnet, we furthermore identified potential glioma suppressor genes ACCN3 and ACCN4 linked to the NBPF1 neuroblastoma breakpoint family, as well as numerous ABC transporter genes (ABCA1, ABCB1) suggesting drug resistance of glioblastoma tumors. © 2016 WILEY PERIODICALS, INC.
Choi, Yunkyu; Kim, Seok; Yi, Gwan-Su; Park, Jinah
Evolution of computer technologies makes it possible to access a large amount and various kinds of biological data via internet such as DNA sequences, proteomics data and information discovered about them. It is expected that the combination of various data could help researchers find further knowledge about them. Roles of a visualization system are to invoke human abilities to integrate information and to recognize certain patterns in the data. Thus, when the various kinds of data are examined and analyzed manually, an effective visualization system is an essential part. One instance of these integrated visualizations can be combination of protein-protein interaction (PPI) data and Gene Ontology (GO) which could help enhance the analysis of PPI network. We introduce a simple but comprehensive visualization system that integrates GO and PPI data where GO and PPI graphs are visualized side-by-side and supports quick reference functions between them. Furthermore, the proposed system provides several interactive visualization methods for efficiently analyzing the PPI network and GO directedacyclic- graph such as context-based browsing and common ancestors finding.
Full Text Available Abstract Background Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM. Results Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF. A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090, which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene. The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070 and constans-like (COL: At2g21320, were identified as positive regulators of starch synthase 4 (SS4: At4g18240. The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines. Conclusions In this study, we utilized a systematic approach of microarray
Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P
Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.
Full Text Available Abstract Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters
Lee, Tak; Kim, Hyojin; Lee, Insuk
Although next-generation sequencing (NGS) technology has enabled the decoding of many crop species genomes, most of the underlying genetic components for economically important crop traits remain to be determined. Network approaches have proven useful for the study of the reference plant, Arabidopsis thaliana, and the success of network-based crop genetics will also require the availability of a genome-scale functional networks for crop species. In this review, we discuss how to construct functional networks and elucidate the holistic view of a crop system. The crop gene network then can be used for gene prioritization and the analysis of resequencing-based genome-wide association study (GWAS) data, the amount of which will rapidly grow in the field of crop science in the coming years. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yasir Tariq Mohmand; Fahad Mehmood; Fahd Amjad; Nedim Makarevic
The structure and properties of public transportation networks can provide suggestions for urban planning and public policies. This study contributes a complex network analysis of the Guangzhou metro. The metro network has 236 kilometers of track and is the 6th busiest metro system of the world. In this paper topological properties of the network are explored. We observed that the network displays small world properties and is assortative in nature. The network possesses a high average degree...
Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner
Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Full Text Available Abstract Background The increasing number of available genotypes for genetic studies in humans requires more advanced techniques of analysis. We previously reported significant univariate associations between gene polymorphisms and antidepressant response in mood disorders. However the combined analysis of multiple gene polymorphisms and clinical variables requires the use of non linear methods. Methods In the present study we tested a neural network strategy for a combined analysis of two gene polymorphisms. A Multi Layer Perceptron model showed the best performance and was therefore selected over the other networks. One hundred and twenty one depressed inpatients treated with fluvoxamine in the context of previously reported pharmacogenetic studies were included. The polymorphism in the transcriptional control region upstream of the 5HTT coding sequence (SERTPR and in the Tryptophan Hydroxylase (TPH gene were analysed simultaneously. Results A multi layer perceptron network composed by 1 hidden layer with 7 nodes was chosen. 77.5 % of responders and 51.2% of non responders were correctly classified (ROC area = 0.731 – empirical p value = 0.0082. Finally, we performed a comparison with traditional techniques. A discriminant function analysis correctly classified 34.1 % of responders and 68.1 % of non responders (F = 8.16 p = 0.0005. Conclusions Overall, our findings suggest that neural networks may be a valid technique for the analysis of gene polymorphisms in pharmacogenetic studies. The complex interactions modelled through NN may be eventually applied at the clinical level for the individualized therapy.
Sie, R. L. L. (2012). COalitions in COOperation Networks (COCOON): Social Network Analysis and Game Theory to Enhance Cooperation Networks (Unpublished doctoral dissertation). September, 28, 2012, Open Universiteit in the Netherlands (CELSTEC), Heerlen, The Netherlands.
Linghu, Bolan; Franzosa, Eric A; Xia, Yu
Networks of functional associations between genes have recently been successfully used for gene function and disease-related research. A typical approach for constructing such functional linkage gene networks (FLNs) is based on the integration of diverse high-throughput functional genomics datasets. Data integration is a nontrivial task due to the heterogeneous nature of the different data sources and their variable accuracy and completeness. The presence of correlations between data sources also adds another layer of complexity to the integration process. In this chapter we discuss an approach for constructing a human FLN from data integration and a subsequent application of the FLN to novel disease gene discovery. Similar approaches can be applied to nonhuman species and other discovery tasks.
Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju
Our understanding of the molecular mechanisms underlying Alzheimer's disease (AD) remains incomplete. Previous studies have revealed that genetic factors provide a significant contribution to the pathogenesis and development of AD. In the past years, numerous genes implicated in this disease have been identified via genetic association studies on candidate genes or at the genome-wide level. However, in many cases, the roles of these genes and their interactions in AD are still unclear. A comprehensive and systematic analysis focusing on the biological function and interactions of these genes in the context of AD will therefore provide valuable insights to understand the molecular features of the disease. In this study, we collected genes potentially associated with AD by screening publications on genetic association studies deposited in PubMed. The major biological themes linked with these genes were then revealed by function and biochemical pathway enrichment analysis, and the relation between the pathways was explored by pathway crosstalk analysis. Furthermore, the network features of these AD-related genes were analyzed in the context of human interactome and an AD-specific network was inferred using the Steiner minimal tree algorithm. We compiled 430 human genes reported to be associated with AD from 823 publications. Biological theme analysis indicated that the biological processes and biochemical pathways related to neurodevelopment, metabolism, cell growth and/or survival, and immunology were enriched in these genes. Pathway crosstalk analysis then revealed that the significantly enriched pathways could be grouped into three interlinked modules-neuronal and metabolic module, cell growth/survival and neuroendocrine pathway module, and immune response-related module-indicating an AD-specific immune-endocrine-neuronal regulatory network. Furthermore, an AD-specific protein network was inferred and novel genes potentially associated with AD were identified. By
Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Altay, Gökmen; Emmert-Streib, Frank
Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Schaffter, Thomas; Marbach, Daniel; Floreano, Dario
Over the last decade, numerous methods have been developed for inference of regulatory networks from gene expression data. However, accurate and systematic evaluation of these methods is hampered by the difficulty of constructing adequate benchmarks and the lack of tools for a differentiated analysis of network predictions on such benchmarks. Here, we describe a novel and comprehensive method for in silico benchmark generation and performance profiling of network inference methods available to the community as an open-source software called GeneNetWeaver (GNW). In addition to the generation of detailed dynamical models of gene regulatory networks to be used as benchmarks, GNW provides a network motif analysis that reveals systematic prediction errors, thereby indicating potential ways of improving inference methods. The accuracy of network inference methods is evaluated using standard metrics such as precision-recall and receiver operating characteristic curves. We show how GNW can be used to assess the performance and identify the strengths and weaknesses of six inference methods. Furthermore, we used GNW to provide the international Dialogue for Reverse Engineering Assessments and Methods (DREAM) competition with three network inference challenges (DREAM3, DREAM4 and DREAM5). GNW is available at http://gnw.sourceforge.net along with its Java source code, user manual and supporting data. Supplementary data are available at Bioinformatics online. firstname.lastname@example.org.
Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T
Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Garg, Abhishek; Di Cara, Alessandro; Xenarios, Ioannis; Mendoza, Luis; De Micheli, Giovanni
In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.
Jordan K. Boutilier
Full Text Available The pulmonary myocardium is a muscular coat surrounding the pulmonary and caval veins. Although its definitive physiological function is unknown, it may have a pathological role as the source of ectopic beats initiating atrial fibrillation. How the pulmonary myocardium gains pacemaker function is not clearly defined, although recent evidence indicates that changed transcriptional gene expression networks are at fault. The gene expression profile of this distinct cell type in situ was examined to investigate underlying molecular events that might contribute to atrial fibrillation. Via systems genetics, a whole-lung transcriptome data set from the BXD recombinant inbred mouse resource was analyzed, uncovering a pulmonary cardiomyocyte gene network of 24 transcripts, coordinately regulated by chromosome 1 and 2 loci. Promoter enrichment analysis and interrogation of publicly available ChIP-seq data suggested that transcription of this gene network may be regulated by the concerted activity of NKX2-5, serum response factor, myocyte enhancer factor 2, and also, at a post-transcriptional level, by RNA binding protein motif 20. Gene ontology terms indicate that this gene network overlaps with molecular markers of the stressed heart. Therefore, we propose that perturbed regulation of this gene network might lead to altered calcium handling, myocyte growth, and contractile force contributing to the aberrant electrophysiological properties observed in atrial fibrillation. We reveal novel molecular interactions and pathways representing possible therapeutic targets for atrial fibrillation. In addition, we highlight the utility of recombinant inbred mouse resources in detecting and characterizing gene expression networks of relatively small populations of cells that have a pathological significance.
Full Text Available Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB, which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other
Kordmahalleh, Mina Moradi; Sefidmazgi, Mohammad Gorji; Harrison, Scott H; Homaifar, Abdollah
The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network
Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.
Li, Mo; Belmonte, Juan Carlos Izpisua
Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.
Masys, Anthony J
Networks and Network Analysis for Defence and Security discusses relevant theoretical frameworks and applications of network analysis in support of the defence and security domains. This book details real world applications of network analysis to support defence and security. Shocks to regional, national and global systems stemming from natural hazards, acts of armed violence, terrorism and serious and organized crime have significant defence and security implications. Today, nations face an uncertain and complex security landscape in which threats impact/target the physical, social, economic
Kristopher J. L. Irizarry
Full Text Available Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1 that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons.
Zhou, Xionghui; Liu, Juan
Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for
Takaku, Tomoiku; Ohyashiki, Junko H.; Zhang, Yu; Ohyashiki, Kazuma
The immune response to viral infection involves complex network of dynamic gene and protein interactions. We present here the dynamic gene network of the host immune response during human herpesvirus type 6 (HHV-6) infection in an adult T-cell leukemia cell line. Using a pathway-focused oligonucleotide DNA microarray, we found a possible association between chemokine genes regulating Th1/Th2 balance and genes regulating T-cell proliferation during HHV-6B infection. Gene network analysis using an integrated comprehensive workbench, VoyaGene, revealed that a gene encoding a TEC-family kinase, ITK, might be a putative modulator in the host immune response against HHV-6B infection. We conclude that Th2-dominated inflammatory reaction in host cells may play an important role in HHV-6B-infected T cells, thereby suggesting the possibility that ITK might be a therapeutic target in diseases related to dysregulation of Th1/Th2 balance. This study describes a novel approach to find genes related with the complex host-virus interaction using microarray data employing the Bayesian statistical framework
Yeung, Enoch; Dy, Aaron J; Martin, Kyle B; Ng, Andrew H; Del Vecchio, Domitilla; Beck, James L; Collins, James J; Murray, Richard M
Synthetic gene expression is highly sensitive to intragenic compositional context (promoter structure, spacing regions between promoter and coding sequences, and ribosome binding sites). However, much less is known about the effects of intergenic compositional context (spatial arrangement and orientation of entire genes on DNA) on expression levels in synthetic gene networks. We compare expression of induced genes arranged in convergent, divergent, or tandem orientations. Induction of convergent genes yielded up to 400% higher expression, greater ultrasensitivity, and dynamic range than divergent- or tandem-oriented genes. Orientation affects gene expression whether one or both genes are induced. We postulate that transcriptional interference in divergent and tandem genes, mediated by supercoiling, can explain differences in expression and validate this hypothesis through modeling and in vitro supercoiling relaxation experiments. Treatment with gyrase abrogated intergenic context effects, bringing expression levels within 30% of each other. We rebuilt the toggle switch with convergent genes, taking advantage of supercoiling effects to improve threshold detection and switch stability. Copyright © 2017 Elsevier Inc. All rights reserved.
Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin
An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.
An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.
Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard
The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the
The formation of the nervous system is a multistep process that yields a mature brain. Failure in any of the steps of this process may cause brain malfunction. In the early stages of embryonic development, neural progenitors quickly proliferate and then, at a specific moment, differentiate into neurons or glia. Once they become postmitotic neurons, they migrate to their final destinations and begin to extend their axons to connect with other neurons, sometimes located in quite distant regions, to establish different neural circuits. During the last decade, it has become evident that Zic genes, in addition to playing important roles in early development (e.g., gastrulation and neural tube closure), are involved in different processes of late brain development, such as neuronal migration, axon guidance, and refinement of axon terminals. ZIC proteins are therefore essential for the proper wiring and connectivity of the brain. In this chapter, we review our current knowledge of the role of Zic genes in the late stages of neural circuit formation.
He, Feng Q; Ollert, Markus
Identification of key genes for a given physiological or pathological process is an essential but still very challenging task for the entire biomedical research community. Statistics-based approaches, such as genome-wide association study (GWAS)- or quantitative trait locus (QTL)-related analysis...... have already made enormous contributions to identifying key genes associated with a given disease or phenotype, the success of which is however very much dependent on a huge number of samples. Recent advances in network biology, especially network inference directly from genome-scale data...
Full Text Available Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/.
Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko
Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.
Smadar eBen-Tabou De-Leon
Full Text Available Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change.
van Dam, Jesse C J; Schaap, Peter J; Martins dos Santos, Vitor A P; Suárez-Diez, María
Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different
Hur, Junguk; Özgür, Arzucan; Xiang, Zuoshuang; He, Yongqun
Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature. INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these
Di Camillo, Barbara; Toffolo, Gianna; Cobelli, Claudio
In the context of reverse engineering of biological networks, simulators are helpful to test and compare the accuracy of different reverse-engineering approaches in a variety of experimental conditions. A novel gene-network simulator is presented that resembles some of the main features of transcriptional regulatory networks related to topology, interaction among regulators of transcription, and expression dynamics. The simulator generates network topology according to the current knowledge of biological network organization, including scale-free distribution of the connectivity and clustering coefficient independent of the number of nodes in the network. It uses fuzzy logic to represent interactions among the regulators of each gene, integrated with differential equations to generate continuous data, comparable to real data for variety and dynamic complexity. Finally, the simulator accounts for saturation in the response to regulation and transcription activation thresholds and shows robustness to perturbations. It therefore provides a reliable and versatile test bed for reverse engineering algorithms applied to microarray data. Since the simulator describes regulatory interactions and expression dynamics as two distinct, although interconnected aspects of regulation, it can also be used to test reverse engineering approaches that use both microarray and protein-protein interaction data in the process of learning. A first software release is available at http://www.dei.unipd.it/~dicamill/software/netsim as an R programming language package.
Filling a gap in literature, this self-contained book presents theoretical and application-oriented results that allow for a structural exploration of complex networks. The work focuses not only on classical graph-theoretic methods, but also demonstrates the usefulness of structural graph theory as a tool for solving interdisciplinary problems. Applications to biology, chemistry, linguistics, and data analysis are emphasized. The book is suitable for a broad, interdisciplinary readership of researchers, practitioners, and graduate students in discrete mathematics, statistics, computer science,
Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher
The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
Larremore, Daniel B.; Clauset, Aaron; Buckee, Caroline O.
The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences. PMID:24130474
Larremore, Daniel B; Clauset, Aaron; Buckee, Caroline O
The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.
Daniel B Larremore
Full Text Available The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs, and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.
Lim, Dajeong; Lee, Seung-Hwan; Kim, Nam-Kuk; Cho, Yong-Min; Chai, Han-Ha; Seong, Hwan-Hoo; Kim, Heebal
Marbling (intramuscular fat) is an important trait that affects meat quality and is a casual factor determining the price of beef in the Korean beef market. It is a complex trait and has many biological pathways related to muscle and fat. There is a need to identify functional modules or genes related to marbling traits and investigate their relationships through a weighted gene co-expression network analysis based on the system level. Therefore, we investigated the co-expression relationships of genes related to the 'marbling score' trait and systemically analyzed the network topology in Hanwoo (Korean cattle). As a result, we determined 3 modules (gene groups) that showed statistically significant results for marbling score. In particular, one module (denoted as red) has a statistically significant result for marbling score (p = 0.008) and intramuscular fat (p = 0.02) and water capacity (p = 0.006). From functional enrichment and relationship analysis of the red module, the pathway hub genes (IL6, CHRNE, RB1, INHBA and NPPA) have a direct interaction relationship and share the biological functions related to fat or muscle, such as adipogenesis or muscle growth. This is the first gene network study with m.logissimus in Hanwoo to observe co-expression patterns in divergent marbling phenotypes. It may provide insights into the functional mechanisms of the marbling trait.
Full Text Available Marbling (intramuscular fat is an important trait that affects meat quality and is a casual factor determining the price of beef in the Korean beef market. It is a complex trait and has many biological pathways related to muscle and fat. There is a need to identify functional modules or genes related to marbling traits and investigate their relationships through a weighted gene co-expression network analysis based on the system level. Therefore, we investigated the co-expression relationships of genes related to the ‘marbling score’ trait and systemically analyzed the network topology in Hanwoo (Korean cattle. As a result, we determined 3 modules (gene groups that showed statistically significant results for marbling score. In particular, one module (denoted as red has a statistically significant result for marbling score (p = 0.008 and intramuscular fat (p = 0.02 and water capacity (p = 0.006. From functional enrichment and relationship analysis of the red module, the pathway hub genes (IL6, CHRNE, RB1, INHBA and NPPA have a direct interaction relationship and share the biological functions related to fat or muscle, such as adipogenesis or muscle growth. This is the first gene network study with m.logissimus in Hanwoo to observe co-expression patterns in divergent marbling phenotypes. It may provide insights into the functional mechanisms of the marbling trait.
Soto-Girón, María Juliana; García-Vallejo, Felipe
One key step of human immunodeficiency virus type 1 (HIV-1) infection is the integration of its viral cDNA. This process is mediated through complex networks of host-virus interactions that alter several normal cell functions of the host. To study the complexity of disturbances in cell gene expression networks by HIV-1 integration, we constructed a network of human macrophage genes located close to chromatin regions rich in proviruses. To perform the network analysis, we selected 28 genes previously identified as the target of cDNA integration and their transcriptional profiles were obtained from GEO Profiles (NCBI). A total of 2770 interactions among the 28 genes located around the HIV-1 proviruses in human macrophages formed a highly dense main network connected to five sub-networks. The overall network was significantly enriched by genes associated with signal transduction, cellular communication and regulatory processes. To simulate the effects of HIV-1 integration in infected macrophages, five genes with the most number of interaction in the normal network were turned off by putting in zero the correspondent expression values. The HIV-1 infected network showed changes in its topology and alteration in the macrophage functions reflected in a re-programming of biosynthetic and general metabolic process. Understanding the complex virus-host interactions that occur during HIV-1 integration, may provided valuable genomic information to develop new antiviral treatments focusing on the management of some specific gene expression networks associated with viral integration. This is the first gene network which describes the human macrophages genes interactions related with HIV-1 integration. Copyright © 2011 Elsevier B.V. All rights reserved.
networks can be applied to better understand informal trade in developing countries, with a particular focus on Africa. The paper starts by discussing some of the fundamental concepts developed by social network analysis. Through a number of case studies, we show how social network analysis can...... illuminate the relevant causes of social patterns, the impact of social ties on economic performance, the diffusion of resources and information, and the exercise of power. The paper then examines some of the methodological challenges of social network analysis and how it can be combined with other...... approaches. The paper finally highlights some of the applications of social network analysis and their implications for trade policies....
Elizabeth A Osterndorff-Kahanek
Full Text Available Repeated ethanol exposure and withdrawal in mice increases voluntary drinking and represents an animal model of physical dependence. We examined time- and brain region-dependent changes in gene coexpression networks in amygdala (AMY, nucleus accumbens (NAC, prefrontal cortex (PFC, and liver after four weekly cycles of chronic intermittent ethanol (CIE vapor exposure in C57BL/6J mice. Microarrays were used to compare gene expression profiles at 0-, 8-, and 120-hours following the last ethanol exposure. Each brain region exhibited a large number of differentially expressed genes (2,000-3,000 at the 0- and 8-hour time points, but fewer changes were detected at the 120-hour time point (400-600. Within each region, there was little gene overlap across time (~20%. All brain regions were significantly enriched with differentially expressed immune-related genes at the 8-hour time point. Weighted gene correlation network analysis identified modules that were highly enriched with differentially expressed genes at the 0- and 8-hour time points with virtually no enrichment at 120 hours. Modules enriched for both ethanol-responsive and cell-specific genes were identified in each brain region. These results indicate that chronic alcohol exposure causes global 'rewiring' of coexpression systems involving glial and immune signaling as well as neuronal genes.
Johnson, Michael R; Shkura, Kirill; Langley, Sarah R; Delahaye-Duriez, Andree; Srivastava, Prashant; Hill, W David; Rackham, Owen J L; Davies, Gail; Harris, Sarah E; Moreno-Moral, Aida; Rotival, Maxime; Speed, Doug; Petrovski, Slavé; Katz, Anaïs; Hayward, Caroline; Porteous, David J; Smith, Blair H; Padmanabhan, Sandosh; Hocking, Lynne J; Starr, John M; Liewald, David C; Visconti, Alessia; Falchi, Mario; Bottolo, Leonardo; Rossetti, Tiziana; Danis, Bénédicte; Mazzuferi, Manuela; Foerch, Patrik; Grote, Alexander; Helmstaedter, Christoph; Becker, Albert J; Kaminski, Rafal M; Deary, Ian J; Petretto, Enrico
Genetic determinants of cognition are poorly characterized, and their relationship to genes that confer risk for neurodevelopmental disease is unclear. Here we performed a systems-level analysis of genome-wide gene expression data to infer gene-regulatory networks conserved across species and brain regions. Two of these networks, M1 and M3, showed replicable enrichment for common genetic variants underlying healthy human cognitive abilities, including memory. Using exome sequence data from 6,871 trios, we found that M3 genes were also enriched for mutations ascertained from patients with neurodevelopmental disease generally, and intellectual disability and epileptic encephalopathy in particular. M3 consists of 150 genes whose expression is tightly developmentally regulated, but which are collectively poorly annotated for known functional pathways. These results illustrate how systems-level analyses can reveal previously unappreciated relationships between neurodevelopmental disease-associated genes in the developed human brain, and provide empirical support for a convergent gene-regulatory network influencing cognition and neurodevelopmental disease.
Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.
Full Text Available Multimedia networks is an emerging subject that currently attracts the attention of research and industrial communities. This environment provides new entertainment services and business opportunities merged with all well-known network services like VoIP calls or file transfers. Such a heterogeneous system has to be able satisfy all network and end-user requirements which are increasing constantly. Therefore the simulation tools enabling deep analysis in order to find the key performance indicators and factors which influence the overall quality for specific network service the most are highly needed. This paper provides a study on the network parameters like communication technology, routing protocol, QoS mechanism, etc. and their effect on the performance of hybrid multimedia network. The analysis was performed in OPNET Modeler environment and the most interesting results are discussed at the end of this paper
Full Text Available For the purpose of improving the prediction of cancer prognosis in the clinical researches, various algorithms have been developed to construct the predictive models with the gene signatures detected by DNA microarrays. Due to the heterogeneity of the clinical samples, the list of differentially expressed genes (DEGs generated by the statistical methods or the machine learning algorithms often involves a number of false positive genes, which are not associated with the phenotypic differences between the compared clinical conditions, and subsequently impacts the reliability of the predictive models. In this study, we proposed a strategy, which combined the statistical algorithm with the gene-pathway bipartite networks, to generate the reliable lists of cancer-related DEGs and constructed the models by using support vector machine for predicting the prognosis of three types of cancers, namely, breast cancer, acute myeloma leukemia, and glioblastoma. Our results demonstrated that, combined with the gene-pathway bipartite networks, our proposed strategy can efficiently generate the reliable cancer-related DEG lists for constructing the predictive models. In addition, the model performance in the swap analysis was similar to that in the original analysis, indicating the robustness of the models in predicting the cancer outcomes.
Kamaraj, Balu; Gopalakrishnan, Chandrasekhar; Purohit, Rituraj
Albinism is an autosomal recessive genetic disorder due to low secretion of melanin. The oculocutaneous albinism (OCA) and ocular albinism (OA) genes are responsible for melanin production and also act as a potential targets for miRNAs. The role of miRNA is to inhibit the protein synthesis partially or completely by binding with the 3'UTR of the mRNA thus regulating gene expression. In this analysis, we predicted the genetic variation that occurred in 3'UTR of the transcript which can be a reason for low melanin production thus causing albinism. The single nucleotide polymorphisms (SNPs) in 3'UTR cause more new binding sites for miRNA which binds with mRNA which leads to inhibit the translation process either partially or completely. The SNPs in the mRNA of OCA and OA genes can create new binding sites for miRNA which may control the gene expression and lead to hypopigmentation. We have developed a computational procedure to determine the SNPs in the 3'UTR region of mRNA of OCA (TYR, OCA2, TYRP1 and SLC45A2) and OA (GPR143) genes which will be a potential cause for albinism. We identified 37 SNPs in five genes that are predicted to create 87 new binding sites on mRNA, which may lead to abrogation of the translation process. Expression analysis confirms that these genes are highly expressed in skin and eye regions. It is well supported by enrichment analysis that these genes are mainly involved in eye pigmentation and melanin biosynthesis process. The network analysis also shows how the genes are interacting and expressing in a complex network. This insight provides clue to wet-lab researches to understand the expression pattern of OCA and OA genes and binding phenomenon of mRNA and miRNA upon mutation, which is responsible for inhibition of translation process at genomic levels.
Full Text Available Retinitis pigmentosa (RP is a highly heterogeneous genetic visual disorder with more than 70 known causative genes, some of them shared with other non-syndromic retinal dystrophies (e.g. Leber congenital amaurosis, LCA. The identification of RP genes has increased steadily during the last decade, and the 30% of the cases that still remain unassigned will soon decrease after the advent of exome/genome sequencing. A considerable amount of genetic and functional data on single RD genes and mutations has been gathered, but a comprehensive view of the RP genes and their interacting partners is still very fragmentary. This is the main gap that needs to be filled in order to understand how mutations relate to progressive blinding disorders and devise effective therapies.We have built an RP-specific network (RPGeNet by merging data from different sources: high-throughput data from BioGRID and STRING databases, manually curated data for interactions retrieved from iHOP, as well as interactions filtered out by syntactical parsing from up-to-date abstracts and full-text papers related to the RP research field. The paths emerging when known RP genes were used as baits over the whole interactome have been analysed, and the minimal number of connections among the RP genes and their close neighbors were distilled in order to simplify the search space.In contrast to the analysis of single isolated genes, finding the networks linking disease genes renders powerful etiopathological insights. We here provide an interactive interface, RPGeNet, for the molecular biologist to explore the network centered on the non-syndromic and syndromic RP and LCA causative genes. By integrating tissue-specific expression levels and phenotypic data on top of that network, a more comprehensive biological view will highlight key molecular players of retinal degeneration and unveil new RP disease candidates.
Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin
High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.
Full Text Available Social network analysis was formed and established in the 1970s as a way of analyzing systems of social relations. In this review the theoretical-methodological standpoint of social network analysis ("structural analysis" is introduced and the different forms of social network analysis are presented. Structural analysis argues that social actors and social relations are embedded in social networks, meaning that action and perception of actors as well as the performance of social relations are influenced by the network structure. Since the 1990s structural analysis has integrated concepts such as agency, discourse and symbolic orientation and in this way structural analysis has opened itself. Since then there has been increasing use of qualitative methods in network analysis. They are used to include the perspective of the analyzed actors, to explore networks, and to understand network dynamics. In the reviewed book, edited by Betina HOLLSTEIN and Florian STRAUS, the twenty predominantly empirically orientated contributions demonstrate the possibilities of combining quantitative and qualitative methods in network analyses in different research fields. In this review we examine how the contributions succeed in applying and developing the structural analysis perspective, and the self-positioning of "qualitative network analysis" is evaluated. URN: urn:nbn:de:0114-fqs0701287
Garg, Abhishek; Mohanram, Kartik; Di Cara, Alessandro; De Micheli, Giovanni; Xenarios, Ioannis
Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations. In this article, we introduce the stochasticity in functions (SIF) model for simulating stochasticity in Boolean models of GRNs. By providing biological motivation behind the use of the SIF model and applying it to the T-helper and T-cell activation networks, we show that the SIF model provides more biologically robust results than the existing SIN model of stochasticity in GRNs. Algorithms are made available under our Boolean modeling toolbox, GenYsis. The software binaries can be downloaded from http://si2.epfl.ch/ approximately garg/genysis.html.
Ermann, Leonardo; Frahm, Klaus M.; Shepelyansky, Dima L.
In the past decade modern societies have developed enormous communication and social networks. Their classification and information retrieval processing has become a formidable task for the society. Because of the rapid growth of the World Wide Web, and social and communication networks, new mathematical methods have been invented to characterize the properties of these networks in a more detailed and precise way. Various search engines extensively use such methods. It is highly important to develop new tools to classify and rank a massive amount of network information in a way that is adapted to internal network structures and characteristics. This review describes the Google matrix analysis of directed complex networks demonstrating its efficiency using various examples including the World Wide Web, Wikipedia, software architectures, world trade, social and citation networks, brain neural networks, DNA sequences, and Ulam networks. The analytical and numerical matrix methods used in this analysis originate from the fields of Markov chains, quantum chaos, and random matrix theory.
M. I. Gumel
Full Text Available The next generation wireless networks experienced a great development with emergence of wireless mesh networks (WMNs, which can be regarded as a realistic solution that provides wireless broadband access. The limited available bandwidth makes capacity analysis of the network very essential. While the network offers broadband wireless access to community and enterprise users, the problems that limit the network capacity must be addressed to exploit the optimum network performance. The wireless mesh network capacity analysis shows that the throughput of each mesh node degrades in order of l/n with increasing number of nodes (n in a linear topology. The degradation is found to be higher in a fully mesh network as a result of increase in interference and MAC layer contention in the network.
Theodosiou, Theodosios; Efstathiou, Georgios; Papanikolaou, Nikolas; Kyrpides, Nikos C; Bagos, Pantelis G; Iliopoulos, Ioannis; Pavlopoulos, Georgios A
Nowadays, due to the technological advances of high-throughput techniques, Systems Biology has seen a tremendous growth of data generation. With network analysis, looking at biological systems at a higher level in order to better understand a system, its topology and the relationships between its components is of a great importance. Gene expression, signal transduction, protein/chemical interactions, biomedical literature co-occurrences, are few of the examples captured in biological network representations where nodes represent certain bioentities and edges represent the connections between them. Today, many tools for network visualization and analysis are available. Nevertheless, most of them are standalone applications that often (i) burden users with computing and calculation time depending on the network's size and (ii) focus on handling, editing and exploring a network interactively. While such functionality is of great importance, limited efforts have been made towards the comparison of the topological analysis of multiple networks. Network Analysis Provider (NAP) is a comprehensive web tool to automate network profiling and intra/inter-network topology comparison. It is designed to bridge the gap between network analysis, statistics, graph theory and partially visualization in a user-friendly way. It is freely available and aims to become a very appealing tool for the broader community. It hosts a great plethora of topological analysis methods such as node and edge rankings. Few of its powerful characteristics are: its ability to enable easy profile comparisons across multiple networks, find their intersection and provide users with simplified, high quality plots of any of the offered topological characteristics against any other within the same network. It is written in R and Shiny, it is based on the igraph library and it is able to handle medium-scale weighted/unweighted, directed/undirected and bipartite graphs. NAP is available at http://bioinformatics.med.uoc.gr/NAP .
This book is devoted to recent progress in social network analysis with a high focus on community detection and evolution. The eleven chapters cover the identification of cohesive groups, core components and key players either in static or dynamic networks of different kinds and levels of heterogeneity. Other important topics in social network analysis such as influential detection and maximization, information propagation, user behavior analysis, as well as network modeling and visualization are also presented. Many studies are validated through real social networks such as Twitter. This edit
Full Text Available Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA, into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms.
Sean P Farris
Full Text Available Cocaine and alcohol are two substances of abuse that prominently affect the central nervous system (CNS. Repeated exposure to cocaine and alcohol leads to longstanding changes in gene expression, and subsequent functional CNS plasticity, throughout multiple brain regions. Epigenetic modifications of histones are one proposed mechanism guiding these enduring changes to the transcriptome. Characterizing the large number of available biological relationships as network models can reveal unexpected biochemical relationships. Clustering analysis of variation from whole-genome sequencing of gene expression (RNA-Seq and histone H3 lysine 4 trimethylation (H3K4me3 events (ChIP-Seq revealed the underlying structure of the transcriptional and epigenomic landscape within hippocampal postmortem brain tissue of drug abusers and control cases. Distinct sets of interrelated networks for cocaine and alcohol abuse were determined for each abusive substance. The network approach identified subsets of functionally related genes that are regulated in agreement with H3K4me3 changes, suggesting cause and effect relationships between this epigenetic mark and gene expression. Gene expression networks consisted of recognized substrates for addiction, such as the dopamine- and cAMP-regulated neuronal phosphoprotein PPP1R1B / DARPP-32 and the vesicular glutamate transporter SLC17A7 / VGLUT1 as well as potentially novel molecular targets for substance abuse. Through a systems biology based approach our results illustrate the utility of integrating epigenetic and transcript expression to establish relevant biological networks in the human brain for addiction. Future work with laboratory models may clarify the functional relevance of these gene networks for cocaine and alcohol, and provide a framework for the development of medications for the treatment of addiction.
Zweig, Katharina A
Network Analysis Literacy focuses on design principles for network analytics projects. The text enables readers to: pose a defined network analytic question; build a network to answer the question; choose or design the right network analytic methods for a particular purpose, and more.
Conclusions: Our study reveals the ability to assess time-dependent changes in gene expression patterns in pancreatic cancer cells associated with infection and susceptibility to vaccinia viruses. This suggests that molecular assays may be useful to develop safer and more efficacious oncolyticvirotherapies and support the idea that these treatments may target pathways implicated in pancreatic cancer resistance to conventional therapies.
Araújo, Daniela; Henriques, Mariana; Silva, Sónia
Most cases of candidiasis have been attributed to Candida albicans, but Candida glabrata, Candida parapsilosis and Candida tropicalis, designated as non-C. albicans Candida (NCAC), have been identified as frequent human pathogens. Moreover, Candida biofilms are an escalating clinical problem associated with significant rates of mortality. Biofilms have distinct developmental phases, including adhesion/colonisation, maturation and dispersal, controlled by complex regulatory networks. This review discusses recent advances regarding Candida species biofilm regulatory network genes, which are key components for candidiasis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence
Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a
Full Text Available Identification of genes/proteins involved in pluripotency and their inter-relationships is important for understanding the induction/loss and maintenance of pluripotency. With the availability of large volume of data on interaction/regulation of pluripotency scattered across a large number of biological databases and hundreds of scientific journals, it is required a systematic integration of data which will create a complete view of pluripotency network. Describing and interpreting such a network of interaction and regulation (i.e., stimulation and inhibition links are essential tasks of computational biology, an important first step in systems-level understanding of the underlying mechanisms of pluripotency. To address this, we have assembled a network of 166 molecular interactions, stimulations and inhibitions, based on a collection of research data from 147 publications, involving 122 human genes/proteins, all in a standard electronic format, enabling analyses by readily available software such as Cytoscape and its Apps (formerly called "Plugins". The network includes the core circuit of OCT4 (POU5F1, SOX2 and NANOG, its periphery (such as STAT3, KLF4, UTF1, ZIC3, and c-MYC, connections to upstream signaling pathways (such as ACTIVIN, WNT, FGF, and BMP, and epigenetic regulators (such as L1TD1, LSD1 and PRC2. We describe the general properties of the network and compare it with other literature-based networks. Gene Ontology (GO analysis is being performed to find out the over-represented GO terms in the network. We use several expression datasets to condense the network to a set of network links that identify the key players (genes/proteins and the pathways involved in transition from one state of pluripotency to other state (i.e., native to primed state, primed to non-pluripotent state and pluripotent to non-pluripotent state.
Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella
Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723
Full Text Available Several studies have reported gene expression signatures that predict recurrence risk in stage II and III colorectal cancer (CRC patients with minimal gene membership overlap and undefined biological relevance. The goal of this study was to investigate biological themes underlying these signatures, to infer genes of potential mechanistic importance to the CRC recurrence phenotype and to test whether accurate prognostic models can be developed using mechanistically important genes.We investigated eight published CRC gene expression signatures and found no functional convergence in Gene Ontology enrichment analysis. Using a random walk-based approach, we integrated these signatures and publicly available somatic mutation data on a protein-protein interaction network and inferred 487 genes that were plausible candidate molecular underpinnings for the CRC recurrence phenotype. We named the list of 487 genes a NEM signature because it integrated information from Network, Expression, and Mutation. The signature showed significant enrichment in four biological processes closely related to cancer pathophysiology and provided good coverage of known oncogenes, tumor suppressors, and CRC-related signaling pathways. A NEM signature-based Survival Support Vector Machine prognostic model was trained using a microarray gene expression dataset and tested on an independent dataset. The model-based scores showed a 75.7% concordance with the real survival data and separated patients into two groups with significantly different relapse-free survival (p = 0.002. Similar results were obtained with reversed training and testing datasets (p = 0.007. Furthermore, adjuvant chemotherapy was significantly associated with prolonged survival of the high-risk patients (p = 0.006, but not beneficial to the low-risk patients (p = 0.491.The NEM signature not only reflects CRC biology but also informs patient prognosis and treatment response. Thus, the network
Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S
Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.
Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E
Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
McCormack, Theodore; Frings, Oliver; Alexeyenko, Andrey; Sonnhammer, Erik L L
Analyzing groups of functionally coupled genes or proteins in the context of global interaction networks has become an important aspect of bioinformatic investigations. Assessing the statistical significance of crosstalk enrichment between or within groups of genes can be a valuable tool for functional annotation of experimental gene sets. Here we present CrossTalkZ, a statistical method and software to assess the significance of crosstalk enrichment between pairs of gene or protein groups in large biological networks. We demonstrate that the standard z-score is generally an appropriate and unbiased statistic. We further evaluate the ability of four different methods to reliably recover crosstalk within known biological pathways. We conclude that the methods preserving the second-order topological network properties perform best. Finally, we show how CrossTalkZ can be used to annotate experimental gene sets using known pathway annotations and that its performance at this task is superior to gene enrichment analysis (GEA). CrossTalkZ (available at http://sonnhammer.sbc.su.se/download/software/CrossTalkZ/) is implemented in C++, easy to use, fast, accepts various input file formats, and produces a number of statistics. These include z-score, p-value, false discovery rate, and a test of normality for the null distributions.
Full Text Available Abstract Background A wide range of techniques is now available for analyzing regulatory networks. Nonetheless, most of these techniques fail to interpret large-scale transcriptional data at the post-translational level. Results We address the question of using large-scale transcriptomic observation of a system perturbation to analyze a regulatory network which contained several types of interactions - transcriptional and post-translational. Our method consisted of post-processing the outputs of an open-source tool named BioQuali - an automatic constraint-based analysis mimicking biologist's local reasoning on a large scale. The post-processing relied on differences in the behavior of the transcriptional and post-translational levels in the network. As a case study, we analyzed a network representation of the genes and proteins controlled by an oncogene in the context of Ewing's sarcoma. The analysis allowed us to pinpoint active interactions specific to this cancer. We also identified the parts of the network which were incomplete and should be submitted for further investigation. Conclusions The proposed approach is effective for the qualitative analysis of cancer networks. It allows the integrative use of experimental data of various types in order to identify the specific information that should be considered a priority in the initial - and possibly very large - experimental dataset. Iteratively, new dataset can be introduced into the analysis to improve the network representation and make it more specific.
Full Text Available Long non-coding RNAs (lncRNAs have been reported to be involved in the development of maize plant. However, few focused on seed development of maize. Here, we identified 753 lncRNA candidates in maize genome from six seed samples. Similar to the mRNAs, lncRNAs showed tissue developmental stage specific and differential expression, indicating their putative role in seed development. Increasing evidence shows that crosstalk among RNAs mediated by shared microRNAs (miRNAs represents a novel layer of gene regulation, which plays important roles in plant development. Functional roles and regulatory mechanisms of lncRNAs as competing endogenous RNAs (ceRNA in plants, particularly in maize seed development, are unclear. We combined analyses of consistently altered 17 lncRNAs, 840 mRNAs and known miRNA to genome-wide investigate potential lncRNA-mediated ceRNA based on “ceRNA hypothesis”. The results uncovered seven novel lncRNAs as potential functional ceRNAs. Functional analyses based on their competitive coding-gene partners by Gene Ontology (GO and KEGG biological pathway demonstrated that combined effects of multiple ceRNAs can have major impacts on general developmental and metabolic processes in maize seed. These findings provided a useful platform for uncovering novel mechanisms of maize seed development and may provide opportunities for the functional characterization of individual lncRNA in future studies.
A duscussion of the fight between proponents of rationalistic policy analysis and more political interaction models for policy analysis. The latter group is the foundation for the many network models of policy analysis of today.......A duscussion of the fight between proponents of rationalistic policy analysis and more political interaction models for policy analysis. The latter group is the foundation for the many network models of policy analysis of today....
Nguyen, Cao D.; Gardiner, Katheleen J.; Cios, Krzysztof J.
We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precis...
Full Text Available In this paper, an original rigorous analysis of recurrent analog neural networks, which are built from opamp neurons, is presented. The analysis, which comes from the approximate model of the operational amplifier, reveals causes of possible non-stable states and enables to determine convergence properties of the network. Results of the analysis are discussed in order to enable development of original robust and fast analog networks. In the analysis, the special attention is turned to the examination of the influence of real circuit elements and of the statistical parameters of processed signals to the parameters of the network.
Raza, Khalid; Alam, Mansaf
One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model. Copyright © 2016 Elsevier Ltd. All rights reserved.
Vlaic, Sebastian; Hoffmann, Bianca; Kupfer, Peter; Weber, Michael; Dräger, Andreas
GRN2SBML automatically encodes gene regulatory networks derived from several inference tools in systems biology markup language. Providing a graphical user interface, the networks can be annotated via the simple object access protocol (SOAP)-based application programming interface of BioMart Central Portal and minimum information required in the annotation of models registry. Additionally, we provide an R-package, which processes the output of supported inference algorithms and automatically passes all required parameters to GRN2SBML. Therefore, GRN2SBML closes a gap in the processing pipeline between the inference of gene regulatory networks and their subsequent analysis, visualization and storage. GRN2SBML is freely available under the GNU Public License version 3 and can be downloaded from http://www.hki-jena.de/index.php/0/2/490. General information on GRN2SBML, examples and tutorials are available at the tool's web page.
Meisel, Matthew K.; Clifton, Allan D.; MacKillop, James; Miller, Joshua D.; Campbell, W. Keith; Goodie, Adam S.
Aims To apply social network analysis (SNA) to investigate whether frequency and severity of gambling problems were associated with different network characteristics among friends, family, and co-workers. is an innovative way to look at relationships among individuals; the current study was the first to our knowledge to apply SNA to gambling behaviors. Design Egocentric social network analysis was used to formally characterize the relationships between social network characteristics and gambling pathology. Setting Laboratory-based questionnaire and interview administration. Participants Forty frequent gamblers (22 non-pathological gamblers, 18 pathological gamblers) were recruited from the community. Findings The SNA revealed significant social network compositional differences between the two groups: pathological gamblers (PGs) had more gamblers, smokers, and drinkers in their social networks than did nonpathological gamblers (NPGs). PGs had more individuals in their network with whom they personally gambled, smoked, and drank with than those with who were NPG. Network ties were closer to individuals in their networks who gambled, smoked, and drank more frequently. Associations between gambling severity and structural network characteristics were not significant. Conclusions Pathological gambling is associated with compositional but not structural differences in social networks. Pathological gamblers differ from non-pathological gamblers in the number of gamblers, smokers, and drinkers in their social networks. Homophily within the networks also indicates that gamblers tend to be closer with other gamblers. This homophily may serve to reinforce addictive behaviors, and may suggest avenues for future study or intervention. PMID:23072641
Meisel, Matthew K; Clifton, Allan D; Mackillop, James; Miller, Joshua D; Campbell, W Keith; Goodie, Adam S
To apply social network analysis (SNA) to investigate whether frequency and severity of gambling problems were associated with different network characteristics among friends, family and co-workers is an innovative way to look at relationships among individuals; the current study was the first, to our knowledge, to apply SNA to gambling behaviors. Egocentric social network analysis was used to characterize formally the relationships between social network characteristics and gambling pathology. Laboratory-based questionnaire and interview administration. Forty frequent gamblers (22 non-pathological gamblers, 18 pathological gamblers) were recruited from the community. The SNA revealed significant social network compositional differences between the two groups: pathological gamblers (PGs) had more gamblers, smokers and drinkers in their social networks than did non-pathological gamblers (NPGs). PGs had more individuals in their network with whom they personally gambled, smoked and drank than those with who were NPG. Network ties were closer to individuals in their networks who gambled, smoked and drank more frequently. Associations between gambling severity and structural network characteristics were not significant. Pathological gambling is associated with compositional but not structural differences in social networks. Pathological gamblers differ from non-pathological gamblers in the number of gamblers, smokers and drinkers in their social networks. Homophily within the networks also indicates that gamblers tend to be closer with other gamblers. This homophily may serve to reinforce addictive behaviors, and may suggest avenues for future study or intervention. © 2012 The Authors, Addiction © 2012 Society for the Study of Addiction.
Pritykin, Yuri; Ghersi, Dario; Singh, Mona
Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655
Damotte, V; Guillot-Noel, L; Patsopoulos, N A
adhesion molecule (CAMs) biological pathway using Cytoscape software. This network is a strong candidate, as it is involved in the crossing of the blood-brain barrier by the T cells, an early event in MS pathophysiology, and is used as an efficient therapeutic target. We drew up a list of 76 genes...... in interaction with other genes as a group. Pathway analysis is an alternative way to highlight such group of genes. Using SNP association P-values from eight multiple sclerosis (MS) GWAS data sets, we performed a candidate pathway analysis for MS susceptibility by considering genes interacting in the cell...... belonging to the CAM network. We highlighted 64 networks enriched with CAM genes with low P-values. Filtering by a percentage of CAM genes up to 50% and rejecting enriched signals mainly driven by transcription factors, we highlighted five networks associated with MS susceptibility. One of them, constituted...
Shaw, Harry C.
Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time expression and assay of gene expression products.
Shaw, Harry C.
Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT (Information Technology) organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time assays of gene expression products.
Full Text Available Abstract Background We have recently identified a number of Quantitative Trait Loci (QTL contributing to the 2-fold muscle weight difference between the LG/J and SM/J mouse strains and refined their confidence intervals. To facilitate nomination of the candidate genes responsible for these differences we examined the transcriptome of the tibialis anterior (TA muscle of each strain by RNA-Seq. Results 13,726 genes were expressed in mouse skeletal muscle. Intersection of a set of 1061 differentially expressed transcripts with a mouse muscle Bayesian Network identified a coherent set of differentially expressed genes that we term the LG/J and SM/J Regulatory Network (LSRN. The integration of the QTL, transcriptome and the network analyses identified eight key drivers of the LSRN (Kdr, Plbd1, Mgp, Fah, Prss23, 2310014F06Rik, Grtp1, Stk10 residing within five QTL regions, which were either polymorphic or differentially expressed between the two strains and are strong candidates for quantitative trait genes (QTGs underlying muscle mass. The insight gained from network analysis including the ability to make testable predictions is illustrated by annotating the LSRN with knowledge-based signatures and showing that the SM/J state of the network corresponds to a more oxidative state. We validated this prediction by NADH tetrazolium reductase staining in the TA muscle revealing higher oxidative potential of the SM/J compared to the LG/J strain (p Conclusion Thus, integration of fine resolution QTL mapping, RNA-Seq transcriptome information and mouse muscle Bayesian Network analysis provides a novel and unbiased strategy for nomination of muscle QTGs.
Li, Yi; Rao, Nini; Yang, Feng; Zhang, Ying; Yang, Yang; Liu, Han-ming; Guo, Fengbiao; Huang, Jian
Acid stress is one of the most serious threats that cyanobacteria have to face, and it has an impact at all levels from genome to phenotype. However, very little is known about the detailed response mechanism to acid stress in this species. We present here a general analysis of the gene regulatory network of Synechocystis sp. PCC 6803 in response to acid stress using comparative genome analysis and biocomputational prediction. In this study, we collected 85 genes and used them as an initial template to predict new genes through co-regulation, protein-protein interactions and the phylogenetic profile, and 179 new genes were obtained to form a complete template. In addition, we found that 11 enriched pathways such as glycolysis are closely related to the acid stress response. Finally, we constructed a regulatory network for the intricate relationship of these genes and summarize the key steps in response to acid stress. This is the first time a bioinformatic approach has been taken systematically to gene interactions in cyanobacteria and the elaboration of their cell metabolism and regulatory pathways under acid stress, which is more efficient than a traditional experimental study. The results also provide theoretical support for similar research into environmental stresses in cyanobacteria and possible industrial applications. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Che, Dongxue; Wang, Yang; Bai, Weiyang; Li, Leijie; Liu, Guiyou; Zhang, Liangcai; Zuo, Yongchun; Tao, Shiheng; Hua, Jinlian; Liao, Mingzhi
Gametogenesis is a complex process, which includes mitosis and meiosis and results in the production of ovum and sperm. The development of gametogenesis is dynamic and needs many different genes to work synergistically, but it is lack of global perspective research about this process. In this study, we detected the dynamic process of gametogenesis from the perspective of systems biology based on protein-protein interaction networks (PPINs) and functional analysis. Results showed that gametogenesis genes have strong synergistic effects in PPINs within and between different phases during the development. Addition to the synergistic effects on molecular networks, gametogenesis genes showed functional consistency within and between different phases, which provides the further evidence about the dynamic process during the development of gametogenesis. At last, we detected and provided the core molecular modules of different phases about gametogenesis. The gametogenesis genes and related modules can be obtained from our Web site Gametogenesis Molecule Online (GMO, http://gametsonline.nwsuaflmz.com/index.php), which is freely accessible. GMO may be helpful for the reference and application of these genes and modules in the future identification of key genes about gametogenesis. Summary, this work provided a computational perspective and frame to the analysis of the gametogenesis dynamics and modularity in both human and mouse. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: email@example.com.
Tegnér, Jesper N.
Mapping out cellular networks in general and transcriptional networks in particular has proved to be a bottle-neck hampering our understanding of biological processes. Integrative approaches fusing computational and experimental technologies for decoding transcriptional networks at a high level of resolution is therefore of uttermost importance. Yet, this is challenging since the control of gene expression in eukaryotes is a complex multi-level process influenced by several epigenetic factors and the fine interplay between regulatory proteins and the promoter structure governing the combinatorial regulation of gene expression. In this chapter we review how the CAGE data can be integrated with other measurements such as expression, physical interactions and computational prediction of regulatory motifs, which together can provide a genome-wide picture of eukaryotic transcriptional regulatory networks at a new level of resolution. © 2010 by Pan Stanford Publishing Pte. Ltd. All rights reserved.
Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable
Raúl Rodríguez Rodríguez
Full Text Available This paper deals with social network analysis and how it could be integrated within supply chain management from a decision-making point of view. Even though the benefits of using social analysis have are widely accepted at both academic and industry/services context, there is still a lack of solid frameworks that allow decision-makers to connect the usage and obtained results of social network analysis – mainly both information and knowledge flows and derived results- with supply chain management objectives and goals. This paper gives an overview of social network analysis, the main social network analysis metrics, supply chain performance and, finally, it identifies how future frameworks could close the gap and link the results of social network analysis with the supply chain management decision-making processes.
Network protocol analysis is a network sniffer to capture data for further analysis and understanding of the technical means necessary packets. Network sniffing is intercepted by packet assembly binary format of the original message content. In order to obtain the information contained. Required based on TCP / IP protocol stack protocol specification. Again to restore the data packets at protocol format and content in each protocol layer. Actual data transferred, as well as the application tier.
Fang, Delin; Chen, Bin
The notions of virtual water flows provide important indicators to manifest the water consumption and allocation between different sectors via product transactions. However, the configuration of virtual water network (VWN) still needs further investigation to identify the water interdependency among different sectors as well as the network efficiency and stability in a socio-economic system. Ecological network analysis is chosen as a useful tool to examine the structure and function of VWN and the interactions among its sectors. A balance analysis of efficiency and redundancy is also conducted to describe the robustness (RVWN) of VWN. Then, network control analysis and network utility analysis are performed to investigate the dominant sectors and pathways for virtual water circulation and the mutual relationships between pairwise sectors. A case study of the Heihe River Basin in China shows that the balance between efficiency and redundancy is situated on the left side of the robustness curve with less efficiency and higher redundancy. The forestation, herding and fishing sectors and industrial sectors are found to be the main controllers. The network tends to be more mutualistic and synergic, though some competitive relationships that weaken the virtual water circulation still exist.
Full Text Available This survey is concerned oneself with the study of those types of material networks which can be met both in civil engineering and also in electrotechnics, in mechanics, or in hydrotechnics, and of which behavior lead to linear problems, solvable by means of Finite Element Method and adequate algorithms. Here, it is presented a unitary theory of networks met in the domains mentioned above and this one is illustrated with examples for the structural networks in civil engineering, electric circuits, and water supply networks, but also planar or spatial mechanisms can be comprised in this theory. The attention is focused to make evident the essential proper- ties and concepts in the network analysis, which differentiate the networks under force from other types of material networks. To such a network a planar, connected, and directed or undirected graph is associated, and with some vector fields on the vertex set this graph is endowed. .
Dalege, Jonas; Borsboom, Denny; van Harreveld, Frenk; van der Maas, Han L J
In this article, we provide a brief tutorial on the estimation, analysis, and simulation on attitude networks using the programming language R. We first discuss what a network is and subsequently show how one can estimate a regularized network on typical attitude data. For this, we use open-access data on the attitudes toward Barack Obama during the 2012 American presidential election. Second, we show how one can calculate standard network measures such as community structure, centrality, and connectivity on this estimated attitude network. Third, we show how one can simulate from an estimated attitude network to derive predictions from attitude networks. By this, we highlight that network theory provides a framework for both testing and developing formalized hypotheses on attitudes and related core social psychological constructs.
An, Yang; Furber, Kendra L; Ji, Shaoping
The concept of competitive endogenous RNA (ceRNA) was first proposed by Salmena and colleagues. Evidence suggests that pseudogene RNAs can act as a 'sponge' through competitive binding of common miRNA, releasing or attenuating repression through sequestering miRNAs away from parental mRNA. In theory, ceRNAs refer to all transcripts such as mRNA, tRNA, rRNA, long non-coding RNA, pseudogene RNA and circular RNA, because all of them may become the targets of miRNA depending on spatiotemporal situation. As binding of miRNA to the target RNA is not 100% complementary, it is possible that one miRNA can bind to multiple target RNAs and vice versa. All RNAs crosstalk through competitively binding to miRNAvia miRNA response elements (MREs) contained within the RNA sequences, thus forming a complex regulatory network. The ratio of a subset of miRNAs to the corresponding number of MREs determines repression strength on a given mRNA translation or stability. An increase in pseudogene RNA level can sequester miRNA and release repression on the parental gene, leading to an increase in parental gene expression. A massive number of transcripts constitute a complicated network that regulates each other through this proposed mechanism, though some regulatory significance may be mild or even undetectable. It is possible that the regulation of gene and pseudogene expression occurring in this manor involves all RNAs bearing common MREs. In this review, we will primarily discuss how pseudogene transcripts regulate expression of parental genes via ceRNA network and biological significance of regulation. © 2016 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.
Koldanov, Petr; Pardalos, Panos
The contributions in this volume cover a broad range of topics including maximum cliques, graph coloring, data mining, brain networks, Steiner forest, logistic and supply chain networks. Network algorithms and their applications to market graphs, manufacturing problems, internet networks and social networks are highlighted. The "Fourth International Conference in Network Analysis," held at the Higher School of Economics, Nizhny Novgorod in May 2014, initiated joint research between scientists, engineers and researchers from academia, industry and government; the major results of conference participants have been reviewed and collected in this Work. Researchers and students in mathematics, economics, statistics, computer science and engineering will find this collection a valuable resource filled with the latest research in network analysis.
Langfelder, Peter; Mischel, Paul S; Horvath, Steve
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to
Langfelder, Peter; Mischel, Paul S.; Horvath, Steve
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to
Full Text Available Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data. Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA in three comprehensive and unbiased empirical studies: (1 Finding genes predictive of lung cancer survival, (2 finding methylation markers related to age, and (3 finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1. However, standard meta-analysis methods perform as good as (if not better than a consensus network approach in terms of validation success (criterion 2. The article also reports a comparison of meta-analysis techniques
Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene
Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.
Yang, Jian; Yang, Tinghong; Wu, Duzhi; Lin, Limei; Yang, Fan; Zhao, Jing
Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise. Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network. The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.
This thesis is generally about network performance analysis. It contains two parts. The theory part summarizes what network performance is and inducts the methods of doing network performance analysis. To answer what network performance is, a study into what network services are is done. And based on the background research, there are two important network performance metrics: Network delay and Throughput should be included in network performance analysis. Among the methods of network a...
Ferguson, Laura B; Harris, R Adron; Mayfield, Roy Dayne
The alcohol research field has amassed an impressive number of gene expression datasets spanning key brain areas for addiction, species (humans as well as multiple animal models), and stages in the addiction cycle (binge/intoxication, withdrawal/negative effect, and preoccupation/anticipation). These data have improved our understanding of the molecular adaptations that eventually lead to dysregulation of brain function and the chronic, relapsing disorder of addiction. Identification of new medications to treat alcohol use disorder (AUD) will likely benefit from the integration of genetic, genomic, and behavioral information included in these important datasets. Systems pharmacology considers drug effects as the outcome of the complex network of interactions a drug has rather than a single drug-molecule interaction. Computational strategies based on this principle that integrate gene expression signatures of pharmaceuticals and disease states have shown promise for identifying treatments that ameliorate disease symptoms (called in silico gene mapping or connectivity mapping). In this review, we suggest that gene expression profiling for in silico mapping is critical to improve drug repurposing and discovery for AUD and other psychiatric illnesses. We highlight studies that successfully apply gene mapping computational approaches to identify or repurpose pharmaceutical treatments for psychiatric illnesses. Furthermore, we address important challenges that must be overcome to maximize the potential of these strategies to translate to the clinic and improve healthcare outcomes.
Full Text Available Background/Aims: Pediatric sepsis is a disease that threatens life of children. The incidence of pediatric sepsis is higher in developing countries due to various reasons, such as insufficient immunization and nutrition, water and air pollution, etc. Exploring the potential genes via different methods is of significance for the prevention and treatment of pediatric sepsis. This study aimed to identify potential genes associated with pediatric sepsis utilizing analysis of gene network and entropy. Methods: The mRNA expression in the blood samples collected from 20 septic children and 30 healthy controls was quantified by using Affymetrix HG-U133A microarray. Two condition-specific protein-protein interaction networks (PINs, one for the healthy control and the other one for the children with sepsis, were deduced by combining the fundamental human PINs with gene expression profiles in the two phenotypes. Subsequently, distinct modules from the two conditional networks were extracted by adopting a maximal clique-merging approach. Delta entropy (ΔS was calculated between sepsis and control modules. Results: Then, key genes displaying changes in gene composition were identified by matching the control and sepsis modules. Two objective modules were obtained, in which ribosomal protein RPL4 and RPL9 as well as TOP2A were probably considered as the key genes differentiating sepsis from healthy controls. Conclusion: According to previous reports and this work, TOP2A is the potential gene therapy target for pediatric sepsis. The relationship between pediatric sepsis and RPL4 and RPL9 needs further investigation.
Jung, Chol-Hee; Wong, Chui E.; Singh, Mohan B.; Bhalla, Prem L.
Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja) revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant, Arabidopsis. PMID:22679494
Curci, Ylenia; Mongeau Ospina, Christian A.
Biofuel policies are motivated by a plethora of political concerns related to energy security, environmental damages, and support of the agricultural sector. In response to this, much scientific work has chiefly focussed on analysing the biofuel domain and on giving policy advice and recommendations. Although innovation has been acknowledged as one of the key factors in sustainable and cost-effective biofuel development, there is an urgent need to investigate technological trajectories in the biofuel sector by starting from consistent data and appropriate methodological tools. To do so, this work proposes a procedure to select patent data unequivocally related to the investigated sector, it uses co-occurrence of technological terms to compute patent similarity and highlights content and interdependencies of biofuels technological trajectories by revealing hidden topics from unstructured patent text fields. The analysis suggests that there is a breaking trend towards modern generation biofuels and that innovators seem to focus increasingly on the ability of alternative energy sources to adapt to the transport/industrial sector. - Highlights: • Innovative effort is devoted to biofuels additives and modern biofuels technologies. • A breaking trend can be observed from the second half of the last decade. • A patent network is identified via text mining techniques that extract latent topics.
Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar
Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.
Full Text Available The neurological complications of AIDS (neuroAIDS during the infection of human immunodeficiency virus (HIV are symptomized by non-specific, multifaceted neurological conditions and therefore, defining a specific diagnosis/treatment mechanism(s for this neuro-complexity at the molecular level remains elusive. Using an in silico based integrated gene network analysis we discovered that HIV infection shares convergent gene networks with each of twelve neurological disorders selected in this study. Importantly, a common gene network was identified among HIV infection, Alzheimer's disease, Parkinson's disease, multiple sclerosis, and age macular degeneration. An mRNA microarray analysis in HIV-infected monocytes showed significant changes in the expression of several genes of this in silico derived common pathway which suggests the possible physiological relevance of this gene-circuit in driving neuroAIDS condition. Further, this unique gene network was compared with another in silico derived novel, convergent gene network which is shared by seven major neurological disorders (Alzheimer's disease, Parkinson's disease, Multiple Sclerosis, Age Macular Degeneration, Amyotrophic Lateral Sclerosis, Vascular Dementia, and Restless Leg Syndrome. These networks differed in their gene circuits; however, in large, they involved innate immunity signaling pathways, which suggests commonalities in the immunological basis of different neuropathogenesis. The common gene circuits reported here can provide a prospective platform to understand how gene-circuits belonging to other neuro-disorders may be convoluted during real-time neuroAIDS condition and it may elucidate the underlying-and so far unknown-genetic overlap between HIV infection and neuroAIDS risk. Also, it may lead to a new paradigm in understanding disease progression, identifying biomarkers, and developing therapies.
Khan, Abhinandan; Mandal, Sudip; Pal, Rajat Kumar; Saha, Goutam
We have proposed a methodology for the reverse engineering of biologically plausible gene regulatory networks from temporal genetic expression data. We have used established information and the fundamental mathematical theory for this purpose. We have employed the Recurrent Neural Network formalism to extract the underlying dynamics present in the time series expression data accurately. We have introduced a new hybrid swarm intelligence framework for the accurate training of the model parameters. The proposed methodology has been first applied to a small artificial network, and the results obtained suggest that it can produce the best results available in the contemporary literature, to the best of our knowledge. Subsequently, we have implemented our proposed framework on experimental (in vivo) datasets. Finally, we have investigated two medium sized genetic networks (in silico) extracted from GeneNetWeaver, to understand how the proposed algorithm scales up with network size. Additionally, we have implemented our proposed algorithm with half the number of time points. The results indicate that a reduction of 50% in the number of time points does not have an effect on the accuracy of the proposed methodology significantly, with a maximum of just over 15% deterioration in the worst case.
Full Text Available RegnANN is a novel method for reverse engineering gene networks based on an ensemble of multilayer perceptrons. The algorithm builds a regressor for each gene in the network, estimating its neighborhood independently. The overall network is obtained by joining all the neighborhoods. RegnANN makes no assumptions about the nature of the relationships between the variables, potentially capturing high-order and non linear dependencies between expression patterns. The evaluation focuses on synthetic data mimicking plausible submodules of larger networks and on biological data consisting of submodules of Escherichia coli. We consider Barabasi and Erdös-Rényi topologies together with two methods for data generation. We verify the effect of factors such as network size and amount of data to the accuracy of the inference algorithm. The accuracy scores obtained with RegnANN is methodically compared with the performance of three reference algorithms: ARACNE, CLR and KELLER. Our evaluation indicates that RegnANN compares favorably with the inference methods tested. The robustness of RegnANN, its ability to discover second order correlations and the agreement between results obtained with this new methods on both synthetic and biological data are promising and they stimulate its application to a wider range of problems.
Inoue, Tohru; Hirabayashi, Yoko
Authors explain that the radiation effect on biological system is stochastic along the law of physics, differing from chemical effect, using instances of Cs-137 gamma-ray (GR) and benzene (BZ) exposures to mice and of resultant comprehensive analyses of gene expression. Single GR irradiation is done with Gamma Cell 40 (CSR) to C57BL/6 or C3H/He mouse at 0, 0.6 and 3 Gy. BE is given orally at 150 mg/kg/day for 5 days x 2 weeks. Bone marrow cells are sampled 1 month after the exposure. Comprehensive gene expression is analyzed by Gene Chip Mouse Genome 430 2.0 Array (Affymetrix) and data are processed by programs like case normalization, statistics, network generation, functional analysis etc. GR irradiation brings about changes of gene expression, which are classifiable in common genes variable commonly on the dose change and stochastic genes variable stochastically within each dose: e.g., with Welch-t-test, significant differences are between 0/3 Gy (dose-specific difference, 455 pbs (probe set), in stochastic 2113 pbs), 0/0.6 Gy (267 in 1284 pbs) and 0.6/3 Gy (532 pbs); and with one-way analysis of variation (ANOVA) and hierarchial/dendrographic analyses, 520 pbs are shown to involve the dose-dependent 226 and dose-specific 294 pbs. It is also shown that at 3 Gy, expression of common genes are rather suppressed, including those related to the proliferation/apoptosis of B/T cells, and of stochastic genes, related to cell division/signaling. Ven diagram of the common genes of above 520 pbs, stochastic 2113 pbs at 3 Gy and 1284 pbs at 0.6 Gy shows the overlapping genes 29, 2 and 4, respectively, indicating only 35 pbs are overlapping in total. Network analysis of changes by GR shows the rather high expression of genes around hub of cAMP response element binding protein (CREB) at 0.6 Gy, and rather variable expression around CREB hub/suppressed expression of kinesin hub at 3 Gy; in the network by BZ exposure, unchanged or low expression around p53 hub and suppression
We present a computational method in which modular and Groebner bases (GB) computation in Boolean rings are used for solving problems in Boolean gene regulatory networks (BN). In contrast to other known algebraic approaches, the degree of intermediate polynomials during the calculation of Groebner bases using our method will never grow resulting in a significant improvement in running time and memory space consumption. We also show how calculation in temporal logic for model checking can be done by means of our direct and efficient Groebner basis computation in Boolean rings. We present our experimental results in finding attractors and control strategies of Boolean networks to illustrate our theoretical arguments. The results are promising. Our algebraic approach is more efficient than the state-of-the-art model checker NuSMV on BNs. More importantly, our approach finds all solutions for the BN problems.
Yasir Tariq Mohmand
Full Text Available The structure and properties of public transportation networks have great implications in urban planning, public policies, and infectious disease control. This study contributes a weighted complex network analysis of travel routes on the national highway network of Pakistan. The network is responsible for handling 75 percent of the road traffic yet is largely inadequate, poor, and unreliable. The highway network displays small world properties and is assortative in nature. Based on the betweenness centrality of the nodes, the most important cities are identified as this could help in identifying the potential congestion points in the network. Keeping in view the strategic location of Pakistan, such a study is of practical importance and could provide opportunities for policy makers to improve the performance of the highway network.
Seker, S.; Ciftcioglu, O.
Noise analysis studies with neural network are aimed. Stochastic signals at the input of the network are used to obtain an algorithmic multivariate stochastic signal modeling. To this end, lattice modeling of a stochastic signal is performed to obtain backward residual noise sources which are uncorrelated among themselves. There are applied together with an additional input to the network to obtain an algorithmic model which is used for signal detection for early failure in plant monitoring. The additional input provides the information to the network to minimize the difference between the signal and the network's one-step-ahead prediction. A stochastic algorithm is used for training where the errors reflecting the measurement error during the training are also modelled so that fast and consistent convergence of network's weights is obtained. The lattice structure coupled to neural network investigated with measured signals from an actual power plant. (authors)
Pollard Katherine S
Full Text Available Abstract Background The molecular events underlying mammary development during pregnancy, lactation, and involution are incompletely understood. Results Mammary gland microarray data, cellular localization data, protein-protein interactions, and literature-mined genes were integrated and analyzed using statistics, principal component analysis, gene ontology analysis, pathway analysis, and network analysis to identify global biological principles that govern molecular events during pregnancy, lactation, and involution. Conclusion Several key principles were derived: (1 nearly a third of the transcriptome fluctuates to build, run, and disassemble the lactation apparatus; (2 genes encoding the secretory machinery are transcribed prior to lactation; (3 the diversity of the endogenous portion of the milk proteome is derived from fewer than 100 transcripts; (4 while some genes are differentially transcribed near the onset of lactation, the lactation switch is primarily post-transcriptionally mediated; (5 the secretion of materials during lactation occurs not by up-regulation of novel genomic functions, but by widespread transcriptional suppression of functions such as protein degradation and cell-environment communication; (6 the involution switch is primarily transcriptionally mediated; and (7 during early involution, the transcriptional state is partially reverted to the pre-lactation state. A new hypothesis for secretory diminution is suggested – milk production gradually declines because the secretory machinery is not transcriptionally replenished. A comprehensive network of protein interactions during lactation is assembled and new regulatory gene targets are identified. Less than one fifth of the transcriptionally regulated nodes in this lactation network have been previously explored in the context of lactation. Implications for future research in mammary and cancer biology are discussed.
Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina
Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.
Nguyen, Cao D; Gardiner, Katheleen J; Cios, Krzysztof J
We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precision and 60% recall versus 45% and 26% for Majority and 24% and 61% for χ²-statistics, respectively. Copyright © 2011 Elsevier Inc. All rights reserved.
Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura
The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.
Traffic monitoring and analysis can be done for multiple different reasons: to investigate the usage of network resources, assess the performance of network applications, adjust Quality of Service (QoS) policies in the network, log the traffic to comply with the law, or create realistic models of traffic for academic purposes. We define the objective of this thesis as finding a way to evaluate the performance of various applications in a high-speed Internet infrastructure. To satisfy the obje...
Hemanta Kumar Kalita; Avijit Kar
The emergence of sensor networks as one of the dominant technology trends in the coming decades hasposed numerous unique challenges to researchers. These networks are likely to be composed of hundreds,and potentially thousands of tiny sensor nodes, functioning autonomously, and in many cases, withoutaccess to renewable energy resources. Cost constraints and the need for ubiquitous, invisibledeployments will result in small sized, resource-constrained sensor nodes. While the set of challenges ...
Full Text Available Abstract Background Inflammation is a hallmark of many human diseases. Elucidating the mechanisms underlying systemic inflammation has long been an important topic in basic and clinical research. When primary pathogenetic events remains unclear due to its immense complexity, construction and analysis of the gene regulatory network of inflammation at times becomes the best way to understand the detrimental effects of disease. However, it is difficult to recognize and evaluate relevant biological processes from the huge quantities of experimental data. It is hence appealing to find an algorithm which can generate a gene regulatory network of systemic inflammation from high-throughput genomic studies of human diseases. Such network will be essential for us to extract valuable information from the complex and chaotic network under diseased conditions. Results In this study, we construct a gene regulatory network of inflammation using data extracted from the Ensembl and JASPAR databases. We also integrate and apply a number of systematic algorithms like cross correlation threshold, maximum likelihood estimation method and Akaike Information Criterion (AIC on time-lapsed microarray data to refine the genome-wide transcriptional regulatory network in response to bacterial endotoxins in the context of dynamic activated genes, which are regulated by transcription factors (TFs such as NF-κB. This systematic approach is used to investigate the stochastic interaction represented by the dynamic leukocyte gene expression profiles of human subject exposed to an inflammatory stimulus (bacterial endotoxin. Based on the kinetic parameters of the dynamic gene regulatory network, we identify important properties (such as susceptibility to infection of the immune system, which may be useful for translational research. Finally, robustness of the inflammatory gene network is also inferred by analyzing the hubs and "weak ties" structures of the gene network
Medvedeva, M. A.; Davletbaev, R. H.; Berg, D. B.; Nazarova, J. J.; Parusheva, S. S.
Structure and functioning of two model industrial entrepreneurial networks are investigated in the present paper. One of these networks is forming when implementing an integrated project and consists of eight agents, which interact with each other and external environment. The other one is obtained from the municipal economy and is based on the set of the 12 real business entities. Analysis of the networks is carried out on the basis of the matrix of mutual payments aggregated over the certain time period. The matrix is created by the methods of experimental economics. Social Network Analysis (SNA) methods and instruments were used in the present research. The set of basic structural characteristics was investigated: set of quantitative parameters such as density, diameter, clustering coefficient, different kinds of centrality, and etc. They were compared with the random Bernoulli graphs of the corresponding size and density. Discovered variations of random and entrepreneurial networks structure are explained by the peculiarities of agents functioning in production network. Separately, were identified the closed exchange circuits (cyclically closed contours of graph) forming an autopoietic (self-replicating) network pattern. The purpose of the functional analysis was to identify the contribution of the autopoietic network pattern in its gross product. It was found that the magnitude of this contribution is more than 20%. Such value allows using of the complementary currency in order to stimulate economic activity of network agents.
Kalyagin, Valery; Pardalos, Panos
This volume compiles the major results of conference participants from the "Third International Conference in Network Analysis" held at the Higher School of Economics, Nizhny Novgorod in May 2013, with the aim to initiate further joint research among different groups. The contributions in this book cover a broad range of topics relevant to the theory and practice of network analysis, including the reliability of complex networks, software, theory, methodology, and applications. Network analysis has become a major research topic over the last several years. The broad range of applications that can be described and analyzed by means of a network has brought together researchers, practitioners from numerous fields such as operations research, computer science, transportation, energy, biomedicine, computational neuroscience and social sciences. In addition, new approaches and computer environments such as parallel computing, grid computing, cloud computing, and quantum computing have helped to solve large scale...
Zhang, Qiang; Andersen, Melvin E
To maintain a stable intracellular environment, cells utilize complex and specialized defense systems against a variety of external perturbations, such as electrophilic stress, heat shock, and hypoxia, etc. Irrespective of the type of stress, many adaptive mechanisms contributing to cellular homeostasis appear to operate through gene regulatory networks that are organized into negative feedback loops. In general, the degree of deviation of the controlled variables, such as electrophiles, misfolded proteins, and O2, is first detected by specialized sensor molecules, then the signal is transduced to specific transcription factors. Transcription factors can regulate the expression of a suite of anti-stress genes, many of which encode enzymes functioning to counteract the perturbed variables. The objective of this study was to explore, using control theory and computational approaches, the theoretical basis that underlies the steady-state dose response relationship between cellular stressors and intracellular biochemical species (controlled variables, transcription factors, and gene products) in these gene regulatory networks. Our work indicated that the shape of dose response curves (linear, superlinear, or sublinear) depends on changes in the specific values of local response coefficients (gains) distributed in the feedback loop. Multimerization of anti-stress enzymes and transcription factors into homodimers, homotrimers, or even higher-order multimers, play a significant role in maintaining robust homeostasis. Moreover, our simulation noted that dose response curves for the controlled variables can transition sequentially through four distinct phases as stressor level increases: initial superlinear with lesser control, superlinear more highly controlled, linear uncontrolled, and sublinear catastrophic. Each phase relies on specific gain-changing events that come into play as stressor level increases. The low-dose region is intrinsically nonlinear, and depending on
Full Text Available To maintain a stable intracellular environment, cells utilize complex and specialized defense systems against a variety of external perturbations, such as electrophilic stress, heat shock, and hypoxia, etc. Irrespective of the type of stress, many adaptive mechanisms contributing to cellular homeostasis appear to operate through gene regulatory networks that are organized into negative feedback loops. In general, the degree of deviation of the controlled variables, such as electrophiles, misfolded proteins, and O2, is first detected by specialized sensor molecules, then the signal is transduced to specific transcription factors. Transcription factors can regulate the expression of a suite of anti-stress genes, many of which encode enzymes functioning to counteract the perturbed variables. The objective of this study was to explore, using control theory and computational approaches, the theoretical basis that underlies the steady-state dose response relationship between cellular stressors and intracellular biochemical species (controlled variables, transcription factors, and gene products in these gene regulatory networks. Our work indicated that the shape of dose response curves (linear, superlinear, or sublinear depends on changes in the specific values of local response coefficients (gains distributed in the feedback loop. Multimerization of anti-stress enzymes and transcription factors into homodimers, homotrimers, or even higher-order multimers, play a significant role in maintaining robust homeostasis. Moreover, our simulation noted that dose response curves for the controlled variables can transition sequentially through four distinct phases as stressor level increases: initial superlinear with lesser control, superlinear more highly controlled, linear uncontrolled, and sublinear catastrophic. Each phase relies on specific gain-changing events that come into play as stressor level increases. The low-dose region is intrinsically nonlinear
Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali
Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.
for Expanded Network Analysis. In Visualising Network Information (pp. 6-1 – 6-10). Meeting Proceedings RTO-MP-IST-063, Paper 6. Neuilly-sur-Seine...Even to this day, current research groups are working to develop an approach that involves taking all available text, video, imagery and audio and
Colbaugh, Richard; Glass, Kristin.; Willard, Gerald
This paper presents a new methodology for analyzing complex networks in which the network of interest is first abstracted to a much simpler (but equivalent) representation, the required analysis is performed using the abstraction, and analytic conclusions are then mapped back to the original network and interpreted there. We begin by identifying a broad and important class of complex networks which admit abstractions that are simultaneously dramatically simplifying and property preserving we call these aggressive abstractions -- and which can therefore be analyzed using the proposed approach. We then introduce and develop two forms of aggressive abstraction: 1.) finite state abstraction, in which dynamical networks with uncountable state spaces are modeled using finite state systems, and 2.) onedimensional abstraction, whereby high dimensional network dynamics are captured in a meaningful way using a single scalar variable. In each case, the property preserving nature of the abstraction process is rigorously established and efficient algorithms are presented for computing the abstraction. The considerable potential of the proposed approach to complex networks analysis is illustrated through case studies involving vulnerability analysis of technological networks and predictive analysis for social processes.
Lastdrager, Elmer; Lastdrager, E.E.H.; Pras, Aiko
Traffic repositories with TCP/IP header information are very important for network analysis. Researchers often assume that such repositories reliably represent all traffic that has been flowing over the network; little thoughts are made regarding the consistency of these repositories. Still, for
van Ruissen, Fred; Baas, Frank
In 1995, serial analysis of gene expression (SAGE) was developed as a versatile tool for gene expression studies. SAGE technology does not require pre-existing knowledge of the genome that is being examined and therefore SAGE can be applied to many different model systems. In this chapter, the SAGE
Frolov, A. A.; Húsek, Dušan; Muraviev, I. P.; Polyakov, P.Y.
Roč. 18, č. 3 (2007), s. 698-707 ISSN 1045-9227 R&D Projects: GA AV ČR 1ET100300419; GA ČR GA201/05/0079 Institutional research plan: CEZ:AV0Z10300504 Keywords : recurrent neural network * Hopfield-like neural network * associative memory * unsupervised learning * neural network architecture * neural network application * statistics * Boolean factor analysis * dimensionality reduction * features clustering * concepts search * information retrieval Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 2.769, year: 2007
Networks are of significant importance in many application domains, such as World Wide Web and social networks, which often embed rich topological information. Since network topology captures the organization of network nodes and links, studying network topology is very important to network analysis. In this dissertation, we study networks by…
Alena, Richard; Evenson, Darin; Rundquist, Victor; Clancy, Daniel (Technical Monitor)
Wireless networks are being used to connect mobile computing elements in more applications as the technology matures. There are now many products (such as 802.11 and 802.11b) which ran in the ISM frequency band and comply with wireless network standards. They are being used increasingly to link mobile Intranet into Wired networks. Standard methods of analyzing and testing their performance and compatibility are needed to determine the limits of the technology. This paper presents analytical and experimental methods of determining network throughput, range and coverage, and interference sources. Both radio frequency (BE) domain and network domain analysis have been applied to determine wireless network throughput and range in the outdoor environment- Comparison of field test data taken under optimal conditions, with performance predicted from RF analysis, yielded quantitative results applicable to future designs. Layering multiple wireless network- sooners can increase performance. Wireless network components can be set to different radio frequency-hopping sequences or spreading functions, allowing more than one sooner to coexist. Therefore, we ran multiple 802.11-compliant systems concurrently in the same geographical area to determine interference effects and scalability, The results can be used to design of more robust networks which have multiple layers of wireless data communication paths and provide increased throughput overall.
Yasir Tariq Mohmand
Full Text Available The structure and properties of public transportation networks can provide suggestions for urban planning and public policies. This study contributes a complex network analysis of the Guangzhou metro. The metro network has 236 kilometers of track and is the 6th busiest metro system of the world. In this paper topological properties of the network are explored. We observed that the network displays small world properties and is assortative in nature. The network possesses a high average degree of 17.5 with a small diameter of 5. Furthermore, we also identified the most important metro stations based on betweenness and closeness centralities. These could help in identifying the probable congestion points in the metro system and provide policy makers with an opportunity to improve the performance of the metro system.
Full Text Available Loss is an important parameter of Quality of Service (QoS. Though stochastic network calculus is a very useful tool for performance evaluation of computer networks, existing studies on stochastic service guarantees mainly focused on the delay and backlog. Some efforts have been made to analyse loss by deterministic network calculus, but there are few results to extend stochastic network calculus for loss analysis. In this paper, we introduce a new parameter named loss factor into stochastic network calculus and then derive the loss bound through the existing arrival curve and service curve via this parameter. We then prove that our result is suitable for the networks with multiple input flows. Simulations show the impact of buffer size, arrival traffic, and service on the loss factor.
Yousefi, Mohammadmahdi R; Dougherty, Edward R
A basic issue for translational genomics is to model gene interaction via gene regulatory networks (GRNs) and thereby provide an informatics environment to study the effects of intervention (say, via drugs) and to derive effective intervention strategies. Taking the view that the phenotype is characterized by the long-run behavior (steady-state distribution) of the network, we desire interventions to optimally move the probability mass from undesirable to desirable states Heretofore, two external control approaches have been taken to shift the steady-state mass of a GRN: (i) use a user-defined cost function for which desirable shift of the steady-state mass is a by-product and (ii) use heuristics to design a greedy algorithm. Neither approach provides an optimal control policy relative to long-run behavior. We use a linear programming approach to optimally shift the steady-state mass from undesirable to desirable states, i.e. optimization is directly based on the amount of shift and therefore must outperform previously proposed methods. Moreover, the same basic linear programming structure is used for both unconstrained and constrained optimization, where in the latter case, constraints on the optimization limit the amount of mass that may be shifted to 'ambiguous' states, these being states that are not directly undesirable relative to the pathology of interest but which bear some perceived risk. We apply the method to probabilistic Boolean networks, but the theory applies to any Markovian GRN. Supplementary materials, including the simulation results, MATLAB source code and description of suboptimal methods are available at http://gsp.tamu.edu/Publications/supplementary/yousefi13b. firstname.lastname@example.org Supplementary data are available at Bioinformatics online.
Dalphin, John F.
The GSFC Computer Network Environment provides a broadband RF cable between campus buildings and ethernet spines in buildings for the interlinking of Local Area Networks (LANs). This system provides terminal and computer linkage among host and user systems thereby providing E-mail services, file exchange capability, and certain distributed computing opportunities. The Environment is designed to be transparent and supports multiple protocols. Networking at Goddard has a short history and has been under coordinated control of a Network Steering Committee for slightly more than two years; network growth has been rapid with more than 1500 nodes currently addressed and greater expansion expected. A new RF cable system with a different topology is being installed during summer 1989; consideration of a fiber optics system for the future will begin soon. Summmer study was directed toward Network Steering Committee operation and planning plus consideration of Center Network Environment analysis and modeling. Biweekly Steering Committee meetings were attended to learn the background of the network and the concerns of those managing it. Suggestions for historical data gathering have been made to support future planning and modeling. Data Systems Dynamic Simulator, a simulation package developed at NASA and maintained at GSFC was studied as a possible modeling tool for the network environment. A modeling concept based on a hierarchical model was hypothesized for further development. Such a model would allow input of newly updated parameters and would provide an estimation of the behavior of the network.
Yang, Liang; Li, Wensheng; Deng, Chunjian; Lv, Yi
This paper is to critically analyze the architecture of UMA which is one of Fix Mobile Convergence (FMC) solutions, and also included by the third generation partnership project(3GPP). In UMA/GAN network architecture, UMA Network Controller (UNC) is the key equipment which connects with cellular core network and mobile station (MS). UMA network could be easily integrated into the existing cellular networks without influencing mobile core network, and could provides high-quality mobile services with preferentially priced indoor voice and data usage. This helps to improve subscriber's experience. On the other hand, UMA/GAN architecture helps to integrate other radio technique into cellular network which includes WiFi, Bluetooth, and WiMax and so on. This offers the traditional mobile operators an opportunity to integrate WiMax technique into cellular network. In the end of this article, we also give an analysis of potential influence on the cellular core networks ,which is pulled by UMA network.