WorldWideScience

Sample records for gene association networks

  1. Genes2FANs: connecting genes through functional association networks

    Science.gov (United States)

    2012-01-01

    Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in

  2. Reveal genes functionally associated with ACADS by a network study.

    Science.gov (United States)

    Chen, Yulong; Su, Zhiguang

    2015-09-15

    Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Novel candidate genes important for asthma and hypertension comorbidity revealed from associative gene networks.

    Science.gov (United States)

    Saik, Olga V; Demenkov, Pavel S; Ivanisenko, Timofey V; Bragina, Elena Yu; Freidin, Maxim B; Goncharova, Irina A; Dosenko, Victor E; Zolotareva, Olga I; Hofestaedt, Ralf; Lavrik, Inna N; Rogaev, Evgeny I; Ivanisenko, Vladimir A

    2018-02-13

    Hypertension and bronchial asthma are a major issue for people's health. As of 2014, approximately one billion adults, or ~ 22% of the world population, have had hypertension. As of 2011, 235-330 million people globally have been affected by asthma and approximately 250,000-345,000 people have died each year from the disease. The development of the effective treatment therapies against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and their treatment. Hence, in this study the bioinformatical methodology for the analysis of the comorbidity of these two diseases have been developed. As such, the search for candidate genes related to the comorbid conditions of asthma and hypertension can help in elucidating the molecular mechanisms underlying the comorbid condition of these two diseases, and can also be useful for genotyping and identifying new drug targets. Using ANDSystem, the reconstruction and analysis of gene networks associated with asthma and hypertension was carried out. The gene network of asthma included 755 genes/proteins and 62,603 interactions, while the gene network of hypertension - 713 genes/proteins and 45,479 interactions. Two hundred and five genes/proteins and 9638 interactions were shared between asthma and hypertension. An approach for ranking genes implicated in the comorbid condition of two diseases was proposed. The approach is based on nine criteria for ranking genes by their importance, including standard methods of gene prioritization (Endeavor, ToppGene) as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analysed genes. According to the proposed approach, the genes IL10, TLR4, and CAT had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the list of top genes is enriched with apoptotic genes and genes involved in

  4. Interactive visualization of gene regulatory networks with associated gene expression time series data

    NARCIS (Netherlands)

    Westenberg, M.A.; Hijum, van S.A.F.T.; Lulko, A.T.; Kuipers, O.P.; Roerdink, J.B.T.M.; Linsen, L.; Hagen, H.; Hamann, B.

    2008-01-01

    We present GENeVis, an application to visualize gene expression time series data in a gene regulatory network context. This is a network of regulator proteins that regulate the expression of their respective target genes. The networks are represented as graphs, in which the nodes represent genes,

  5. Network graph analysis of gene-gene interactions in genome-wide association study data.

    Science.gov (United States)

    Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

    2012-12-01

    Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene (GxG) interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.

  6. The integration of weighted human gene association networks based on link prediction.

    Science.gov (United States)

    Yang, Jian; Yang, Tinghong; Wu, Duzhi; Lin, Limei; Yang, Fan; Zhao, Jing

    2017-01-31

    Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise. Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network. The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.

  7. Network Graph Analysis of Gene-Gene Interactions in Genome-Wide Association Study Data

    Directory of Open Access Journals (Sweden)

    Sungyoung Lee

    2012-12-01

    Full Text Available Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs. For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR is one of the powerful and efficient methods for detecting high-order gene-gene (GxG interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI. Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.

  8. Discovering disease-associated genes in weighted protein-protein interaction networks

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Stanley, H. Eugene

    2018-04-01

    Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.

  9. The integration of weighted gene association networks based on information entropy.

    Science.gov (United States)

    Yang, Fan; Wu, Duzhi; Lin, Limei; Yang, Jian; Yang, Tinghong; Zhao, Jing

    2017-01-01

    Constructing genome scale weighted gene association networks (WGAN) from multiple data sources is one of research hot spots in systems biology. In this paper, we employ information entropy to describe the uncertain degree of gene-gene links and propose a strategy for data integration of weighted networks. We use this method to integrate four existing human weighted gene association networks and construct a much larger WGAN, which includes richer biology information while still keeps high functional relevance between linked gene pairs. The new WGAN shows satisfactory performance in disease gene prediction, which suggests the reliability of our integration strategy. Compared with existing integration methods, our method takes the advantage of the inherent characteristics of the component networks and pays less attention to the biology background of the data. It can make full use of existing biological networks with low computational effort.

  10. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

    2014-06-01

    In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both

  11. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo

    2014-01-01

    Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further

  12. Text mining and network analysis to find functional associations of genes in high altitude diseases.

    Science.gov (United States)

    Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar

    2018-05-02

    Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

    Science.gov (United States)

    Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

    2015-06-01

    To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; He, Yongqun

    2017-03-14

    Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of

  15. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  16. fabp4 is central to eight obesity associated genes: a functional gene network-based polymorphic study.

    Science.gov (United States)

    Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand

    2015-01-07

    Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment.

    Science.gov (United States)

    Uddin, Raihan; Singh, Shiva M

    2017-01-01

    As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they

  18. Identification of a gene module associated with BMD through the integration of network analysis and genome-wide association data.

    Science.gov (United States)

    Farber, Charles R

    2010-11-01

    Bone mineral density (BMD) is influenced by a complex network of gene interactions; therefore, elucidating the relationships between genes and how those genes, in turn, influence BMD is critical for developing a comprehensive understanding of osteoporosis. To investigate the role of transcriptional networks in the regulation of BMD, we performed a weighted gene coexpression network analysis (WGCNA) using microarray expression data on monocytes from young individuals with low or high BMD. WGCNA groups genes into modules based on patterns of gene coexpression. and our analysis identified 11 gene modules. We observed that the overall expression of one module (referred to as module 9) was significantly higher in the low-BMD group (p = .03). Module 9 was highly enriched for genes belonging to the immune system-related gene ontology (GO) category "response to virus" (p = 7.6 × 10(-11)). Using publically available genome-wide association study data, we independently validated the importance of module 9 by demonstrating that highly connected module 9 hubs were more likely, relative to less highly connected genes, to be genetically associated with BMD. This study highlights the advantages of systems-level analyses to uncover coexpression modules associated with bone mass and suggests that particular monocyte expression patterns may mediate differences in BMD. © 2010 American Society for Bone and Mineral Research.

  19. Transcriptional dynamics of a conserved gene expression network associated with craniofacial divergence in Arctic charr.

    Science.gov (United States)

    Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut

    2014-01-01

    Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional

  20. Transcriptional profiles of supragranular-enriched genes associate with corticocortical network architecture in the human brain.

    Science.gov (United States)

    Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C

    2016-01-26

    The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.

  1. Application of R to investigate common gene regulatory network pathway among bipolar disorder and associate diseases

    Directory of Open Access Journals (Sweden)

    Nahida Habib

    2016-12-01

    Full Text Available Depression, Major Depression or mental disorder creates severe diseases. Mental illness such as Unipolar Major Depression, Bipolar Disorder, Dysthymia, Schizophrenia, Cardiovascular Diseases (Hypertension, Coronary Heart Disease, Stroke etc., are known as Major Depression. Several studies have revealed the possibilities about the association among Bipolar Disorder, Schizophrenia, Coronary Heart Diseases and Stroke with each other. The current study aimed to investigate the relationships between genetic variants in the above four diseases and to create a common pathway or PPI network. The associated genes of each disease are collected from different gene database with verification using R. After performing some preprocessing, mining and operations using R on collected genes, seven (7 common associated genes are discovered on selected four diseases (SZ, BD, CHD and Stroke. In each of the iteration, the numbers of collected genes are reduced up to 51%, 36%, 10%, 2% and finally less than 1% respectively. Moreover, common pathway on selected diseases has been investigated in this research.

  2. Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks

    Directory of Open Access Journals (Sweden)

    Kohane Isaac S

    2005-09-01

    Full Text Available Abstract Background Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain. Results We developed methods to systematically explore the breadth of GBA across a large and varied corpus of expression data to answer the following question: To what extent is the GBA heuristic broadly applicable to the transcriptome and conversely how broadly is GBA captured by a priori knowledge represented in the Gene Ontology (GO? Our study provides an investigation of the functional organization of five coexpression networks using data from three mammalian organisms. Our method calculates a probabilistic score between each gene and each Gene Ontology category that reflects coexpression enrichment of a GO module. For each GO category we use Receiver Operating Curves to assess whether these probabilistic scores reflect GBA. This methodology applied to five different coexpression networks demonstrates that the signature of guilt-by-association is ubiquitous and reproducible and that the GBA heuristic is broadly applicable across the population of nine hundred Gene Ontology categories. We also demonstrate the existence of highly reproducible patterns of coexpression between some pairs of GO categories. Conclusion We conclude that GBA has universal value and that transcriptional control may be more modular than previously realized. Our analyses also suggest that methodologies combining coexpression measurements across multiple genes in a biologically-defined module can aid in characterizing gene function or in characterizing

  3. Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; Xiang, Zuoshuang; He, Yongqun

    2015-01-01

    Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature. INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these

  4. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations.

    Directory of Open Access Journals (Sweden)

    Xingli Guo

    Full Text Available The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.

  5. Network-based association of hypoxia-responsive genes with cardiovascular diseases

    International Nuclear Information System (INIS)

    Wang, Rui-Sheng; Oldham, William M; Loscalzo, Joseph

    2014-01-01

    Molecular oxygen is indispensable for cellular viability and function. Hypoxia is a stress condition in which oxygen demand exceeds supply. Low cellular oxygen content induces a number of molecular changes to activate regulatory pathways responsible for increasing the oxygen supply and optimizing cellular metabolism under limited oxygen conditions. Hypoxia plays critical roles in the pathobiology of many diseases, such as cancer, heart failure, myocardial ischemia, stroke, and chronic lung diseases. Although the complicated associations between hypoxia and cardiovascular (and cerebrovascular) diseases (CVD) have been recognized for some time, there are few studies that investigate their biological link from a systems biology perspective. In this study, we integrate hypoxia genes, CVD genes, and the human protein interactome in order to explore the relationship between hypoxia and cardiovascular diseases at a systems level. We show that hypoxia genes are much closer to CVD genes in the human protein interactome than that expected by chance. We also find that hypoxia genes play significant bridging roles in connecting different cardiovascular diseases. We construct a hypoxia-CVD bipartite network and find several interesting hypoxia-CVD modules with significant gene ontology similarity. Finally, we show that hypoxia genes tend to have more CVD interactors in the human interactome than in random networks of matching topology. Based on these observations, we can predict novel genes that may be associated with CVD. This network-based association study gives us a broad view of the relationships between hypoxia and cardiovascular diseases and provides new insights into the role of hypoxia in cardiovascular biology. (paper)

  6. Statistical identification of gene association by CID in application of constructing ER regulatory network

    Directory of Open Access Journals (Sweden)

    Lien Huang-Chun

    2009-03-01

    Full Text Available Abstract Background A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID, is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs (X and their downstream genes (Y based on clinical data. More specifically, we use estrogen receptor α (ERα as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A. Results The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC, Student's t-test (STT, coefficient of determination (CoD, and mutual information (MI. When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y against a discrete variable (X, it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. Conclusion CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the

  7. Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk.

    Science.gov (United States)

    Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P

    2015-10-01

    Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.

  8. Posterior association networks and functional modules inferred from rich phenotypes of gene perturbations.

    Directory of Open Access Journals (Sweden)

    Xin Wang

    Full Text Available Combinatorial gene perturbations provide rich information for a systematic exploration of genetic interactions. Despite successful applications to bacteria and yeast, the scalability of this approach remains a major challenge for higher organisms such as humans. Here, we report a novel experimental and computational framework to efficiently address this challenge by limiting the 'search space' for important genetic interactions. We propose to integrate rich phenotypes of multiple single gene perturbations to robustly predict functional modules, which can subsequently be subjected to further experimental investigations such as combinatorial gene silencing. We present posterior association networks (PANs to predict functional interactions between genes estimated using a Bayesian mixture modelling approach. The major advantage of this approach over conventional hypothesis tests is that prior knowledge can be incorporated to enhance predictive power. We demonstrate in a simulation study and on biological data, that integrating complementary information greatly improves prediction accuracy. To search for significant modules, we perform hierarchical clustering with multiscale bootstrap resampling. We demonstrate the power of the proposed methodologies in applications to Ewing's sarcoma and human adult stem cells using publicly available and custom generated data, respectively. In the former application, we identify a gene module including many confirmed and highly promising therapeutic targets. Genes in the module are also significantly overrepresented in signalling pathways that are known to be critical for proliferation of Ewing's sarcoma cells. In the latter application, we predict a functional network of chromatin factors controlling epidermal stem cell fate. Further examinations using ChIP-seq, ChIP-qPCR and RT-qPCR reveal that the basis of their genetic interactions may arise from transcriptional cross regulation. A Bioconductor package

  9. Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

    Science.gov (United States)

    2011-01-01

    Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were

  10. A co-expression gene network associated with developmental regulation of apple fruit acidity.

    Science.gov (United States)

    Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Xu, Kenong

    2015-08-01

    Apple fruit acidity, which affects the fruit's overall taste and flavor to a large extent, is primarily determined by the concentration of malic acid. Previous studies demonstrated that the major QTL malic acid (Ma) on chromosome 16 is largely responsible for fruit acidity variations in apple. Recent advances suggested that a natural mutation that gives rise to a premature stop codon in one of the two aluminum-activated malate transporter (ALMT)-like genes (called Ma1) is the genetic causal element underlying Ma. However, the natural mutation does not explain the developmental changes of fruit malate levels in a given genotype. Using RNA-seq data from the fruit of 'Golden Delicious' taken at 14 developmental stages from 1 week after full-bloom (WAF01) to harvest (WAF20), we characterized their transcriptomes in groups of high (12.2 ± 1.6 mg/g fw, WAF03-WAF08), mid (7.4 ± 0.5 mg/g fw, WAF01-WAF02 and WAF10-WAF14) and low (5.4 ± 0.4 mg/g fw, WAF16-WAF20) malate concentrations. Detailed analyses showed that a set of 3,066 genes (including Ma1) were expressed not only differentially (P FDR < 0.05) between the high and low malate groups (or between the early and late developmental stages) but also in significant (P < 0.05) correlation with malate concentrations. The 3,066 genes fell in 648 MapMan (sub-) bins or functional classes, and 19 of them were significantly (P FDR < 0.05) co-enriched or co-suppressed in a malate dependent manner. Network inferring using the 363 genes encompassed in the 19 (sub-) bins, identified a major co-expression network of 239 genes. Since the 239 genes were also differentially expressed between the early (WAF03-WAF08) and late (WAF16-WAF20) developmental stages, the major network was considered to be associated with developmental regulation of apple fruit acidity in 'Golden Delicious'.

  11. A Genome-Wide Association Study and Complex Network Identify Four Core Hub Genes in Bipolar Disorder

    Directory of Open Access Journals (Sweden)

    Zengyan Xie

    2017-12-01

    Full Text Available Bipolar disorder is a common and severe mental illness with unsolved pathophysiology. A genome-wide association study (GWAS has been used to find a number of risk genes, but it is difficult for a GWAS to find genes indirectly associated with a disease. To find core hub genes, we introduce a network analysis after the GWAS was conducted. Six thousand four hundred fifty eight single nucleotide polymorphisms (SNPs with p < 0.01 were sifted out from Wellcome Trust Case Control Consortium (WTCCC dataset and mapped to 2045 genes, which are then compared with the protein–protein network. One hundred twelve genes with a degree >17 were chosen as hub genes from which five significant modules and four core hub genes (FBXL13, WDFY2, bFGF, and MTHFD1L were found. These core hub genes have not been reported to be directly associated with BD but may function by interacting with genes directly related to BD. Our method engenders new thoughts on finding genes indirectly associated with, but important for, complex diseases.

  12. A Social Network Approach Reveals Associations between Mouse Social Dominance and Brain Gene Expression

    Science.gov (United States)

    So, Nina; Franks, Becca; Lim, Sean; Curley, James P.

    2015-01-01

    Modelling complex social behavior in the laboratory is challenging and requires analyses of dyadic interactions occurring over time in a physically and socially complex environment. In the current study, we approached the analyses of complex social interactions in group-housed male CD1 mice living in a large vivarium. Intensive observations of social interactions during a 3-week period indicated that male mice form a highly linear and steep dominance hierarchy that is maintained by fighting and chasing behaviors. Individual animals were classified as dominant, sub-dominant or subordinate according to their David’s Scores and I& SI ranking. Using a novel dynamic temporal Glicko rating method, we ascertained that the dominance hierarchy was stable across time. Using social network analyses, we characterized the behavior of individuals within 66 unique relationships in the social group. We identified two individual network metrics, Kleinberg’s Hub Centrality and Bonacich’s Power Centrality, as accurate predictors of individual dominance and power. Comparing across behaviors, we establish that agonistic, grooming and sniffing social networks possess their own distinctive characteristics in terms of density, average path length, reciprocity out-degree centralization and out-closeness centralization. Though grooming ties between individuals were largely independent of other social networks, sniffing relationships were highly predictive of the directionality of agonistic relationships. Individual variation in dominance status was associated with brain gene expression, with more dominant individuals having higher levels of corticotropin releasing factor mRNA in the medial and central nuclei of the amygdala and the medial preoptic area of the hypothalamus, as well as higher levels of hippocampal glucocorticoid receptor and brain-derived neurotrophic factor mRNA. This study demonstrates the potential and significance of combining complex social housing and intensive

  13. A Social Network Approach Reveals Associations between Mouse Social Dominance and Brain Gene Expression.

    Directory of Open Access Journals (Sweden)

    Nina So

    Full Text Available Modelling complex social behavior in the laboratory is challenging and requires analyses of dyadic interactions occurring over time in a physically and socially complex environment. In the current study, we approached the analyses of complex social interactions in group-housed male CD1 mice living in a large vivarium. Intensive observations of social interactions during a 3-week period indicated that male mice form a highly linear and steep dominance hierarchy that is maintained by fighting and chasing behaviors. Individual animals were classified as dominant, sub-dominant or subordinate according to their David's Scores and I& SI ranking. Using a novel dynamic temporal Glicko rating method, we ascertained that the dominance hierarchy was stable across time. Using social network analyses, we characterized the behavior of individuals within 66 unique relationships in the social group. We identified two individual network metrics, Kleinberg's Hub Centrality and Bonacich's Power Centrality, as accurate predictors of individual dominance and power. Comparing across behaviors, we establish that agonistic, grooming and sniffing social networks possess their own distinctive characteristics in terms of density, average path length, reciprocity out-degree centralization and out-closeness centralization. Though grooming ties between individuals were largely independent of other social networks, sniffing relationships were highly predictive of the directionality of agonistic relationships. Individual variation in dominance status was associated with brain gene expression, with more dominant individuals having higher levels of corticotropin releasing factor mRNA in the medial and central nuclei of the amygdala and the medial preoptic area of the hypothalamus, as well as higher levels of hippocampal glucocorticoid receptor and brain-derived neurotrophic factor mRNA. This study demonstrates the potential and significance of combining complex social housing

  14. Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining.

    Science.gov (United States)

    Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R

    2018-01-01

    The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.

  15. An Organismal Model for Gene Regulatory Networks in the Gut-Associated Immune Response

    Directory of Open Access Journals (Sweden)

    Katherine M. Buckley

    2017-10-01

    Full Text Available The gut epithelium is an ancient site of complex communication between the animal immune system and the microbial world. While elements of self-non-self receptors and effector mechanisms differ greatly among animal phyla, some aspects of recognition, regulation, and response are broadly conserved. A gene regulatory network (GRN approach provides a means to investigate the nature of this conservation and divergence even as more peripheral functional details remain incompletely understood. The sea urchin embryo is an unparalleled experimental model for detangling the GRNs that govern embryonic development. By applying this theoretical framework to the free swimming, feeding larval stage of the purple sea urchin, it is possible to delineate the conserved regulatory circuitry that regulates the gut-associated immune response. This model provides a morphologically simple system in which to efficiently unravel regulatory connections that are phylogenetically relevant to immunity in vertebrates. Here, we review the organism-wide cellular and transcriptional immune response of the sea urchin larva. A large set of transcription factors and signal systems, including epithelial expression of interleukin 17 (IL17, are important mediators in the activation of the early gut-associated response. Many of these have homologs that are active in vertebrate immunity, while others are ancient in animals but absent in vertebrates or specific to echinoderms. This larval model provides a means to experimentally characterize immune function encoded in the sea urchin genome and the regulatory interconnections that control immune response and resolution across the tissues of the organism.

  16. Simultaneous inference of phenotype-associated genes and relevant tissues from GWAS data via Bayesian integration of multiple tissue-specific gene networks.

    Science.gov (United States)

    Wu, Mengmeng; Lin, Zhixiang; Ma, Shining; Chen, Ting; Jiang, Rui; Wong, Wing Hung

    2017-12-01

    Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.

  17. Multiscale Modeling of Gene-Behavior Associations in an Artificial Neural Network Model of Cognitive Development

    Science.gov (United States)

    Thomas, Michael S. C.; Forrester, Neil A.; Ronald, Angelica

    2016-01-01

    In the multidisciplinary field of developmental cognitive neuroscience, statistical associations between levels of description play an increasingly important role. One example of such associations is the observation of correlations between relatively common gene variants and individual differences in behavior. It is perhaps surprising that such…

  18. Genome-wide association study and annotating candidate gene networks affecting age at first calving in Nellore cattle.

    Science.gov (United States)

    Mota, R R; Guimarães, S E F; Fortes, M R S; Hayes, B; Silva, F F; Verardo, L L; Kelly, M J; de Campos, C F; Guimarães, J D; Wenceslau, R R; Penitente-Filho, J M; Garcia, J F; Moore, S

    2017-12-01

    We performed a genome-wide mapping for the age at first calving (AFC) with the goal of annotating candidate genes that regulate fertility in Nellore cattle. Phenotypic data from 762 cows and 777k SNP genotypes from 2,992 bulls and cows were used. Single nucleotide polymorphism (SNP) effects based on the single-step GBLUP methodology were blocked into adjacent windows of 1 Megabase (Mb) to explain the genetic variance. SNP windows explaining more than 0.40% of the AFC genetic variance were identified on chromosomes 2, 8, 9, 14, 16 and 17. From these windows, we identified 123 coding protein genes that were used to build gene networks. From the association study and derived gene networks, putative candidate genes (e.g., PAPPA, PREP, FER1L6, TPR, NMNAT1, ACAD10, PCMTD1, CRH, OPKR1, NPBWR1 and NCOA2) and transcription factors (TF) (STAT1, STAT3, RELA, E2F1 and EGR1) were strongly associated with female fertility (e.g., negative regulation of luteinizing hormone secretion, folliculogenesis and establishment of uterine receptivity). Evidence suggests that AFC inheritance is complex and controlled by multiple loci across the genome. As several windows explaining higher proportion of the genetic variance were identified on chromosome 14, further studies investigating the interaction across haplotypes to better understand the molecular architecture behind AFC in Nellore cattle should be undertaken. © 2017 Blackwell Verlag GmbH.

  19. Identification of Gene Modules Associated with Low Temperatures Response in Bambara Groundnut by Network-Based Analysis.

    Directory of Open Access Journals (Sweden)

    Venkata Suresh Bonthala

    Full Text Available Bambara groundnut (Vigna subterranea (L. Verdc. is an African legume and is a promising underutilized crop with good seed nutritional values. Low temperature stress in a number of African countries at night, such as Botswana, can effect the growth and development of bambara groundnut, leading to losses in potential crop yield. Therefore, in this study we developed a computational pipeline to identify and analyze the genes and gene modules associated with low temperature stress responses in bambara groundnut using the cross-species microarray technique (as bambara groundnut has no microarray chip coupled with network-based analysis. Analyses of the bambara groundnut transcriptome using cross-species gene expression data resulted in the identification of 375 and 659 differentially expressed genes (p<0.01 under the sub-optimal (23°C and very sub-optimal (18°C temperatures, respectively, of which 110 genes are commonly shared between the two stress conditions. The construction of a Highest Reciprocal Rank-based gene co-expression network, followed by its partition using a Heuristic Cluster Chiseling Algorithm resulted in 6 and 7 gene modules in sub-optimal and very sub-optimal temperature stresses being identified, respectively. Modules of sub-optimal temperature stress are principally enriched with carbohydrate and lipid metabolic processes, while most of the modules of very sub-optimal temperature stress are significantly enriched with responses to stimuli and various metabolic processes. Several transcription factors (from MYB, NAC, WRKY, WHIRLY & GATA classes that may regulate the downstream genes involved in response to stimulus in order for the plant to withstand very sub-optimal temperature stress were highlighted. The identified gene modules could be useful in breeding for low-temperature stress tolerant bambara groundnut varieties.

  20. Gene coexpression network analysis of fruit transcriptomes uncovers a possible mechanistically distinct class of sugar/acid ratio-associated genes in sweet orange.

    Science.gov (United States)

    Qiao, Liang; Cao, Minghao; Zheng, Jian; Zhao, Yihong; Zheng, Zhi-Liang

    2017-10-30

    The ratio of sugars to organic acids, two of the major metabolites in fleshy fruits, has been considered the most important contributor to fruit sweetness. Although accumulation of sugars and acids have been extensively studied, whether plants evolve a mechanism to maintain, sense or respond to the fruit sugar/acid ratio remains a mystery. In a prior study, we used an integrated systems biology tool to identify a group of 39 acid-associated genes from the fruit transcriptomes in four sweet orange varieties (Citrus sinensis L. Osbeck) with varying fruit acidity, Succari (acidless), Bingtang (low acid), and Newhall and Xinhui (normal acid). We reanalyzed the prior sweet orange fruit transcriptome data, leading to the identification of 72 genes highly correlated with the fruit sugar/acid ratio. The majority of these sugar/acid ratio-related genes are predicted to be involved in regulatory functions such as transport, signaling and transcription or encode enzymes involved in metabolism. Surprisingly, only three of these sugar/acid ratio-correlated genes are weakly correlated with sugar level and none of them overlaps with the acid-associated genes. Weighted Gene Coexpression Network Analysis (WGCNA) has revealed that these genes belong to four modules, Blue, Grey, Brown and Turquoise, with the former two modules being unique to the sugar/acid ratio control. Our results indicate that orange fruits contain a possible mechanistically distinct class of genes that may potentially be involved in maintaining fruit sugar/acid ratios and/or responding to the cellular sugar/acid ratio status. Therefore, our analysis of orange transcriptomes provides an intriguing insight into the potentially novel genetic or molecular mechanisms controlling the sugar/acid ratio in fruits.

  1. Identifying miRNA and gene modules of colon cancer associated with pathological stage by weighted gene co-expression network analysis

    Directory of Open Access Journals (Sweden)

    Zhou X

    2018-05-01

    characterize the results of WGCNA.Results: Two gene modules (Gmagenta and Ggreen and one miRNA module were associated with the pathological stage. Six hub genes (COL1A2, THBS2, BGN, COL1A1, TAGLN and DACT3 were related to prognosis and validated to be associated with the pathological stage. Five hub miRNAs were identified to be related to prognosis (hsa-miR-125b-5p, hsa-miR-145-5p, hsa-let-7c-5p, hsa-miR-218-5p and hsa-miR-125b-2-3p. A total of 18 hub genes and seven hub miRNAs were predominantly expressed in tumor stroma. Proteoglycans in cancer, focal adhesion, extracellular matrix (ECM–receptor interaction and so on were common pathways of the three modules. Hsa-let-7c-5p was located at the core of miRNA–gene network.Conclusion: These findings help to advance the understanding of tumor stroma in the progression of CAC and provide prognostic biomarkers as well as therapeutic targets. Keywords: colon adenocarcinoma, weighted gene co-expression network analysis, differentially expressed genes, differentially expressed miRNA, tumor stroma

  2. Introduction: Cancer Gene Networks.

    Science.gov (United States)

    Clarke, Robert

    2017-01-01

    Constructing, evaluating, and interpreting gene networks generally sits within the broader field of systems biology, which continues to emerge rapidly, particular with respect to its application to understanding the complexity of signaling in the context of cancer biology. For the purposes of this volume, we take a broad definition of systems biology. Considering an organism or disease within an organism as a system, systems biology is the study of the integrated and coordinated interactions of the network(s) of genes, their variants both natural and mutated (e.g., polymorphisms, rearrangements, alternate splicing, mutations), their proteins and isoforms, and the organic and inorganic molecules with which they interact, to execute the biochemical reactions (e.g., as enzymes, substrates, products) that reflect the function of that system. Central to systems biology, and perhaps the only approach that can effectively manage the complexity of such systems, is the building of quantitative multiscale predictive models. The predictions of the models can vary substantially depending on the nature of the model and its inputoutput relationships. For example, a model may predict the outcome of a specific molecular reaction(s), a cellular phenotype (e.g., alive, dead, growth arrest, proliferation, and motility), a change in the respective prevalence of cell or subpopulations, a patient or patient subgroup outcome(s). Such models necessarily require computers. Computational modeling can be thought of as using machine learning and related tools to integrate the very high dimensional data generated from modern, high throughput omics technologies including genomics (next generation sequencing), transcriptomics (gene expression microarrays; RNAseq), metabolomics and proteomics (ultra high performance liquid chromatography, mass spectrometry), and "subomic" technologies to study the kinome, methylome, and others. Mathematical modeling can be thought of as the use of ordinary

  3. SPDEF is required for mouse pulmonary goblet cell differentiation and regulates a network of genes associated with mucus production.

    Science.gov (United States)

    Chen, Gang; Korfhagen, Thomas R; Xu, Yan; Kitzmiller, Joseph; Wert, Susan E; Maeda, Yutaka; Gregorieff, Alexander; Clevers, Hans; Whitsett, Jeffrey A

    2009-10-01

    Various acute and chronic inflammatory stimuli increase the number and activity of pulmonary mucus-producing goblet cells, and goblet cell hyperplasia and excess mucus production are central to the pathogenesis of chronic pulmonary diseases. However, little is known about the transcriptional programs that regulate goblet cell differentiation. Here, we show that SAM-pointed domain-containing Ets-like factor (SPDEF) controls a transcriptional program critical for pulmonary goblet cell differentiation in mice. Initial cell-lineage-tracing analysis identified nonciliated secretory epithelial cells, known as Clara cells, as the progenitors of goblet cells induced by pulmonary allergen exposure in vivo. Furthermore, in vivo expression of SPDEF in Clara cells caused rapid and reversible goblet cell differentiation in the absence of cell proliferation. This was associated with enhanced expression of genes regulating goblet cell differentiation and protein glycosylation, including forkhead box A3 (Foxa3), anterior gradient 2 (Agr2), and glucosaminyl (N-acetyl) transferase 3, mucin type (Gcnt3). Consistent with these findings, levels of SPDEF and FOXA3 were increased in mouse goblet cells after sensitization with pulmonary allergen, and the proteins were colocalized in goblet cells lining the airways of patients with chronic lung diseases. Deletion of the mouse Spdef gene resulted in the absence of goblet cells in tracheal/laryngeal submucosal glands and in the conducting airway epithelium after pulmonary allergen exposure in vivo. These data show that SPDEF plays a critical role in regulating a transcriptional network mediating the goblet cell differentiation and mucus hyperproduction associated with chronic pulmonary disorders.

  4. Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk

    DEFF Research Database (Denmark)

    Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan

    2015-01-01

    BACKGROUND: Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified...... in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). RESULTS: Gene set enrichment analysis...

  5. A novel method of predicting microRNA-disease associations based on microRNA, disease, gene and environment factor networks.

    Science.gov (United States)

    Peng, Wei; Lan, Wei; Zhong, Jiancheng; Wang, Jianxin; Pan, Yi

    2017-07-15

    MicroRNAs have been reported to have close relationship with diseases due to their deregulation of the expression of target mRNAs. Detecting disease-related microRNAs is helpful for disease therapies. With the development of high throughput experimental techniques, a large number of microRNAs have been sequenced. However, it is still a big challenge to identify which microRNAs are related to diseases. Recently, researchers are interesting in combining multiple-biological information to identify the associations between microRNAs and diseases. In this work, we have proposed a novel method to predict the microRNA-disease associations based on four biological properties. They are microRNA, disease, gene and environment factor. Compared with previous methods, our method makes predictions not only by using the prior knowledge of associations among microRNAs, disease, environment factors and genes, but also by using the internal relationship among these biological properties. We constructed four biological networks based on the similarity of microRNAs, diseases, environment factors and genes, respectively. Then random walking was implemented on the four networks unequally. In the walking course, the associations can be inferred from the neighbors in the same networks. Meanwhile the association information can be transferred from one network to another. The results of experiment showed that our method achieved better prediction performance than other existing state-of-the-art methods. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. The effect of alcohol on the differential expression of cluster of differentiation 14 gene, associated pathways, and genetic network.

    Directory of Open Access Journals (Sweden)

    Diana X Zhou

    Full Text Available Alcohol consumption affects human health in part by compromising the immune system. In this study, we examined the expression of the Cd14 (cluster of differentiation 14 gene, which is involved in the immune system through a proinflammatory cascade. Expression was evaluated in BXD mice treated with saline or acute 1.8 g/kg i.p. ethanol (12.5% v/v. Hippocampal gene expression data were generated to examine differential expression and to perform systems genetics analyses. The Cd14 gene expression showed significant changes among the BXD strains after ethanol treatment, and eQTL mapping revealed that Cd14 is a cis-regulated gene. We also identified eighteen ethanol-related phenotypes correlated with Cd14 expression related to either ethanol responses or ethanol consumption. Pathway analysis was performed to identify possible biological pathways involved in the response to ethanol and Cd14. We also constructed a genetic network for Cd14 using the top 20 correlated genes and present several genes possibly involved in Cd14 and ethanol responses based on differential gene expression. In conclusion, we found Cd14, along with several other genes and pathways, to be involved in ethanol responses in the hippocampus, such as increased susceptibility to lipopolysaccharides and neuroinflammation.

  7. Resistance Genes in Global Crop Breeding Networks.

    Science.gov (United States)

    Garrett, K A; Andersen, K F; Asche, F; Bowden, R L; Forbes, G A; Kulakow, P A; Zhou, B

    2017-10-01

    Resistance genes are a major tool for managing crop diseases. The networks of crop breeders who exchange resistance genes and deploy them in varieties help to determine the global landscape of resistance and epidemics, an important system for maintaining food security. These networks function as a complex adaptive system, with associated strengths and vulnerabilities, and implications for policies to support resistance gene deployment strategies. Extensions of epidemic network analysis can be used to evaluate the multilayer agricultural networks that support and influence crop breeding networks. Here, we evaluate the general structure of crop breeding networks for cassava, potato, rice, and wheat. All four are clustered due to phytosanitary and intellectual property regulations, and linked through CGIAR hubs. Cassava networks primarily include public breeding groups, whereas others are more mixed. These systems must adapt to global change in climate and land use, the emergence of new diseases, and disruptive breeding technologies. Research priorities to support policy include how best to maintain both diversity and redundancy in the roles played by individual crop breeding groups (public versus private and global versus local), and how best to manage connectivity to optimize resistance gene deployment while avoiding risks to the useful life of resistance genes. [Formula: see text] Copyright © 2017 The Author(s). This is an open access article distributed under the CC BY 4.0 International license .

  8. Genome-wide copy number variation study associates metabotropic glutamate receptor gene networks with attention deficit hyperactivity disorder

    Science.gov (United States)

    Elia, Josephine; Glessner, Joseph T; Wang, Kai; Takahashi, Nagahide; Shtir, Corina J; Hadley, Dexter; Sleiman, Patrick M A; Zhang, Haitao; Kim, Cecilia E; Robison, Reid; Lyon, Gholson J; Flory, James H; Bradfield, Jonathan P; Imielinski, Marcin; Hou, Cuiping; Frackelton, Edward C; Chiavacci, Rosetta M; Sakurai, Takeshi; Rabin, Cara; Middleton, Frank A; Thomas, Kelly A; Garris, Maria; Mentch, Frank; Freitag, Christine M; Steinhausen, Hans-Christoph; Todorov, Alexandre A; Reif, Andreas; Rothenberger, Aribert; Franke, Barbara; Mick, Eric O; Roeyers, Herbert; Buitelaar, Jan; Lesch, Klaus-Peter; Banaschewski, Tobias; Ebstein, Richard P; Mulas, Fernando; Oades, Robert D; Sergeant, Joseph; Sonuga-Barke, Edmund; Renner, Tobias J; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Meyer, Jobst; Pálmason, Haukur; Seitz, Christiane; Loo, Sandra K; Smalley, Susan L; Biederman, Joseph; Kent, Lindsey; Asherson, Philip; Anney, Richard J L; Gaynor, J William; Shaw, Philip; Devoto, Marcella; White, Peter S; Grant, Struan F A; Buxbaum, Joseph D; Rapoport, Judith L; Williams, Nigel M; Nelson, Stanley F; Faraone, Stephen V; Hakonarson, Hakon

    2014-01-01

    Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically significant findings in multiple independent cohorts, with a total of 2,493 cases with ADHD and 9,222 controls of European ancestry, using matched platforms. CNVs affecting metabotropic glutamate receptor genes were enriched across all cohorts (P = 2.1 × 10−9). We saw GRM5 (encoding glutamate receptor, metabotropic 5) deletions in ten cases and one control (P = 1.36 × 10−6). We saw GRM7 deletions in six cases, and we saw GRM8 deletions in eight cases and no controls. GRM1 was duplicated in eight cases. We experimentally validated the observed variants using quantitative RT-PCR. A gene network analysis showed that genes interacting with the genes in the GRM family are enriched for CNVs in ~10% of the cases (P = 4.38 × 10−10) after correction for occurrence in the controls. We identified rare recurrent CNVs affecting glutamatergic neurotransmission genes that were overrepresented in multiple ADHD cohorts. PMID:22138692

  9. Discovering implicit entity relation with the gene-citation-gene network.

    Directory of Open Access Journals (Sweden)

    Min Song

    Full Text Available In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner.

  10. Dynamics of associating networks

    Science.gov (United States)

    Tang, Shengchang; Habicht, Axel; Wang, Muzhou; Li, Shuaili; Seiffert, Sebastian; Olsen, Bradley

    Associating polymers offer important technological solutions to renewable and self-healing materials, conducting electrolytes for energy storage and transport, and vehicles for cell and protein deliveries. The interplay between polymer topologies and association chemistries warrants new interesting physics from associating networks, yet poses significant challenges to study these systems over a wide range of time and length scales. In a series of studies, we explored self-diffusion mechanisms of associating polymers above the percolation threshold, by combining experimental measurements using forced Rayleigh scattering and analytical insights from a two-state model. Despite the differences in molecular structures, a universal super-diffusion phenomenon is observed when diffusion of molecular species is hindered by dissociation kinetics. The molecular dissociation rate can be used to renormalize shear rheology data, which yields an unprecedented time-temperature-concentration superposition. The obtained shear rheology master curves provide experimental evidence of the relaxation hierarchy in associating networks.

  11. Application of gene network analysis techniques identifies AXIN1/PDIA2 and endoglin haplotypes associated with bicuspid aortic valve.

    Directory of Open Access Journals (Sweden)

    Eric C Wooten

    2010-01-01

    Full Text Available Bicuspid Aortic Valve (BAV is a highly heritable congenital heart defect. The low frequency of BAV (1% of general population limits our ability to perform genome-wide association studies. We present the application of four a priori SNP selection techniques, reducing the multiple-testing penalty by restricting analysis to SNPs relevant to BAV in a genome-wide SNP dataset from a cohort of 68 BAV probands and 830 control subjects. Two knowledge-based approaches, CANDID and STRING, were used to systematically identify BAV genes, and their SNPs, from the published literature, microarray expression studies and a genome scan. We additionally tested Functionally Interpolating SNPs (fitSNPs present on the array; the fourth consisted of SNPs selected by Random Forests, a machine learning approach. These approaches reduced the multiple testing penalty by lowering the fraction of the genome probed to 0.19% of the total, while increasing the likelihood of studying SNPs within relevant BAV genes and pathways. Three loci were identified by CANDID, STRING, and fitSNPS. A haplotype within the AXIN1-PDIA2 locus (p-value of 2.926x10(-06 and a haplotype within the Endoglin gene (p-value of 5.881x10(-04 were found to be strongly associated with BAV. The Random Forests approach identified a SNP on chromosome 3 in association with BAV (p-value 5.061x10(-06. The results presented here support an important role for genetic variants in BAV and provide support for additional studies in well-powered cohorts. Further, these studies demonstrate that leveraging existing expression and genomic data in the context of GWAS studies can identify biologically relevant genes and pathways associated with a congenital heart defect.

  12. Gene coexpression network analysis as a source of functional annotation for rice genes.

    Directory of Open Access Journals (Sweden)

    Kevin L Childs

    Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional

  13. Networks in biological systems: An investigation of the Gene Ontology as an evolving network

    International Nuclear Information System (INIS)

    Coronnello, C; Tumminello, M; Micciche, S; Mantegna, R.N.

    2009-01-01

    Many biological systems can be described as networks where different elements interact, in order to perform biological processes. We introduce a network associated with the Gene Ontology. Specifically, we construct a correlation-based network where the vertices are the terms of the Gene Ontology and the link between each two terms is weighted on the basis of the number of genes that they have in common. We analyze a filtered network obtained from the correlation-based network and we characterize its evolution over different releases of the Gene Ontology.

  14. Current approaches to gene regulatory network modelling

    Directory of Open Access Journals (Sweden)

    Brazma Alvis

    2007-09-01

    Full Text Available Abstract Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model.

  15. A network of genes, genetic disorders, and brain areas.

    Directory of Open Access Journals (Sweden)

    Satoru Hayasaka

    Full Text Available The network-based approach has been used to describe the relationship among genes and various phenotypes, producing a network describing complex biological relationships. Such networks can be constructed by aggregating previously reported associations in the literature from various databases. In this work, we applied the network-based approach to investigate how different brain areas are associated to genetic disorders and genes. In particular, a tripartite network with genes, genetic diseases, and brain areas was constructed based on the associations among them reported in the literature through text mining. In the resulting network, a disproportionately large number of gene-disease and disease-brain associations were attributed to a small subset of genes, diseases, and brain areas. Furthermore, a small number of brain areas were found to be associated with a large number of the same genes and diseases. These core brain regions encompassed the areas identified by the previous genome-wide association studies, and suggest potential areas of focus in the future imaging genetics research. The approach outlined in this work demonstrates the utility of the network-based approach in studying genetic effects on the brain.

  16. Constructing an integrated gene similarity network for the identification of disease genes.

    Science.gov (United States)

    Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

    2017-09-20

    Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .

  17. Chemical-Gene Interactions from ToxCast Bioactivity Data Expands Universe of Literature Network-Based Associations (SOT)

    Science.gov (United States)

    Characterizing the effects of chemicals in biological systems is often summarized by chemical-gene interactions, which have sparse coverage in the literature. The ToxCast chemical screening program has produced bioactivity data for nearly 2000 chemicals and over 450 gene targets....

  18. Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

    Directory of Open Access Journals (Sweden)

    Ettore Mosca

    2017-09-01

    Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.

  19. Prioritizing genes associated with prostate cancer development

    International Nuclear Information System (INIS)

    Gorlov, Ivan P; Logothetis, Christopher J; Sircar, Kanishka; Zhao, Hongya; Maity, Sankar N; Navone, Nora M; Gorlova, Olga Y; Troncoso, Patricia; Pettaway, Curtis A; Byun, Jin Young

    2010-01-01

    The genetic control of prostate cancer development is poorly understood. Large numbers of gene-expression datasets on different aspects of prostate tumorigenesis are available. We used these data to identify and prioritize candidate genes associated with the development of prostate cancer and bone metastases. Our working hypothesis was that combining meta-analyses on different but overlapping steps of prostate tumorigenesis will improve identification of genes associated with prostate cancer development. A Z score-based meta-analysis of gene-expression data was used to identify candidate genes associated with prostate cancer development. To put together different datasets, we conducted a meta-analysis on 3 levels that follow the natural history of prostate cancer development. For experimental verification of candidates, we used in silico validation as well as in-house gene-expression data. Genes with experimental evidence of an association with prostate cancer development were overrepresented among our top candidates. The meta-analysis also identified a considerable number of novel candidate genes with no published evidence of a role in prostate cancer development. Functional annotation identified cytoskeleton, cell adhesion, extracellular matrix, and cell motility as the top functions associated with prostate cancer development. We identified 10 genes--CDC2, CCNA2, IGF1, EGR1, SRF, CTGF, CCL2, CAV1, SMAD4, and AURKA--that form hubs of the interaction network and therefore are likely to be primary drivers of prostate cancer development. By using this large 3-level meta-analysis of the gene-expression data to identify candidate genes associated with prostate cancer development, we have generated a list of candidate genes that may be a useful resource for researchers studying the molecular mechanisms underlying prostate cancer development

  20. Inferring gene networks from discrete expression data

    KAUST Repository

    Zhang, L.

    2013-07-18

    The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.

  1. Association of a Network of Interferon-Stimulated Genes with a Locus Encoding a Negative Regulator of Non-conventional IKK Kinases and IFNB1

    Directory of Open Access Journals (Sweden)

    Saloua Jeidane

    2016-10-01

    Full Text Available Functional genomic analysis of gene expression in mice allowed us to identify a quantitative trait locus (QTL linked in trans to the expression of 190 gene transcripts and in cis to the expression of only two genes, one of which was Ypel5. Most of the trans-expression QTL genes were interferon-stimulated genes (ISGs, and their expression in mouse macrophage cell lines was stimulated in an IFNB1-dependent manner by Ypel5 silencing. In human HEK293T cells, YPEL5 silencing enhanced the induction of IFNB1 by pattern recognition receptors and phosphorylation of TBK1/IKBKE kinases, whereas co-immunoprecipitation experiments revealed that YPEL5 interacted physically with IKBKE. We thus found that the Ypel5 gene (contained in a locus linked to a network of ISGs in mice is a negative regulator of IFNB1 production and innate immune responses that interacts functionally and physically with TBK1/IKBKE kinases.

  2. Inferring gene networks from discrete expression data

    KAUST Repository

    Zhang, L.; Mallick, B. K.

    2013-01-01

    graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which

  3. Sparsity in Model Gene Regulatory Networks

    International Nuclear Information System (INIS)

    Zagorski, M.

    2011-01-01

    We propose a gene regulatory network model which incorporates the microscopic interactions between genes and transcription factors. In particular the gene's expression level is determined by deterministic synchronous dynamics with contribution from excitatory interactions. We study the structure of networks that have a particular '' function '' and are subject to the natural selection pressure. The question of network robustness against point mutations is addressed, and we conclude that only a small part of connections defined as '' essential '' for cell's existence is fragile. Additionally, the obtained networks are sparse with narrow in-degree and broad out-degree, properties well known from experimental study of biological regulatory networks. Furthermore, during sampling procedure we observe that significantly different genotypes can emerge under mutation-selection balance. All the preceding features hold for the model parameters which lay in the experimentally relevant range. (author)

  4. Functional Module Analysis for Gene Coexpression Networks with Network Integration.

    Science.gov (United States)

    Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K

    2015-01-01

    Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.

  5. Crowdsourcing the nodulation gene network discovery environment.

    Science.gov (United States)

    Li, Yupeng; Jackson, Scott A

    2016-05-26

    The Legumes (Fabaceae) are an economically and ecologically important group of plant species with the conspicuous capacity for symbiotic nitrogen fixation in root nodules, specialized plant organs containing symbiotic microbes. With the aim of understanding the underlying molecular mechanisms leading to nodulation, many efforts are underway to identify nodulation-related genes and determine how these genes interact with each other. In order to accurately and efficiently reconstruct nodulation gene network, a crowdsourcing platform, CrowdNodNet, was created. The platform implements the jQuery and vis.js JavaScript libraries, so that users are able to interactively visualize and edit the gene network, and easily access the information about the network, e.g. gene lists, gene interactions and gene functional annotations. In addition, all the gene information is written on MediaWiki pages, enabling users to edit and contribute to the network curation. Utilizing the continuously updated, collaboratively written, and community-reviewed Wikipedia model, the platform could, in a short time, become a comprehensive knowledge base of nodulation-related pathways. The platform could also be used for other biological processes, and thus has great potential for integrating and advancing our understanding of the functional genomics and systems biology of any process for any species. The platform is available at http://crowd.bioops.info/ , and the source code can be openly accessed at https://github.com/bioops/crowdnodnet under MIT License.

  6. BRAIN NETWORKS. Correlated gene expression supports synchronous activity in brain networks.

    Science.gov (United States)

    Richiardi, Jonas; Altmann, Andre; Milazzo, Anna-Clare; Chang, Catie; Chakravarty, M Mallar; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Bromberg, Uli; Büchel, Christian; Conrod, Patricia; Fauth-Bühler, Mira; Flor, Herta; Frouin, Vincent; Gallinat, Jürgen; Garavan, Hugh; Gowland, Penny; Heinz, Andreas; Lemaître, Hervé; Mann, Karl F; Martinot, Jean-Luc; Nees, Frauke; Paus, Tomáš; Pausova, Zdenka; Rietschel, Marcella; Robbins, Trevor W; Smolka, Michael N; Spanagel, Rainer; Ströhle, Andreas; Schumann, Gunter; Hawrylycz, Mike; Poline, Jean-Baptiste; Greicius, Michael D

    2015-06-12

    During rest, brain activity is synchronized between different regions widely distributed throughout the brain, forming functional networks. However, the molecular mechanisms supporting functional connectivity remain undefined. We show that functional brain networks defined with resting-state functional magnetic resonance imaging can be recapitulated by using measures of correlated gene expression in a post mortem brain tissue data set. The set of 136 genes we identify is significantly enriched for ion channels. Polymorphisms in this set of genes significantly affect resting-state functional connectivity in a large sample of healthy adolescents. Expression levels of these genes are also significantly associated with axonal connectivity in the mouse. The results provide convergent, multimodal evidence that resting-state functional networks correlate with the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Copyright © 2015, American Association for the Advancement of Science.

  7. Combinatorial explosion in model gene networks

    Science.gov (United States)

    Edwards, R.; Glass, L.

    2000-09-01

    The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such

  8. Deconstructing the pluripotency gene regulatory network

    KAUST Repository

    Li, Mo

    2018-04-04

    Pluripotent stem cells can be isolated from embryos or derived by reprogramming. Pluripotency is stabilized by an interconnected network of pluripotency genes that cooperatively regulate gene expression. Here we describe the molecular principles of pluripotency gene function and highlight post-transcriptional controls, particularly those induced by RNA-binding proteins and alternative splicing, as an important regulatory layer of pluripotency. We also discuss heterogeneity in pluripotency regulation, alternative pluripotency states and future directions of pluripotent stem cell research.

  9. Deconstructing the pluripotency gene regulatory network

    KAUST Repository

    Li, Mo; Belmonte, Juan Carlos Izpisua

    2018-01-01

    Pluripotent stem cells can be isolated from embryos or derived by reprogramming. Pluripotency is stabilized by an interconnected network of pluripotency genes that cooperatively regulate gene expression. Here we describe the molecular principles of pluripotency gene function and highlight post-transcriptional controls, particularly those induced by RNA-binding proteins and alternative splicing, as an important regulatory layer of pluripotency. We also discuss heterogeneity in pluripotency regulation, alternative pluripotency states and future directions of pluripotent stem cell research.

  10. Network Completion for Static Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Natsu Nakajima

    2014-01-01

    Full Text Available We tackle the problem of completing and inferring genetic networks under stationary conditions from static data, where network completion is to make the minimum amount of modifications to an initial network so that the completed network is most consistent with the expression data in which addition of edges and deletion of edges are basic modification operations. For this problem, we present a new method for network completion using dynamic programming and least-squares fitting. This method can find an optimal solution in polynomial time if the maximum indegree of the network is bounded by a constant. We evaluate the effectiveness of our method through computational experiments using synthetic data. Furthermore, we demonstrate that our proposed method can distinguish the differences between two types of genetic networks under stationary conditions from lung cancer and normal gene expression data.

  11. Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

    Science.gov (United States)

    Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

    2017-08-01

    This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  12. Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases

    Directory of Open Access Journals (Sweden)

    Ma'ayan Avi

    2007-10-01

    Full Text Available Abstract Background In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP, generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Results Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Conclusion Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.

  13. Mutated Genes in Schizophrenia Map to Brain Networks

    Science.gov (United States)

    ... Matters NIH Research Matters August 12, 2013 Mutated Genes in Schizophrenia Map to Brain Networks Schizophrenia networks ... have a high number of spontaneous mutations in genes that form a network in the front region ...

  14. Transcriptome analysis of an apple (Malus × domestica) yellow fruit somatic mutation identifies a gene network module highly associated with anthocyanin and epigenetic regulation.

    Science.gov (United States)

    El-Sharkawy, Islam; Liang, Dong; Xu, Kenong

    2015-12-01

    Using RNA-seq, this study analysed an apple (Malus×domestica) anthocyanin-deficient yellow-skin somatic mutant 'Blondee' (BLO) and its red-skin parent 'Kidd's D-8' (KID), the original name of 'Gala', to understand the molecular mechanisms underlying the mutation. A total of 3299 differentially expressed genes (DEGs) were identified between BLO and KID at four developmental stages and/or between two adjacent stages within BLO and/or KID. A weighted gene co-expression network analysis (WGCNA) of the DEGs uncovered a network module of 34 genes highly correlated (r=0.95, P=9.0×10(-13)) with anthocyanin contents. Although 12 of the 34 genes in the WGCNA module were characterized and known of roles in anthocyanin, the remainder 22 appear to be novel. Examining the expression of ten representative genes in the module in 14 diverse apples revealed that at least eight were significantly correlated with anthocyanin variation. MdMYB10 (MDP0000259614) and MdGST (MDP0000252292) were among the most suppressed module member genes in BLO despite being undistinguishable in their corresponding sequences between BLO and KID. Methylation assay of MdMYB10 and MdGST in fruit skin revealed that two regions (MR3 and MR7) in the MdMYB10 promoter exhibited remarkable differences between BLO and KID. In particular, methylation was high and progressively increased alongside fruit development in BLO while was correspondingly low and constant in KID. The methylation levels in both MR3 and MR7 were negatively correlated with anthocyanin content as well as the expression of MdMYB10 and MdGST. Clearly, the collective repression of the 34 genes explains the loss-of-colour in BLO while the methylation in MdMYB10 promoter is likely causal for the mutation. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  15. System Biology Approach: Gene Network Analysis for Muscular Dystrophy.

    Science.gov (United States)

    Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro

    2018-01-01

    Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.

  16. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov

    2013-01-01

    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  17. Mutational robustness of gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Aalt D J van Dijk

    Full Text Available Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor-target gene interactions but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive. In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence.

  18. Transcriptional delay stabilizes bistable gene networks.

    Science.gov (United States)

    Gupta, Chinmaya; López, José Manuel; Ott, William; Josić, Krešimir; Bennett, Matthew R

    2013-08-02

    Transcriptional delay can significantly impact the dynamics of gene networks. Here we examine how such delay affects bistable systems. We investigate several stochastic models of bistable gene networks and find that increasing delay dramatically increases the mean residence times near stable states. To explain this, we introduce a non-Markovian, analytically tractable reduced model. The model shows that stabilization is the consequence of an increased number of failed transitions between stable states. Each of the bistable systems that we simulate behaves in this manner.

  19. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    2014-01-01

    Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.

  20. A hybrid network-based method for the detection of disease-related genes

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

    2018-02-01

    Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.

  1. Multiscale Embedded Gene Co-expression Network Analysis.

    Directory of Open Access Journals (Sweden)

    Won-Min Song

    2015-11-01

    Full Text Available Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3, the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA by: i introducing quality control of co-expression similarities, ii parallelizing embedded network construction, and iii developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs. We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA. MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  2. Multiscale Embedded Gene Co-expression Network Analysis.

    Science.gov (United States)

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  3. Ground rules of the pluripotency gene regulatory network.

    KAUST Repository

    Li, Mo

    2017-01-03

    Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.

  4. Ground rules of the pluripotency gene regulatory network.

    KAUST Repository

    Li, Mo; Belmonte, Juan Carlos Izpisua

    2017-01-01

    Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.

  5. Construction of functional linkage gene networks by data integration.

    Science.gov (United States)

    Linghu, Bolan; Franzosa, Eric A; Xia, Yu

    2013-01-01

    Networks of functional associations between genes have recently been successfully used for gene function and disease-related research. A typical approach for constructing such functional linkage gene networks (FLNs) is based on the integration of diverse high-throughput functional genomics datasets. Data integration is a nontrivial task due to the heterogeneous nature of the different data sources and their variable accuracy and completeness. The presence of correlations between data sources also adds another layer of complexity to the integration process. In this chapter we discuss an approach for constructing a human FLN from data integration and a subsequent application of the FLN to novel disease gene discovery. Similar approaches can be applied to nonhuman species and other discovery tasks.

  6. Molecular network, pathway, and functional analysis of time-dependent gene changes associated with pancreatic cancer susceptibility to oncolytic vaccinia virotherapy

    Directory of Open Access Journals (Sweden)

    Dana Haddad

    2016-01-01

    Conclusions: Our study reveals the ability to assess time-dependent changes in gene expression patterns in pancreatic cancer cells associated with infection and susceptibility to vaccinia viruses. This suggests that molecular assays may be useful to develop safer and more efficacious oncolyticvirotherapies and support the idea that these treatments may target pathways implicated in pancreatic cancer resistance to conventional therapies.

  7. Impulsive Internet Game Play Is Associated With Increased Functional Connectivity Between the Default Mode and Salience Networks in Depressed Patients With Short Allele of Serotonin Transporter Gene

    Directory of Open Access Journals (Sweden)

    Ji Sun Hong

    2018-04-01

    Full Text Available Problematic Internet game play is often accompanied by major depressive disorder (MDD. Depression seems to be closely related to altered functional connectivity (FC within (and between the default mode network (DMN and salience network. In addition, serotonergic neurotransmission may regulate the symptoms of depression, including impulsivity, potentially by modulating the DMN. We hypothesized that altered connectivity between the DMN and salience network could mediate an association between the 5HTTLPR genotype and impulsivity in patients with depression. A total of 54 participants with problematic Internet game play and MDD completed the research protocol. We genotyped for 5HTTLPR and assessed the DMN FC using resting-state functional magnetic resonance imaging. The severity of Internet game play, depressive symptoms, anxiety, attention and impulsivity, and behavioral inhibition and activation were assessed using the Young Internet Addiction Scale (YIAS, Beck Depressive Inventory, Beck Anxiety Inventory (BAI, Korean Attention Deficit Hyperactivity Disorder scale, and the Behavioral Inhibition and Activation Scales (BIS-BAS, respectively. The SS allele was associated with increased FC within the DMN, including the middle prefrontal cortex (MPFC to the posterior cingulate cortex, and within the salience network, including the right supramarginal gyrus (SMG to the right rostral prefrontal cortex (RPFC, right anterior insular (AInsular to right SMG, anterior cingulate cortex (ACC to left RPFC, and left AInsular to right RPFC, and between the DMN and salience network, including the MPFC to the ACC. In addition, the FC from the MPFC to ACC positively correlated with the BIS and YIAS scores in the SS allele group. The SS allele of 5HTTLPR might modulate the FC within and between the DMN and salience network, which may ultimately be a risk factor for impulsive Internet game play in patients with MDD.

  8. Impulsive Internet Game Play Is Associated With Increased Functional Connectivity Between the Default Mode and Salience Networks in Depressed Patients With Short Allele of Serotonin Transporter Gene.

    Science.gov (United States)

    Hong, Ji Sun; Kim, Sun Mi; Bae, Sujin; Han, Doug Hyun

    2018-01-01

    Problematic Internet game play is often accompanied by major depressive disorder (MDD). Depression seems to be closely related to altered functional connectivity (FC) within (and between) the default mode network (DMN) and salience network. In addition, serotonergic neurotransmission may regulate the symptoms of depression, including impulsivity, potentially by modulating the DMN. We hypothesized that altered connectivity between the DMN and salience network could mediate an association between the 5HTTLPR genotype and impulsivity in patients with depression. A total of 54 participants with problematic Internet game play and MDD completed the research protocol. We genotyped for 5HTTLPR and assessed the DMN FC using resting-state functional magnetic resonance imaging. The severity of Internet game play, depressive symptoms, anxiety, attention and impulsivity, and behavioral inhibition and activation were assessed using the Young Internet Addiction Scale (YIAS), Beck Depressive Inventory, Beck Anxiety Inventory (BAI), Korean Attention Deficit Hyperactivity Disorder scale, and the Behavioral Inhibition and Activation Scales (BIS-BAS), respectively. The SS allele was associated with increased FC within the DMN, including the middle prefrontal cortex (MPFC) to the posterior cingulate cortex, and within the salience network, including the right supramarginal gyrus (SMG) to the right rostral prefrontal cortex (RPFC), right anterior insular (AInsular) to right SMG, anterior cingulate cortex (ACC) to left RPFC, and left AInsular to right RPFC, and between the DMN and salience network, including the MPFC to the ACC. In addition, the FC from the MPFC to ACC positively correlated with the BIS and YIAS scores in the SS allele group. The SS allele of 5HTTLPR might modulate the FC within and between the DMN and salience network, which may ultimately be a risk factor for impulsive Internet game play in patients with MDD.

  9. Generic Properties of Random Gene Regulatory Networks.

    Science.gov (United States)

    Li, Zhiyuan; Bianco, Simone; Zhang, Zhaoyang; Tang, Chao

    2013-12-01

    Modeling gene regulatory networks (GRNs) is an important topic in systems biology. Although there has been much work focusing on various specific systems, the generic behavior of GRNs with continuous variables is still elusive. In particular, it is not clear typically how attractors partition among the three types of orbits: steady state, periodic and chaotic, and how the dynamical properties change with network's topological characteristics. In this work, we first investigated these questions in random GRNs with different network sizes, connectivity, fraction of inhibitory links and transcription regulation rules. Then we searched for the core motifs that govern the dynamic behavior of large GRNs. We show that the stability of a random GRN is typically governed by a few embedding motifs of small sizes, and therefore can in general be understood in the context of these short motifs. Our results provide insights for the study and design of genetic networks.

  10. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  11. Hybrid stochastic simplifications for multiscale gene networks

    Directory of Open Access Journals (Sweden)

    Debussche Arnaud

    2009-09-01

    Full Text Available Abstract Background Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models. Results We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion 123 which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples. Conclusion Hybrid simplifications can be used for onion-like (multi-layered approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach.

  12. A powerful score-based test statistic for detecting gene-gene co-association.

    Science.gov (United States)

    Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun

    2016-01-29

    The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.

  13. Gene regulatory networks elucidating huanglongbing disease mechanisms.

    Directory of Open Access Journals (Sweden)

    Federico Martinelli

    Full Text Available Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas, especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein - protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation, sucrose metabolism (upregulation, and starch biosynthesis (upregulation. In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70 was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur.

  14. Construction of coffee transcriptome networks based on gene annotation semantics

    Directory of Open Access Journals (Sweden)

    Castillo Luis F.

    2012-12-01

    Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

  15. Decoupling Linear and Nonlinear Associations of Gene Expression

    KAUST Repository

    Itakura, Alan

    2013-01-01

    The FANTOM consortium has generated a large gene expression dataset of different cell lines and tissue cultures using the single-molecule sequencing technology of HeliscopeCAGE. This provides a unique opportunity to investigate novel associations between gene expression over time and different cell types. Here, we create a MatLab wrapper for a powerful and computationally intensive set of statistics known as Maximal Information Coefficient, and then calculate this statistic for a large, comprehensive dataset containing gene expression of a variety of differentiating tissues. We then distinguish between linear and nonlinear associations, and then create gene association networks. Following this analysis, we are then able to identify clusters of linear gene associations that then associate nonlinearly with other clusters of linearity, providing insight to much more complex connections between gene expression patterns than previously anticipated.

  16. Decoupling Linear and Nonlinear Associations of Gene Expression

    KAUST Repository

    Itakura, Alan

    2013-05-01

    The FANTOM consortium has generated a large gene expression dataset of different cell lines and tissue cultures using the single-molecule sequencing technology of HeliscopeCAGE. This provides a unique opportunity to investigate novel associations between gene expression over time and different cell types. Here, we create a MatLab wrapper for a powerful and computationally intensive set of statistics known as Maximal Information Coefficient, and then calculate this statistic for a large, comprehensive dataset containing gene expression of a variety of differentiating tissues. We then distinguish between linear and nonlinear associations, and then create gene association networks. Following this analysis, we are then able to identify clusters of linear gene associations that then associate nonlinearly with other clusters of linearity, providing insight to much more complex connections between gene expression patterns than previously anticipated.

  17. Genome-wide identification of key modulators of gene-gene interaction networks in breast cancer.

    Science.gov (United States)

    Chiu, Yu-Chiao; Wang, Li-Ju; Hsiao, Tzu-Hung; Chuang, Eric Y; Chen, Yidong

    2017-10-03

    With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking. We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions. Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.

  18. Cell cycle- and cancer-associated gene networks activated by Dsg2: evidence of cystatin A deregulation and a potential role in cell-cell adhesion.

    Directory of Open Access Journals (Sweden)

    Abhilasha Gupta

    Full Text Available Cell-cell adhesion is paramount in providing and maintaining multicellular structure and signal transmission between cells. In the skin, disruption to desmosomal regulated intercellular connectivity may lead to disorders of keratinization and hyperproliferative disease including cancer. Recently we showed transgenic mice overexpressing desmoglein 2 (Dsg2 in the epidermis develop hyperplasia. Following microarray and gene network analysis, we demonstrate that Dsg2 caused a profound change in the transcriptome of keratinocytes in vivo and altered a number of genes important in epithelial dysplasia including: calcium-binding proteins (S100A8 and S100A9, members of the cyclin protein family, and the cysteine protease inhibitor cystatin A (CSTA. CSTA is deregulated in several skin cancers, including squamous cell carcinomas (SCC and loss of function mutations lead to recessive skin fragility disorders. The microarray results were confirmed by qPCR, immunoblotting, and immunohistochemistry. CSTA was detected at high level throughout the newborn mouse epidermis but dramatically decreased with development and was detected predominantly in the differentiated layers. In human keratinocytes, knockdown of Dsg2 by siRNA or shRNA reduced CSTA expression. Furthermore, siRNA knockdown of CSTA resulted in cytoplasmic localization of Dsg2, perturbed cytokeratin 14 staining and reduced levels of desmoplakin in response to mechanical stretching. Both knockdown of either Dsg2 or CSTA induced loss of cell adhesion in a dispase-based assay and the effect was synergistic. Our findings here offer a novel pathway of CSTA regulation involving Dsg2 and a potential crosstalk between Dsg2 and CSTA that modulates cell adhesion. These results further support the recent human genetic findings that loss of function mutations in the CSTA gene result in skin fragility due to impaired cell-cell adhesion: autosomal-recessive exfoliative ichthyosis or acral peeling skin syndrome.

  19. Statistical indicators of collective behavior and functional clusters in gene networks of yeast

    Science.gov (United States)

    Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

    2006-03-01

    We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.

  20. Portrait of Candida Species Biofilm Regulatory Network Genes.

    Science.gov (United States)

    Araújo, Daniela; Henriques, Mariana; Silva, Sónia

    2017-01-01

    Most cases of candidiasis have been attributed to Candida albicans, but Candida glabrata, Candida parapsilosis and Candida tropicalis, designated as non-C. albicans Candida (NCAC), have been identified as frequent human pathogens. Moreover, Candida biofilms are an escalating clinical problem associated with significant rates of mortality. Biofilms have distinct developmental phases, including adhesion/colonisation, maturation and dispersal, controlled by complex regulatory networks. This review discusses recent advances regarding Candida species biofilm regulatory network genes, which are key components for candidiasis. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Protein Annotation from Protein Interaction Networks and Gene Ontology

    OpenAIRE

    Nguyen, Cao D.; Gardiner, Katheleen J.; Cios, Krzysztof J.

    2011-01-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precis...

  2. Grouping by association: using associative networks for document categorization

    NARCIS (Netherlands)

    Bloom, Niels

    2015-01-01

    In this thesis we describe a method of using associative networks for automatic doc- ument grouping. Associative networks are networks of ideas or concepts in which each concept is linked to concepts that are semantically similar to it. By activating concepts in the network based on the text of a

  3. Paper-based synthetic gene networks.

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J

    2014-11-06

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.

  4. Paper-based Synthetic Gene Networks

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

    2014-01-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  5. Neural Inductive Matrix Completion for Predicting Disease-Gene Associations

    KAUST Repository

    Hou, Siqing

    2018-05-21

    In silico prioritization of undiscovered associations can help find causal genes of newly discovered diseases. Some existing methods are based on known associations, and side information of diseases and genes. We exploit the possibility of using a neural network model, Neural inductive matrix completion (NIMC), in disease-gene prediction. Comparing to the state-of-the-art inductive matrix completion method, using neural networks allows us to learn latent features from non-linear functions of input features. Previous methods use disease features only from mining text. Comparing to text mining, disease ontology is a more informative way of discovering correlation of dis- eases, from which we can calculate the similarities between diseases and help increase the performance of predicting disease-gene associations. We compare the proposed method with other state-of-the-art methods for pre- dicting associated genes for diseases from the Online Mendelian Inheritance in Man (OMIM) database. Results show that both new features and the proposed NIMC model can improve the chance of recovering an unknown associated gene in the top 100 predicted genes. Best results are obtained by using both the new features and the new model. Results also show the proposed method does better in predicting associated genes for newly discovered diseases.

  6. Identification of the IGF1/PI3K/NF κB/ERK gene signalling networks associated with chemotherapy resistance and treatment response in high-grade serous epithelial ovarian cancer

    International Nuclear Information System (INIS)

    Koti, Madhuri; Evans, Kenneth; Feilotter, Harriet E; Park, Paul C; Squire, Jeremy A; Gooding, Robert J; Nuin, Paulo; Haslehurst, Alexandria; Crane, Colleen; Weberpals, Johanne; Childs, Timothy; Bryson, Peter; Dharsee, Moyez

    2013-01-01

    Resistance to platinum-based chemotherapy remains a major impediment in the treatment of serous epithelial ovarian cancer. The objective of this study was to use gene expression profiling to delineate major deregulated pathways and biomarkers associated with the development of intrinsic chemotherapy resistance upon exposure to standard first-line therapy for ovarian cancer. The study cohort comprised 28 patients divided into two groups based on their varying sensitivity to first-line chemotherapy using progression free survival (PFS) as a surrogate of response. All 28 patients had advanced stage, high-grade serous ovarian cancer, and were treated with standard platinum-based chemotherapy. Twelve patient tumours demonstrating relative resistance to platinum chemotherapy corresponding to shorter PFS (< eight months) were compared to sixteen tumours from platinum-sensitive patients (PFS > eighteen months). Whole transcriptome profiling was performed using an Affymetrix high-resolution microarray platform to permit global comparisons of gene expression profiles between tumours from the resistant group and the sensitive group. Microarray data analysis revealed a set of 204 discriminating genes possessing expression levels which could influence differential chemotherapy response between the two groups. Robust statistical testing was then performed which eliminated a dependence on the normalization algorithm employed, producing a restricted list of differentially regulated genes, and which found IGF1 to be the most strongly differentially expressed gene. Pathway analysis, based on the list of 204 genes, revealed enrichment in genes primarily involved in the IGF1/PI3K/NF κB/ERK gene signalling networks. This study has identified pathway specific prognostic biomarkers possibly underlying a differential chemotherapy response in patients undergoing standard platinum-based treatment of serous epithelial ovarian cancer. In addition, our results provide a pathway context for

  7. Integration of biological networks and gene expression data using Cytoscape

    DEFF Research Database (Denmark)

    Cline, M.S.; Smoot, M.; Cerami, E.

    2007-01-01

    of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules......Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context...... and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape....

  8. Biomarker Gene Signature Discovery Integrating Network Knowledge

    Directory of Open Access Journals (Sweden)

    Holger Fröhlich

    2012-02-01

    Full Text Available Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.

  9. Prioritization of epilepsy associated candidate genes by convergent analysis.

    Science.gov (United States)

    Jia, Peilin; Ewers, Jeffrey M; Zhao, Zhongming

    2011-02-24

    Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The strategy can be

  10. Prioritization of epilepsy associated candidate genes by convergent analysis.

    Directory of Open Access Journals (Sweden)

    Peilin Jia

    2011-02-01

    Full Text Available Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs that are more likely to be associated with epilepsy. The responsible gene(s within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research.In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways.The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The

  11. The impact of gene expression variation on the robustness and evolvability of a developmental gene regulatory network.

    Directory of Open Access Journals (Sweden)

    David A Garfield

    2013-10-01

    Full Text Available Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear, allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role.

  12. Diurnal Transcriptome and Gene Network Represented through Sparse Modeling in Brachypodium distachyon

    Directory of Open Access Journals (Sweden)

    Satoru Koda

    2017-11-01

    Full Text Available We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX model with a group smoothly clipped absolute deviation (SCAD method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon. To reveal the diurnal changes in the transcriptome in B. distachyon, we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon. On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon, aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.

  13. Population genomics of the Arabidopsis thaliana flowering time gene network.

    Science.gov (United States)

    Flowers, Jonathan M; Hanzawa, Yoshie; Hall, Megan C; Moore, Richard C; Purugganan, Michael D

    2009-11-01

    The time to flowering is a key component of the life-history strategy of the model plant Arabidopsis thaliana that varies quantitatively among genotypes. A significant problem for evolutionary and ecological genetics is to understand how natural selection may operate on this ecologically significant trait. Here, we conduct a population genomic study of resequencing data from 52 genes in the flowering time network. McDonald-Kreitman tests of neutrality suggested a strong excess of amino acid polymorphism when pooling across loci. This excess of replacement polymorphism across the flowering time network and a skewed derived frequency spectrum toward rare alleles for both replacement and noncoding polymorphisms relative to synonymous changes is consistent with a large class of deleterious polymorphisms segregating in these genes. Assuming selective neutrality of synonymous changes, we estimate that approximately 30% of amino acid polymorphisms are deleterious. Evidence of adaptive substitution is less prominent in our analysis. The photoperiod regulatory gene, CO, and a gibberellic acid transcription factor, AtMYB33, show evidence of adaptive fixation of amino acid mutations. A test for extended haplotypes revealed no examples of flowering time alleles with haplotypes comparable in length to those associated with the null fri(Col) allele reported previously. This suggests that the FRI gene likely has a uniquely intense or recent history of selection among the flowering time genes considered here. Although there is some evidence for adaptive evolution in these life-history genes, it appears that slightly deleterious polymorphisms are a major component of natural molecular variation in the flowering time network of A. thaliana.

  14. Associative memory in phasing neuron networks

    Energy Technology Data Exchange (ETDEWEB)

    Nair, Niketh S [ORNL; Bochove, Erik J. [United States Air Force Research Laboratory, Kirtland Air Force Base; Braiman, Yehuda [ORNL

    2014-01-01

    We studied pattern formation in a network of coupled Hindmarsh-Rose model neurons and introduced a new model for associative memory retrieval using networks of Kuramoto oscillators. Hindmarsh-Rose Neural Networks can exhibit a rich set of collective dynamics that can be controlled by their connectivity. Specifically, we showed an instance of Hebb's rule where spiking was correlated with network topology. Based on this, we presented a simple model of associative memory in coupled phase oscillators.

  15. Gene regulation is governed by a core network in hepatocellular carcinoma.

    Science.gov (United States)

    Gu, Zuguang; Zhang, Chenyu; Wang, Jin

    2012-05-01

    Hepatocellular carcinoma (HCC) is one of the most lethal cancers worldwide, and the mechanisms that lead to the disease are still relatively unclear. However, with the development of high-throughput technologies it is possible to gain a systematic view of biological systems to enhance the understanding of the roles of genes associated with HCC. Thus, analysis of the mechanism of molecule interactions in the context of gene regulatory networks can reveal specific sub-networks that lead to the development of HCC. In this study, we aimed to identify the most important gene regulations that are dysfunctional in HCC generation. Our method for constructing gene regulatory network is based on predicted target interactions, experimentally-supported interactions, and co-expression model. Regulators in the network included both transcription factors and microRNAs to provide a complete view of gene regulation. Analysis of gene regulatory network revealed that gene regulation in HCC is highly modular, in which different sets of regulators take charge of specific biological processes. We found that microRNAs mainly control biological functions related to mitochondria and oxidative reduction, while transcription factors control immune responses, extracellular activity and the cell cycle. On the higher level of gene regulation, there exists a core network that organizes regulations between different modules and maintains the robustness of the whole network. There is direct experimental evidence for most of the regulators in the core gene regulatory network relating to HCC. We infer it is the central controller of gene regulation. Finally, we explored the influence of the core gene regulatory network on biological pathways. Our analysis provides insights into the mechanism of transcriptional and post-transcriptional control in HCC. In particular, we highlight the importance of the core gene regulatory network; we propose that it is highly related to HCC and we believe further

  16. Robust gene network analysis reveals alteration of the STAT5a network as a hallmark of prostate cancer.

    Science.gov (United States)

    Reddy, Anupama; Huang, C Chris; Liu, Huiqing; Delisi, Charles; Nevalainen, Marja T; Szalma, Sandor; Bhanot, Gyan

    2010-01-01

    We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily

  17. The DOPA decarboxylase (DDC) gene is associated with alerting attention.

    Science.gov (United States)

    Zhu, Bi; Chen, Chuansheng; Moyzis, Robert K; Dong, Qi; Chen, Chunhui; He, Qinghua; Li, Jin; Li, Jun; Lei, Xuemei; Lin, Chongde

    2013-06-03

    DOPA decarboxylase (DDC) is involved in the synthesis of dopamine, norepinephrine and serotonin. It has been suggested that genes involved in the dopamine, norepinephrine, and cholinergic systems play an essential role in the efficiency of human attention networks. Attention refers to the cognitive process of obtaining and maintaining the alert state, orienting to sensory events, and regulating the conflicts of thoughts and behavior. The present study tested seven single nucleotide polymorphisms (SNPs) within the DDC gene for association with attention, which was assessed by the Attention Network Test to detect three networks of attention, including alerting, orienting, and executive attention, in a healthy Han Chinese sample (N=451). Association analysis for individual SNPs indicated that four of the seven SNPs (rs3887825, rs7786398, rs10499695, and rs6969081) were significantly associated with alerting attention. Haplotype-based association analysis revealed that alerting was associated with the haplotype G-A-T for SNPs rs7786398-rs10499695-rs6969081. These associations remained significant after correcting for multiple testing by max(T) permutation. No association was found for orienting and executive attention. This study provides the first evidence for the involvement of the DDC gene in alerting attention. A better understanding of the genetic basis of distinct attention networks would allow us to develop more effective diagnosis, treatment, and prevention of deficient or underdeveloped alerting attention as well as its related prevalent neuropsychiatric disorders. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. Co-regulation of metabolic genes is better explained by flux coupling than by network distance.

    Directory of Open Access Journals (Sweden)

    Richard A Notebaart

    2008-01-01

    Full Text Available To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naïve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools.

  19. Association and Centrality in Criminal Networks

    DEFF Research Database (Denmark)

    Petersen, Rasmus Rosenqvist

    Network-based techniques are widely used in criminal investigations because patterns of association are actionable and understandable. Existing network models with nodes as first class entities and their related measures (e.g., social networks and centrality measures) are unable to capture...

  20. Annotating gene sets by mining large literature collections with protein networks.

    Science.gov (United States)

    Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

    2018-01-01

    Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.

  1. Evolving chromosomes and gene regulatory networks

    Indian Academy of Sciences (India)

    Aswin

    Genes under H NS control can be. (a) regulated by H NS. (b) regulated by H NS and StpA. Because backup by StpA is partial. Page 19. Gene expression level. H NS regulated xenogenes. Other genes. Page 20 ... recollect: H&NS silences highl transcribable genes. Gene expression level unilateral. Other genes epistatic ...

  2. Unveiling network-based functional features through integration of gene expression into protein networks.

    Science.gov (United States)

    Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali

    2018-06-01

    Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Network-based prediction and knowledge mining of disease genes.

    Science.gov (United States)

    Carson, Matthew B; Lu, Hui

    2015-01-01

    In recent years, high-throughput protein interaction identification methods have generated a large amount of data. When combined with the results from other in vivo and in vitro experiments, a complex set of relationships between biological molecules emerges. The growing popularity of network analysis and data mining has allowed researchers to recognize indirect connections between these molecules. Due to the interdependent nature of network entities, evaluating proteins in this context can reveal relationships that may not otherwise be evident. We examined the human protein interaction network as it relates to human illness using the Disease Ontology. After calculating several topological metrics, we trained an alternating decision tree (ADTree) classifier to identify disease-associated proteins. Using a bootstrapping method, we created a tree to highlight conserved characteristics shared by many of these proteins. Subsequently, we reviewed a set of non-disease-associated proteins that were misclassified by the algorithm with high confidence and searched for evidence of a disease relationship. Our classifier was able to predict disease-related genes with 79% area under the receiver operating characteristic (ROC) curve (AUC), which indicates the tradeoff between sensitivity and specificity and is a good predictor of how a classifier will perform on future data sets. We found that a combination of several network characteristics including degree centrality, disease neighbor ratio, eccentricity, and neighborhood connectivity help to distinguish between disease- and non-disease-related proteins. Furthermore, the ADTree allowed us to understand which combinations of strongly predictive attributes contributed most to protein-disease classification. In our post-processing evaluation, we found several examples of potential novel disease-related proteins and corresponding literature evidence. In addition, we showed that first- and second-order neighbors in the PPI network

  4. Protein annotation from protein interaction networks and Gene Ontology.

    Science.gov (United States)

    Nguyen, Cao D; Gardiner, Katheleen J; Cios, Krzysztof J

    2011-10-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precision and 60% recall versus 45% and 26% for Majority and 24% and 61% for χ²-statistics, respectively. Copyright © 2011 Elsevier Inc. All rights reserved.

  5. Uncovering co-expression gene network modules regulating fruit acidity in diverse apples.

    Science.gov (United States)

    Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Zhong, Gan-Yuan; Xu, Kenong

    2015-08-16

    Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that putatively encodes a vacuolar aluminum-activated malate transporter1 (ALMT1)-like protein is a strong candidate gene. We hypothesize that fruit acidity is governed by a gene network in which Ma1 is key member. The goal of this study is to identify the gene network and the potential mechanisms through which the network operates. Guided by Ma1, we analyzed the transcriptomes of mature fruit of contrasting acidity from six apple accessions of genotype Ma_ (MaMa or Mama) and four of mama using RNA-seq and identified 1301 fruit acidity associated genes, among which 18 were most significant acidity genes (MSAGs). Network inferring using weighted gene co-expression network analysis (WGCNA) revealed five co-expression gene network modules of significant (P acidity. Overall, this study provides important insight into the Ma1-mediated gene network controlling acidity in mature apple fruit of diverse genetic background.

  6. Signal Network Analysis of Plant Genes Responding to Ionizing Radiation

    International Nuclear Information System (INIS)

    Kim, Dong Sub; Kim, Jinbaek; Kim, Sang Hoon

    2012-12-01

    In this project, we irradiated Arabidopsis plants with various doses of gamma-rays at the vegetative and reproductive stages to assess their radiation sensitivity. After the gene expression profiles and an analysis of the antioxidant response, we selected several Arabidopsis genes for uses of 'Radio marker genes (RMG)' and conducted over-expression and knock-down experiments to confirm the radio sensitivity. Based on these results, we applied two patents for the detection of two RMG (At3g28210 and At4g37990) and development of transgenic plants. Also, we developed a Genechip for use of high-throughput screening of Arabidopsis genes responding only to ionizing radiation and identified RMG to detect radiation leaks. Based on these results, we applied two patents associated with the use of Genechip for different types of radiation and different growth stages. Also, we conducted co-expression network study of specific expressed probes against gamma-ray stress and identified expressed patterns of duplicated genes formed by whole/500kb segmental genome duplication

  7. Hierarchical document categorization using associative networks

    NARCIS (Netherlands)

    Bloom, Niels; Theune, Mariet; de Jong, Franciska M.G.; Klement, E.P.; Borutzky, W.; Fahringer, T.; Hamza, M.H.; Uskov, V.

    Associative networks are a connectionist language model with the ability to handle dynamic data. We used two associative networks to categorize random sets of related Wikipedia articles with only their raw text as input. We then compared the resulting categorization to a gold standard: the manual

  8. Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

    Science.gov (United States)

    Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

    2012-01-01

    Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to

  9. Inference of Cancer-specific Gene Regulatory Networks Using Soft Computing Rules

    Directory of Open Access Journals (Sweden)

    Xiaosheng Wang

    2010-03-01

    Full Text Available Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

  10. Inference of cancer-specific gene regulatory networks using soft computing rules.

    Science.gov (United States)

    Wang, Xiaosheng; Gotoh, Osamu

    2010-03-24

    Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

  11. A Regulatory Network Analysis of Orphan Genes in Arabidopsis Thaliana

    Science.gov (United States)

    Singh, Pramesh; Chen, Tianlong; Arendsee, Zebulun; Wurtele, Eve S.; Bassler, Kevin E.

    Orphan genes, which are genes unique to each particular species, have recently drawn significant attention for their potential usefulness for organismal robustness. Their origin and regulatory interaction patterns remain largely undiscovered. Recently, methods that use the context likelihood of relatedness to infer a network followed by modularity maximizing community detection algorithms on the inferred network to find the functional structure of regulatory networks were shown to be effective. We apply improved versions of these methods to gene expression data from Arabidopsis thaliana, identify groups (clusters) of interacting genes with related patterns of expression and analyze the structure within those groups. Focusing on clusters that contain orphan genes, we compare the identified clusters to gene ontology (GO) terms, regulons, and pathway designations and analyze their hierarchical structure. We predict new regulatory interactions and unravel the structure of the regulatory interaction patterns of orphan genes. Work supported by the NSF through Grants DMR-1507371 and IOS-1546858.

  12. Multistability in bidirectional associative memory neural networks

    International Nuclear Information System (INIS)

    Huang Gan; Cao Jinde

    2008-01-01

    In this Letter, the multistability issue is studied for Bidirectional Associative Memory (BAM) neural networks. Based on the existence and stability analysis of the neural networks with or without delay, it is found that the 2n-dimensional networks can have 3 n equilibria and 2 n equilibria of them are locally exponentially stable, where each layer of the BAM network has n neurons. Furthermore, the results has been extended to (n+m)-dimensional BAM neural networks, where there are n and m neurons on the two layers respectively. Finally, two numerical examples are presented to illustrate the validity of our results

  13. Multistability in bidirectional associative memory neural networks

    Science.gov (United States)

    Huang, Gan; Cao, Jinde

    2008-04-01

    In this Letter, the multistability issue is studied for Bidirectional Associative Memory (BAM) neural networks. Based on the existence and stability analysis of the neural networks with or without delay, it is found that the 2 n-dimensional networks can have 3 equilibria and 2 equilibria of them are locally exponentially stable, where each layer of the BAM network has n neurons. Furthermore, the results has been extended to (n+m)-dimensional BAM neural networks, where there are n and m neurons on the two layers respectively. Finally, two numerical examples are presented to illustrate the validity of our results.

  14. Bioinformatics, interaction network analysis, and neural networks to characterize gene expression of radicular cyst and periapical granuloma.

    Science.gov (United States)

    Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena

    2015-06-01

    Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  15. Coordinations between gene modules control the operation of plant amino acid metabolic networks

    Directory of Open Access Journals (Sweden)

    Galili Gad

    2009-01-01

    Full Text Available Abstract Background Being sessile organisms, plants should adjust their metabolism to dynamic changes in their environment. Such adjustments need particular coordination in branched metabolic networks in which a given metabolite can be converted into multiple other metabolites via different enzymatic chains. In the present report, we developed a novel "Gene Coordination" bioinformatics approach and use it to elucidate adjustable transcriptional interactions of two branched amino acid metabolic networks in plants in response to environmental stresses, using publicly available microarray results. Results Using our "Gene Coordination" approach, we have identified in Arabidopsis plants two oppositely regulated groups of "highly coordinated" genes within the branched Asp-family network of Arabidopsis plants, which metabolizes the amino acids Lys, Met, Thr, Ile and Gly, as well as a single group of "highly coordinated" genes within the branched aromatic amino acid metabolic network, which metabolizes the amino acids Trp, Phe and Tyr. These genes possess highly coordinated adjustable negative and positive expression responses to various stress cues, which apparently regulate adjustable metabolic shifts between competing branches of these networks. We also provide evidence implying that these highly coordinated genes are central to impose intra- and inter-network interactions between the Asp-family and aromatic amino acid metabolic networks as well as differential system interactions with other growth promoting and stress-associated genome-wide genes. Conclusion Our novel Gene Coordination elucidates that branched amino acid metabolic networks in plants are regulated by specific groups of highly coordinated genes that possess adjustable intra-network, inter-network and genome-wide transcriptional interactions. We also hypothesize that such transcriptional interactions enable regulatory metabolic adjustments needed for adaptation to the stresses.

  16. Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Song, Mingzhou (Joe) [New Mexico State University, Las Cruces; Lewis, Chris K. [New Mexico State University, Las Cruces; Lance, Eric [New Mexico State University, Las Cruces; Chesler, Elissa J [ORNL; Kirova, Roumyana [Bristol-Myers Squibb Pharmaceutical Research & Development, NJ; Langston, Michael A [University of Tennessee, Knoxville (UTK); Bergeson, Susan [Texas Tech University, Lubbock

    2009-01-01

    The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from high-throughput transcriptomic data is addressed. A network reconstruction algorithm was developed that uses the statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. Using temporal gene expression data collected from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol, this algorithm identified genes from a major neuronal pathway as putative components of the alcohol response mechanism. Three of these genes have known associations with alcohol in the literature. Several other potentially relevant genes, highlighted and agreeing with independent results from literature mining, may play a role in the response to alcohol. Additional, previously-unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.

  17. Inferring time-varying network topologies from gene expression data.

    Science.gov (United States)

    Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas

    2007-01-01

    Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.

  18. Correlated measurement error hampers association network inference

    NARCIS (Netherlands)

    Kaduk, M.; Hoefsloot, H.C.J.; Vis, D.J.; Reijmers, T.; Greef, J. van der; Smilde, A.K.; Hendriks, M.M.W.B.

    2014-01-01

    Modern chromatography-based metabolomics measurements generate large amounts of data in the form of abundances of metabolites. An increasingly popular way of representing and analyzing such data is by means of association networks. Ideally, such a network can be interpreted in terms of the

  19. Transcriptomic network analysis of micronuclei-related genes: a case study

    DEFF Research Database (Denmark)

    van Leeuwen, D. M.; Pedersen, Marie; Knudsen, Lisbeth E.

    2011-01-01

    checkpoint and aneuploidy. The MN-related gene network was tested against a transcriptomics case study associated with MN measurements. In this case study, transcriptomic data from children and adults differentially exposed to ambient air pollution in the Czech Republic were analysed and visualised......Mechanistically relevant information on responses of humans to xenobiotic exposure in relation to chemically induced biological effects, such as micronuclei (MN) formation can be obtained through large-scale transcriptomics studies. Network analysis may enhance the analysis and visualisation...... of such data. Therefore, this study aimed to develop a 'MN formation' network based on a priori knowledge, by using the pathway tool MetaCore. The gene network contained 27 genes and three gene complexes that are related to processes involved in MN formation, e.g. spindle assembly checkpoint, cell cycle...

  20. A flood-based information flow analysis and network minimization method for gene regulatory networks.

    Science.gov (United States)

    Pavlogiannis, Andreas; Mozhayskiy, Vadim; Tagkopoulos, Ilias

    2013-04-24

    Biological networks tend to have high interconnectivity, complex topologies and multiple types of interactions. This renders difficult the identification of sub-networks that are involved in condition- specific responses. In addition, we generally lack scalable methods that can reveal the information flow in gene regulatory and biochemical pathways. Doing so will help us to identify key participants and paths under specific environmental and cellular context. This paper introduces the theory of network flooding, which aims to address the problem of network minimization and regulatory information flow in gene regulatory networks. Given a regulatory biological network, a set of source (input) nodes and optionally a set of sink (output) nodes, our task is to find (a) the minimal sub-network that encodes the regulatory program involving all input and output nodes and (b) the information flow from the source to the sink nodes of the network. Here, we describe a novel, scalable, network traversal algorithm and we assess its potential to achieve significant network size reduction in both synthetic and E. coli networks. Scalability and sensitivity analysis show that the proposed method scales well with the size of the network, and is robust to noise and missing data. The method of network flooding proves to be a useful, practical approach towards information flow analysis in gene regulatory networks. Further extension of the proposed theory has the potential to lead in a unifying framework for the simultaneous network minimization and information flow analysis across various "omics" levels.

  1. Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.

    Science.gov (United States)

    Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich

    2004-03-01

    By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.

  2. Causal structure of oscillations in gene regulatory networks: Boolean analysis of ordinary differential equation attractors.

    Science.gov (United States)

    Sun, Mengyang; Cheng, Xianrui; Socolar, Joshua E S

    2013-06-01

    A common approach to the modeling of gene regulatory networks is to represent activating or repressing interactions using ordinary differential equations for target gene concentrations that include Hill function dependences on regulator gene concentrations. An alternative formulation represents the same interactions using Boolean logic with time delays associated with each network link. We consider the attractors that emerge from the two types of models in the case of a simple but nontrivial network: a figure-8 network with one positive and one negative feedback loop. We show that the different modeling approaches give rise to the same qualitative set of attractors with the exception of a possible fixed point in the ordinary differential equation model in which concentrations sit at intermediate values. The properties of the attractors are most easily understood from the Boolean perspective, suggesting that time-delay Boolean modeling is a useful tool for understanding the logic of regulatory networks.

  3. Cell cycle gene expression networks discovered using systems biology: Significance in carcinogenesis

    Science.gov (United States)

    Scott, RE; Ghule, PN; Stein, JL; Stein, GS

    2015-01-01

    The early stages of carcinogenesis are linked to defects in the cell cycle. A series of cell cycle checkpoints are involved in this process. The G1/S checkpoint that serves to integrate the control of cell proliferation and differentiation is linked to carcinogenesis and the mitotic spindle checkpoint with the development of chromosomal instability. This paper presents the outcome of systems biology studies designed to evaluate if networks of covariate cell cycle gene transcripts exist in proliferative mammalian tissues including mice, rats and humans. The GeneNetwork website that contains numerous gene expression datasets from different species, sexes and tissues represents the foundational resource for these studies (www.genenetwork.org). In addition, WebGestalt, a gene ontology tool, facilitated the identification of expression networks of genes that co-vary with key cell cycle targets, especially Cdc20 and Plk1 (www.bioinfo.vanderbilt.edu/webgestalt). Cell cycle expression networks of such covariate mRNAs exist in multiple proliferative tissues including liver, lung, pituitary, adipose and lymphoid tissues among others but not in brain or retina that have low proliferative potential. Sixty-three covariate cell cycle gene transcripts (mRNAs) compose the average cell cycle network with p = e−13 to e−36. Cell cycle expression networks show species, sex and tissue variability and they are enriched in mRNA transcripts associated with mitosis many of which are associated with chromosomal instability. PMID:25808367

  4. Evolutionary conservation and network structure characterize genes of phenotypic relevance for mitosis in human.

    Directory of Open Access Journals (Sweden)

    Marek Ostaszewski

    Full Text Available The impact of gene silencing on cellular phenotypes is difficult to establish due to the complexity of interactions in the associated biological processes and pathways. A recent genome-wide RNA knock-down study both identified and phenotypically characterized a set of important genes for the cell cycle in HeLa cells. Here, we combine a molecular interaction network analysis, based on physical and functional protein interactions, in conjunction with evolutionary information, to elucidate the common biological and topological properties of these key genes. Our results show that these genes tend to be conserved with their corresponding protein interactions across several species and are key constituents of the evolutionary conserved molecular interaction network. Moreover, a group of bistable network motifs is found to be conserved within this network, which are likely to influence the network stability and therefore the robustness of cellular functioning. They form a cluster, which displays functional homogeneity and is significantly enriched in genes phenotypically relevant for mitosis. Additional results reveal a relationship between specific cellular processes and the phenotypic outcomes induced by gene silencing. This study introduces new ideas regarding the relationship between genotype and phenotype in the context of the cell cycle. We show that the analysis of molecular interaction networks can result in the identification of genes relevant to cellular processes, which is a promising avenue for future research.

  5. Disease candidate gene identification and prioritization using protein interaction networks

    Directory of Open Access Journals (Sweden)

    Aronow Bruce J

    2009-02-01

    Full Text Available Abstract Background Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN analyses. Results For the first time, extended versions of the PageRank and HITS algorithms, and the K-Step Markov method are applied to prioritize disease candidate genes in a training-test schema. Using a list of known disease-related genes from our earlier study as a training set ("seeds", and the rest of the known genes as a test list, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Under appropriate settings – for example, a back probability of 0.3 for PageRank with Priors and HITS with Priors, and step size 6 for K-Step Markov method – the three methods achieved a comparable AUC value, suggesting a similar performance. Conclusion Even though network-based methods are generally not as effective as integrated functional annotation-based methods for disease candidate gene prioritization, in a one-to-one comparison, PPIN-based candidate gene prioritization performs better than all other gene features or annotations. Additionally, we demonstrate that methods used for studying both social and Web networks can be successfully used for disease candidate gene prioritization.

  6. Convergent evolution of gene networks by single-gene duplications in higher eukaryotes

    OpenAIRE

    Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich

    2004-01-01

    By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix–loop–helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks e...

  7. Genes and Gene Networks Involved in Sodium Fluoride-Elicited Cell Death Accompanying Endoplasmic Reticulum Stress in Oral Epithelial Cells

    Directory of Open Access Journals (Sweden)

    Yoshiaki Tabuchi

    2014-05-01

    Full Text Available Here, to understand the molecular mechanisms underlying cell death induced by sodium fluoride (NaF, we analyzed gene expression patterns in rat oral epithelial ROE2 cells exposed to NaF using global-scale microarrays and bioinformatics tools. A relatively high concentration of NaF (2 mM induced cell death concomitant with decreases in mitochondrial membrane potential, chromatin condensation and caspase-3 activation. Using 980 probe sets, we identified 432 up-regulated and 548 down-regulated genes, that were differentially expressed by >2.5-fold in the cells treated with 2 mM of NaF and categorized them into 4 groups by K-means clustering. Ingenuity® pathway analysis revealed several gene networks from gene clusters. The gene networks Up-I and Up-II included many up-regulated genes that were mainly associated with the biological function of induction or prevention of cell death, respectively, such as Atf3, Ddit3 and Fos (for Up-I and Atf4 and Hspa5 (for Up-II. Interestingly, knockdown of Ddit3 and Hspa5 significantly increased and decreased the number of viable cells, respectively. Moreover, several endoplasmic reticulum (ER stress-related genes including, Ddit3, Atf4 and Hapa5, were observed in these gene networks. These findings will provide further insight into the molecular mechanisms of NaF-induced cell death accompanying ER stress in oral epithelial cells.

  8. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    Science.gov (United States)

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  9. Tissue-specific functional networks for prioritizing phenotype and disease genes.

    Directory of Open Access Journals (Sweden)

    Yuanfang Guan

    Full Text Available Integrated analyses of functional genomics data have enormous potential for identifying phenotype-associated genes. Tissue-specificity is an important aspect of many genetic diseases, reflecting the potentially different roles of proteins and pathways in diverse cell lineages. Accounting for tissue specificity in global integration of functional genomics data is challenging, as "functionality" and "functional relationships" are often not resolved for specific tissue types. We address this challenge by generating tissue-specific functional networks, which can effectively represent the diversity of protein function for more accurate identification of phenotype-associated genes in the laboratory mouse. Specifically, we created 107 tissue-specific functional relationship networks through integration of genomic data utilizing knowledge of tissue-specific gene expression patterns. Cross-network comparison revealed significantly changed genes enriched for functions related to specific tissue development. We then utilized these tissue-specific networks to predict genes associated with different phenotypes. Our results demonstrate that prediction performance is significantly improved through using the tissue-specific networks as compared to the global functional network. We used a testis-specific functional relationship network to predict genes associated with male fertility and spermatogenesis phenotypes, and experimentally confirmed one top prediction, Mbyl1. We then focused on a less-common genetic disease, ataxia, and identified candidates uniquely predicted by the cerebellum network, which are supported by both literature and experimental evidence. Our systems-level, tissue-specific scheme advances over traditional global integration and analyses and establishes a prototype to address the tissue-specific effects of genetic perturbations, diseases and drugs.

  10. Fused Regression for Multi-source Gene Regulatory Network Inference.

    Directory of Open Access Journals (Sweden)

    Kari Y Lam

    2016-12-01

    Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.

  11. A network approach to predict pathogenic genes for Fusarium graminearum.

    Science.gov (United States)

    Liu, Xiaoping; Tang, Wei-Hua; Zhao, Xing-Ming; Chen, Luonan

    2010-10-04

    Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB), which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN) of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other pathogenic fungi, which

  12. Medusa structure of the gene regulatory network: dominance of transcription factors in cancer subtype classification.

    Science.gov (United States)

    Guo, Yuchun; Feng, Ying; Trivedi, Niraj S; Huang, Sui

    2011-05-01

    Gene expression profiles consisting of ten thousands of transcripts are used for clustering of tissue, such as tumors, into subtypes, often without considering the underlying reason that the distinct patterns of expression arise because of constraints in the realization of gene expression profiles imposed by the gene regulatory network. The topology of this network has been suggested to consist of a regulatory core of genes represented most prominently by transcription factors (TFs) and microRNAs, that influence the expression of other genes, and of a periphery of 'enslaved' effector genes that are regulated but not regulating. This 'medusa' architecture implies that the core genes are much stronger determinants of the realized gene expression profiles. To test this hypothesis, we examined the clustering of gene expression profiles into known tumor types to quantitatively demonstrate that TFs, and even more pronounced, microRNAs, are much stronger discriminators of tumor type specific gene expression patterns than a same number of randomly selected or metabolic genes. These findings lend support to the hypothesis of a medusa architecture and of the canalizing nature of regulation by microRNAs. They also reveal the degree of freedom for the expression of peripheral genes that are less stringently associated with a tissue type specific global gene expression profile.

  13. Association of rs1801157 single nucleotide polymorphism of CXCL12 gene in breast cancer in Pakistan and in-silico expression analysis of CXCL12–CXCR4 associated biological regulatory network

    Directory of Open Access Journals (Sweden)

    Samra Khalid

    2017-09-01

    Full Text Available Background C-X-C chemokine ligand 12 (CXCL12 has important implications in breast cancer (BC pathogenesis. It is selectively expressed on B and T lymphocytes and is involved in hematopoiesis, thymocyte trafficking, stem cell motility, neovascularization, and tumorigenesis. The single nucleotide polymorphism (SNP rs1801157 of CXCL12 gene has been found to be associated with higher risk of BC. Methods Our study focuses on the genotypic and allelic distribution of SNP (rs1801157; G/A in Pakistani population as well as its association with the clinico-pathological features. The association between rs1801157 genotypes (G/A and BC risks was assessed by a multivariate logistic regression (MLR analysis. Genotyping was performed in both healthy individuals and patients of BC using PCR-restriction fragment length polymorphism (PCR-RFLP method. Furthermore, in-silico approaches were adapted to investigate the association of CXCL12 and its receptor CXCR4 with genes/proteins involved in BC signalling. Results Significant differences in allelic and genotypic distribution between BC patients and healthy individuals of genotype (G/G and (A/G (p  0.05 was assessed. In a MLR analysis, a number of variables including age, weight of an individual, affected lymph nodes, hormonal status (estrogen and progesterone receptor, alcohol consumption and family history associated with the GG genotype (GG:AA, odds ratio (OR = 1.30, 95% CI [1.06–1.60] were found to be independent risk factors for BC. Our in-vitro results suggest that genotype GG is possibly increasing the risk of BC in Pakistani cohorts. in-silico analysis finds that CXCL12–CXCR4 is associated with an increased expression of PDZK1, PI3k and Akt which lead the breast tumor towards metastasis. Conclusion Multiple targets such as CXCL12, CXCR4, PDZK1, PI3k and Akt can be inhibited in combined strategies to treat BC metastasis.

  14. Semi-supervised prediction of gene regulatory networks using ...

    Indian Academy of Sciences (India)

    2015-09-28

    Sep 28, 2015 ... Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging ... two types of methods differ primarily based on whether ..... negligible, allowing us to draw the qualitative conclusions .... research will be conducted to develop additional biologically.

  15. Functional modules by relating protein interaction networks and gene expression.

    Science.gov (United States)

    Tornow, Sabine; Mewes, H W

    2003-11-01

    Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships.

  16. On the dynamics of a gene regulatory network

    International Nuclear Information System (INIS)

    Grammaticos, B; Carstea, A S; Ramani, A

    2006-01-01

    We examine the dynamics of a network of genes focusing on a periodic chain of genes, of arbitrary length. We show that within a given class of sigmoids representing the equilibrium probability of the binding of the RNA polymerase to the core promoter, the system possesses a single stable fixed point. By slightly modifying the sigmoid, introducing 'stiffer' forms, we show that it is possible to find network configurations exhibiting bistable behaviour. Our results do not depend crucially on the length of the chain considered: calculations with finite chains lead to similar results. However, a realistic study of regulatory genetic networks would require the consideration of more complex topologies and interactions

  17. Identifying key genes associated with acute myocardial infarction.

    Science.gov (United States)

    Cheng, Ming; An, Shoukuan; Li, Junquan

    2017-10-01

    This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21-5p and hsa-miR-30c-5p were obviously decreased in AMI. A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs.

  18. Linear control theory for gene network modeling.

    Science.gov (United States)

    Shin, Yong-Jun; Bleris, Leonidas

    2010-09-16

    Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain) and linear state-space (time domain) can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.

  19. SELANSI: a toolbox for simulation of stochastic gene regulatory networks.

    Science.gov (United States)

    Pájaro, Manuel; Otero-Muras, Irene; Vázquez, Carlos; Alonso, Antonio A

    2018-03-01

    Gene regulation is inherently stochastic. In many applications concerning Systems and Synthetic Biology such as the reverse engineering and the de novo design of genetic circuits, stochastic effects (yet potentially crucial) are often neglected due to the high computational cost of stochastic simulations. With advances in these fields there is an increasing need of tools providing accurate approximations of the stochastic dynamics of gene regulatory networks (GRNs) with reduced computational effort. This work presents SELANSI (SEmi-LAgrangian SImulation of GRNs), a software toolbox for the simulation of stochastic multidimensional gene regulatory networks. SELANSI exploits intrinsic structural properties of gene regulatory networks to accurately approximate the corresponding Chemical Master Equation with a partial integral differential equation that is solved by a semi-lagrangian method with high efficiency. Networks under consideration might involve multiple genes with self and cross regulations, in which genes can be regulated by different transcription factors. Moreover, the validity of the method is not restricted to a particular type of kinetics. The tool offers total flexibility regarding network topology, kinetics and parameterization, as well as simulation options. SELANSI runs under the MATLAB environment, and is available under GPLv3 license at https://sites.google.com/view/selansi. antonio@iim.csic.es. © The Author(s) 2017. Published by Oxford University Press.

  20. Predictive networks: a flexible, open source, web application for integration and analysis of human gene networks.

    Science.gov (United States)

    Haibe-Kains, Benjamin; Olsen, Catharina; Djebbari, Amira; Bontempi, Gianluca; Correll, Mick; Bouton, Christopher; Quackenbush, John

    2012-01-01

    Genomics provided us with an unprecedented quantity of data on the genes that are activated or repressed in a wide range of phenotypes. We have increasingly come to recognize that defining the networks and pathways underlying these phenotypes requires both the integration of multiple data types and the development of advanced computational methods to infer relationships between the genes and to estimate the predictive power of the networks through which they interact. To address these issues we have developed Predictive Networks (PN), a flexible, open-source, web-based application and data services framework that enables the integration, navigation, visualization and analysis of gene interaction networks. The primary goal of PN is to allow biomedical researchers to evaluate experimentally derived gene lists in the context of large-scale gene interaction networks. The PN analytical pipeline involves two key steps. The first is the collection of a comprehensive set of known gene interactions derived from a variety of publicly available sources. The second is to use these 'known' interactions together with gene expression data to infer robust gene networks. The PN web application is accessible from http://predictivenetworks.org. The PN code base is freely available at https://sourceforge.net/projects/predictivenets/.

  1. Network Based Integrated Analysis of Phenotype-Genotype Data for Prioritization of Candidate Symptom Genes

    Directory of Open Access Journals (Sweden)

    Xing Li

    2014-01-01

    Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

  2. FocusHeuristics - expression-data-driven network optimization and disease gene prediction.

    Science.gov (United States)

    Ernst, Mathias; Du, Yang; Warsow, Gregor; Hamed, Mohamed; Endlich, Nicole; Endlich, Karlhans; Murua Escobar, Hugo; Sklarz, Lisa-Madeleine; Sender, Sina; Junghanß, Christian; Möller, Steffen; Fuellen, Georg; Struckmann, Stephan

    2017-02-16

    To identify genes contributing to disease phenotypes remains a challenge for bioinformatics. Static knowledge on biological networks is often combined with the dynamics observed in gene expression levels over disease development, to find markers for diagnostics and therapy, and also putative disease-modulatory drug targets and drugs. The basis of current methods ranges from a focus on expression-levels (Limma) to concentrating on network characteristics (PageRank, HITS/Authority Score), and both (DeMAND, Local Radiality). We present an integrative approach (the FocusHeuristics) that is thoroughly evaluated based on public expression data and molecular disease characteristics provided by DisGeNet. The FocusHeuristics combines three scores, i.e. the log fold change and another two, based on the sum and difference of log fold changes of genes/proteins linked in a network. A gene is kept when one of the scores to which it contributes is above a threshold. Our FocusHeuristics is both, a predictor for gene-disease-association and a bioinformatics method to reduce biological networks to their disease-relevant parts, by highlighting the dynamics observed in expression data. The FocusHeuristics is slightly, but significantly better than other methods by its more successful identification of disease-associated genes measured by AUC, and it delivers mechanistic explanations for its choice of genes.

  3. A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.

    Science.gov (United States)

    Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo

    2015-07-01

    In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.

  4. Network-Guided Key Gene Discovery for a Given Cellular Process

    DEFF Research Database (Denmark)

    He, Feng Q; Ollert, Markus

    2018-01-01

    Identification of key genes for a given physiological or pathological process is an essential but still very challenging task for the entire biomedical research community. Statistics-based approaches, such as genome-wide association study (GWAS)- or quantitative trait locus (QTL)-related analysis...... have already made enormous contributions to identifying key genes associated with a given disease or phenotype, the success of which is however very much dependent on a huge number of samples. Recent advances in network biology, especially network inference directly from genome-scale data...

  5. Inductive matrix completion for predicting gene-disease associations.

    Science.gov (United States)

    Natarajan, Nagarajan; Dhillon, Inderjit S

    2014-06-15

    Most existing methods for predicting causal disease genes rely on specific type of evidence, and are therefore limited in terms of applicability. More often than not, the type of evidence available for diseases varies-for example, we may know linked genes, keywords associated with the disease obtained by mining text, or co-occurrence of disease symptoms in patients. Similarly, the type of evidence available for genes varies-for example, specific microarray probes convey information only for certain sets of genes. In this article, we apply a novel matrix-completion method called Inductive Matrix Completion to the problem of predicting gene-disease associations; it combines multiple types of evidence (features) for diseases and genes to learn latent factors that explain the observed gene-disease associations. We construct features from different biological sources such as microarray expression data and disease-related textual data. A crucial advantage of the method is that it is inductive; it can be applied to diseases not seen at training time, unlike traditional matrix-completion approaches and network-based inference methods that are transductive. Comparison with state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database shows that the proposed approach is substantially better-it has close to one-in-four chance of recovering a true association in the top 100 predictions, compared to the recently proposed Catapult method (second best) that has bigdata.ices.utexas.edu/project/gene-disease. © The Author 2014. Published by Oxford University Press.

  6. Combining many interaction networks to predict gene function and analyze gene lists.

    Science.gov (United States)

    Mostafavi, Sara; Morris, Quaid

    2012-05-01

    In this article, we review how interaction networks can be used alone or in combination in an automated fashion to provide insight into gene and protein function. We describe the concept of a "gene-recommender system" that can be applied to any large collection of interaction networks to make predictions about gene or protein function based on a query list of proteins that share a function of interest. We discuss these systems in general and focus on one specific system, GeneMANIA, that has unique features and uses different algorithms from the majority of other systems. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Multi-Valued Associative Memory Neural Network

    Institute of Scientific and Technical Information of China (English)

    修春波; 刘向东; 张宇河

    2003-01-01

    A novel learning method for multi-valued associative memory network is introduced, which is based on Hebb rule, but utilizes more information. According to the current probe vector, the connection weights matrix could be chosen dynamically. Double-valued and multi-valued associative memory are all realized in our simulation experiment. The experimental results show that the method could enhance the associative success rate.

  8. Chromosome Gene Orientation Inversion Networks (GOINs) of Plasmodium Proteome.

    Science.gov (United States)

    Quevedo-Tumailli, Viviana F; Ortega-Tenezaca, Bernabé; González-Díaz, Humbert

    2018-03-02

    The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and

  9. Spatiotemporal network motif reveals the biological traits of developmental gene regulatory networks in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Kim Man-Sun

    2012-05-01

    Full Text Available Abstract Background Network motifs provided a “conceptual tool” for understanding the functional principles of biological networks, but such motifs have primarily been used to consider static network structures. Static networks, however, cannot be used to reveal time- and region-specific traits of biological systems. To overcome this limitation, we proposed the concept of a “spatiotemporal network motif,” a spatiotemporal sequence of network motifs of sub-networks which are active only at specific time points and body parts. Results On the basis of this concept, we analyzed the developmental gene regulatory network of the Drosophila melanogaster embryo. We identified spatiotemporal network motifs and investigated their distribution pattern in time and space. As a result, we found how key developmental processes are temporally and spatially regulated by the gene network. In particular, we found that nested feedback loops appeared frequently throughout the entire developmental process. From mathematical simulations, we found that mutual inhibition in the nested feedback loops contributes to the formation of spatial expression patterns. Conclusions Taken together, the proposed concept and the simulations can be used to unravel the design principle of developmental gene regulatory networks.

  10. Linear control theory for gene network modeling.

    Directory of Open Access Journals (Sweden)

    Yong-Jun Shin

    Full Text Available Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain and linear state-space (time domain can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.

  11. The Reconstruction and Analysis of Gene Regulatory Networks.

    Science.gov (United States)

    Zheng, Guangyong; Huang, Tao

    2018-01-01

    In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.

  12. Inferring the gene network underlying the branching of tomato inflorescence.

    Directory of Open Access Journals (Sweden)

    Laura Astola

    Full Text Available The architecture of tomato inflorescence strongly affects flower production and subsequent crop yield. To understand the genetic activities involved, insight into the underlying network of genes that initiate and control the sympodial growth in the tomato is essential. In this paper, we show how the structure of this network can be derived from available data of the expressions of the involved genes. Our approach starts from employing biological expert knowledge to select the most probable gene candidates behind branching behavior. To find how these genes interact, we develop a stepwise procedure for computational inference of the network structure. Our data consists of expression levels from primary shoot meristems, measured at different developmental stages on three different genotypes of tomato. With the network inferred by our algorithm, we can explain the dynamics corresponding to all three genotypes simultaneously, despite their apparent dissimilarities. We also correctly predict the chronological order of expression peaks for the main hubs in the network. Based on the inferred network, using optimal experimental design criteria, we are able to suggest an informative set of experiments for further investigation of the mechanisms underlying branching behavior.

  13. ENEN - European Nuclear Educational Network Association

    International Nuclear Information System (INIS)

    De Regge, P.

    2006-01-01

    After the pioneering initiative of BNEN, the Belgian Nuclear higher Education Network, other countries, e.g. Italy, United Kingdom, Germany, Switzerland, etc., created their own pool of education. At the European level the ENEN Association (European Nuclear Education Network) is a sustainable product generated by an FP5 project. The main objective of the ENEN Association is the preservation and the further development of higher nuclear education and expertise. This objective is realized through the co-operation between European universities, involved in education and research in the nuclear engineering field, nuclear research centres and nuclear industry

  14. Singular Perturbation Analysis and Gene Regulatory Networks with Delay

    Science.gov (United States)

    Shlykova, Irina; Ponosov, Arcady

    2009-09-01

    There are different ways of how to model gene regulatory networks. Differential equations allow for a detailed description of the network's dynamics and provide an explicit model of the gene concentration changes over time. Production and relative degradation rate functions used in such models depend on the vector of steeply sloped threshold functions which characterize the activity of genes. The most popular example of the threshold functions comes from the Boolean network approach, where the threshold functions are given by step functions. The system of differential equations becomes then piecewise linear. The dynamics of this system can be described very easily between the thresholds, but not in the switching domains. For instance this approach fails to analyze stationary points of the system and to define continuous solutions in the switching domains. These problems were studied in [2], [3], but the proposed model did not take into account a time delay in cellular systems. However, analysis of real gene expression data shows a considerable number of time-delayed interactions suggesting that time delay is essential in gene regulation. Therefore, delays may have a great effect on the dynamics of the system presenting one of the critical factors that should be considered in reconstruction of gene regulatory networks. The goal of this work is to apply the singular perturbation analysis to certain systems with delay and to obtain an analog of Tikhonov's theorem, which provides sufficient conditions for constracting the limit system in the delay case.

  15. Learning gene regulatory networks from only positive and unlabeled data

    Directory of Open Access Journals (Sweden)

    Elkan Charles

    2010-05-01

    Full Text Available Abstract Background Recently, supervised learning methods have been exploited to reconstruct gene regulatory networks from gene expression data. The reconstruction of a network is modeled as a binary classification problem for each pair of genes. A statistical classifier is trained to recognize the relationships between the activation profiles of gene pairs. This approach has been proven to outperform previous unsupervised methods. However, the supervised approach raises open questions. In particular, although known regulatory connections can safely be assumed to be positive training examples, obtaining negative examples is not straightforward, because definite knowledge is typically not available that a given pair of genes do not interact. Results A recent advance in research on data mining is a method capable of learning a classifier from only positive and unlabeled examples, that does not need labeled negative examples. Applied to the reconstruction of gene regulatory networks, we show that this method significantly outperforms the current state of the art of machine learning methods. We assess the new method using both simulated and experimental data, and obtain major performance improvement. Conclusions Compared to unsupervised methods for gene network inference, supervised methods are potentially more accurate, but for training they need a complete set of known regulatory connections. A supervised method that can be trained using only positive and unlabeled data, as presented in this paper, is especially beneficial for the task of inferring gene regulatory networks, because only an incomplete set of known regulatory connections is available in public databases such as RegulonDB, TRRD, KEGG, Transfac, and IPA.

  16. Characterization of differentially expressed genes involved in pathways associated with gastric cancer.

    Directory of Open Access Journals (Sweden)

    Hao Li

    Full Text Available To explore the patterns of gene expression in gastric cancer, a total of 26 paired gastric cancer and noncancerous tissues from patients were enrolled for gene expression microarray analyses. Limma methods were applied to analyze the data, and genes were considered to be significantly differentially expressed if the False Discovery Rate (FDR value was 2. Subsequently, Gene Ontology (GO categories were used to analyze the main functions of the differentially expressed genes. According to the Kyoto Encyclopedia of Genes and Genomes (KEGG database, we found pathways significantly associated with the differential genes. Gene-Act network and co-expression network were built respectively based on the relationships among the genes, proteins and compounds in the database. 2371 mRNAs and 350 lncRNAs considered as significantly differentially expressed genes were selected for the further analysis. The GO categories, pathway analyses and the Gene-Act network showed a consistent result that up-regulated genes were responsible for tumorigenesis, migration, angiogenesis and microenvironment formation, while down-regulated genes were involved in metabolism. These results of this study provide some novel findings on coding RNAs, lncRNAs, pathways and the co-expression network in gastric cancer which will be useful to guide further investigation and target therapy for this disease.

  17. Relaxation rates of gene expression kinetics reveal the feedback signs of autoregulatory gene networks

    Science.gov (United States)

    Jia, Chen; Qian, Hong; Chen, Min; Zhang, Michael Q.

    2018-03-01

    The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterized by the spectral gap of the Markov model, and the feedback sign of the underlying gene circuit. When a network has no feedback, the relaxation rate is exactly the decaying rate of the protein. We further show that positive feedback always slows down the relaxation kinetics while negative feedback always speeds it up. Numerical simulations demonstrate that this relation provides a possible method to infer the feedback topology of autoregulatory gene networks by using time-series data of gene expression.

  18. Gene therapy for Stargardt disease associated with ABCA4 gene.

    Science.gov (United States)

    Han, Zongchao; Conley, Shannon M; Naash, Muna I

    2014-01-01

    Mutations in the photoreceptor-specific flippase ABCA4 lead to accumulation of the toxic bisretinoid A2E, resulting in atrophy of the retinal pigment epithelium (RPE) and death of the photoreceptor cells. Many blinding diseases are associated with these mutations including Stargardt's disease (STGD1), cone-rod dystrophy, retinitis pigmentosa (RP), and increased susceptibility to age-related macular degeneration. There are no curative treatments for any of these dsystrophies. While the monogenic nature of many of these conditions makes them amenable to treatment with gene therapy, the ABCA4 cDNA is 6.8 kb and is thus too large for the AAV vectors which have been most successful for other ocular genes. Here we review approaches to ABCA4 gene therapy including treatment with novel AAV vectors, lentiviral vectors, and non-viral compacted DNA nanoparticles. Lentiviral and compacted DNA nanoparticles in particular have a large capacity and have been successful in improving disease phenotypes in the Abca4 (-/-) murine model. Excitingly, two Phase I/IIa clinical trials are underway to treat patients with ABCA4-associated Startgardt's disease (STGD1). As a result of the development of these novel technologies, effective therapies for ABCA4-associated diseases may finally be within reach.

  19. Machine Learning-Assisted Network Inference Approach to Identify a New Class of Genes that Coordinate the Functionality of Cancer Networks.

    Science.gov (United States)

    Ghanat Bari, Mehrab; Ung, Choong Yong; Zhang, Cheng; Zhu, Shizhen; Li, Hu

    2017-08-01

    Emerging evidence indicates the existence of a new class of cancer genes that act as "signal linkers" coordinating oncogenic signals between mutated and differentially expressed genes. While frequently mutated oncogenes and differentially expressed genes, which we term Class I cancer genes, are readily detected by most analytical tools, the new class of cancer-related genes, i.e., Class II, escape detection because they are neither mutated nor differentially expressed. Given this hypothesis, we developed a Machine Learning-Assisted Network Inference (MALANI) algorithm, which assesses all genes regardless of expression or mutational status in the context of cancer etiology. We used 8807 expression arrays, corresponding to 9 cancer types, to build more than 2 × 10 8 Support Vector Machine (SVM) models for reconstructing a cancer network. We found that ~3% of ~19,000 not differentially expressed genes are Class II cancer gene candidates. Some Class II genes that we found, such as SLC19A1 and ATAD3B, have been recently reported to associate with cancer outcomes. To our knowledge, this is the first study that utilizes both machine learning and network biology approaches to uncover Class II cancer genes in coordinating functionality in cancer networks and will illuminate our understanding of how genes are modulated in a tissue-specific network contribute to tumorigenesis and therapy development.

  20. Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.

    Science.gov (United States)

    Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju

    2017-04-27

    Our understanding of the molecular mechanisms underlying Alzheimer's disease (AD) remains incomplete. Previous studies have revealed that genetic factors provide a significant contribution to the pathogenesis and development of AD. In the past years, numerous genes implicated in this disease have been identified via genetic association studies on candidate genes or at the genome-wide level. However, in many cases, the roles of these genes and their interactions in AD are still unclear. A comprehensive and systematic analysis focusing on the biological function and interactions of these genes in the context of AD will therefore provide valuable insights to understand the molecular features of the disease. In this study, we collected genes potentially associated with AD by screening publications on genetic association studies deposited in PubMed. The major biological themes linked with these genes were then revealed by function and biochemical pathway enrichment analysis, and the relation between the pathways was explored by pathway crosstalk analysis. Furthermore, the network features of these AD-related genes were analyzed in the context of human interactome and an AD-specific network was inferred using the Steiner minimal tree algorithm. We compiled 430 human genes reported to be associated with AD from 823 publications. Biological theme analysis indicated that the biological processes and biochemical pathways related to neurodevelopment, metabolism, cell growth and/or survival, and immunology were enriched in these genes. Pathway crosstalk analysis then revealed that the significantly enriched pathways could be grouped into three interlinked modules-neuronal and metabolic module, cell growth/survival and neuroendocrine pathway module, and immune response-related module-indicating an AD-specific immune-endocrine-neuronal regulatory network. Furthermore, an AD-specific protein network was inferred and novel genes potentially associated with AD were identified. By

  1. Learning Gene Regulatory Networks Computationally from Gene Expression Data Using Weighted Consensus

    KAUST Repository

    Fujii, Chisato

    2015-04-16

    Gene regulatory networks analyze the relationships between genes allowing us to un- derstand the gene regulatory interactions in systems biology. Gene expression data from the microarray experiments is used to obtain the gene regulatory networks. How- ever, the microarray data is discrete, noisy and non-linear which makes learning the networks a challenging problem and existing gene network inference methods do not give consistent results. Current state-of-the-art study uses the average-ranking-based consensus method to combine and average the ranked predictions from individual methods. However each individual method has an equal contribution to the consen- sus prediction. We have developed a linear programming-based consensus approach which uses learned weights from linear programming among individual methods such that the methods have di↵erent weights depending on their performance. Our result reveals that assigning di↵erent weights to individual methods rather than giving them equal weights improves the performance of the consensus. The linear programming- based consensus method is evaluated and it had the best performance on in silico and Saccharomyces cerevisiae networks, and the second best on the Escherichia coli network outperformed by Inferelator Pipeline method which gives inconsistent results across a wide range of microarray data sets.

  2. Differential Gene Expression in Colon Tissue Associated With Diet, Lifestyle, and Related Oxidative Stress.

    Directory of Open Access Journals (Sweden)

    Martha L Slattery

    Full Text Available Several diet and lifestyle factors may impact health by influencing oxidative stress levels. We hypothesize that level of cigarette smoking, alcohol, anti-inflammatory drugs, and diet alter gene expression. We analyzed RNA-seq data from 144 colon cancer patients who had information on recent cigarette smoking, recent alcohol consumption, diet, and recent aspirin/non-steroidal anti-inflammatory use. Using a false discovery rate of 0.1, we evaluated gene differential expression between high and low levels of exposure using DESeq2. Ingenuity Pathway Analysis (IPA was used to determine networks associated with de-regulated genes in our data. We identified 46 deregulated genes associated with recent cigarette use; these genes enriched causal networks regulated by TEK and MAP2K3. Different differentially expressed genes were associated with type of alcohol intake; five genes were associated with total alcohol, six were associated with beer intake, six were associated with wine intake, and four were associated with liquor consumption. Recent use of aspirin and/or ibuprofen was associated with differential expression of TMC06, ST8SIA4, and STEAP3 while a summary oxidative balance score (OBS was associated with SYCP3, HDX, and NRG4 (all up-regulated with greater oxidative balance. Of the dietary antioxidants and carotenoids evaluated only intake of beta carotene (1 gene, Lutein/Zeaxanthine (5 genes, and Vitamin E (4 genes were associated with differential gene expression. There were similarities in biological function of de-regulated genes associated with various dietary and lifestyle factors. Our data support the hypothesis that diet and lifestyle factors associated with oxidative stress can alter gene expression. However genes altered were unique to type of alcohol and type of antioxidant. Because of potential differences in associations observed between platforms these findings need replication in other populations.

  3. An integer optimization algorithm for robust identification of non-linear gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Chemmangattuvalappil Nishanth

    2012-09-01

    Full Text Available Abstract Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters

  4. The Global Alzheimer's Association Interactive Network.

    Science.gov (United States)

    Toga, Arthur W; Neu, Scott C; Bhatt, Priya; Crawford, Karen L; Ashish, Naveen

    2016-01-01

    The Global Alzheimer's Association Interactive Network (GAAIN) is consolidating the efforts of independent Alzheimer's disease data repositories around the world with the goals of revealing more insights into the causes of Alzheimer's disease, improving treatments, and designing preventative measures that delay the onset of physical symptoms. We developed a system for federating these repositories that is reliant on the tenets that (1) its participants require incentives to join, (2) joining the network is not disruptive to existing repository systems, and (3) the data ownership rights of its members are protected. We are currently in various phases of recruitment with over 55 data repositories in North America, Europe, Asia, and Australia and can presently query >250,000 subjects using GAAIN's search interfaces. GAAIN's data sharing philosophy, which guided our architectural choices, is conducive to motivating membership in a voluntary data sharing network. Copyright © 2016 The Alzheimer's Association. Published by Elsevier Inc. All rights reserved.

  5. [Weighted gene co-expression network analysis in biomedicine research].

    Science.gov (United States)

    Liu, Wei; Li, Li; Ye, Hua; Tu, Wei

    2017-11-25

    High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.

  6. Endothelial nitric oxide synthase gene polymorphisms associated ...

    African Journals Online (AJOL)

    Endothelial nitric oxide synthase (NOS3) is involved in key steps of immune response. Genetic factors predispose individuals to periodontal disease. This study's aim was to explore the association between NOS3 gene polymorphisms and clinical parameters in patients with periodontal disease. Genomic DNA was obtained ...

  7. Transcriptional control in the segmentation gene network of Drosophila.

    Directory of Open Access Journals (Sweden)

    Mark D Schroeder

    2004-09-01

    Full Text Available The segmentation gene network of Drosophila consists of maternal and zygotic factors that generate, by transcriptional (cross- regulation, expression patterns of increasing complexity along the anterior-posterior axis of the embryo. Using known binding site information for maternal and zygotic gap transcription factors, the computer algorithm Ahab recovers known segmentation control elements (modules with excellent success and predicts many novel modules within the network and genome-wide. We show that novel module predictions are highly enriched in the network and typically clustered proximal to the promoter, not only upstream, but also in intronic space and downstream. When placed upstream of a reporter gene, they consistently drive patterned blastoderm expression, in most cases faithfully producing one or more pattern elements of the endogenous gene. Moreover, we demonstrate for the entire set of known and newly validated modules that Ahab's prediction of binding sites correlates well with the expression patterns produced by the modules, revealing basic rules governing their composition. Specifically, we show that maternal factors consistently act as activators and that gap factors act as repressors, except for the bimodal factor Hunchback. Our data suggest a simple context-dependent rule for its switch from repressive to activating function. Overall, the composition of modules appears well fitted to the spatiotemporal distribution of their positive and negative input factors. Finally, by comparing Ahab predictions with different categories of transcription factor input, we confirm the global regulatory structure of the segmentation gene network, but find odd skipped behaving like a primary pair-rule gene. The study expands our knowledge of the segmentation gene network by increasing the number of experimentally tested modules by 50%. For the first time, the entire set of validated modules is analyzed for binding site composition under a

  8. Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

    Directory of Open Access Journals (Sweden)

    Fei Xiao

    Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.

  9. Network analysis of inflammatory genes and their transcriptional regulators in coronary artery disease.

    Directory of Open Access Journals (Sweden)

    Jiny Nair

    Full Text Available Network analysis is a novel method to understand the complex pathogenesis of inflammation-driven atherosclerosis. Using this approach, we attempted to identify key inflammatory genes and their core transcriptional regulators in coronary artery disease (CAD. Initially, we obtained 124 candidate genes associated with inflammation and CAD using Polysearch and CADgene database for which protein-protein interaction network was generated using STRING 9.0 (Search Tool for the Retrieval of Interacting Genes and visualized using Cytoscape v 2.8.3. Based on betweenness centrality (BC and node degree as key topological parameters, we identified interleukin-6 (IL-6, vascular endothelial growth factor A (VEGFA, interleukin-1 beta (IL-1B, tumor necrosis factor (TNF and prostaglandin-endoperoxide synthase 2 (PTGS2 as hub nodes. The backbone network constructed with these five hub genes showed 111 nodes connected via 348 edges, with IL-6 having the largest degree and highest BC. Nuclear factor kappa B1 (NFKB1, signal transducer and activator of transcription 3 (STAT3 and JUN were identified as the three core transcription factors from the regulatory network derived using MatInspector. For the purpose of validation of the hub genes, 97 test networks were constructed, which revealed the accuracy of the backbone network to be 0.7763 while the frequency of the hub nodes remained largely unaltered. Pathway enrichment analysis with ClueGO, KEGG and REACTOME showed significant enrichment of six validated CAD pathways - smooth muscle cell proliferation, acute-phase response, calcidiol 1-monooxygenase activity, toll-like receptor signaling, NOD-like receptor signaling and adipocytokine signaling pathways. Experimental verification of the above findings in 64 cases and 64 controls showed increased expression of the five candidate genes and the three transcription factors in the cases relative to the controls (p<0.05. Thus, analysis of complex networks aid in the

  10. Meta-Analysis of Transcriptome Data Related to Hippocampus Biopsies and iPSC-Derived Neuronal Cells from Alzheimer's Disease Patients Reveals an Association with FOXA1 and FOXA2 Gene Regulatory Networks.

    Science.gov (United States)

    Wruck, Wasco; Schröter, Friederike; Adjaye, James

    2016-01-01

    Although the incidence of Alzheimer's disease (AD) is continuously increasing in the aging population worldwide, effective therapies are not available. The interplay between causative genetic and environmental factors is partially understood. Meta-analyses have been performed on aspects such as polymorphisms, cytokines, and cognitive training. Here, we propose a meta-analysis approach based on hierarchical clustering analysis of a reliable training set of hippocampus biopsies, which is condensed to a gene expression signature. This gene expression signature was applied to various test sets of brain biopsies and iPSC-derived neuronal cell models to demonstrate its ability to distinguish AD samples from control. Thus, our identified AD-gene signature may form the basis for determination of biomarkers that are urgently needed to overcome current diagnostic shortfalls. Intriguingly, the well-described AD-related genes APP and APOE are not within the signature because their gene expression profiles show a lower correlation to the disease phenotype than genes from the signature. This is in line with the differing characteristics of the disease as early-/late-onset or with/without genetic predisposition. To investigate the gene signature's systemic role(s), signaling pathways, gene ontologies, and transcription factors were analyzed which revealed over-representation of response to stress, regulation of cellular metabolic processes, and reactive oxygen species. Additionally, our results clearly point to an important role of FOXA1 and FOXA2 gene regulatory networks in the etiology of AD. This finding is in corroboration with the recently reported major role of the dopaminergic system in the development of AD and its regulation by FOXA1 and FOXA2.

  11. Finding gene regulatory network candidates using the gene expression knowledge base.

    Science.gov (United States)

    Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

    2014-12-10

    Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

  12. Inferring the conservative causal core of gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2010-09-01

    Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  13. Inferring the conservative causal core of gene regulatory networks.

    Science.gov (United States)

    Altay, Gökmen; Emmert-Streib, Frank

    2010-09-28

    Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  14. Network Analysis of Human Genes Influencing Susceptibility to Mycobacterial Infections

    Science.gov (United States)

    Lipner, Ettie M.; Garcia, Benjamin J.; Strong, Michael

    2016-01-01

    Tuberculosis and nontuberculous mycobacterial infections constitute a high burden of pulmonary disease in humans, resulting in over 1.5 million deaths per year. Building on the premise that genetic factors influence the instance, progression, and defense of infectious disease, we undertook a systems biology approach to investigate relationships among genetic factors that may play a role in increased susceptibility or control of mycobacterial infections. We combined literature and database mining with network analysis and pathway enrichment analysis to examine genes, pathways, and networks, involved in the human response to Mycobacterium tuberculosis and nontuberculous mycobacterial infections. This approach allowed us to examine functional relationships among reported genes, and to identify novel genes and enriched pathways that may play a role in mycobacterial susceptibility or control. Our findings suggest that the primary pathways and genes influencing mycobacterial infection control involve an interplay between innate and adaptive immune proteins and pathways. Signaling pathways involved in autoimmune disease were significantly enriched as revealed in our networks. Mycobacterial disease susceptibility networks were also examined within the context of gene-chemical relationships, in order to identify putative drugs and nutrients with potential beneficial immunomodulatory or anti-mycobacterial effects. PMID:26751573

  15. Inferring transcriptional gene regulation network of starch metabolism in Arabidopsis thaliana leaves using graphical Gaussian model

    Directory of Open Access Journals (Sweden)

    Ingkasuwan Papapit

    2012-08-01

    Full Text Available Abstract Background Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM. Results Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF. A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090, which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene. The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070 and constans-like (COL: At2g21320, were identified as positive regulators of starch synthase 4 (SS4: At4g18240. The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines. Conclusions In this study, we utilized a systematic approach of microarray

  16. Changes in the topology of gene expression networks by human immunodeficiency virus type 1 (HIV-1) integration in macrophages.

    Science.gov (United States)

    Soto-Girón, María Juliana; García-Vallejo, Felipe

    2012-01-01

    One key step of human immunodeficiency virus type 1 (HIV-1) infection is the integration of its viral cDNA. This process is mediated through complex networks of host-virus interactions that alter several normal cell functions of the host. To study the complexity of disturbances in cell gene expression networks by HIV-1 integration, we constructed a network of human macrophage genes located close to chromatin regions rich in proviruses. To perform the network analysis, we selected 28 genes previously identified as the target of cDNA integration and their transcriptional profiles were obtained from GEO Profiles (NCBI). A total of 2770 interactions among the 28 genes located around the HIV-1 proviruses in human macrophages formed a highly dense main network connected to five sub-networks. The overall network was significantly enriched by genes associated with signal transduction, cellular communication and regulatory processes. To simulate the effects of HIV-1 integration in infected macrophages, five genes with the most number of interaction in the normal network were turned off by putting in zero the correspondent expression values. The HIV-1 infected network showed changes in its topology and alteration in the macrophage functions reflected in a re-programming of biosynthetic and general metabolic process. Understanding the complex virus-host interactions that occur during HIV-1 integration, may provided valuable genomic information to develop new antiviral treatments focusing on the management of some specific gene expression networks associated with viral integration. This is the first gene network which describes the human macrophages genes interactions related with HIV-1 integration. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

    Science.gov (United States)

    Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

    2017-10-06

    Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.

  18. Estimating immunoregulatory gene networks in human herpesvirus type 6-infected T cells

    International Nuclear Information System (INIS)

    Takaku, Tomoiku; Ohyashiki, Junko H.; Zhang, Yu; Ohyashiki, Kazuma

    2005-01-01

    The immune response to viral infection involves complex network of dynamic gene and protein interactions. We present here the dynamic gene network of the host immune response during human herpesvirus type 6 (HHV-6) infection in an adult T-cell leukemia cell line. Using a pathway-focused oligonucleotide DNA microarray, we found a possible association between chemokine genes regulating Th1/Th2 balance and genes regulating T-cell proliferation during HHV-6B infection. Gene network analysis using an integrated comprehensive workbench, VoyaGene, revealed that a gene encoding a TEC-family kinase, ITK, might be a putative modulator in the host immune response against HHV-6B infection. We conclude that Th2-dominated inflammatory reaction in host cells may play an important role in HHV-6B-infected T cells, thereby suggesting the possibility that ITK might be a therapeutic target in diseases related to dysregulation of Th1/Th2 balance. This study describes a novel approach to find genes related with the complex host-virus interaction using microarray data employing the Bayesian statistical framework

  19. Synchronous versus asynchronous modeling of gene regulatory networks.

    Science.gov (United States)

    Garg, Abhishek; Di Cara, Alessandro; Xenarios, Ioannis; Mendoza, Luis; De Micheli, Giovanni

    2008-09-01

    In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.

  20. A network approach to predict pathogenic genes for Fusarium graminearum.

    Directory of Open Access Journals (Sweden)

    Xiaoping Liu

    Full Text Available Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB, which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other

  1. Identifying time-delayed gene regulatory networks via an evolvable hierarchical recurrent neural network.

    Science.gov (United States)

    Kordmahalleh, Mina Moradi; Sefidmazgi, Mohammad Gorji; Harrison, Scott H; Homaifar, Abdollah

    2017-01-01

    The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network

  2. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    Science.gov (United States)

    Zhou, Xionghui; Liu, Juan

    2014-01-01

    Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for

  3. A systems approach identifies networks and genes linking sleep and stress: implications for neuropsychiatric disorders.

    Science.gov (United States)

    Jiang, Peng; Scarpa, Joseph R; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D; Hao, Ke; Summa, Keith C; Yang, He S; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H; Turek, Fred W; Kasarskis, Andrew

    2015-05-05

    Sleep dysfunction and stress susceptibility are comorbid complex traits that often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multilevel organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J × A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type-specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests that the interplay among sleep, stress, and neuropathology emerges from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework for interrogating the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Exercise-associated DNA methylation change in skeletal muscle and the importance of imprinted genes: a bioinformatics meta-analysis.

    Science.gov (United States)

    Brown, William M

    2015-12-01

    Epigenetics is the study of processes--beyond DNA sequence alteration--producing heritable characteristics. For example, DNA methylation modifies gene expression without altering the nucleotide sequence. A well-studied DNA methylation-based phenomenon is genomic imprinting (ie, genotype-independent parent-of-origin effects). We aimed to elucidate: (1) the effect of exercise on DNA methylation and (2) the role of imprinted genes in skeletal muscle gene networks (ie, gene group functional profiling analyses). Gene ontology (ie, gene product elucidation)/meta-analysis. 26 skeletal muscle and 86 imprinted genes were subjected to g:Profiler ontology analysis. Meta-analysis assessed exercise-associated DNA methylation change. g:Profiler found four muscle gene networks with imprinted loci. Meta-analysis identified 16 articles (387 genes/1580 individuals) associated with exercise. Age, method, sample size, sex and tissue variation could elevate effect size bias. Only skeletal muscle gene networks including imprinted genes were reported. Exercise-associated effect sizes were calculated by gene. Age, method, sample size, sex and tissue variation were moderators. Six imprinted loci (RB1, MEG3, UBE3A, PLAGL1, SGCE, INS) were important for muscle gene networks, while meta-analysis uncovered five exercise-associated imprinted loci (KCNQ1, MEG3, GRB10, L3MBTL1, PLAGL1). DNA methylation decreased with exercise (60% of loci). Exercise-associated DNA methylation change was stronger among older people (ie, age accounted for 30% of the variation). Among older people, genes exhibiting DNA methylation decreases were part of a microRNA-regulated gene network functioning to suppress cancer. Imprinted genes were identified in skeletal muscle gene networks and exercise-associated DNA methylation change. Exercise-associated DNA methylation modification could rewind the 'epigenetic clock' as we age. CRD42014009800. Published by the BMJ Publishing Group Limited. For permission to use (where

  5. Local and global responses in complex gene regulation networks

    Science.gov (United States)

    Tsuchiya, Masa; Selvarajoo, Kumar; Piras, Vincent; Tomita, Masaru; Giuliani, Alessandro

    2009-04-01

    An exacerbated sensitivity to apparently minor stimuli and a general resilience of the entire system stay together side-by-side in biological systems. This apparent paradox can be explained by the consideration of biological systems as very strongly interconnected network systems. Some nodes of these networks, thanks to their peculiar location in the network architecture, are responsible for the sensitivity aspects, while the large degree of interconnection is at the basis of the resilience properties of the system. One relevant feature of the high degree of connectivity of gene regulation networks is the emergence of collective ordered phenomena influencing the entire genome and not only a specific portion of transcripts. The great majority of existing gene regulation models give the impression of purely local ‘hard-wired’ mechanisms disregarding the emergence of global ordered behavior encompassing thousands of genes while the general, genome wide, aspects are less known. Here we address, on a data analysis perspective, the discrimination between local and global scale regulations, this goal was achieved by means of the examination of two biological systems: innate immune response in macrophages and oscillating growth dynamics in yeast. Our aim was to reconcile the ‘hard-wired’ local view of gene regulation with a global continuous and scalable one borrowed from statistical physics. This reconciliation is based on the network paradigm in which the local ‘hard-wired’ activities correspond to the activation of specific crucial nodes in the regulation network, while the scalable continuous responses can be equated to the collective oscillations of the network after a perturbation.

  6. Biophysical Constraints Arising from Compositional Context in Synthetic Gene Networks.

    Science.gov (United States)

    Yeung, Enoch; Dy, Aaron J; Martin, Kyle B; Ng, Andrew H; Del Vecchio, Domitilla; Beck, James L; Collins, James J; Murray, Richard M

    2017-07-26

    Synthetic gene expression is highly sensitive to intragenic compositional context (promoter structure, spacing regions between promoter and coding sequences, and ribosome binding sites). However, much less is known about the effects of intergenic compositional context (spatial arrangement and orientation of entire genes on DNA) on expression levels in synthetic gene networks. We compare expression of induced genes arranged in convergent, divergent, or tandem orientations. Induction of convergent genes yielded up to 400% higher expression, greater ultrasensitivity, and dynamic range than divergent- or tandem-oriented genes. Orientation affects gene expression whether one or both genes are induced. We postulate that transcriptional interference in divergent and tandem genes, mediated by supercoiling, can explain differences in expression and validate this hypothesis through modeling and in vitro supercoiling relaxation experiments. Treatment with gyrase abrogated intergenic context effects, bringing expression levels within 30% of each other. We rebuilt the toggle switch with convergent genes, taking advantage of supercoiling effects to improve threshold detection and switch stability. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin

    2016-01-01

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  8. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato

    2016-08-25

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  9. MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers.

    Science.gov (United States)

    Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier; Lecompte, Odile

    2017-06-16

    The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user's specific interests and provides an efficient way to share information with collaborators. Furthermore, the user's behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. ©Alexis Allot, Kirsley Chennen, Yannis

  10. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  11. Rodent Zic Genes in Neural Network Wiring.

    Science.gov (United States)

    Herrera, Eloísa

    2018-01-01

    The formation of the nervous system is a multistep process that yields a mature brain. Failure in any of the steps of this process may cause brain malfunction. In the early stages of embryonic development, neural progenitors quickly proliferate and then, at a specific moment, differentiate into neurons or glia. Once they become postmitotic neurons, they migrate to their final destinations and begin to extend their axons to connect with other neurons, sometimes located in quite distant regions, to establish different neural circuits. During the last decade, it has become evident that Zic genes, in addition to playing important roles in early development (e.g., gastrulation and neural tube closure), are involved in different processes of late brain development, such as neuronal migration, axon guidance, and refinement of axon terminals. ZIC proteins are therefore essential for the proper wiring and connectivity of the brain. In this chapter, we review our current knowledge of the role of Zic genes in the late stages of neural circuit formation.

  12. The capacity for multistability in small gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Grotewold Erich

    2009-09-01

    Full Text Available Abstract Background Recent years have seen a dramatic increase in the use of mathematical modeling to gain insight into gene regulatory network behavior across many different organisms. In particular, there has been considerable interest in using mathematical tools to understand how multistable regulatory networks may contribute to developmental processes such as cell fate determination. Indeed, such a network may subserve the formation of unicellular leaf hairs (trichomes in the model plant Arabidopsis thaliana. Results In order to investigate the capacity of small gene regulatory networks to generate multiple equilibria, we present a chemical reaction network (CRN-based modeling formalism and describe a number of methods for CRN analysis in a parameter-free context. These methods are compared and applied to a full set of one-component subnetworks, as well as a large random sample from 40,680 similarly constructed two-component subnetworks. We find that positive feedback and cooperativity mediated by transcription factor (TF dimerization is a requirement for one-component subnetwork bistability. For subnetworks with two components, the presence of these processes increases the probability that a randomly sampled subnetwork will exhibit multiple equilibria, although we find several examples of bistable two-component subnetworks that do not involve cooperative TF-promoter binding. In the specific case of epidermal differentiation in Arabidopsis, dimerization of the GL3-GL1 complex and cooperative sequential binding of GL3-GL1 to the CPC promoter are each independently sufficient for bistability. Conclusion Computational methods utilizing CRN-specific theorems to rule out bistability in small gene regulatory networks are far superior to techniques generally applicable to deterministic ODE systems. Using these methods to conduct an unbiased survey of parameter-free deterministic models of small networks, and the Arabidopsis epidermal cell

  13. Functional network analysis of genes differentially expressed during xylogenesis in soc1ful woody Arabidopsis plants.

    Science.gov (United States)

    Davin, Nicolas; Edger, Patrick P; Hefer, Charles A; Mizrachi, Eshchar; Schuetz, Mathias; Smets, Erik; Myburg, Alexander A; Douglas, Carl J; Schranz, Michael E; Lens, Frederic

    2016-06-01

    Many plant genes are known to be involved in the development of cambium and wood, but how the expression and functional interaction of these genes determine the unique biology of wood remains largely unknown. We used the soc1ful loss of function mutant - the woodiest genotype known in the otherwise herbaceous model plant Arabidopsis - to investigate the expression and interactions of genes involved in secondary growth (wood formation). Detailed anatomical observations of the stem in combination with mRNA sequencing were used to assess transcriptome remodeling during xylogenesis in wild-type and woody soc1ful plants. To interpret the transcriptome changes, we constructed functional gene association networks of differentially expressed genes using the STRING database. This analysis revealed functionally enriched gene association hubs that are differentially expressed in herbaceous and woody tissues. In particular, we observed the differential expression of genes related to mechanical stress and jasmonate biosynthesis/signaling during wood formation in soc1ful plants that may be an effect of greater tension within woody tissues. Our results suggest that habit shifts from herbaceous to woody life forms observed in many angiosperm lineages could have evolved convergently by genetic changes that modulate the gene expression and interaction network, and thereby redeploy the conserved wood developmental program. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.

  14. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks.

    Directory of Open Access Journals (Sweden)

    Vipin Narang

    Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.

  15. Robustness and accuracy in sea urchin developmental gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Smadar eBen-Tabou De-Leon

    2016-02-01

    Full Text Available Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change.

  16. Network analysis of differential expression for the identification of disease-causing genes.

    Directory of Open Access Journals (Sweden)

    Daniela Nitsch

    Full Text Available Genetic studies (in particular linkage and association studies identify chromosomal regions involved in a disease or phenotype of interest, but those regions often contain many candidate genes, only a few of which can be followed-up for biological validation. Recently, computational methods to identify (prioritize the most promising candidates within a region have been proposed, but they are usually not applicable to cases where little is known about the phenotype (no or few confirmed disease genes, fragmentary understanding of the biological cascades involved. We seek to overcome this limitation by replacing knowledge about the biological process by experimental data on differential gene expression between affected and healthy individuals. Considering the problem from the perspective of a gene/protein network, we assess a candidate gene by considering the level of differential expression in its neighborhood under the assumption that strong candidates will tend to be surrounded by differentially expressed neighbors. We define a notion of soft neighborhood where each gene is given a contributing weight, which decreases with the distance from the candidate gene on the protein network. To account for multiple paths between genes, we define the distance using the Laplacian exponential diffusion kernel. We score candidates by aggregating the differential expression of neighbors weighted as a function of distance. Through a randomization procedure, we rank candidates by p-values. We illustrate our approach on four monogenic diseases and successfully prioritize the known disease causing genes.

  17. Systems genetics identifies a convergent gene network for cognition and neurodevelopmental disease.

    Science.gov (United States)

    Johnson, Michael R; Shkura, Kirill; Langley, Sarah R; Delahaye-Duriez, Andree; Srivastava, Prashant; Hill, W David; Rackham, Owen J L; Davies, Gail; Harris, Sarah E; Moreno-Moral, Aida; Rotival, Maxime; Speed, Doug; Petrovski, Slavé; Katz, Anaïs; Hayward, Caroline; Porteous, David J; Smith, Blair H; Padmanabhan, Sandosh; Hocking, Lynne J; Starr, John M; Liewald, David C; Visconti, Alessia; Falchi, Mario; Bottolo, Leonardo; Rossetti, Tiziana; Danis, Bénédicte; Mazzuferi, Manuela; Foerch, Patrik; Grote, Alexander; Helmstaedter, Christoph; Becker, Albert J; Kaminski, Rafal M; Deary, Ian J; Petretto, Enrico

    2016-02-01

    Genetic determinants of cognition are poorly characterized, and their relationship to genes that confer risk for neurodevelopmental disease is unclear. Here we performed a systems-level analysis of genome-wide gene expression data to infer gene-regulatory networks conserved across species and brain regions. Two of these networks, M1 and M3, showed replicable enrichment for common genetic variants underlying healthy human cognitive abilities, including memory. Using exome sequence data from 6,871 trios, we found that M3 genes were also enriched for mutations ascertained from patients with neurodevelopmental disease generally, and intellectual disability and epileptic encephalopathy in particular. M3 consists of 150 genes whose expression is tightly developmentally regulated, but which are collectively poorly annotated for known functional pathways. These results illustrate how systems-level analyses can reveal previously unappreciated relationships between neurodevelopmental disease-associated genes in the developed human brain, and provide empirical support for a convergent gene-regulatory network influencing cognition and neurodevelopmental disease.

  18. Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes

    Directory of Open Access Journals (Sweden)

    Reverter Antonio

    2008-09-01

    Full Text Available Abstract Background The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association. Methods We adopted an a priori approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported. Results Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes. Conclusion We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing

  19. A gene network simulator to assess reverse engineering algorithms.

    Science.gov (United States)

    Di Camillo, Barbara; Toffolo, Gianna; Cobelli, Claudio

    2009-03-01

    In the context of reverse engineering of biological networks, simulators are helpful to test and compare the accuracy of different reverse-engineering approaches in a variety of experimental conditions. A novel gene-network simulator is presented that resembles some of the main features of transcriptional regulatory networks related to topology, interaction among regulators of transcription, and expression dynamics. The simulator generates network topology according to the current knowledge of biological network organization, including scale-free distribution of the connectivity and clustering coefficient independent of the number of nodes in the network. It uses fuzzy logic to represent interactions among the regulators of each gene, integrated with differential equations to generate continuous data, comparable to real data for variety and dynamic complexity. Finally, the simulator accounts for saturation in the response to regulation and transcription activation thresholds and shows robustness to perturbations. It therefore provides a reliable and versatile test bed for reverse engineering algorithms applied to microarray data. Since the simulator describes regulatory interactions and expression dynamics as two distinct, although interconnected aspects of regulation, it can also be used to test reverse engineering approaches that use both microarray and protein-protein interaction data in the process of learning. A first software release is available at http://www.dei.unipd.it/~dicamill/software/netsim as an R programming language package.

  20. Developing integrated crop knowledge networks to advance candidate gene discovery.

    Science.gov (United States)

    Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

    2016-12-01

    The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.

  1. A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

    Science.gov (United States)

    Larremore, Daniel B.; Clauset, Aaron; Buckee, Caroline O.

    2013-01-01

    The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences. PMID:24130474

  2. A gene regulatory network armature for T-lymphocyte specification

    Energy Technology Data Exchange (ETDEWEB)

    Fung, Elizabeth-sharon [Los Alamos National Laboratory

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  3. A network approach to analyzing highly recombinant malaria parasite genes.

    Science.gov (United States)

    Larremore, Daniel B; Clauset, Aaron; Buckee, Caroline O

    2013-01-01

    The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.

  4. A network approach to analyzing highly recombinant malaria parasite genes.

    Directory of Open Access Journals (Sweden)

    Daniel B Larremore

    Full Text Available The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs, and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.

  5. The European Nuclear Education Network Association - ENEN

    International Nuclear Information System (INIS)

    De Regge, P.P.

    2005-01-01

    The temporary network, established through the European 5 th Framework Programme project ENEN, was given a more permanent character by the foundation of the European Nuclear Education Network Association, a non-profit-making association according to the French law of 1901, pursuing a pedagogic and scientific aim. Its main objective is the preservation and the further development of higher nuclear education and expertise. This objective is realized through the co-operation between the European universities, involved in education and research in the nuclear engineering field, the nuclear research centres and the nuclear industry. The membership of the ENEN Association now consists of 35 universities members and 6 research centres. The paper briefly describes the history and structure of the ENEN Association and elaborates on the objectives and activities of its five committees during its first two years of operation. Supported by the 5 th and 6 th Framework Programme of the European Community, the ENEN Association established the delivery of the European Master of Science in Nuclear Engineering certificate. In particular, education and training courses have been developed and offered to materialise the core curricula and optional fields of study in a European exchange structure. Pilot editions of those courses and try-outs of training programmes have been successfully organised with a satisfying interest, attendance and performance by the students and the support of nuclear industries and international organisations. The involvement of ENEN in the 6 th EC Framework project EUROTRANS will further enlarge its field of activities into a realm of nuclear disciplines. The ENEN Association further contributes to the management of nuclear knowledge within the European Union as well as on a world-wide level, through contacts with its sister Network ANENT in Asia, and by its participation to activities of the World Nuclear University. (author)

  6. The European Nuclear Education Network Association - ENEN

    International Nuclear Information System (INIS)

    Gentile, D.

    2006-01-01

    The temporary network, established through the European 5. Framework Programme project ENEN, was given a more permanent character by the foundation of the European Nuclear Education Network Association, a non-profit-making association according to the French law of 1901, pursuing a pedagogic and scientific aim. Its main objective is the preservation and the further development of higher nuclear education and expertise. This objective is realized through the co-operation between the European universities, involved in education and research in the nuclear engineering field, the nuclear research centres and the nuclear industry. The membership of the ENEN Association now consists of 35 universities members and 6 research centres. The paper briefly describes the history and structure of the ENEN Association and elaborates on the objectives and activities of its five committees during its first two years of operation. Supported by the 5. and 6. Framework Programme of the European Community, the ENEN Association established the delivery of the European Master of Science in Nuclear Engineering certificate. In particular, education and training courses have been developed and offered to materialize the core curricula and optional fields of study in a European exchange structure. Pilot editions of those courses and try-outs of training programmes have been successfully organised with a satisfying interest, attendance and performance by the students and the support of nuclear industries and international organisations. The involvement of ENEN in the 6. EC Framework project EUROTRANS will further enlarge its field of activities into a realm of nuclear disciplines. The ENEN Association further contributes to the management of nuclear knowledge within the European Union as well as on a world-wide level, through contacts with its sister Network ANENT in Asia, and by its participation to activities of the World Nuclear University. (author)

  7. The European Nuclear Education Network Association - ENEN

    Energy Technology Data Exchange (ETDEWEB)

    Gentile, D. [Institut des Sciences et Techniques Nucleaires, CEA - Centre de Saclay, Bat. 395, F-91191 Gif-sur-Yvette (France)

    2006-07-01

    The temporary network, established through the European 5. Framework Programme project ENEN, was given a more permanent character by the foundation of the European Nuclear Education Network Association, a non-profit-making association according to the French law of 1901, pursuing a pedagogic and scientific aim. Its main objective is the preservation and the further development of higher nuclear education and expertise. This objective is realized through the co-operation between the European universities, involved in education and research in the nuclear engineering field, the nuclear research centres and the nuclear industry. The membership of the ENEN Association now consists of 35 universities members and 6 research centres. The paper briefly describes the history and structure of the ENEN Association and elaborates on the objectives and activities of its five committees during its first two years of operation. Supported by the 5. and 6. Framework Programme of the European Community, the ENEN Association established the delivery of the European Master of Science in Nuclear Engineering certificate. In particular, education and training courses have been developed and offered to materialize the core curricula and optional fields of study in a European exchange structure. Pilot editions of those courses and try-outs of training programmes have been successfully organised with a satisfying interest, attendance and performance by the students and the support of nuclear industries and international organisations. The involvement of ENEN in the 6. EC Framework project EUROTRANS will further enlarge its field of activities into a realm of nuclear disciplines. The ENEN Association further contributes to the management of nuclear knowledge within the European Union as well as on a world-wide level, through contacts with its sister Network ANENT in Asia, and by its participation to activities of the World Nuclear University. (author)

  8. Toxic Diatom Aldehydes Affect Defence Gene Networks in Sea Urchins.

    Directory of Open Access Journals (Sweden)

    Stefano Varrella

    Full Text Available Marine organisms possess a series of cellular strategies to counteract the negative effects of toxic compounds, including the massive reorganization of gene expression networks. Here we report the modulated dose-dependent response of activated genes by diatom polyunsaturated aldehydes (PUAs in the sea urchin Paracentrotus lividus. PUAs are secondary metabolites deriving from the oxidation of fatty acids, inducing deleterious effects on the reproduction and development of planktonic and benthic organisms that feed on these unicellular algae and with anti-cancer activity. Our previous results showed that PUAs target several genes, implicated in different functional processes in this sea urchin. Using interactomic Ingenuity Pathway Analysis we now show that the genes targeted by PUAs are correlated with four HUB genes, NF-κB, p53, δ-2-catenin and HIF1A, which have not been previously reported for P. lividus. We propose a working model describing hypothetical pathways potentially involved in toxic aldehyde stress response in sea urchins. This represents the first report on gene networks affected by PUAs, opening new perspectives in understanding the cellular mechanisms underlying the response of benthic organisms to diatom exposure.

  9. Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle.

    Science.gov (United States)

    Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M

    2017-01-01

    Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms

  10. Deregulation of an imprinted gene network in prostate cancer.

    Science.gov (United States)

    Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A

    2014-05-01

    Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes.

  11. Identification of potential crucial genes associated with steroid-induced necrosis of femoral head based on gene expression profile.

    Science.gov (United States)

    Lin, Zhe; Lin, Yongsheng

    2017-09-05

    The aim of this study was to explore potential crucial genes associated with the steroid-induced necrosis of femoral head (SINFH) and to provide valid biological information for further investigation of SINFH. Gene expression profile of GSE26316, generated from 3 SINFH rat samples and 3 normal rat samples were downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified using LIMMA package. After functional enrichment analyses of DEGs, protein-protein interaction (PPI) network and sub-PPI network analyses were conducted based on the STRING database and cytoscape. In total, 59 up-regulated DEGs and 156 downregulated DEGs were identified. The up-regulated DEGs were mainly involved in functions about immunity (e.g. Fcer1A and Il7R), and the downregulated DEGs were mainly enriched in muscle system process (e.g. Tnni2, Mylpf and Myl1). The PPI network of DEGs consisted of 123 nodes and 300 interactions. Tnni2, Mylpf, and Myl1 were the top 3 outstanding genes based on both subgraph centrality and degree centrality evaluation. These three genes interacted with each other in the network. Furthermore, the significant network module was composed of 22 downregulated genes (e.g. Tnni2, Mylpf and Myl1). These genes were mainly enriched in functions like muscle system process. The DEGs related to the regulation of immune system process (e.g. Fcer1A and Il7R), and DEGs correlated with muscle system process (e.g. Tnni2, Mylpf and Myl1) may be closely associated with the progress of SINFH, which is still needed to be confirmed by experiments. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes.

    Directory of Open Access Journals (Sweden)

    Emre Guney

    Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.

  13. Memory functions reveal structural properties of gene regulatory networks

    Science.gov (United States)

    Perez-Carrasco, Ruben

    2018-01-01

    Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs. PMID:29470492

  14. Gene variants associated with antisocial behaviour: a latent variable approach.

    Science.gov (United States)

    Bentley, Mary Jane; Lin, Haiqun; Fernandez, Thomas V; Lee, Maria; Yrigollen, Carolyn M; Pakstis, Andrew J; Katsovich, Liliya; Olds, David L; Grigorenko, Elena L; Leckman, James F

    2013-10-01

    The aim of this study was to determine if a latent variable approach might be useful in identifying shared variance across genetic risk alleles that is associated with antisocial behaviour at age 15 years. Using a conventional latent variable approach, we derived an antisocial phenotype in 328 adolescents utilizing data from a 15-year follow-up of a randomized trial of a prenatal and infancy nurse-home visitation programme in Elmira, New York. We then investigated, via a novel latent variable approach, 450 informative genetic polymorphisms in 71 genes previously associated with antisocial behaviour, drug use, affiliative behaviours and stress response in 241 consenting individuals for whom DNA was available. Haplotype and Pathway analyses were also performed. Eight single-nucleotide polymorphisms (SNPs) from eight genes contributed to the latent genetic variable that in turn accounted for 16.0% of the variance within the latent antisocial phenotype. The number of risk alleles was linearly related to the latent antisocial variable scores. Haplotypes that included the putative risk alleles for all eight genes were also associated with higher latent antisocial variable scores. In addition, 33 SNPs from 63 of the remaining genes were also significant when added to the final model. Many of these genes interact on a molecular level, forming molecular networks. The results support a role for genes related to dopamine, norepinephrine, serotonin, glutamate, opioid and cholinergic signalling as well as stress response pathways in mediating susceptibility to antisocial behaviour. This preliminary study supports use of relevant behavioural indicators and latent variable approaches to study the potential 'co-action' of gene variants associated with antisocial behaviour. It also underscores the cumulative relevance of common genetic variants for understanding the aetiology of complex behaviour. If replicated in future studies, this approach may allow the identification of a

  15. Estimation of the proteomic cancer co-expression sub networks by using association estimators.

    Directory of Open Access Journals (Sweden)

    Cihat Erdoğan

    Full Text Available In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA. Correlation and mutual information (MI based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators' performance, a multi-layer data integration platform on gene-disease associations (DisGeNET and the Molecular Signatures Database (MSigDB was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink and 64% for Schurmann-Grassberger (SG association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists.

  16. Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia

    Directory of Open Access Journals (Sweden)

    Borlawsky Tara B

    2010-10-01

    Full Text Available Abstract Background Chronic lymphocytic leukemia (CLL is the most common adult leukemia. It is a highly heterogeneous disease, and can be divided roughly into indolent and progressive stages based on classic clinical markers. Immunoglobin heavy chain variable region (IgVH mutational status was found to be associated with patient survival outcome, and biomarkers linked to the IgVH status has been a focus in the CLL prognosis research field. However, biomarkers highly correlated with IgVH mutational status which can accurately predict the survival outcome are yet to be discovered. Results In this paper, we investigate the use of gene co-expression network analysis to identify potential biomarkers for CLL. Specifically we focused on the co-expression network involving ZAP70, a well characterized biomarker for CLL. We selected 23 microarray datasets corresponding to multiple types of cancer from the Gene Expression Omnibus (GEO and used the frequent network mining algorithm CODENSE to identify highly connected gene co-expression networks spanning the entire genome, then evaluated the genes in the co-expression network in which ZAP70 is involved. We then applied a set of feature selection methods to further select genes which are capable of predicting IgVH mutation status from the ZAP70 co-expression network. Conclusions We have identified a set of genes that are potential CLL prognostic biomarkers IL2RB, CD8A, CD247, LAG3 and KLRK1, which can predict CLL patient IgVH mutational status with high accuracies. Their prognostic capabilities were cross-validated by applying these biomarker candidates to classify patients into different outcome groups using a CLL microarray datasets with clinical information.

  17. Modeling stochasticity and robustness in gene regulatory networks.

    Science.gov (United States)

    Garg, Abhishek; Mohanram, Kartik; Di Cara, Alessandro; De Micheli, Giovanni; Xenarios, Ioannis

    2009-06-15

    Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations. In this article, we introduce the stochasticity in functions (SIF) model for simulating stochasticity in Boolean models of GRNs. By providing biological motivation behind the use of the SIF model and applying it to the T-helper and T-cell activation networks, we show that the SIF model provides more biologically robust results than the existing SIN model of stochasticity in GRNs. Algorithms are made available under our Boolean modeling toolbox, GenYsis. The software binaries can be downloaded from http://si2.epfl.ch/ approximately garg/genysis.html.

  18. Identification of Plagl1/Zac1 binding sites and target genes establishes its role in the regulation of extracellular matrix genes and the imprinted gene network.

    Science.gov (United States)

    Varrault, Annie; Dantec, Christelle; Le Digarcher, Anne; Chotard, Laëtitia; Bilanges, Benoit; Parrinello, Hugues; Dubois, Emeric; Rialle, Stéphanie; Severac, Dany; Bouschet, Tristan; Journot, Laurent

    2017-10-13

    PLAGL1/ZAC1 undergoes parental genomic imprinting, is paternally expressed, and is a member of the imprinted gene network (IGN). It encodes a zinc finger transcription factor with anti-proliferative activity and is a candidate tumor suppressor gene on 6q24 whose expression is frequently lost in various neoplasms. Conversely, gain of PLAGL1 function is responsible for transient neonatal diabetes mellitus, a rare genetic disease that results from defective pancreas development. In the present work, we showed that Plagl1 up-regulation was not associated with DNA damage-induced cell cycle arrest. It was rather associated with physiological cell cycle exit that occurred with contact inhibition, growth factor withdrawal, or cell differentiation. To gain insights into Plagl1 mechanism of action, we identified Plagl1 target genes by combining chromatin immunoprecipitation and genome-wide transcriptomics in transfected cell lines. Plagl1-elicited gene regulation correlated with multiple binding to the proximal promoter region through a GC-rich motif. Plagl1 target genes included numerous genes involved in signaling, cell adhesion, and extracellular matrix composition, including collagens. Plagl1 targets also included 22% of the 409 genes that make up the IGN. Altogether, this work identified Plagl1 as a transcription factor that coordinated the regulation of a subset of IGN genes and controlled extracellular matrix composition. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. An improved, bias-reduced probabilistic functional gene network of baker's yeast, Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Insuk Lee

    2007-10-01

    Full Text Available Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations.We report a significantly improved version (v. 2 of a probabilistic functional gene network of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis.YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome. YeastNet is available from http://www.yeastnet.org.

  20. Listening to the Noise: Random Fluctuations Reveal Gene Network Parameters

    Science.gov (United States)

    Munsky, Brian; Trinh, Brooke; Khammash, Mustafa

    2010-03-01

    The cellular environment is abuzz with noise originating from the inherent random motion of reacting molecules in the living cell. In this noisy environment, clonal cell populations exhibit cell-to-cell variability that can manifest significant prototypical differences. Noise induced stochastic fluctuations in cellular constituents can be measured and their statistics quantified using flow cytometry, single molecule fluorescence in situ hybridization, time lapse fluorescence microscopy and other single cell and single molecule measurement techniques. We show that these random fluctuations carry within them valuable information about the underlying genetic network. Far from being a nuisance, the ever-present cellular noise acts as a rich source of excitation that, when processed through a gene network, carries its distinctive fingerprint that encodes a wealth of information about that network. We demonstrate that in some cases the analysis of these random fluctuations enables the full identification of network parameters, including those that may otherwise be difficult to measure. We use theoretical investigations to establish experimental guidelines for the identification of gene regulatory networks, and we apply these guideline to experimentally identify predictive models for different regulatory mechanisms in bacteria and yeast.

  1. Analysis of metastasis associated signal regulatory network in colorectal cancer.

    Science.gov (United States)

    Qi, Lu; Ding, Yanqing

    2018-06-18

    Metastasis is a key factor that affects the survival and prognosis of colorectal cancer patients. To elucidate molecular mechanism associated with the metastasis of colorectal cancer, genes related to the metastasis time of colorectal cancer were screened. Then, a network was constructed with this genes. Data was obtained from colorectal cancer expression profile. Molecular mechanism elucidated the time of tumor metastasis and the expression of genes related to colorectal cancer. We found that metastasis-promoting and metastasis-inhibiting networks included protein hubs of high connectivity. These protein hubs were components of organelles. Some ribosomal proteins promoted the metastasis of colorectal cancer. In some components of organelles, such as proteasomes, mitochondrial ribosome, ATP synthase, and splicing factors, the metastasis of colorectal cancer was inhibited by some sections of these organelles. After performing survival analysis of proteins in organelles, joint survival curve of proteins was constructed in ribosomal network. This joint survival curve showed metastasis was promoted in patients with colorectal cancer (P = 0.0022939). Joint survival curve of proteins was plotted against proteasomes (P = 7 e-07), mitochondrial ribosome (P = 0.0001157), ATP synthase (P = 0.0001936), and splicing factors (P = 1.35e-05). These curves indicate that metastasis of colorectal cancer can be inhibited. After analyzing proteins that bind with organelle components, we also found that some proteins were associated with the time of colorectal cancer metastasis. Hence, different cellular components play different roles in the metastasis of colorectal cancer. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. A network-based gene expression signature informs prognosis and treatment for colorectal cancer patients.

    Directory of Open Access Journals (Sweden)

    Mingguang Shi

    Full Text Available Several studies have reported gene expression signatures that predict recurrence risk in stage II and III colorectal cancer (CRC patients with minimal gene membership overlap and undefined biological relevance. The goal of this study was to investigate biological themes underlying these signatures, to infer genes of potential mechanistic importance to the CRC recurrence phenotype and to test whether accurate prognostic models can be developed using mechanistically important genes.We investigated eight published CRC gene expression signatures and found no functional convergence in Gene Ontology enrichment analysis. Using a random walk-based approach, we integrated these signatures and publicly available somatic mutation data on a protein-protein interaction network and inferred 487 genes that were plausible candidate molecular underpinnings for the CRC recurrence phenotype. We named the list of 487 genes a NEM signature because it integrated information from Network, Expression, and Mutation. The signature showed significant enrichment in four biological processes closely related to cancer pathophysiology and provided good coverage of known oncogenes, tumor suppressors, and CRC-related signaling pathways. A NEM signature-based Survival Support Vector Machine prognostic model was trained using a microarray gene expression dataset and tested on an independent dataset. The model-based scores showed a 75.7% concordance with the real survival data and separated patients into two groups with significantly different relapse-free survival (p = 0.002. Similar results were obtained with reversed training and testing datasets (p = 0.007. Furthermore, adjuvant chemotherapy was significantly associated with prolonged survival of the high-risk patients (p = 0.006, but not beneficial to the low-risk patients (p = 0.491.The NEM signature not only reflects CRC biology but also informs patient prognosis and treatment response. Thus, the network

  3. Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

    OpenAIRE

    Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

    2009-01-01

    Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...

  4. Integrative analysis of survival-associated gene sets in breast cancer.

    Science.gov (United States)

    Varn, Frederick S; Ung, Matthew H; Lou, Shao Ke; Cheng, Chao

    2015-03-12

    Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient's cancer. Identifying robust gene sets that are consistently predictive of a patient's clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set's activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network's topology and applied the GSAS metric to characterize its role in patient survival. Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used

  5. Learning a Markov Logic network for supervised gene regulatory network inference.

    Science.gov (United States)

    Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence

    2013-09-12

    Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a

  6. Comparison of evolutionary algorithms in gene regulatory network model inference.

    LENUS (Irish Health Repository)

    2010-01-01

    ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.

  7. Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Liang Jinghang

    2012-08-01

    Full Text Available Abstract Background Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs. As a logical model, probabilistic Boolean networks (PBNs consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n or O(nN2n for a sparse matrix. Results This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN. An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n, where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a

  8. Identifying the Gene Signatures from Gene-Pathway Bipartite Network Guarantees the Robust Model Performance on Predicting the Cancer Prognosis

    Directory of Open Access Journals (Sweden)

    Li He

    2014-01-01

    Full Text Available For the purpose of improving the prediction of cancer prognosis in the clinical researches, various algorithms have been developed to construct the predictive models with the gene signatures detected by DNA microarrays. Due to the heterogeneity of the clinical samples, the list of differentially expressed genes (DEGs generated by the statistical methods or the machine learning algorithms often involves a number of false positive genes, which are not associated with the phenotypic differences between the compared clinical conditions, and subsequently impacts the reliability of the predictive models. In this study, we proposed a strategy, which combined the statistical algorithm with the gene-pathway bipartite networks, to generate the reliable lists of cancer-related DEGs and constructed the models by using support vector machine for predicting the prognosis of three types of cancers, namely, breast cancer, acute myeloma leukemia, and glioblastoma. Our results demonstrated that, combined with the gene-pathway bipartite networks, our proposed strategy can efficiently generate the reliable cancer-related DEG lists for constructing the predictive models. In addition, the model performance in the swap analysis was similar to that in the original analysis, indicating the robustness of the models in predicting the cancer outcomes.

  9. Integration of heterogeneous molecular networks to unravel gene-regulation in Mycobacterium tuberculosis.

    Science.gov (United States)

    van Dam, Jesse C J; Schaap, Peter J; Martins dos Santos, Vitor A P; Suárez-Diez, María

    2014-09-26

    Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different

  10. Sequence-based model of gap gene regulatory network.

    Science.gov (United States)

    Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

    2014-01-01

    The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3

  11. Effects of Gene Dose, Chromatin, and Network Topology on Expression in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Hangnoh Lee

    2016-09-01

    Full Text Available Deletions, commonly referred to as deficiencies by Drosophila geneticists, are valuable tools for mapping genes and for genetic pathway discovery via dose-dependent suppressor and enhancer screens. More recently, it has become clear that deviations from normal gene dosage are associated with multiple disorders in a range of species including humans. While we are beginning to understand some of the transcriptional effects brought about by gene dosage changes and the chromosome rearrangement breakpoints associated with them, much of this work relies on isolated examples. We have systematically examined deficiencies of the left arm of chromosome 2 and characterize gene-by-gene dosage responses that vary from collapsed expression through modest partial dosage compensation to full or even over compensation. We found negligible long-range effects of creating novel chromosome domains at deletion breakpoints, suggesting that cases of gene regulation due to altered nuclear architecture are rare. These rare cases include trans de-repression when deficiencies delete chromatin characterized as repressive in other studies. Generally, effects of breakpoints on expression are promoter proximal (~100bp or in the gene body. Effects of deficiencies genome-wide are in genes with regulatory relationships to genes within the deleted segments, highlighting the subtle expression network defects in these sensitized genetic backgrounds.

  12. Effects of Gene Dose, Chromatin, and Network Topology on Expression in Drosophila melanogaster.

    Science.gov (United States)

    Lee, Hangnoh; Cho, Dong-Yeon; Whitworth, Cale; Eisman, Robert; Phelps, Melissa; Roote, John; Kaufman, Thomas; Cook, Kevin; Russell, Steven; Przytycka, Teresa; Oliver, Brian

    2016-09-01

    Deletions, commonly referred to as deficiencies by Drosophila geneticists, are valuable tools for mapping genes and for genetic pathway discovery via dose-dependent suppressor and enhancer screens. More recently, it has become clear that deviations from normal gene dosage are associated with multiple disorders in a range of species including humans. While we are beginning to understand some of the transcriptional effects brought about by gene dosage changes and the chromosome rearrangement breakpoints associated with them, much of this work relies on isolated examples. We have systematically examined deficiencies of the left arm of chromosome 2 and characterize gene-by-gene dosage responses that vary from collapsed expression through modest partial dosage compensation to full or even over compensation. We found negligible long-range effects of creating novel chromosome domains at deletion breakpoints, suggesting that cases of gene regulation due to altered nuclear architecture are rare. These rare cases include trans de-repression when deficiencies delete chromatin characterized as repressive in other studies. Generally, effects of breakpoints on expression are promoter proximal (~100bp) or in the gene body. Effects of deficiencies genome-wide are in genes with regulatory relationships to genes within the deleted segments, highlighting the subtle expression network defects in these sensitized genetic backgrounds.

  13. European Nuclear Education Network (ENEN) Association Initiative

    International Nuclear Information System (INIS)

    Comsa, Olivia; Meglea, Claudia; Banutoiu, Marina; Paraschiva, M. V.; Meglea, S.

    2003-01-01

    The main objective of the ENEN Association is the preservation and further development of a higher nuclear education and expertise. This objective should be achieved through the co-operation between European universities involved in education and research in the nuclear engineering field, research centers and the nuclear industry. To reach this objective, the ENEN Association has to: Promote and develop the collaboration in nuclear engineering education of engineers and researchers required by the nuclear industry and the regulatory bodies; Ensure the quality of nuclear academic engineering education and training; Increase the attractiveness for engagement in the nuclear field for students and young academics. The basic objectives of the ENEN Association shall be to: Deliver an European Master of Science Degree in Nuclear Engineering and promote PhD studies; Promote exchange of students and teachers participating in the frame of this network; Increase the number of students by providing incentives; Establish a framework for mutual recognition; Foster and strengthen the relationship with research laboratories and networks, industry and regulatory bodies, by involving them in (or association them with) nuclear academic education and by offering continuous training. The aims of the ENEN Association shall be achieved by: Discussion on educational objectives, methods and course contents among the members and with external partners, particularly national European industries; Organization of internal audits on the quality of nuclear engineering curricula; Awarding the label of 'European Master degree of Science in Nuclear Engineering' to the curricula satisfying the criteria set up by the ENEN Association; Cooperation between the members, and with the research centers and the nuclear industry for enhancement of mobility of teachers and students, organization of training and advanced courses, use of large research and teaching facilities or infrastructures; Cooperation

  14. Identification of T1D susceptibility genes within the MHC region by combining protein interaction networks and SNP genotyping data

    DEFF Research Database (Denmark)

    Brorsson, C.; Hansen, Niclas Tue; Hansen, Kasper Lage

    2009-01-01

    genes. We have developed a novel method that combines single nucleotide polymorphism (SNP) genotyping data with protein-protein interaction (ppi) networks to identify disease-associated network modules enriched for proteins encoded from the MHC region. Approximately 2500 SNPs located in the 4 Mb MHC......To develop novel methods for identifying new genes that contribute to the risk of developing type 1 diabetes within the Major Histocompatibility Complex (MHC) region on chromosome 6, independently of the known linkage disequilibrium (LD) between human leucocyte antigen (HLA)-DRB1, -DQA1, -DQB1...... region were analysed in 1000 affected offspring trios generated by the Type 1 Diabetes Genetics Consortium (T1DGC). The most associated SNP in each gene was chosen and genes were mapped to ppi networks for identification of interaction partners. The association testing and resulting interacting protein...

  15. Recurrent neural network based hybrid model for reconstructing gene regulatory network.

    Science.gov (United States)

    Raza, Khalid; Alam, Mansaf

    2016-10-01

    One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  17. A systematic study on drug-response associated genes using baseline gene expressions of the Cancer Cell Line Encyclopedia

    Science.gov (United States)

    Liu, Xiaoming; Yang, Jiasheng; Zhang, Yi; Fang, Yun; Wang, Fayou; Wang, Jun; Zheng, Xiaoqi; Yang, Jialiang

    2016-03-01

    We have studied drug-response associated (DRA) gene expressions by applying a systems biology framework to the Cancer Cell Line Encyclopedia data. More than 4,000 genes are inferred to be DRA for at least one drug, while the number of DRA genes for each drug varies dramatically from almost 0 to 1,226. Functional enrichment analysis shows that the DRA genes are significantly enriched in genes associated with cell cycle and plasma membrane. Moreover, there might be two patterns of DRA genes between genders. There are significantly shared DRA genes between male and female for most drugs, while very little DRA genes tend to be shared between the two genders for a few drugs targeting sex-specific cancers (e.g., PD-0332991 for breast cancer and ovarian cancer). Our analyses also show substantial difference for DRA genes between young and old samples, suggesting the necessity of considering the age effects for personalized medicine in cancers. Lastly, differential module and key driver analyses confirm cell cycle related modules as top differential ones for drug sensitivity. The analyses also reveal the role of TSPO, TP53, and many other immune or cell cycle related genes as important key drivers for DRA network modules. These key drivers provide new drug targets to improve the sensitivity of cancer therapy.

  18. Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

    NARCIS (Netherlands)

    Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

    2009-01-01

    Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori

  19. Radioresistance related genes screened by protein-protein interaction network analysis in nasopharyngeal carcinoma

    International Nuclear Information System (INIS)

    Zhu Xiaodong; Guo Ya; Qu Song; Li Ling; Huang Shiting; Li Danrong; Zhang Wei

    2012-01-01

    Objective: To discover radioresistance associated molecular biomarkers and its mechanism in nasopharyngeal carcinoma by protein-protein interaction network analysis. Methods: Whole genome expression microarray was applied to screen out differentially expressed genes in two cell lines CNE-2R and CNE-2 with different radiosensitivity. Four differentially expressed genes were randomly selected for further verification by the semi-quantitative RT-PCR analysis with self-designed primers. The common differentially expressed genes from two experiments were analyzed with the SNOW online database in order to find out the central node related to the biomarkers of nasopharyngeal carcinoma radioresistance. The expression of STAT1 in CNE-2R and CNE-2 cells was measured by Western blot. Results: Compared with CNE-2 cells, 374 genes in CNE-2R cells were differentially expressed while 197 genes showed significant differences. Four randomly selected differentially expressed genes were verified by RT-PCR and had same change trend in consistent with the results of chip assay. Analysis with the SNOW database demonstrated that those 197 genes could form a complicated interaction network where STAT1 and JUN might be two key nodes. Indeed, the STAT1-α expression in CNE-2R was higher than that in CNE-2 (t=4.96, P<0.05). Conclusions: The key nodes of STAT1 and JUN may be the molecular biomarkers leading to radioresistance in nasopharyngeal carcinoma, and STAT1-α might have close relationship with radioresistance. (authors)

  20. Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.

    Science.gov (United States)

    Park, Chihyun; Ahn, Jaegyoon; Kim, Hyunjin; Park, Sanghyun

    2014-01-01

    The prognosis of cancer recurrence is an important research area in bioinformatics and is challenging due to the small sample sizes compared to the vast number of genes. There have been several attempts to predict cancer recurrence. Most studies employed a supervised approach, which uses only a few labeled samples. Semi-supervised learning can be a great alternative to solve this problem. There have been few attempts based on manifold assumptions to reveal the detailed roles of identified cancer genes in recurrence. In order to predict cancer recurrence, we proposed a novel semi-supervised learning algorithm based on a graph regularization approach. We transformed the gene expression data into a graph structure for semi-supervised learning and integrated protein interaction data with the gene expression data to select functionally-related gene pairs. Then, we predicted the recurrence of cancer by applying a regularization approach to the constructed graph containing both labeled and unlabeled nodes. The average improvement rate of accuracy for three different cancer datasets was 24.9% compared to existing supervised and semi-supervised methods. We performed functional enrichment on the gene networks used for learning. We identified that those gene networks are significantly associated with cancer-recurrence-related biological functions. Our algorithm was developed with standard C++ and is available in Linux and MS Windows formats in the STL library. The executable program is freely available at: http://embio.yonsei.ac.kr/~Park/ssl.php.

  1. Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.

    Directory of Open Access Journals (Sweden)

    Chihyun Park

    Full Text Available BACKGROUND: The prognosis of cancer recurrence is an important research area in bioinformatics and is challenging due to the small sample sizes compared to the vast number of genes. There have been several attempts to predict cancer recurrence. Most studies employed a supervised approach, which uses only a few labeled samples. Semi-supervised learning can be a great alternative to solve this problem. There have been few attempts based on manifold assumptions to reveal the detailed roles of identified cancer genes in recurrence. RESULTS: In order to predict cancer recurrence, we proposed a novel semi-supervised learning algorithm based on a graph regularization approach. We transformed the gene expression data into a graph structure for semi-supervised learning and integrated protein interaction data with the gene expression data to select functionally-related gene pairs. Then, we predicted the recurrence of cancer by applying a regularization approach to the constructed graph containing both labeled and unlabeled nodes. CONCLUSIONS: The average improvement rate of accuracy for three different cancer datasets was 24.9% compared to existing supervised and semi-supervised methods. We performed functional enrichment on the gene networks used for learning. We identified that those gene networks are significantly associated with cancer-recurrence-related biological functions. Our algorithm was developed with standard C++ and is available in Linux and MS Windows formats in the STL library. The executable program is freely available at: http://embio.yonsei.ac.kr/~Park/ssl.php.

  2. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    Science.gov (United States)

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  3. Circuit-wide Transcriptional Profiling Reveals Brain Region-Specific Gene Networks Regulating Depression Susceptibility.

    Science.gov (United States)

    Bagot, Rosemary C; Cates, Hannah M; Purushothaman, Immanuel; Lorsch, Zachary S; Walker, Deena M; Wang, Junshi; Huang, Xiaojie; Schlüter, Oliver M; Maze, Ian; Peña, Catherine J; Heller, Elizabeth A; Issler, Orna; Wang, Minghui; Song, Won-Min; Stein, Jason L; Liu, Xiaochuan; Doyle, Marie A; Scobie, Kimberly N; Sun, Hao Sheng; Neve, Rachael L; Geschwind, Daniel; Dong, Yan; Shen, Li; Zhang, Bin; Nestler, Eric J

    2016-06-01

    Depression is a complex, heterogeneous disorder and a leading contributor to the global burden of disease. Most previous research has focused on individual brain regions and genes contributing to depression. However, emerging evidence in humans and animal models suggests that dysregulated circuit function and gene expression across multiple brain regions drive depressive phenotypes. Here, we performed RNA sequencing on four brain regions from control animals and those susceptible or resilient to chronic social defeat stress at multiple time points. We employed an integrative network biology approach to identify transcriptional networks and key driver genes that regulate susceptibility to depressive-like symptoms. Further, we validated in vivo several key drivers and their associated transcriptional networks that regulate depression susceptibility and confirmed their functional significance at the levels of gene transcription, synaptic regulation, and behavior. Our study reveals novel transcriptional networks that control stress susceptibility and offers fundamentally new leads for antidepressant drug discovery. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Network Security via Biometric Recognition of Patterns of Gene Expression

    Science.gov (United States)

    Shaw, Harry C.

    2016-01-01

    Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time expression and assay of gene expression products.

  5. Network Security via Biometric Recognition of Patterns of Gene Expression

    Science.gov (United States)

    Shaw, Harry C.

    2016-01-01

    Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT (Information Technology) organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time assays of gene expression products.

  6. Language networks associated with computerized semantic indices.

    Science.gov (United States)

    Pakhomov, Serguei V S; Jones, David T; Knopman, David S

    2015-01-01

    Tests of generative semantic verbal fluency are widely used to study organization and representation of concepts in the human brain. Previous studies demonstrated that clustering and switching behavior during verbal fluency tasks is supported by multiple brain mechanisms associated with semantic memory and executive control. Previous work relied on manual assessments of semantic relatedness between words and grouping of words into semantic clusters. We investigated a computational linguistic approach to measuring the strength of semantic relatedness between words based on latent semantic analysis of word co-occurrences in a subset of a large online encyclopedia. We computed semantic clustering indices and compared them to brain network connectivity measures obtained with task-free fMRI in a sample consisting of healthy participants and those differentially affected by cognitive impairment. We found that semantic clustering indices were associated with brain network connectivity in distinct areas including fronto-temporal, fronto-parietal and fusiform gyrus regions. This study shows that computerized semantic indices complement traditional assessments of verbal fluency to provide a more complete account of the relationship between brain and verbal behavior involved organization and retrieval of lexical information from memory. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. Harnessing diversity towards the reconstructing of large scale gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Takeshi Hase

    Full Text Available Elucidating gene regulatory network (GRN from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.

  8. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  9. How to construct the statistic network? An association network of herbaceous

    Directory of Open Access Journals (Sweden)

    WenJun Zhang

    2012-06-01

    Full Text Available In present study I defined a new type of network, the statistic network. The statistic network is a weighted and non-deterministic network. In the statistic network, a connection value, i.e., connection weight, represents connection strength and connection likelihood between two nodes and its absolute value falls in the interval (0,1]. The connection value is expressed as a statistical measure such as correlation coefficient, association coefficient, or Jaccard coefficient, etc. In addition, all connections of the statistic network can be statistically tested for their validity. A connection is true if the connection value is statistically significant. If all connection values of a node are not statistically significant, it is an isolated node. An isolated node has not any connection to other nodes in the statistic network. Positive and negative connection values denote distinct connectiontypes (positive or negative association or interaction. In the statistic network, two nodes with the greater connection value will show more similar trend in the change of their states. At any time we can obtain a sample network of the statistic network. A sample network is a non-weighted and deterministic network. Thestatistic network, in particular the plant association network that constructed from field sampling, is mostly an information network. Most of the interspecific relationships in plant community are competition and cooperation. Therefore in comparison to animal networks, the methodology of statistic network is moresuitable to construct plant association networks. Some conclusions were drawn from this study: (1 in the plant association network, most connections are weak and positive interactions. The association network constructed from Spearman rank correlation has most connections and isolated taxa are fewer. From net linear correlation,linear correlation, to Spearman rank correlation, the practical number of connections and connectance in the

  10. Pseudogenes regulate parental gene expression via ceRNA network.

    Science.gov (United States)

    An, Yang; Furber, Kendra L; Ji, Shaoping

    2017-01-01

    The concept of competitive endogenous RNA (ceRNA) was first proposed by Salmena and colleagues. Evidence suggests that pseudogene RNAs can act as a 'sponge' through competitive binding of common miRNA, releasing or attenuating repression through sequestering miRNAs away from parental mRNA. In theory, ceRNAs refer to all transcripts such as mRNA, tRNA, rRNA, long non-coding RNA, pseudogene RNA and circular RNA, because all of them may become the targets of miRNA depending on spatiotemporal situation. As binding of miRNA to the target RNA is not 100% complementary, it is possible that one miRNA can bind to multiple target RNAs and vice versa. All RNAs crosstalk through competitively binding to miRNAvia miRNA response elements (MREs) contained within the RNA sequences, thus forming a complex regulatory network. The ratio of a subset of miRNAs to the corresponding number of MREs determines repression strength on a given mRNA translation or stability. An increase in pseudogene RNA level can sequester miRNA and release repression on the parental gene, leading to an increase in parental gene expression. A massive number of transcripts constitute a complicated network that regulates each other through this proposed mechanism, though some regulatory significance may be mild or even undetectable. It is possible that the regulation of gene and pseudogene expression occurring in this manor involves all RNAs bearing common MREs. In this review, we will primarily discuss how pseudogene transcripts regulate expression of parental genes via ceRNA network and biological significance of regulation. © 2016 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

  11. Gene Network for Identifying the Entropy Changes of Different Modules in Pediatric Sepsis

    Directory of Open Access Journals (Sweden)

    Jing Yang

    2016-12-01

    Full Text Available Background/Aims: Pediatric sepsis is a disease that threatens life of children. The incidence of pediatric sepsis is higher in developing countries due to various reasons, such as insufficient immunization and nutrition, water and air pollution, etc. Exploring the potential genes via different methods is of significance for the prevention and treatment of pediatric sepsis. This study aimed to identify potential genes associated with pediatric sepsis utilizing analysis of gene network and entropy. Methods: The mRNA expression in the blood samples collected from 20 septic children and 30 healthy controls was quantified by using Affymetrix HG-U133A microarray. Two condition-specific protein-protein interaction networks (PINs, one for the healthy control and the other one for the children with sepsis, were deduced by combining the fundamental human PINs with gene expression profiles in the two phenotypes. Subsequently, distinct modules from the two conditional networks were extracted by adopting a maximal clique-merging approach. Delta entropy (ΔS was calculated between sepsis and control modules. Results: Then, key genes displaying changes in gene composition were identified by matching the control and sepsis modules. Two objective modules were obtained, in which ribosomal protein RPL4 and RPL9 as well as TOP2A were probably considered as the key genes differentiating sepsis from healthy controls. Conclusion: According to previous reports and this work, TOP2A is the potential gene therapy target for pediatric sepsis. The relationship between pediatric sepsis and RPL4 and RPL9 needs further investigation.

  12. In-Silico Integration Approach to Identify a Key miRNA Regulating a Gene Network in Aggressive Prostate Cancer

    Science.gov (United States)

    Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella

    2018-01-01

    Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723

  13. Ecological transition predictably associated with gene degeneration.

    Science.gov (United States)

    Wessinger, Carolyn A; Rausher, Mark D

    2015-02-01

    Gene degeneration or loss can significantly contribute to phenotypic diversification, but may generate genetic constraints on future evolutionary trajectories, potentially restricting phenotypic reversal. Such constraints may manifest as directional evolutionary trends when parallel phenotypic shifts consistently involve gene degeneration or loss. Here, we demonstrate that widespread parallel evolution in Penstemon from blue to red flowers predictably involves the functional inactivation and degeneration of the enzyme flavonoid 3',5'-hydroxylase (F3'5'H), an anthocyanin pathway enzyme required for the production of blue floral pigments. Other types of genetic mutations do not consistently accompany this phenotypic shift. This pattern may be driven by the relatively large mutational target size of degenerative mutations to this locus and the apparent lack of associated pleiotropic effects. The consistent degeneration of F3'5'H may provide a mechanistic explanation for the observed asymmetry in the direction of flower color evolution in Penstemon: Blue to red transitions are common, but reverse transitions have not been observed. Although phenotypic shifts in this system are likely driven by natural selection, internal constraints may generate predictable genetic outcomes and may restrict future evolutionary trajectories. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  14. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  15. From gene networks to drugs: systems pharmacology approaches for AUD.

    Science.gov (United States)

    Ferguson, Laura B; Harris, R Adron; Mayfield, Roy Dayne

    2018-06-01

    The alcohol research field has amassed an impressive number of gene expression datasets spanning key brain areas for addiction, species (humans as well as multiple animal models), and stages in the addiction cycle (binge/intoxication, withdrawal/negative effect, and preoccupation/anticipation). These data have improved our understanding of the molecular adaptations that eventually lead to dysregulation of brain function and the chronic, relapsing disorder of addiction. Identification of new medications to treat alcohol use disorder (AUD) will likely benefit from the integration of genetic, genomic, and behavioral information included in these important datasets. Systems pharmacology considers drug effects as the outcome of the complex network of interactions a drug has rather than a single drug-molecule interaction. Computational strategies based on this principle that integrate gene expression signatures of pharmaceuticals and disease states have shown promise for identifying treatments that ameliorate disease symptoms (called in silico gene mapping or connectivity mapping). In this review, we suggest that gene expression profiling for in silico mapping is critical to improve drug repurposing and discovery for AUD and other psychiatric illnesses. We highlight studies that successfully apply gene mapping computational approaches to identify or repurpose pharmaceutical treatments for psychiatric illnesses. Furthermore, we address important challenges that must be overcome to maximize the potential of these strategies to translate to the clinic and improve healthcare outcomes.

  16. Heart morphogenesis gene regulatory networks revealed by temporal expression analysis.

    Science.gov (United States)

    Hill, Jonathon T; Demarest, Bradley; Gorsi, Bushra; Smith, Megan; Yost, H Joseph

    2017-10-01

    During embryogenesis the heart forms as a linear tube that then undergoes multiple simultaneous morphogenetic events to obtain its mature shape. To understand the gene regulatory networks (GRNs) driving this phase of heart development, during which many congenital heart disease malformations likely arise, we conducted an RNA-seq timecourse in zebrafish from 30 hpf to 72 hpf and identified 5861 genes with altered expression. We clustered the genes by temporal expression pattern, identified transcription factor binding motifs enriched in each cluster, and generated a model GRN for the major gene batteries in heart morphogenesis. This approach predicted hundreds of regulatory interactions and found batteries enriched in specific cell and tissue types, indicating that the approach can be used to narrow the search for novel genetic markers and regulatory interactions. Subsequent analyses confirmed the GRN using two mutants, Tbx5 and nkx2-5 , and identified sets of duplicated zebrafish genes that do not show temporal subfunctionalization. This dataset provides an essential resource for future studies on the genetic/epigenetic pathways implicated in congenital heart defects and the mechanisms of cardiac transcriptional regulation. © 2017. Published by The Company of Biologists Ltd.

  17. Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data.

    Science.gov (United States)

    Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J

    2015-03-07

    Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP

  18. Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

    Science.gov (United States)

    Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

    2017-08-01

    Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.

  19. A core filamentation response network in Candida albicans is restricted to eight genes.

    Directory of Open Access Journals (Sweden)

    Ronny Martin

    Full Text Available Although morphological plasticity is a central virulence trait of Candida albicans, the number of filament-associated genes and the interplay of mechanisms regulating their expression remain unknown. By correlation-based network modeling of the transcriptional response to different defined external stimuli for morphogenesis we identified a set of eight genes with highly correlated expression patterns, forming a core filamentation response. This group of genes included ALS3, ECE1, HGT2, HWP1, IHD1 and RBT1 which are known or supposed to encode for cell- wall associated proteins as well as the Rac1 guanine nucleotide exchange factor encoding gene DCK1 and the unknown function open reading frame orf19.2457. The validity of network modeling was confirmed using a dataset of advanced complexity that describes the transcriptional response of C. albicans during epithelial invasion as well as comparing our results with other previously published transcriptome studies. Although the set of core filamentation response genes was quite small, several transcriptional regulators are involved in the control of their expression, depending on the environmental condition.

  20. Using Morpholinos to Probe Gene Networks in Sea Urchin.

    Science.gov (United States)

    Materna, Stefan C

    2017-01-01

    The control processes that underlie the progression of development can be summarized in maps of gene regulatory networks (GRNs). A critical step in their assembly is the systematic perturbation of network candidates. In sea urchins the most important method for interfering with expression in a gene-specific way is application of morpholino antisense oligonucleotides (MOs). MOs act by binding to their sequence complement in transcripts resulting in a block in translation or a change in splicing and thus result in a loss of function. Despite the tremendous success of this technology, recent comparisons to mutants generated by genome editing have led to renewed criticism and challenged its reliability. As with all methods based on sequence recognition, MOs are prone to off-target binding that may result in phenotypes that are erroneously ascribed to the loss of the intended target. However, the slow progression of development in sea urchins has enabled extremely detailed studies of gene activity in the embryo. This wealth of knowledge paired with the simplicity of the sea urchin embryo enables careful analysis of MO phenotypes through a variety of methods that do not rely on terminal phenotypes. This article summarizes the use of MOs in probing GRNs and the steps that should be taken to assure their specificity.

  1. Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Alina Sîrbu

    2015-05-01

    Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

  2. Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks.

    Science.gov (United States)

    Sîrbu, Alina; Crane, Martin; Ruskin, Heather J

    2015-05-14

    Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

  3. Pluripotency gene network dynamics: System views from parametric analysis.

    Science.gov (United States)

    Akberdin, Ilya R; Omelyanchuk, Nadezda A; Fadeev, Stanislav I; Leskova, Natalya E; Oschepkova, Evgeniya A; Kazantsev, Fedor V; Matushkin, Yury G; Afonnikov, Dmitry A; Kolchanov, Nikolay A

    2018-01-01

    Multiple experimental data demonstrated that the core gene network orchestrating self-renewal and differentiation of mouse embryonic stem cells involves activity of Oct4, Sox2 and Nanog genes by means of a number of positive feedback loops among them. However, recent studies indicated that the architecture of the core gene network should also incorporate negative Nanog autoregulation and might not include positive feedbacks from Nanog to Oct4 and Sox2. Thorough parametric analysis of the mathematical model based on this revisited core regulatory circuit identified that there are substantial changes in model dynamics occurred depending on the strength of Oct4 and Sox2 activation and molecular complexity of Nanog autorepression. The analysis showed the existence of four dynamical domains with different numbers of stable and unstable steady states. We hypothesize that these domains can constitute the checkpoints in a developmental progression from naïve to primed pluripotency and vice versa. During this transition, parametric conditions exist, which generate an oscillatory behavior of the system explaining heterogeneity in expression of pluripotent and differentiation factors in serum ESC cultures. Eventually, simulations showed that addition of positive feedbacks from Nanog to Oct4 and Sox2 leads mainly to increase of the parametric space for the naïve ESC state, in which pluripotency factors are strongly expressed while differentiation ones are repressed.

  4. Construction of Gene Regulatory Networks Using Recurrent Neural Networks and Swarm Intelligence.

    Science.gov (United States)

    Khan, Abhinandan; Mandal, Sudip; Pal, Rajat Kumar; Saha, Goutam

    2016-01-01

    We have proposed a methodology for the reverse engineering of biologically plausible gene regulatory networks from temporal genetic expression data. We have used established information and the fundamental mathematical theory for this purpose. We have employed the Recurrent Neural Network formalism to extract the underlying dynamics present in the time series expression data accurately. We have introduced a new hybrid swarm intelligence framework for the accurate training of the model parameters. The proposed methodology has been first applied to a small artificial network, and the results obtained suggest that it can produce the best results available in the contemporary literature, to the best of our knowledge. Subsequently, we have implemented our proposed framework on experimental (in vivo) datasets. Finally, we have investigated two medium sized genetic networks (in silico) extracted from GeneNetWeaver, to understand how the proposed algorithm scales up with network size. Additionally, we have implemented our proposed algorithm with half the number of time points. The results indicate that a reduction of 50% in the number of time points does not have an effect on the accuracy of the proposed methodology significantly, with a maximum of just over 15% deterioration in the worst case.

  5. RegnANN: Reverse Engineering Gene Networks using Artificial Neural Networks.

    Directory of Open Access Journals (Sweden)

    Marco Grimaldi

    Full Text Available RegnANN is a novel method for reverse engineering gene networks based on an ensemble of multilayer perceptrons. The algorithm builds a regressor for each gene in the network, estimating its neighborhood independently. The overall network is obtained by joining all the neighborhoods. RegnANN makes no assumptions about the nature of the relationships between the variables, potentially capturing high-order and non linear dependencies between expression patterns. The evaluation focuses on synthetic data mimicking plausible submodules of larger networks and on biological data consisting of submodules of Escherichia coli. We consider Barabasi and Erdös-Rényi topologies together with two methods for data generation. We verify the effect of factors such as network size and amount of data to the accuracy of the inference algorithm. The accuracy scores obtained with RegnANN is methodically compared with the performance of three reference algorithms: ARACNE, CLR and KELLER. Our evaluation indicates that RegnANN compares favorably with the inference methods tested. The robustness of RegnANN, its ability to discover second order correlations and the agreement between results obtained with this new methods on both synthetic and biological data are promising and they stimulate its application to a wider range of problems.

  6. Prediction of the Ebola Virus Infection Related Human Genes Using Protein-Protein Interaction Network.

    Science.gov (United States)

    Cao, HuanHuan; Zhang, YuHang; Zhao, Jia; Zhu, Liucun; Wang, Yi; Li, JiaRui; Feng, Yuan-Ming; Zhang, Ning

    2017-01-01

    Ebola hemorrhagic fever (EHF) is caused by Ebola virus (EBOV). It is reported that human could be infected by EBOV with a high fatality rate. However, association factors between EBOV and host still tend to be ambiguous. According to the "guilt by association" (GBA) principle, proteins interacting with each other are very likely to function similarly or the same. Based on this assumption, we tried to obtain EBOV infection-related human genes in a protein-protein interaction network using Dijkstra algorithm. We hope it could contribute to the discovery of novel effective treatments. Finally, 15 genes were selected as potential EBOV infection-related human genes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  7. RNAi-Based Identification of Gene-Specific Nuclear Cofactor Networks Regulating Interleukin-1 Target Genes

    Directory of Open Access Journals (Sweden)

    Johanna Meier-Soelch

    2018-04-01

    Full Text Available The potent proinflammatory cytokine interleukin (IL-1 triggers gene expression through the NF-κB signaling pathway. Here, we investigated the cofactor requirements of strongly regulated IL-1 target genes whose expression is impaired in p65 NF-κB-deficient murine embryonic fibroblasts. By two independent small-hairpin (shRNA screens, we examined 170 genes annotated to encode nuclear cofactors for their role in Cxcl2 mRNA expression and identified 22 factors that modulated basal or IL-1-inducible Cxcl2 levels. The functions of 16 of these factors were validated for Cxcl2 and further analyzed for their role in regulation of 10 additional IL-1 target genes by RT-qPCR. These data reveal that each inducible gene has its own (quantitative requirement of cofactors to maintain basal levels and to respond to IL-1. Twelve factors (Epc1, H2afz, Kdm2b, Kdm6a, Mbd3, Mta2, Phf21a, Ruvbl1, Sin3b, Suv420h1, Taf1, and Ube3a have not been previously implicated in inflammatory cytokine functions. Bioinformatics analysis indicates that they are components of complex nuclear protein networks that regulate chromatin functions and gene transcription. Collectively, these data suggest that downstream from the essential NF-κB signal each cytokine-inducible target gene has further subtle requirements for individual sets of nuclear cofactors that shape its transcriptional activation profile.

  8. Algebraic model checking for Boolean gene regulatory networks.

    Science.gov (United States)

    Tran, Quoc-Nam

    2011-01-01

    We present a computational method in which modular and Groebner bases (GB) computation in Boolean rings are used for solving problems in Boolean gene regulatory networks (BN). In contrast to other known algebraic approaches, the degree of intermediate polynomials during the calculation of Groebner bases using our method will never grow resulting in a significant improvement in running time and memory space consumption. We also show how calculation in temporal logic for model checking can be done by means of our direct and efficient Groebner basis computation in Boolean rings. We present our experimental results in finding attractors and control strategies of Boolean networks to illustrate our theoretical arguments. The results are promising. Our algebraic approach is more efficient than the state-of-the-art model checker NuSMV on BNs. More importantly, our approach finds all solutions for the BN problems.

  9. Transcriptional Analysis of the MrpJ Network: Modulation of Diverse Virulence-Associated Genes and Direct Regulation of mrp Fimbrial and flhDC Flagellar Operons in Proteus mirabilis

    Science.gov (United States)

    Bode, Nadine J.; Debnath, Irina; Kuan, Lisa; Schulfer, Anjelique; Ty, Maureen

    2015-01-01

    The enteric bacterium Proteus mirabilis is associated with a significant number of catheter-associated urinary tract infections (UTIs). Strict regulation of the antagonistic processes of adhesion and motility, mediated by fimbriae and flagella, respectively, is essential for disease progression. Previously, the transcriptional regulator MrpJ, which is encoded by the mrp fimbrial operon, has been shown to repress both swimming and swarming motility. Here we show that MrpJ affects an array of cellular processes beyond adherence and motility. Microarray analysis found that expression of mrpJ mimicking levels observed during UTIs leads to differential expression of 217 genes related to, among other functions, bacterial virulence, type VI secretion, and metabolism. We probed the molecular mechanism of transcriptional regulation by MrpJ using transcriptional reporters and chromatin immunoprecipitation (ChIP). Binding of MrpJ to two virulence-associated target gene promoters, the promoters of the flagellar master regulator flhDC and mrp itself, appears to be affected by the condensation state of the native chromosome, although both targets share a direct MrpJ binding site proximal to the transcriptional start. Furthermore, an mrpJ deletion mutant colonized the bladders of mice at significantly lower levels in a transurethral model of infection. Additionally, we observed that mrpJ is widely conserved in a collection of recent clinical isolates. Altogether, these findings support a role of MrpJ as a global regulator of P. mirabilis virulence. PMID:25847961

  10. GSEH: A Novel Approach to Select Prostate Cancer-Associated Genes Using Gene Expression Heterogeneity.

    Science.gov (United States)

    Kim, Hyunjin; Choi, Sang-Min; Park, Sanghyun

    2018-01-01

    When a gene shows varying levels of expression among normal people but similar levels in disease patients or shows similar levels of expression among normal people but different levels in disease patients, we can assume that the gene is associated with the disease. By utilizing this gene expression heterogeneity, we can obtain additional information that abets discovery of disease-associated genes. In this study, we used collaborative filtering to calculate the degree of gene expression heterogeneity between classes and then scored the genes on the basis of the degree of gene expression heterogeneity to find "differentially predicted" genes. Through the proposed method, we discovered more prostate cancer-associated genes than 10 comparable methods. The genes prioritized by the proposed method are potentially significant to biological processes of a disease and can provide insight into them.

  11. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

    Science.gov (United States)

    Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

    2015-01-01

    Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.

  12. CRISPR loci reveal networks of gene exchange in archaea

    Directory of Open Access Journals (Sweden)

    Brodt Avital

    2011-12-01

    Full Text Available Abstract Background CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Results Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. Conclusions CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. Open peer review This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten

  13. CRISPR loci reveal networks of gene exchange in archaea.

    Science.gov (United States)

    Brodt, Avital; Lurie-Weinberger, Mor N; Gophna, Uri

    2011-12-21

    CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten).

  14. Gene regulatory and signaling networks exhibit distinct topological distributions of motifs

    Science.gov (United States)

    Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura

    2018-04-01

    The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.

  15. Robust variable selection method for nonparametric differential equation models with application to nonlinear dynamic gene regulatory network analysis.

    Science.gov (United States)

    Lu, Tao

    2016-01-01

    The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.

  16. A network-based approach to prioritize results from genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Nirmala Akula

    Full Text Available Genome-wide association studies (GWAS are a valuable approach to understanding the genetic basis of complex traits. One of the challenges of GWAS is the translation of genetic association results into biological hypotheses suitable for further investigation in the laboratory. To address this challenge, we introduce Network Interface Miner for Multigenic Interactions (NIMMI, a network-based method that combines GWAS data with human protein-protein interaction data (PPI. NIMMI builds biological networks weighted by connectivity, which is estimated by use of a modification of the Google PageRank algorithm. These weights are then combined with genetic association p-values derived from GWAS, producing what we call 'trait prioritized sub-networks.' As a proof of principle, NIMMI was tested on three GWAS datasets previously analyzed for height, a classical polygenic trait. Despite differences in sample size and ancestry, NIMMI captured 95% of the known height associated genes within the top 20% of ranked sub-networks, far better than what could be achieved by a single-locus approach. The top 2% of NIMMI height-prioritized sub-networks were significantly enriched for genes involved in transcription, signal transduction, transport, and gene expression, as well as nucleic acid, phosphate, protein, and zinc metabolism. All of these sub-networks were ranked near the top across all three height GWAS datasets we tested. We also tested NIMMI on a categorical phenotype, Crohn's disease. NIMMI prioritized sub-networks involved in B- and T-cell receptor, chemokine, interleukin, and other pathways consistent with the known autoimmune nature of Crohn's disease. NIMMI is a simple, user-friendly, open-source software tool that efficiently combines genetic association data with biological networks, translating GWAS findings into biological hypotheses.

  17. A Network-Based Approach to Prioritize Results from Genome-Wide Association Studies

    Science.gov (United States)

    Akula, Nirmala; Baranova, Ancha; Seto, Donald; Solka, Jeffrey; Nalls, Michael A.; Singleton, Andrew; Ferrucci, Luigi; Tanaka, Toshiko; Bandinelli, Stefania; Cho, Yoon Shin; Kim, Young Jin; Lee, Jong-Young; Han, Bok-Ghee; McMahon, Francis J.

    2011-01-01

    Genome-wide association studies (GWAS) are a valuable approach to understanding the genetic basis of complex traits. One of the challenges of GWAS is the translation of genetic association results into biological hypotheses suitable for further investigation in the laboratory. To address this challenge, we introduce Network Interface Miner for Multigenic Interactions (NIMMI), a network-based method that combines GWAS data with human protein-protein interaction data (PPI). NIMMI builds biological networks weighted by connectivity, which is estimated by use of a modification of the Google PageRank algorithm. These weights are then combined with genetic association p-values derived from GWAS, producing what we call ‘trait prioritized sub-networks.’ As a proof of principle, NIMMI was tested on three GWAS datasets previously analyzed for height, a classical polygenic trait. Despite differences in sample size and ancestry, NIMMI captured 95% of the known height associated genes within the top 20% of ranked sub-networks, far better than what could be achieved by a single-locus approach. The top 2% of NIMMI height-prioritized sub-networks were significantly enriched for genes involved in transcription, signal transduction, transport, and gene expression, as well as nucleic acid, phosphate, protein, and zinc metabolism. All of these sub-networks were ranked near the top across all three height GWAS datasets we tested. We also tested NIMMI on a categorical phenotype, Crohn’s disease. NIMMI prioritized sub-networks involved in B- and T-cell receptor, chemokine, interleukin, and other pathways consistent with the known autoimmune nature of Crohn’s disease. NIMMI is a simple, user-friendly, open-source software tool that efficiently combines genetic association data with biological networks, translating GWAS findings into biological hypotheses. PMID:21915301

  18. Systematic identification of latent disease-gene associations from PubMed articles.

    Science.gov (United States)

    Zhang, Yuji; Shen, Feichen; Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang

    2018-01-01

    Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.

  19. Dose response relationship in anti-stress gene regulatory networks.

    Science.gov (United States)

    Zhang, Qiang; Andersen, Melvin E

    2007-03-02

    To maintain a stable intracellular environment, cells utilize complex and specialized defense systems against a variety of external perturbations, such as electrophilic stress, heat shock, and hypoxia, etc. Irrespective of the type of stress, many adaptive mechanisms contributing to cellular homeostasis appear to operate through gene regulatory networks that are organized into negative feedback loops. In general, the degree of deviation of the controlled variables, such as electrophiles, misfolded proteins, and O2, is first detected by specialized sensor molecules, then the signal is transduced to specific transcription factors. Transcription factors can regulate the expression of a suite of anti-stress genes, many of which encode enzymes functioning to counteract the perturbed variables. The objective of this study was to explore, using control theory and computational approaches, the theoretical basis that underlies the steady-state dose response relationship between cellular stressors and intracellular biochemical species (controlled variables, transcription factors, and gene products) in these gene regulatory networks. Our work indicated that the shape of dose response curves (linear, superlinear, or sublinear) depends on changes in the specific values of local response coefficients (gains) distributed in the feedback loop. Multimerization of anti-stress enzymes and transcription factors into homodimers, homotrimers, or even higher-order multimers, play a significant role in maintaining robust homeostasis. Moreover, our simulation noted that dose response curves for the controlled variables can transition sequentially through four distinct phases as stressor level increases: initial superlinear with lesser control, superlinear more highly controlled, linear uncontrolled, and sublinear catastrophic. Each phase relies on specific gain-changing events that come into play as stressor level increases. The low-dose region is intrinsically nonlinear, and depending on

  20. Dose response relationship in anti-stress gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    2007-03-01

    Full Text Available To maintain a stable intracellular environment, cells utilize complex and specialized defense systems against a variety of external perturbations, such as electrophilic stress, heat shock, and hypoxia, etc. Irrespective of the type of stress, many adaptive mechanisms contributing to cellular homeostasis appear to operate through gene regulatory networks that are organized into negative feedback loops. In general, the degree of deviation of the controlled variables, such as electrophiles, misfolded proteins, and O2, is first detected by specialized sensor molecules, then the signal is transduced to specific transcription factors. Transcription factors can regulate the expression of a suite of anti-stress genes, many of which encode enzymes functioning to counteract the perturbed variables. The objective of this study was to explore, using control theory and computational approaches, the theoretical basis that underlies the steady-state dose response relationship between cellular stressors and intracellular biochemical species (controlled variables, transcription factors, and gene products in these gene regulatory networks. Our work indicated that the shape of dose response curves (linear, superlinear, or sublinear depends on changes in the specific values of local response coefficients (gains distributed in the feedback loop. Multimerization of anti-stress enzymes and transcription factors into homodimers, homotrimers, or even higher-order multimers, play a significant role in maintaining robust homeostasis. Moreover, our simulation noted that dose response curves for the controlled variables can transition sequentially through four distinct phases as stressor level increases: initial superlinear with lesser control, superlinear more highly controlled, linear uncontrolled, and sublinear catastrophic. Each phase relies on specific gain-changing events that come into play as stressor level increases. The low-dose region is intrinsically nonlinear

  1. Ethylene-Related Gene Expression Networks in Wood Formation

    Directory of Open Access Journals (Sweden)

    Carolin Seyfferth

    2018-03-01

    Full Text Available Thickening of tree stems is the result of secondary growth, accomplished by the meristematic activity of the vascular cambium. Secondary growth of the stem entails developmental cascades resulting in the formation of secondary phloem outwards and secondary xylem (i.e., wood inwards of the stem. Signaling and transcriptional reprogramming by the phytohormone ethylene modifies cambial growth and cell differentiation, but the molecular link between ethylene and secondary growth remains unknown. We addressed this shortcoming by analyzing expression profiles and co-expression networks of ethylene pathway genes using the AspWood transcriptome database which covers all stages of secondary growth in aspen (Populus tremula stems. ACC synthase expression suggests that the ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC is synthesized during xylem expansion and xylem cell maturation. Ethylene-mediated transcriptional reprogramming occurs during all stages of secondary growth, as deduced from AspWood expression profiles of ethylene-responsive genes. A network centrality analysis of the AspWood dataset identified EIN3D and 11 ERFs as hubs. No overlap was found between the co-expressed genes of the EIN3 and ERF hubs, suggesting target diversification and hence independent roles for these transcription factor families during normal wood formation. The EIN3D hub was part of a large co-expression gene module, which contained 16 transcription factors, among them several new candidates that have not been earlier connected to wood formation and a VND-INTERACTING 2 (VNI2 homolog. We experimentally demonstrated Populus EIN3D function in ethylene signaling in Arabidopsis thaliana. The ERF hubs ERF118 and ERF119 were connected on the basis of their expression pattern and gene co-expression module composition to xylem cell expansion and secondary cell wall formation, respectively. We hereby establish data resources for ethylene-responsive genes and

  2. Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks.

    Science.gov (United States)

    Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina

    2017-11-28

    Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD.

  3. Fractional Hopfield Neural Networks: Fractional Dynamic Associative Recurrent Neural Networks.

    Science.gov (United States)

    Pu, Yi-Fei; Yi, Zhang; Zhou, Ji-Liu

    2017-10-01

    This paper mainly discusses a novel conceptual framework: fractional Hopfield neural networks (FHNN). As is commonly known, fractional calculus has been incorporated into artificial neural networks, mainly because of its long-term memory and nonlocality. Some researchers have made interesting attempts at fractional neural networks and gained competitive advantages over integer-order neural networks. Therefore, it is naturally makes one ponder how to generalize the first-order Hopfield neural networks to the fractional-order ones, and how to implement FHNN by means of fractional calculus. We propose to introduce a novel mathematical method: fractional calculus to implement FHNN. First, we implement fractor in the form of an analog circuit. Second, we implement FHNN by utilizing fractor and the fractional steepest descent approach, construct its Lyapunov function, and further analyze its attractors. Third, we perform experiments to analyze the stability and convergence of FHNN, and further discuss its applications to the defense against chip cloning attacks for anticounterfeiting. The main contribution of our work is to propose FHNN in the form of an analog circuit by utilizing a fractor and the fractional steepest descent approach, construct its Lyapunov function, prove its Lyapunov stability, analyze its attractors, and apply FHNN to the defense against chip cloning attacks for anticounterfeiting. A significant advantage of FHNN is that its attractors essentially relate to the neuron's fractional order. FHNN possesses the fractional-order-stability and fractional-order-sensitivity characteristics.

  4. Intervention in gene regulatory networks with maximal phenotype alteration.

    Science.gov (United States)

    Yousefi, Mohammadmahdi R; Dougherty, Edward R

    2013-07-15

    A basic issue for translational genomics is to model gene interaction via gene regulatory networks (GRNs) and thereby provide an informatics environment to study the effects of intervention (say, via drugs) and to derive effective intervention strategies. Taking the view that the phenotype is characterized by the long-run behavior (steady-state distribution) of the network, we desire interventions to optimally move the probability mass from undesirable to desirable states Heretofore, two external control approaches have been taken to shift the steady-state mass of a GRN: (i) use a user-defined cost function for which desirable shift of the steady-state mass is a by-product and (ii) use heuristics to design a greedy algorithm. Neither approach provides an optimal control policy relative to long-run behavior. We use a linear programming approach to optimally shift the steady-state mass from undesirable to desirable states, i.e. optimization is directly based on the amount of shift and therefore must outperform previously proposed methods. Moreover, the same basic linear programming structure is used for both unconstrained and constrained optimization, where in the latter case, constraints on the optimization limit the amount of mass that may be shifted to 'ambiguous' states, these being states that are not directly undesirable relative to the pathology of interest but which bear some perceived risk. We apply the method to probabilistic Boolean networks, but the theory applies to any Markovian GRN. Supplementary materials, including the simulation results, MATLAB source code and description of suboptimal methods are available at http://gsp.tamu.edu/Publications/supplementary/yousefi13b. edward@ece.tamu.edu Supplementary data are available at Bioinformatics online.

  5. Data identification for improving gene network inference using computational algebra.

    Science.gov (United States)

    Dimitrova, Elena; Stigler, Brandilyn

    2014-11-01

    Identification of models of gene regulatory networks is sensitive to the amount of data used as input. Considering the substantial costs in conducting experiments, it is of value to have an estimate of the amount of data required to infer the network structure. To minimize wasted resources, it is also beneficial to know which data are necessary to identify the network. Knowledge of the data and knowledge of the terms in polynomial models are often required a priori in model identification. In applications, it is unlikely that the structure of a polynomial model will be known, which may force data sets to be unnecessarily large in order to identify a model. Furthermore, none of the known results provides any strategy for constructing data sets to uniquely identify a model. We provide a specialization of an existing criterion for deciding when a set of data points identifies a minimal polynomial model when its monomial terms have been specified. Then, we relax the requirement of the knowledge of the monomials and present results for model identification given only the data. Finally, we present a method for constructing data sets that identify minimal polynomial models.

  6. Gene co-expression analysis identifies gene clusters associated with isotropic and polarized growth in Aspergillus fumigatus conidia.

    Science.gov (United States)

    Baltussen, Tim J H; Coolen, Jordy P M; Zoll, Jan; Verweij, Paul E; Melchers, Willem J G

    2018-04-26

    Aspergillus fumigatus is a saprophytic fungus that extensively produces conidia. These microscopic asexually reproductive structures are small enough to reach the lungs. Germination of conidia followed by hyphal growth inside human lungs is a key step in the establishment of infection in immunocompromised patients. RNA-Seq was used to analyze the transcriptome of dormant and germinating A. fumigatus conidia. Construction of a gene co-expression network revealed four gene clusters (modules) correlated with a growth phase (dormant, isotropic growth, polarized growth). Transcripts levels of genes encoding for secondary metabolites were high in dormant conidia. During isotropic growth, transcript levels of genes involved in cell wall modifications increased. Two modules encoding for growth and cell cycle/DNA processing were associated with polarized growth. In addition, the co-expression network was used to identify highly connected intermodular hub genes. These genes may have a pivotal role in the respective module and could therefore be compelling therapeutic targets. Generally, cell wall remodeling is an important process during isotropic and polarized growth, characterized by an increase of transcripts coding for hyphal growth and cell cycle/DNA processing when polarized growth is initiated. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Transcriptional analysis of the MrpJ network: modulation of diverse virulence-associated genes and direct regulation of mrp fimbrial and flhDC flagellar operons in Proteus mirabilis.

    Science.gov (United States)

    Bode, Nadine J; Debnath, Irina; Kuan, Lisa; Schulfer, Anjelique; Ty, Maureen; Pearson, Melanie M

    2015-06-01

    The enteric bacterium Proteus mirabilis is associated with a significant number of catheter-associated urinary tract infections (UTIs). Strict regulation of the antagonistic processes of adhesion and motility, mediated by fimbriae and flagella, respectively, is essential for disease progression. Previously, the transcriptional regulator MrpJ, which is encoded by the mrp fimbrial operon, has been shown to repress both swimming and swarming motility. Here we show that MrpJ affects an array of cellular processes beyond adherence and motility. Microarray analysis found that expression of mrpJ mimicking levels observed during UTIs leads to differential expression of 217 genes related to, among other functions, bacterial virulence, type VI secretion, and metabolism. We probed the molecular mechanism of transcriptional regulation by MrpJ using transcriptional reporters and chromatin immunoprecipitation (ChIP). Binding of MrpJ to two virulence-associated target gene promoters, the promoters of the flagellar master regulator flhDC and mrp itself, appears to be affected by the condensation state of the native chromosome, although both targets share a direct MrpJ binding site proximal to the transcriptional start. Furthermore, an mrpJ deletion mutant colonized the bladders of mice at significantly lower levels in a transurethral model of infection. Additionally, we observed that mrpJ is widely conserved in a collection of recent clinical isolates. Altogether, these findings support a role of MrpJ as a global regulator of P. mirabilis virulence. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  8. Evolutionary Analysis of DELLA-Associated Transcriptional Networks

    Directory of Open Access Journals (Sweden)

    Miguel A. Blázquez

    2017-04-01

    Full Text Available DELLA proteins are transcriptional regulators present in all land plants which have been shown to modulate the activity of over 100 transcription factors in Arabidopsis, involved in multiple physiological and developmental processes. It has been proposed that DELLAs transduce environmental information to pre-wired transcriptional circuits because their stability is regulated by gibberellins (GAs, whose homeostasis largely depends on environmental signals. The ability of GAs to promote DELLA degradation coincides with the origin of vascular plants, but the presence of DELLAs in other land plants poses at least two questions: what regulatory properties have DELLAs provided to the behavior of transcriptional networks in land plants, and how has the recruitment of DELLAs by GA signaling affected this regulation. To address these issues, we have constructed gene co-expression networks of four different organisms within the green lineage with different properties regarding DELLAs: Arabidopsis thaliana and Solanum lycopersicum (both with GA-regulated DELLA proteins, Physcomitrella patens (with GA-independent DELLA proteins and Chlamydomonas reinhardtii (a green alga without DELLA, and we have examined the relative evolution of the subnetworks containing the potential DELLA-dependent transcriptomes. Network analysis indicates a relative increase in parameters associated with the degree of interconnectivity in the DELLA-associated subnetworks of land plants, with a stronger effect in species with GA-regulated DELLA proteins. These results suggest that DELLAs may have played a role in the coordination of multiple transcriptional programs along evolution, and the function of DELLAs as regulatory ‘hubs’ became further consolidated after their recruitment by GA signaling in higher plants.

  9. CCNA Cisco Certified Network Associate Study Guide

    CERN Document Server

    Lammle, Todd

    2011-01-01

    Learn from the Best - Cisco Networking Authority Todd LammleWritten by Cisco networking authority Todd Lammle, this comprehensive guide has been completely updated to reflect the latest CCNA 640-802 exam. Todd's straightforward style provides lively examples, hands on and written labs, easy-to-understand analogies, and real-world scenarios that will not only help you prepare for the exam, but also give you a solid foundation as a Cisco networking professional.This Study Guide teaches you how toDescribe how a network worksConfigure, verify and troubleshoot a switch with VLANs and interswitch co

  10. Explorative data analysis of MCL reveals gene expression networks implicated in survival and prognosis supported by explorative CGH analysis

    International Nuclear Information System (INIS)

    Blenk, Steffen; Engelmann, Julia C; Pinkert, Stefan; Weniger, Markus; Schultz, Jörg; Rosenwald, Andreas; Müller-Hermelink, Hans K; Müller, Tobias; Dandekar, Thomas

    2008-01-01

    Mantle cell lymphoma (MCL) is an incurable B cell lymphoma and accounts for 6% of all non-Hodgkin's lymphomas. On the genetic level, MCL is characterized by the hallmark translocation t(11;14) that is present in most cases with few exceptions. Both gene expression and comparative genomic hybridization (CGH) data vary considerably between patients with implications for their prognosis. We compare patients over and below the median of survival. Exploratory principal component analysis of gene expression data showed that the second principal component correlates well with patient survival. Explorative analysis of CGH data shows the same correlation. On chromosome 7 and 9 specific genes and bands are delineated which improve prognosis prediction independent of the previously described proliferation signature. We identify a compact survival predictor of seven genes for MCL patients. After extensive re-annotation using GEPAT, we established protein networks correlating with prognosis. Well known genes (CDC2, CCND1) and further proliferation markers (WEE1, CDC25, aurora kinases, BUB1, PCNA, E2F1) form a tight interaction network, but also non-proliferative genes (SOCS1, TUBA1B CEBPB) are shown to be associated with prognosis. Furthermore we show that aggressive MCL implicates a gene network shift to higher expressed genes in late cell cycle states and refine the set of non-proliferative genes implicated with bad prognosis in MCL. The results from explorative data analysis of gene expression and CGH data are complementary to each other. Including further tests such as Wilcoxon rank test we point both to proliferative and non-proliferative gene networks implicated in inferior prognosis of MCL and identify suitable markers both in gene expression and CGH data

  11. The Medical Library Association Benchmarking Network: results.

    Science.gov (United States)

    Dudden, Rosalind Farnam; Corcoran, Kate; Kaplan, Janice; Magouirk, Jeff; Rand, Debra C; Smith, Bernie Todd

    2006-04-01

    This article presents some limited results from the Medical Library Association (MLA) Benchmarking Network survey conducted in 2002. Other uses of the data are also presented. After several years of development and testing, a Web-based survey opened for data input in December 2001. Three hundred eighty-five MLA members entered data on the size of their institutions and the activities of their libraries. The data from 344 hospital libraries were edited and selected for reporting in aggregate tables and on an interactive site in the Members-Only area of MLANET. The data represent a 16% to 23% return rate and have a 95% confidence level. Specific questions can be answered using the reports. The data can be used to review internal processes, perform outcomes benchmarking, retest a hypothesis, refute a previous survey findings, or develop library standards. The data can be used to compare to current surveys or look for trends by comparing the data to past surveys. The impact of this project on MLA will reach into areas of research and advocacy. The data will be useful in the everyday working of small health sciences libraries as well as provide concrete data on the current practices of health sciences libraries.

  12. The Medical Library Association Benchmarking Network: results*

    Science.gov (United States)

    Dudden, Rosalind Farnam; Corcoran, Kate; Kaplan, Janice; Magouirk, Jeff; Rand, Debra C.; Smith, Bernie Todd

    2006-01-01

    Objective: This article presents some limited results from the Medical Library Association (MLA) Benchmarking Network survey conducted in 2002. Other uses of the data are also presented. Methods: After several years of development and testing, a Web-based survey opened for data input in December 2001. Three hundred eighty-five MLA members entered data on the size of their institutions and the activities of their libraries. The data from 344 hospital libraries were edited and selected for reporting in aggregate tables and on an interactive site in the Members-Only area of MLANET. The data represent a 16% to 23% return rate and have a 95% confidence level. Results: Specific questions can be answered using the reports. The data can be used to review internal processes, perform outcomes benchmarking, retest a hypothesis, refute a previous survey findings, or develop library standards. The data can be used to compare to current surveys or look for trends by comparing the data to past surveys. Conclusions: The impact of this project on MLA will reach into areas of research and advocacy. The data will be useful in the everyday working of small health sciences libraries as well as provide concrete data on the current practices of health sciences libraries. PMID:16636703

  13. Characterizing genes with distinct methylation patterns in the context of protein-protein interaction network: application to human brain tissues.

    Science.gov (United States)

    Li, Yongsheng; Xu, Juan; Chen, Hong; Zhao, Zheng; Li, Shengli; Bai, Jing; Wu, Aiwei; Jiang, Chunjie; Wang, Yuan; Su, Bin; Li, Xia

    2013-01-01

    DNA methylation is an essential epigenetic mechanism involved in transcriptional control. However, how genes with different methylation patterns are assembled in the protein-protein interaction network (PPIN) remains a mystery. In the present study, we systematically dissected the characterization of genes with different methylation patterns in the PPIN. A negative association was detected between the methylation levels in the brain tissues and topological centralities. By focusing on two classes of genes with considerably different methylation levels in the brain tissues, namely the low methylated genes (LMGs) and high methylated genes (HMGs), we found that their organizing principles in the PPIN are distinct. The LMGs tend to be the center of the PPIN, and attacking them causes a more deleterious effect on the network integrity. Furthermore, the LMGs express their functions in a modular pattern and substantial differences in functions are observed between the two types of genes. The LMGs are enriched in the basic biological functions, such as binding activity and regulation of transcription. More importantly, cancer genes, especially recessive cancer genes, essential genes, and aging-related genes were all found more often in the LMGs. Additionally, our analysis presented that the intra-classes communications are enhanced, but inter-classes communications are repressed. Finally, a functional complementation was revealed between methylation and miRNA regulation in the human genome. We have elucidated the assembling principles of genes with different methylation levels in the context of the PPIN, providing key insights into the complex epigenetic regulation mechanisms.

  14. Gene, protein and network of male sterility in rice

    Directory of Open Access Journals (Sweden)

    Wang eKun

    2013-04-01

    Full Text Available Rice is one of the most important model crop plants whose heterosis has been well exploited in commercial hybrid seed production via a variety of types of male sterile lines. Hybrid rice cultivation area is steadily expanding around the world, especially in Southern Asia. Characterization of genes and proteins related to male sterility aims to understand how and why the male sterility occurs, and which proteins are the key players for microspores abortion. Recently, a series of genes and proteins related to cytoplasmic male sterility, photoperiod sensitive male sterility, self-incompatibility and other types of microspores deterioration have been characterized through genetics or proteomics. Especially the latter, offers us a powerful and high throughput approach to discern the novel proteins involving in male-sterile pathways which may help us to breed artificial male-sterile system. This represents an alternative tool to meet the critical challenge of further development of hybrid rice. In this paper, we reviewed the recent developments in our understanding of male sterility in rice hybrid production across gene, protein and integrated network levels, and also, present a perspective on the engineering of male sterile lines for hybrid rice production.

  15. Learning gene networks under SNP perturbations using eQTL datasets.

    Directory of Open Access Journals (Sweden)

    Lingxue Zhang

    2014-02-01

    Full Text Available The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network

  16. Gene-based Association Approach Identify Genes Across Stress Traits in Fruit Flies

    DEFF Research Database (Denmark)

    Rohde, Palle Duun; Edwards, Stefan McKinnon; Sarup, Pernille Merete

    Identification of genes explaining variation in quantitative traits or genetic risk factors of human diseases requires both good phenotypic- and genotypic data, but also efficient statistical methods. Genome-wide association studies may reveal association between phenotypic variation and variation...... approach grouping variants accordingly to gene position, thus lowering the number of statistical tests performed and increasing the probability of identifying genes with small to moderate effects. Using this approach we identify numerous genes associated with different types of stresses in Drosophila...... melanogaster, but also identify common genes that affects the stress traits....

  17. Building gene co-expression networks using transcriptomics data for systems biology investigations

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Watson-Haigh, Nathan S.

    2012-01-01

    Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four......) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT...... (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended....

  18. Chemical-gene interaction networks and causal reasoning for ...

    Science.gov (United States)

    Evaluating the potential human health and ecological risks associated with exposures to complex chemical mixtures in the environment is one of the main challenges of chemical safety assessment and environmental protection. There is a need for approaches that can help to integrate chemical monitoring and biological effects data to evaluate risks associated with chemicals present in the environment. Here, we used prior knowledge about chemical-gene interactions to develop a knowledge assembly model for detected chemicals at five locations near the North Branch and Chisago wastewater treatment plants (WWTP) in the St. Croix River Basin, MN and WI. The assembly model was used to generate hypotheses about the biological impacts of the chemicals at each location. The hypotheses were tested using empirical hepatic gene expression data from fathead minnows exposed for 12 d at each location. Empirical gene expression data were also mapped to the assembly models to evaluate the likelihood of a chemical contributing to the observed biological responses using richness and concordance statistics. The prior knowledge approach was able predict the observed biological pathways impacted at one site but not the other. Atrazine was identified as a potential contributor to the observed gene expression responses at a location upstream of the North Branch WTTP. Four chemicals were identified as contributors to the observed biological responses at the effluent and downstream o

  19. Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

    Directory of Open Access Journals (Sweden)

    Xiaobo Guo

    Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.

  20. Weighted gene co-expression network analysis of the peripheral blood from Amyotrophic Lateral Sclerosis patients

    Directory of Open Access Journals (Sweden)

    DeYoung Joseph

    2009-08-01

    Full Text Available Abstract Background Amyotrophic Lateral Sclerosis (ALS is a lethal disorder characterized by progressive degeneration of motor neurons in the brain and spinal cord. Diagnosis is mainly based on clinical symptoms, and there is currently no therapy to stop the disease or slow its progression. Since access to spinal cord tissue is not possible at disease onset, we investigated changes in gene expression profiles in whole blood of ALS patients. Results Our transcriptional study showed dramatic changes in blood of ALS patients; 2,300 probes (9.4% showed significant differential expression in a discovery dataset consisting of 30 ALS patients and 30 healthy controls. Weighted gene co-expression network analysis (WGCNA was used to find disease-related networks (modules and disease related hub genes. Two large co-expression modules were found to be associated with ALS. Our findings were replicated in a second (30 patients and 30 controls and third dataset (63 patients and 63 controls, thereby demonstrating a highly significant and consistent association of two large co-expression modules with ALS disease status. Ingenuity Pathway Analysis of the ALS related module genes implicates enrichment of functional categories related to genetic disorders, neurodegeneration of the nervous system and inflammatory disease. The ALS related modules contain a number of candidate genes possibly involved in pathogenesis of ALS. Conclusion This first large-scale blood gene expression study in ALS observed distinct patterns between cases and controls which may provide opportunities for biomarker development as well as new insights into the molecular mechanisms of the disease.

  1. MINER: exploratory analysis of gene interaction networks by machine learning from expression data

    Directory of Open Access Journals (Sweden)

    Sivieng Jane

    2009-12-01

    Full Text Available Abstract Background The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. Results We have developed MINER (Microarray Interactive Network Exploration and Representation, an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Conclusion Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.

  2. Endothelial nitric oxide synthase gene polymorphisms associated ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-05-24

    May 24, 2010 ... chronic periodontitis (CP), 31 with gingivitis (G) and 50 healthy controls. Probing depth ..... Periodontal disease in pregnancy I. Prevalence and severity. ... endothelial nitric oxide synthase gene in premenopausal women with.

  3. Sparse encoding of automatic visual association in hippocampal networks

    DEFF Research Database (Denmark)

    Hulme, Oliver J; Skov, Martin; Chadwick, Martin J

    2014-01-01

    Intelligent action entails exploiting predictions about associations between elements of ones environment. The hippocampus and mediotemporal cortex are endowed with the network topology, physiology, and neurochemistry to automatically and sparsely code sensori-cognitive associations that can...

  4. Reliable Uplink Communication through Double Association in Wireless Heterogeneous Networks

    DEFF Research Database (Denmark)

    Kim, Dong Min; Popovski, Petar

    2016-01-01

    We investigate methods for network association that improve the reliability of uplink transmissions in dense wireless heterogeneous networks. The stochastic geometry analysis shows that the double association, in which an uplink transmission is transmitted to a macro Base Station (BS) and small BS...

  5. Network-based ranking methods for prediction of novel disease associated microRNAs.

    Science.gov (United States)

    Le, Duc-Hau

    2015-10-01

    Many studies have shown roles of microRNAs on human disease and a number of computational methods have been proposed to predict such associations by ranking candidate microRNAs according to their relevance to a disease. Among them, machine learning-based methods usually have a limitation in specifying non-disease microRNAs as negative training samples. Meanwhile, network-based methods are becoming dominant since they well exploit a "disease module" principle in microRNA functional similarity networks. Of which, random walk with restart (RWR) algorithm-based method is currently state-of-the-art. The use of this algorithm was inspired from its success in predicting disease gene because the "disease module" principle also exists in protein interaction networks. Besides, many algorithms designed for webpage ranking have been successfully applied in ranking disease candidate genes because web networks share topological properties with protein interaction networks. However, these algorithms have not yet been utilized for disease microRNA prediction. We constructed microRNA functional similarity networks based on shared targets of microRNAs, and then we integrated them with a microRNA functional synergistic network, which was recently identified. After analyzing topological properties of these networks, in addition to RWR, we assessed the performance of (i) PRINCE (PRIoritizatioN and Complex Elucidation), which was proposed for disease gene prediction; (ii) PageRank with Priors (PRP) and K-Step Markov (KSM), which were used for studying web networks; and (iii) a neighborhood-based algorithm. Analyses on topological properties showed that all microRNA functional similarity networks are small-worldness and scale-free. The performance of each algorithm was assessed based on average AUC values on 35 disease phenotypes and average rankings of newly discovered disease microRNAs. As a result, the performance on the integrated network was better than that on individual ones. In

  6. Horizontal Acquisition and Transcriptional Integration of Novel Genes in Mosquito-Associated Spiroplasma.

    Science.gov (United States)

    Lo, Wen-Sui; Kuo, Chih-Horng

    2017-12-01

    Genetic differentiation among symbiotic bacteria is important in shaping biodiversity. The genus Spiroplasma contains species occupying diverse niches and is a model system for symbiont evolution. Previous studies have established that two mosquito-associated species have diverged extensively in their carbohydrate metabolism genes despite having a close phylogenetic relationship. Notably, although the commensal Spiroplasma diminutum lacks identifiable pathogenicity factors, the pathogenic Spiroplasma taiwanense was found to have acquired a virulence factor glpO and its associated genes through horizontal transfer. However, it is unclear if these acquired genes have been integrated into the regulatory network. In this study, we inferred the gene content evolution in these bacteria, as well as examined their transcriptomes in response to glucose availability. The results indicated that both species have many more gene acquisitions from the Mycoides-Entomoplasmataceae clade, which contains several important pathogens of ruminants, than previously thought. Moreover, several acquired genes have higher expression levels than the vertically inherited homologs, indicating possible functional replacement. Finally, the virulence factor and its functionally linked genes in S. taiwanense were up-regulated in response to glucose starvation, suggesting that these acquired genes are under expression regulation and the pathogenicity may be a stress response. In summary, although differential gene losses are a major process for symbiont divergence, gene gains are critical in counteracting genome degradation and driving diversification among facultative symbionts. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Analysis of deterministic cyclic gene regulatory network models with delays

    CERN Document Server

    Ahsen, Mehmet Eren; Niculescu, Silviu-Iulian

    2015-01-01

    This brief examines a deterministic, ODE-based model for gene regulatory networks (GRN) that incorporates nonlinearities and time-delayed feedback. An introductory chapter provides some insights into molecular biology and GRNs. The mathematical tools necessary for studying the GRN model are then reviewed, in particular Hill functions and Schwarzian derivatives. One chapter is devoted to the analysis of GRNs under negative feedback with time delays and a special case of a homogenous GRN is considered. Asymptotic stability analysis of GRNs under positive feedback is then considered in a separate chapter, in which conditions leading to bi-stability are derived. Graduate and advanced undergraduate students and researchers in control engineering, applied mathematics, systems biology and synthetic biology will find this brief to be a clear and concise introduction to the modeling and analysis of GRNs.

  8. Small RNA-Controlled Gene Regulatory Networks in Pseudomonas putida

    DEFF Research Database (Denmark)

    Bojanovic, Klara

    evolved numerous mechanisms to controlgene expression in response to specific environmental signals. In addition to two-component systems, small regulatory RNAs (sRNAs) have emerged as major regulators of gene expression. The majority of sRNAs bind to mRNA and regulate their expression. They often have...... multiple targets and are incorporated into large regulatory networks and the RNA chaper one Hfq in many cases facilitates interactions between sRNAs and their targets. Some sRNAs also act by binding to protein targets and sequestering their function. In this PhD thesis we investigated the transcriptional....... Detailed insights into the mechanisms through which P. putida responds to different stress conditions and increased understanding of bacterial adaptation in natural and industrial settings were gained. Additionally, we identified genome-wide transcription start sites, andmany regulatory RNA elements...

  9. The transcriptional and gene regulatory network of Lactococcus lactis MG1363 during growth in milk.

    Directory of Open Access Journals (Sweden)

    Anne de Jong

    Full Text Available In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks.

  10. No Evidence That Schizophrenia Candidate Genes Are More Associated With Schizophrenia Than Noncandidate Genes.

    Science.gov (United States)

    Johnson, Emma C; Border, Richard; Melroy-Greif, Whitney E; de Leeuw, Christiaan A; Ehringer, Marissa A; Keller, Matthew C

    2017-11-15

    A recent analysis of 25 historical candidate gene polymorphisms for schizophrenia in the largest genome-wide association study conducted to date suggested that these commonly studied variants were no more associated with the disorder than would be expected by chance. However, the same study identified other variants within those candidate genes that demonstrated genome-wide significant associations with schizophrenia. As such, it is possible that variants within historic schizophrenia candidate genes are associated with schizophrenia at levels above those expected by chance, even if the most-studied specific polymorphisms are not. The present study used association statistics from the largest schizophrenia genome-wide association study conducted to date as input to a gene set analysis to investigate whether variants within schizophrenia candidate genes are enriched for association with schizophrenia. As a group, variants in the most-studied candidate genes were no more associated with schizophrenia than were variants in control sets of noncandidate genes. While a small subset of candidate genes did appear to be significantly associated with schizophrenia, these genes were not particularly noteworthy given the large number of more strongly associated noncandidate genes. The history of schizophrenia research should serve as a cautionary tale to candidate gene investigators examining other phenotypes: our findings indicate that the most investigated candidate gene hypotheses of schizophrenia are not well supported by genome-wide association studies, and it is likely that this will be the case for other complex traits as well. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  11. Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks.

    Science.gov (United States)

    Wang, Yi Kan; Hurley, Daniel G; Schnell, Santiago; Print, Cristin G; Crampin, Edmund J

    2013-01-01

    We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.

  12. An extended Kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series.

    Science.gov (United States)

    Wang, Zidong; Liu, Xiaohui; Liu, Yurong; Liang, Jinling; Vinciotti, Veronica

    2009-01-01

    In this paper, the extended Kalman filter (EKF) algorithm is applied to model the gene regulatory network from gene time series data. The gene regulatory network is considered as a nonlinear dynamic stochastic model that consists of the gene measurement equation and the gene regulation equation. After specifying the model structure, we apply the EKF algorithm for identifying both the model parameters and the actual value of gene expression levels. It is shown that the EKF algorithm is an online estimation algorithm that can identify a large number of parameters (including parameters of nonlinear functions) through iterative procedure by using a small number of observations. Four real-world gene expression data sets are employed to demonstrate the effectiveness of the EKF algorithm, and the obtained models are evaluated from the viewpoint of bioinformatics.

  13. Protein-Protein Interaction Network and Gene Ontology

    Science.gov (United States)

    Choi, Yunkyu; Kim, Seok; Yi, Gwan-Su; Park, Jinah

    Evolution of computer technologies makes it possible to access a large amount and various kinds of biological data via internet such as DNA sequences, proteomics data and information discovered about them. It is expected that the combination of various data could help researchers find further knowledge about them. Roles of a visualization system are to invoke human abilities to integrate information and to recognize certain patterns in the data. Thus, when the various kinds of data are examined and analyzed manually, an effective visualization system is an essential part. One instance of these integrated visualizations can be combination of protein-protein interaction (PPI) data and Gene Ontology (GO) which could help enhance the analysis of PPI network. We introduce a simple but comprehensive visualization system that integrates GO and PPI data where GO and PPI graphs are visualized side-by-side and supports quick reference functions between them. Furthermore, the proposed system provides several interactive visualization methods for efficiently analyzing the PPI network and GO directedacyclic- graph such as context-based browsing and common ancestors finding.

  14. Influence of the experimental design of gene expression studies on the inference of gene regulatory networks: environmental factors

    Directory of Open Access Journals (Sweden)

    Frank Emmert-Streib

    2013-02-01

    Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.

  15. Integrating network, sequence and functional features using machine learning approaches towards identification of novel Alzheimer genes.

    Science.gov (United States)

    Jamal, Salma; Goyal, Sukriti; Shanker, Asheesh; Grover, Abhinav

    2016-10-18

    Alzheimer's disease (AD) is a complex progressive neurodegenerative disorder commonly characterized by short term memory loss. Presently no effective therapeutic treatments exist that can completely cure this disease. The cause of Alzheimer's is still unclear, however one of the other major factors involved in AD pathogenesis are the genetic factors and around 70 % risk of the disease is assumed to be due to the large number of genes involved. Although genetic association studies have revealed a number of potential AD susceptibility genes, there still exists a need for identification of unidentified AD-associated genes and therapeutic targets to have better understanding of the disease-causing mechanisms of Alzheimer's towards development of effective AD therapeutics. In the present study, we have used machine learning approach to identify candidate AD associated genes by integrating topological properties of the genes from the protein-protein interaction networks, sequence features and functional annotations. We also used molecular docking approach and screened already known anti-Alzheimer drugs against the novel predicted probable targets of AD and observed that an investigational drug, AL-108, had high affinity for majority of the possible therapeutic targets. Furthermore, we performed molecular dynamics simulations and MM/GBSA calculations on the docked complexes to validate our preliminary findings. To the best of our knowledge, this is the first comprehensive study of its kind for identification of putative Alzheimer-associated genes using machine learning approaches and we propose that such computational studies can improve our understanding on the core etiology of AD which could lead to the development of effective anti-Alzheimer drugs.

  16. Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

    Science.gov (United States)

    Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

    2016-04-01

    Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  17. Global gene expression profiling displays a network of dysregulated genes in non-atherosclerotic arterial tissue from patients with type 2 diabetes

    Directory of Open Access Journals (Sweden)

    Skov Vibe

    2012-02-01

    Full Text Available Abstract Background Generalized arterial alterations, such as endothelial dysfunction, medial matrix accumulations, and calcifications are associated with type 2 diabetes (T2D. These changes may render the vessel wall more susceptible to injury; however, the molecular characteristics of such diffuse pre-atherosclerotic changes in diabetes are only superficially known. Methods To identify the molecular alterations of the generalized arterial disease in T2D, DNA microarrays were applied to examine gene expression changes in normal-appearing, non-atherosclerotic arterial tissue from 10 diabetic and 11 age-matched non-diabetic men scheduled for a coronary by-pass operation. Gene expression changes were integrated with GO-Elite, GSEA, and Cytoscape to identify significant biological pathways and networks. Results Global pathway analysis revealed differential expression of gene-sets representing matrix metabolism, triglyceride synthesis, inflammation, insulin signaling, and apoptosis. The network analysis showed a significant cluster of dysregulated genes coding for both intra- and extra-cellular proteins associated with vascular cell functions together with genes related to insulin signaling and matrix remodeling. Conclusions Our results identify pathways and networks involved in the diffuse vasculopathy present in non-atherosclerotic arterial tissue in patients with T2D and confirmed previously observed mRNA-alterations. These abnormalities may play a role for the arterial response to injury and putatively for the accelerated atherogenesis among patients with diabetes.

  18. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  19. Genome-wide profiling of 24 hr diel rhythmicity in the water flea, Daphnia pulex: network analysis reveals rhythmic gene expression and enhances functional gene annotation.

    Science.gov (United States)

    Rund, Samuel S C; Yoo, Boyoung; Alam, Camille; Green, Taryn; Stephens, Melissa T; Zeng, Erliang; George, Gary F; Sheppard, Aaron D; Duffield, Giles E; Milenković, Tijana; Pfrender, Michael E

    2016-08-18

    Marine and freshwater zooplankton exhibit daily rhythmic patterns of behavior and physiology which may be regulated directly by the light:dark (LD) cycle and/or a molecular circadian clock. One of the best-studied zooplankton taxa, the freshwater crustacean Daphnia, has a 24 h diel vertical migration (DVM) behavior whereby the organism travels up and down through the water column daily. DVM plays a critical role in resource tracking and the behavioral avoidance of predators and damaging ultraviolet radiation. However, there is little information at the transcriptional level linking the expression patterns of genes to the rhythmic physiology/behavior of Daphnia. Here we analyzed genome-wide temporal transcriptional patterns from Daphnia pulex collected over a 44 h time period under a 12:12 LD cycle (diel) conditions using a cosine-fitting algorithm. We used a comprehensive network modeling and analysis approach to identify novel co-regulated rhythmic genes that have similar network topological properties and functional annotations as rhythmic genes identified by the cosine-fitting analyses. Furthermore, we used the network approach to predict with high accuracy novel gene-function associations, thus enhancing current functional annotations available for genes in this ecologically relevant model species. Our results reveal that genes in many functional groupings exhibit 24 h rhythms in their expression patterns under diel conditions. We highlight the rhythmic expression of immunity, oxidative detoxification, and sensory process genes. We discuss differences in the chronobiology of D. pulex from other well-characterized terrestrial arthropods. This research adds to a growing body of literature suggesting the genetic mechanisms governing rhythmicity in crustaceans may be divergent from other arthropod lineages including insects. Lastly, these results highlight the power of using a network analysis approach to identify differential gene expression and provide novel

  20. Association of MITF gene with hearing and pigmentation phenotype ...

    Indian Academy of Sciences (India)

    2014-07-24

    Jul 24, 2014 ... 1Department of Biology, Plant Growth and Development, University of Antwerp, .... network of many developmental pathways (reviewed by ... The MITF gene in many studied species has over 3 kb of untranslated region (UTR) at the 3 end, extending from ... Thus, as in the case of other mammalian species.

  1. Association between plasma metabolites and gene expression profiles in five porcine endocrine tissues

    Directory of Open Access Journals (Sweden)

    Bassols Anna

    2011-07-01

    Full Text Available Abstract Background Endocrine tissues play a fundamental role in maintaining homeostasis of plasma metabolites such as non-esterified fatty acids and glucose, the levels of which reflect the energy balance or the health status of animals. However, the relationship between the transcriptome of endocrine tissues and plasma metabolites has been poorly studied. Methods We determined the blood levels of 12 plasma metabolites in 27 pigs belonging to five breeds, each breed consisting of both females and males. The transcriptome of five endocrine tissues i.e. hypothalamus, adenohypophysis, thyroid gland, gonads and backfat tissues from 16 out of the 27 pigs was also determined. Sex and breed effects on the 12 plasma metabolites were investigated and associations between genes expressed in the five endocrine tissues and the 12 plasma metabolites measured were analyzed. A probeset was defined as a quantitative trait transcript (QTT when its association with a particular metabolic trait achieved a nominal P value Results A larger than expected number of QTT was found for non-esterified fatty acids and alanine aminotransferase in at least two tissues. The associations were highly tissue-specific. The QTT within the tissues were divided into co-expression network modules enriched for genes in Kyoto Encyclopedia of Genes and Genomes or gene ontology categories that are related to the physiological functions of the corresponding tissues. We also explored a multi-tissue co-expression network using QTT for non-esterified fatty acids from the five tissues and found that a module, enriched in hypothalamus QTT, was positioned at the centre of the entire multi-tissue network. Conclusions These results emphasize the relationships between endocrine tissues and plasma metabolites in terms of gene expression. Highly tissue-specific association patterns suggest that candidate genes or gene pathways should be investigated in the context of specific tissues.

  2. Community Structure Analysis of Gene Interaction Networks in Duchenne Muscular Dystrophy.

    Directory of Open Access Journals (Sweden)

    Tejaswini Narayanan

    Full Text Available Duchenne Muscular Dystrophy (DMD is an important pathology associated with the human skeletal muscle and has been studied extensively. Gene expression measurements on skeletal muscle of patients afflicted with DMD provides the opportunity to understand the underlying mechanisms that lead to the pathology. Community structure analysis is a useful computational technique for understanding and modeling genetic interaction networks. In this paper, we leverage this technique in combination with gene expression measurements from normal and DMD patient skeletal muscle tissue to study the structure of genetic interactions in the context of DMD. We define a novel framework for transforming a raw dataset of gene expression measurements into an interaction network, and subsequently apply algorithms for community structure analysis for the extraction of topological communities. The emergent communities are analyzed from a biological standpoint in terms of their constituent biological pathways, and an interpretation that draws correlations between functional and structural organization of the genetic interactions is presented. We also compare these communities and associated functions in pathology against those in normal human skeletal muscle. In particular, differential enhancements are observed in the following pathways between pathological and normal cases: Metabolic, Focal adhesion, Regulation of actin cytoskeleton and Cell adhesion, and implication of these mechanisms are supported by prior work. Furthermore, our study also includes a gene-level analysis to identify genes that are involved in the coupling between the pathways of interest. We believe that our results serve to highlight important distinguishing features in the structural/functional organization of constituent biological pathways, as it relates to normal and DMD cases, and provide the mechanistic basis for further biological investigations into specific pathways differently regulated

  3. Critical dynamics in associative memory networks

    Directory of Open Access Journals (Sweden)

    Maximilian eUhlig

    2013-07-01

    Full Text Available Critical behavior in neural networks is characterized by scale-free avalanche size distributions and can be explained by self-regulatory mechanisms. Theoretical and experimental evidence indicates that information storage capacity reaches its maximum in the critical regime. We study the effect of structural connectivity formed by Hebbian learning on the criticality of network dynamics. The network endowed with Hebbian learning only does not allow for simultaneous information storage and criticality. However, the critical regime is can be stabilized by short-term synaptic dynamics in the form of synaptic depression and facilitation or, alternatively, by homeostatic adaptation of the synaptic weights. We show that a heterogeneous distribution of maximal synaptic strengths does not preclude criticality if the Hebbian learning is alternated with periods of critical dynamics recovery. We discuss the relevance of these findings for the flexibility of memory in aging and with respect to the recent theory of synaptic plasticity.

  4. GWATCH: a web platform for automated gene association discovery analysis

    Science.gov (United States)

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  5. Topology influences performance in the associative memory neural networks

    International Nuclear Information System (INIS)

    Lu Jianquan; He Juan; Cao Jinde; Gao Zhiqiang

    2006-01-01

    To explore how topology affects performance within Hopfield-type associative memory neural networks (AMNNs), we studied the computational performance of the neural networks with regular lattice, random, small-world, and scale-free structures. In this Letter, we found that the memory performance of neural networks obtained through asynchronous updating from 'larger' nodes to 'smaller' nodes are better than asynchronous updating in random order, especially for the scale-free topology. The computational performance of associative memory neural networks linked by the above-mentioned network topologies with the same amounts of nodes (neurons) and edges (synapses) were studied respectively. Along with topologies becoming more random and less locally disordered, we will see that the performance of associative memory neural network is quite improved. By comparing, we show that the regular lattice and random network form two extremes in terms of patterns stability and retrievability. For a network, its patterns stability and retrievability can be largely enhanced by adding a random component or some shortcuts to its structured component. According to the conclusions of this Letter, we can design the associative memory neural networks with high performance and minimal interconnect requirements

  6. Dissemination Patterns and Associated Network Effects of Sentiments in Social Networks

    DEFF Research Database (Denmark)

    Hillmann, Robert; Trier, Matthias

    2012-01-01

    . The dissemination patterns analyzed in this study consist of network motifs based on triples of actors and the ties among them. These motifs are associated with common social network effects to derive meaningful insights about dissemination activities. The data basis includes several thousand social networks...... with textual messages classified according to embedded positive and negative sentiments. Based on this data, sub-networks are extracted and analyzed with a dynamic network motif analysis to determine dissemination patterns and associated network effects. Results indicate that the emergence of digital social...... networks exhibits a strong tendency towards reciprocity, followed by the dominance of hierarchy as an intermediate step leading to social clustering with hubs and transitivity effects for both positive and negative sentiments to the same extend. Sentiments embedded in exchanged textual messages do only...

  7. Gene networks specific for innate immunity define post-traumatic stress disorder.

    Science.gov (United States)

    Breen, M S; Maihofer, A X; Glatt, S J; Tylee, D S; Chandler, S D; Tsuang, M T; Risbrough, V B; Baker, D G; O'Connor, D T; Nievergelt, C M; Woelk, C H

    2015-12-01

    The molecular factors involved in the development of Post-Traumatic Stress Disorder (PTSD) remain poorly understood. Previous transcriptomic studies investigating the mechanisms of PTSD apply targeted approaches to identify individual genes under a cross-sectional framework lack a holistic view of the behaviours and properties of these genes at the system-level. Here we sought to apply an unsupervised gene-network based approach to a prospective experimental design using whole-transcriptome RNA-Seq gene expression from peripheral blood leukocytes of U.S. Marines (N=188), obtained both pre- and post-deployment to conflict zones. We identified discrete groups of co-regulated genes (i.e., co-expression modules) and tested them for association to PTSD. We identified one module at both pre- and post-deployment containing putative causal signatures for PTSD development displaying an over-expression of genes enriched for functions of innate-immune response and interferon signalling (Type-I and Type-II). Importantly, these results were replicated in a second non-overlapping independent dataset of U.S. Marines (N=96), further outlining the role of innate immune and interferon signalling genes within co-expression modules to explain at least part of the causal pathophysiology for PTSD development. A second module, consequential of trauma exposure, contained PTSD resiliency signatures and an over-expression of genes involved in hemostasis and wound responsiveness suggesting that chronic levels of stress impair proper wound healing during/after exposure to the battlefield while highlighting the role of the hemostatic system as a clinical indicator of chronic-based stress. These findings provide novel insights for early preventative measures and advanced PTSD detection, which may lead to interventions that delay or perhaps abrogate the development of PTSD.

  8. Integration of metabolic and gene regulatory networks modulates the C. elegans dietary response.

    Science.gov (United States)

    Watson, Emma; MacNeil, Lesley T; Arda, H Efsun; Zhu, Lihua Julie; Walhout, Albertha J M

    2013-03-28

    Expression profiles are tailored according to dietary input. However, the networks that control dietary responses remain largely uncharacterized. Here, we combine forward and reverse genetic screens to delineate a network of 184 genes that affect the C. elegans dietary response to Comamonas DA1877 bacteria. We find that perturbation of a mitochondrial network composed of enzymes involved in amino acid metabolism and the TCA cycle affects the dietary response. In humans, mutations in the corresponding genes cause inborn diseases of amino acid metabolism, most of which are treated by dietary intervention. We identify several transcription factors (TFs) that mediate the changes in gene expression upon metabolic network perturbations. Altogether, our findings unveil a transcriptional response system that is poised to sense dietary cues and metabolic imbalances, illustrating extensive communication between metabolic networks in the mitochondria and gene regulatory networks in the nucleus. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Association of childhood abuse with homeless women's social networks.

    Science.gov (United States)

    Green, Harold D; Tucker, Joan S; Wenzel, Suzanne L; Golinelli, Daniela; Kennedy, David P; Ryan, Gery W; Zhou, Annie J

    2012-01-01

    Childhood abuse has been linked to negative sequelae for women later in life including drug and alcohol use and violence as victim or perpetrator and may also affect the development of women's social networks. Childhood abuse is prevalent among at-risk populations of women (such as the homeless) and thus may have a stronger impact on their social networks. We conducted a study to: (a) develop a typology of sheltered homeless women's social networks; (b) determine whether childhood abuse was associated with the social networks of sheltered homeless women; and (c) determine whether those associations remained after accounting for past-year substance abuse and recent intimate partner abuse. A probability sample of 428 homeless women from temporary shelter settings in Los Angeles County completed a personal network survey that provided respondent information as well as information about their network members' demographics and level of interaction with each other. Cluster analyses identified groups of women who shared specific social network characteristics. Multinomial logistic regressions revealed variables associated with group membership. We identified three groups of women with differing social network characteristics: low-risk networks, densely connected risky networks (dense, risky), and sparsely connected risky networks (sparse, risky). Multinomial logistic regressions indicated that membership in the sparse, risky network group, when compared to the low-risk group, was associated with history of childhood physical abuse (but not sexual or emotional abuse). Recent drug abuse was associated with membership in both risky network groups; however, the association of childhood physical abuse with sparse, risky network group membership remained. Although these findings support theories proposing that the experience of childhood abuse can shape women's social networks, they suggest that it may be childhood physical abuse that has the most impact among homeless women

  10. Association of -330 interleukin-2 gene polymorphism with oral cancer.

    Science.gov (United States)

    Singh, Prithvi Kumar; Kumar, Vijay; Ahmad, Mohammad Kaleem; Gupta, Rajni; Mahdi, Abbas Ali; Jain, Amita; Bogra, Jaishri; Chandra, Girish

    2017-12-01

    Cytokines play an important role in the development of cancer. Several single-nucleotide polymorphisms (SNPs) of cytokine genes have been reported to be associated with the development and severity of inflammatory diseases and cancer predisposition. This study was undertaken to evaluate a possible association of interleukin 2 (IL-2) (- 330A>C) gene polymorphisms with the susceptibility to oral cancer. The SNP in IL-2 (-330A>C) gene was genotyped in 300 oral cancer patients and in similar number of healthy volunteers by polymerase chain reaction (PCR)-restriction fragment length polymorphism and the association of the gene with the disease was evaluated. IL-2 (-330A>C) gene polymorphism was significantly associated with oral cancer whereas it was neither associated with clinicopathological status nor with cancer pain. The AC heterozygous genotype was significantly associated with oral cancer patients as compared to controls [odds ratio (OR): 3.0; confidence interval (CI): 2.14-4.20; Poral cancer (OR: 1.80; CI: 1.39-2.33; PC) gene polymorphism was also associated with oral cancer in tobacco smokers and chewers. Our results showed that oral cancer patients had significantly higher frequency of AA genotype but significantly lower frequency of AC genotype and C allele compared to controls. The IL-2 AC genotype and C allele of IL-2 (-330A>C) gene polymorphisms could be potential protective factors and might reduce the risk of oral cancer in Indian population.

  11. Dynamic association rules for gene expression data analysis.

    Science.gov (United States)

    Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung

    2015-10-14

    The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed

  12. Social network characteristics associated with health promoting behaviors among Latinos.

    Science.gov (United States)

    Marquez, Becky; Elder, John P; Arredondo, Elva M; Madanat, Hala; Ji, Ming; Ayala, Guadalupe X

    2014-06-01

    This study examined the relationship between social network characteristics and health promoting behaviors (having a routine medical check-up, consuming no alcohol, consuming no fast food, and meeting recommendations for leisure-time physical activity and sleep duration) among Latinos to identify potential targets for behavioral interventions. Personal network characteristics and health behavior data were collected from a community sample of 393 adult Latinos (73% women) in San Diego County, California. Network characteristics consisted of size and composition. Network size was calculated by the number of alters listed on a name generator questionnaire eliciting people with whom respondents discussed personal issues. Network composition variables were the proportion of Latinos, Spanish-speakers, females, family, and friends listed in the name generator. Additional network composition variables included marital status and the number of adults or children in the household. Network members were predominately Latinos (95%), Spanish-speakers (80%), females (64%), and family (55%). In multivariate logistic regression analyses, gender moderated the relationship between network composition, but not size, and a health behavior. Married women were more likely to have had a routine medical check-up than married men. For both men and women, having a larger network was associated with meeting the recommendation for leisure-time physical activity. Few social network characteristics were significantly associated with health promoting behaviors, suggesting a need to examine other aspects of social relationships that may influence health behaviors. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  13. In-silico gene co-expression network analysis in Paracoccidioides brasiliensis with reference to haloacid dehalogenase superfamily hydrolase gene

    Directory of Open Access Journals (Sweden)

    Raghunath Satpathy

    2015-01-01

    Full Text Available Context: Paracoccidioides brasiliensis, a dimorphic fungus is the causative agent of paracoccidioidomycosis, a disease globally affecting millions of people. The haloacid dehalogenase (HAD superfamily hydrolases enzyme in the fungi, in particular, is known to be responsible in the pathogenesis by adhering to the tissue. Hence, identification of novel drug targets is essential. Aims: In-silico based identification of co-expressed genes along with HAD superfamily hydrolase in P. brasiliensis during the morphogenesis from mycelium to yeast to identify possible genes as drug targets. Materials and Methods: In total, four datasets were retrieved from the NCBI-gene expression omnibus (GEO database, each containing 4340 genes, followed by gene filtration expression of the data set. Further co-expression (CE study was performed individually and then a combination these genes were visualized in the Cytoscape 2. 8.3. Statistical Analysis Used: Mean and standard deviation value of the HAD superfamily hydrolase gene was obtained from the expression data and this value was subsequently used for the CE calculation purpose by selecting specific correlation power and filtering threshold. Results: The 23 genes that were thus obtained are common with respect to the HAD superfamily hydrolase gene. A significant network was selected from the Cytoscape network visualization that contains total 7 genes out of which 5 genes, which do not have significant protein hits, obtained from gene annotation of the expressed sequence tags by BLAST X. For all the protein PSI-BLAST was performed against human genome to find the homology. Conclusions: The gene co-expression network was obtained with respect to HAD superfamily dehalogenase gene in P. Brasiliensis.

  14. Model checking optimal finite-horizon control for probabilistic gene regulatory networks.

    Science.gov (United States)

    Wei, Ou; Guo, Zonghao; Niu, Yun; Liao, Wenyuan

    2017-12-14

    Probabilistic Boolean networks (PBNs) have been proposed for analyzing external control in gene regulatory networks with incorporation of uncertainty. A context-sensitive PBN with perturbation (CS-PBNp), extending a PBN with context-sensitivity to reflect the inherent biological stability and random perturbations to express the impact of external stimuli, is considered to be more suitable for modeling small biological systems intervened by conditions from the outside. In this paper, we apply probabilistic model checking, a formal verification technique, to optimal control for a CS-PBNp that minimizes the expected cost over a finite control horizon. We first describe a procedure of modeling a CS-PBNp using the language provided by a widely used probabilistic model checker PRISM. We then analyze the reward-based temporal properties and the computation in probabilistic model checking; based on the analysis, we provide a method to formulate the optimal control problem as minimum reachability reward properties. Furthermore, we incorporate control and state cost information into the PRISM code of a CS-PBNp such that automated model checking a minimum reachability reward property on the code gives the solution to the optimal control problem. We conduct experiments on two examples, an apoptosis network and a WNT5A network. Preliminary experiment results show the feasibility and effectiveness of our approach. The approach based on probabilistic model checking for optimal control avoids explicit computation of large-size state transition relations associated with PBNs. It enables a natural depiction of the dynamics of gene regulatory networks, and provides a canonical form to formulate optimal control problems using temporal properties that can be automated solved by leveraging the analysis power of underlying model checking engines. This work will be helpful for further utilization of the advances in formal verification techniques in system biology.

  15. Global similarity and local divergence in human and mouse gene co-expression networks

    Directory of Open Access Journals (Sweden)

    Koonin Eugene V

    2006-09-01

    Full Text Available Abstract Background A genome-wide comparative analysis of human and mouse gene expression patterns was performed in order to evaluate the evolutionary divergence of mammalian gene expression. Tissue-specific expression profiles were analyzed for 9,105 human-mouse orthologous gene pairs across 28 tissues. Expression profiles were resolved into species-specific coexpression networks, and the topological properties of the networks were compared between species. Results At the global level, the topological properties of the human and mouse gene coexpression networks are, essentially, identical. For instance, both networks have topologies with small-world and scale-free properties as well as closely similar average node degrees, clustering coefficients, and path lengths. However, the human and mouse coexpression networks are highly divergent at the local level: only a small fraction ( Conclusion The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal. Nevertheless, there is substantial evolutionary conservation of the local network structure which is compatible with the notion that gene coexpression networks are subject to purifying selection.

  16. Meta-analysis of inter-species liver co-expression networks elucidates traits associated with common human diseases.

    Directory of Open Access Journals (Sweden)

    Kai Wang

    2009-12-01

    Full Text Available Co-expression networks are routinely used to study human diseases like obesity and diabetes. Systematic comparison of these networks between species has the potential to elucidate common mechanisms that are conserved between human and rodent species, as well as those that are species-specific characterizing evolutionary plasticity. We developed a semi-parametric meta-analysis approach for combining gene-gene co-expression relationships across expression profile datasets from multiple species. The simulation results showed that the semi-parametric method is robust against noise. When applied to human, mouse, and rat liver co-expression networks, our method out-performed existing methods in identifying gene pairs with coherent biological functions. We identified a network conserved across species that highlighted cell-cell signaling, cell-adhesion and sterol biosynthesis as main biological processes represented in genome-wide association study candidate gene sets for blood lipid levels. We further developed a heterogeneity statistic to test for network differences among multiple datasets, and demonstrated that genes with species-specific interactions tend to be under positive selection throughout evolution. Finally, we identified a human-specific sub-network regulated by RXRG, which has been validated to play a different role in hyperlipidemia and Type 2 diabetes between human and mouse. Taken together, our approach represents a novel step forward in integrating gene co-expression networks from multiple large scale datasets to leverage not only common information but also differences that are dataset-specific.

  17. A gene co-expression network in whole blood of schizophrenia patients is independent of antipsychotic-use and enriched for brain-expressed genes.

    Directory of Open Access Journals (Sweden)

    Simone de Jong

    Full Text Available Despite large-scale genome-wide association studies (GWAS, the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1, is located in, and regulated by the major histocompatibility (MHC complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network.

  18. Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules.

    Science.gov (United States)

    Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P

    2013-03-21

    Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group

  19. Extracting Association Patterns in Network Communications

    Science.gov (United States)

    Portela, Javier; Villalba, Luis Javier García; Trujillo, Alejandra Guadalupe Silva; Orozco, Ana Lucila Sandoval; Kim, Tai-hoon

    2015-01-01

    In network communications, mixes provide protection against observers hiding the appearance of messages, patterns, length and links between senders and receivers. Statistical disclosure attacks aim to reveal the identity of senders and receivers in a communication network setting when it is protected by standard techniques based on mixes. This work aims to develop a global statistical disclosure attack to detect relationships between users. The only information used by the attacker is the number of messages sent and received by each user for each round, the batch of messages grouped by the anonymity system. A new modeling framework based on contingency tables is used. The assumptions are more flexible than those used in the literature, allowing to apply the method to multiple situations automatically, such as email data or social networks data. A classification scheme based on combinatoric solutions of the space of rounds retrieved is developed. Solutions about relationships between users are provided for all pairs of users simultaneously, since the dependence of the data retrieved needs to be addressed in a global sense. PMID:25679311

  20. Extracting Association Patterns in Network Communications

    Directory of Open Access Journals (Sweden)

    Javier Portela

    2015-02-01

    Full Text Available In network communications, mixes provide protection against observers hiding the appearance of messages, patterns, length and links between senders and receivers. Statistical disclosure attacks aim to reveal the identity of senders and receivers in a communication network setting when it is protected by standard techniques based on mixes. This work aims to develop a global statistical disclosure attack to detect relationships between users. The only information used by the attacker is the number of messages sent and received by each user for each round, the batch of messages grouped by the anonymity system. A new modeling framework based on contingency tables is used. The assumptions are more flexible than those used in the literature, allowing to apply the method to multiple situations automatically, such as email data or social networks data. A classification scheme based on combinatoric solutions of the space of rounds retrieved is developed. Solutions about relationships between users are provided for all pairs of users simultaneously, since the dependence of the data retrieved needs to be addressed in a global sense.

  1. Interleukin 18 receptor 1 gene polymorphisms are associated with asthma

    DEFF Research Database (Denmark)

    Zhu, Guohua; Whyte, Moira K B; Vestbo, Jørgen

    2008-01-01

    The interleukin 18 receptor (IL18R1) gene is a strong candidate gene for asthma. It has been implicated in the pathophysiology of asthma and maps to an asthma susceptibility locus on chromosome 2q12. The possibility of association between polymorphisms in IL18R1 and asthma was examined by genotyp...

  2. Polymorphism in leptin receptor gene was associated with obesity in ...

    African Journals Online (AJOL)

    The mutation in leptin receptor (LEPR) gene causes splicing abnormality that resulted in truncated receptor, aberrant signal transduction, leptin resistance, and obesity. This study aims to determine the association of LEPR gene polymorphisms, rs1137100 and rs1137101, on phenotype and leptin level between obese and ...

  3. Distribution of genes associated with yield potential and water ...

    Indian Academy of Sciences (India)

    Supplementary data: Distribution of genes associated with yield potential and water-saving in. Chinese Zone II wheat detected by developed functional markers. Zhenxian Gao, Zhanliang Shi, Aimin Zhang and Jinkao Guo. J. Genet. 94, 35–42. Table 1. Functional markers for high-yield or water-saving genes in wheat and ...

  4. Disruption of the neurexin 1 gene is associated with schizophrenia.

    NARCIS (Netherlands)

    Rujescu, D.; Ingason, A.; Cichon, S.; Pietilainen, O.P.H.; Barnes, M.R.; Toulopoulou, T.; Picchioni, M.; Vassos, E.; Ettinger, U.; Bramon, E.; Murray, R.; Ruggeri, M.; Tosato, S.; Bonetto, C.; Steinberg, S.; Sigurdsson, E.; Sigmundsson, T.; Petursson, H.; Gylfason, A; Olason, P.; Hardarsson, G.; Jonsdottir, G.A.; Gustafsson, O.; Fossdal, R.; Giegling, I.; Moller, H.J.; Hartmann, A.M.; Hoffmann, P.; Crombie, C.; Fraser, G.; Walker, N.; Lonnqvist, J.; Suvisaari, J.; Tuulio-Henriksson, A.; Djurovic, S.; Melle, I.; Andreassen, O.A.; Hansen, T.; Werge, T.; Kiemeney, L.A.L.M.; Franke, B.; Veltman, J.A.; Buizer-Voskamp, J.E.; Sabatti, C.; Ophoff, R.A.; Rietschel, M.; Nothen, Markus; Stefansson, K.; Peltonen, L.; St Clair, D.; Stefansson, H.; Collier, D.A.

    2009-01-01

    Deletions within the neurexin 1 gene (NRXN1; 2p16.3) are associated with autism and have also been reported in two families with schizophrenia. We examined NRXN1, and the closely related NRXN2 and NRXN3 genes, for copy number variants (CNVs) in 2977 schizophrenia patients and 33 746 controls from

  5. Disruption of the neurexin 1 gene is associated with schizophrenia

    NARCIS (Netherlands)

    Rujescu, Dan; Ingason, Andres; Cichon, Sven; Pietilainen, Olli P. H.; Barnes, Michael R.; Toulopoulou, Timothea; Picchioni, Marco; Vassos, Evangelos; Ettinger, Ulrich; Bramon, Elvira; Murray, Robin; Ruggeri, Mirella; Tosato, Sarah; Bonetto, Chiara; Steinberg, Stacy; Sigurdsson, Engilbert; Sigmundsson, Thordur; Petursson, Hannes; Gylfason, Arnaldur; Olason, Pall I.; Hardarsson, Gudmundur; Jonsdottir, Gudrun A.; Gustafsson, Omar; Fossdal, Ragnheidur; Giegling, Ina; Moeller, Hans-Jurgen; Hartmann, Annette M.; Hoffmann, Per; Crombie, Caroline; Fraser, Gillian; Walker, Nicholas; Lonnqvist, Jouko; Suvisaari, Jaana; Tuulio-Henriksson, Annamari; Djurovic, Srdjan; Melle, Ingrid; Andreassen, Ole A.; Hansen, Thomas; Werge, Thomas; Kiemeney, Lambertus A.; Franke, Barbara; Veltman, Joris; Buizer-Voskamp, Jacobine E.; Sabatti, Chiara; Ophoff, Roel A.; Rietschel, Marcella; Noehen, Markus M.; Stefansson, Kari; Peltonen, Leena; St Clair, David

    2009-01-01

    Deletions within the neurexin 1 gene (NRXN1; 2p16.3) are associated with autism and have also been reported in two families with schizophrenia. We examined NRXN1, and the closely related NRXN2 and NRXN3 genes, for copy number variants (CNVs) in 2977 schizophrenia patients and 33 746 controls from

  6. Association of transforming growth factor-ß3 gene polymorphism ...

    African Journals Online (AJOL)

    Genotyping for the TGF-β3 gene using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method and BslI restriction endonuclease showed a mutation in 294-bp fragment located on the fourth intron of chromosome 5. Polymorphism in TGF-β3 gene was significantly (P < 0.1) associated with ...

  7. Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways.

    Science.gov (United States)

    Obayashi, Takeshi; Kinoshita, Kengo

    2010-05-01

    Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.

  8. The Schizophrenia-Associated BRD1 Gene Regulates Behavior, Neurotransmission, and Expression of Schizophrenia Risk Enriched Gene Sets in Mice.

    Science.gov (United States)

    Qvist, Per; Christensen, Jane Hvarregaard; Vardya, Irina; Rajkumar, Anto Praveen; Mørk, Arne; Paternoster, Veerle; Füchtbauer, Ernst-Martin; Pallesen, Jonatan; Fryland, Tue; Dyrvig, Mads; Hauberg, Mads Engel; Lundsberg, Birgitte; Fejgin, Kim; Nyegaard, Mette; Jensen, Kimmo; Nyengaard, Jens Randel; Mors, Ole; Didriksen, Michael; Børglum, Anders Dupont

    2017-07-01

    The schizophrenia-associated BRD1 gene encodes a transcriptional regulator whose comprehensive chromatin interactome is enriched with schizophrenia risk genes. However, the biology underlying the disease association of BRD1 remains speculative. This study assessed the transcriptional drive of a schizophrenia-associated BRD1 risk variant in vitro. Accordingly, to examine the effects of reduced Brd1 expression, we generated a genetically modified Brd1 +/- mouse and subjected it to behavioral, electrophysiological, molecular, and integrative genomic analyses with focus on schizophrenia-relevant parameters. Brd1 +/- mice displayed cerebral histone H3K14 hypoacetylation and a broad range of behavioral changes with translational relevance to schizophrenia. These behaviors were accompanied by striatal dopamine/serotonin abnormalities and cortical excitation-inhibition imbalances involving loss of parvalbumin immunoreactive interneurons. RNA-sequencing analyses of cortical and striatal micropunches from Brd1 +/- and wild-type mice revealed differential expression of genes enriched for schizophrenia risk, including several schizophrenia genome-wide association study risk genes (e.g., calcium channel subunits [Cacna1c and Cacnb2], cholinergic muscarinic receptor 4 [Chrm4)], dopamine receptor D 2 [Drd2], and transcription factor 4 [Tcf4]). Integrative analyses further found differentially expressed genes to cluster in functional networks and canonical pathways associated with mental illness and molecular signaling processes (e.g., glutamatergic, monoaminergic, calcium, cyclic adenosine monophosphate [cAMP], dopamine- and cAMP-regulated neuronal phosphoprotein 32 kDa [DARPP-32], and cAMP responsive element binding protein signaling [CREB]). Our study bridges the gap between genetic association and pathogenic effects and yields novel insights into the unfolding molecular changes in the brain of a new schizophrenia model that incorporates genetic risk at three levels: allelic

  9. Integration of multiple networks and pathways identifies cancer driver genes in pan-cancer analysis.

    Science.gov (United States)

    Cava, Claudia; Bertoli, Gloria; Colaprico, Antonio; Olsen, Catharina; Bontempi, Gianluca; Castiglioni, Isabella

    2018-01-06

    Modern high-throughput genomic technologies represent a comprehensive hallmark of molecular changes in pan-cancer studies. Although different cancer gene signatures have been revealed, the mechanism of tumourigenesis has yet to be completely understood. Pathways and networks are important tools to explain the role of genes in functional genomic studies. However, few methods consider the functional non-equal roles of genes in pathways and the complex gene-gene interactions in a network. We present a novel method in pan-cancer analysis that identifies de-regulated genes with a functional role by integrating pathway and network data. A pan-cancer analysis of 7158 tumour/normal samples from 16 cancer types identified 895 genes with a central role in pathways and de-regulated in cancer. Comparing our approach with 15 current tools that identify cancer driver genes, we found that 35.6% of the 895 genes identified by our method have been found as cancer driver genes with at least 2/15 tools. Finally, we applied a machine learning algorithm on 16 independent GEO cancer datasets to validate the diagnostic role of cancer driver genes for each cancer. We obtained a list of the top-ten cancer driver genes for each cancer considered in this study. Our analysis 1) confirmed that there are several known cancer driver genes in common among different types of cancer, 2) highlighted that cancer driver genes are able to regulate crucial pathways.

  10. 5C analysis of the Epidermal Differentiation Complex locus reveals distinct chromatin interaction networks between gene-rich and gene-poor TADs in skin epithelial cells.

    Directory of Open Access Journals (Sweden)

    Krzysztof Poterlowicz

    2017-09-01

    Full Text Available Mammalian genomes contain several dozens of large (>0.5 Mbp lineage-specific gene loci harbouring functionally related genes. However, spatial chromatin folding, organization of the enhancer-promoter networks and their relevance to Topologically Associating Domains (TADs in these loci remain poorly understood. TADs are principle units of the genome folding and represents the DNA regions within which DNA interacts more frequently and less frequently across the TAD boundary. Here, we used Chromatin Conformation Capture Carbon Copy (5C technology to characterize spatial chromatin interaction network in the 3.1 Mb Epidermal Differentiation Complex (EDC locus harbouring 61 functionally related genes that show lineage-specific activation during terminal keratinocyte differentiation in the epidermis. 5C data validated by 3D-FISH demonstrate that the EDC locus is organized into several TADs showing distinct lineage-specific chromatin interaction networks based on their transcription activity and the gene-rich or gene-poor status. Correlation of the 5C results with genome-wide studies for enhancer-specific histone modifications (H3K4me1 and H3K27ac revealed that the majority of spatial chromatin interactions that involves the gene-rich TADs at the EDC locus in keratinocytes include both intra- and inter-TAD interaction networks, connecting gene promoters and enhancers. Compared to thymocytes in which the EDC locus is mostly transcriptionally inactive, these interactions were found to be keratinocyte-specific. In keratinocytes, the promoter-enhancer anchoring regions in the gene-rich transcriptionally active TADs are enriched for the binding of chromatin architectural proteins CTCF, Rad21 and chromatin remodeler Brg1. In contrast to gene-rich TADs, gene-poor TADs show preferential spatial contacts with each other, do not contain active enhancers and show decreased binding of CTCF, Rad21 and Brg1 in keratinocytes. Thus, spatial interactions between gene

  11. Artificial neural network inference (ANNI: a study on gene-gene interaction for biomarkers in childhood sarcomas.

    Directory of Open Access Journals (Sweden)

    Dong Ling Tong

    Full Text Available OBJECTIVE: To model the potential interaction between previously identified biomarkers in children sarcomas using artificial neural network inference (ANNI. METHOD: To concisely demonstrate the biological interactions between correlated genes in an interaction network map, only 2 types of sarcomas in the children small round blue cell tumors (SRBCTs dataset are discussed in this paper. A backpropagation neural network was used to model the potential interaction between genes. The prediction weights and signal directions were used to model the strengths of the interaction signals and the direction of the interaction link between genes. The ANN model was validated using Monte Carlo cross-validation to minimize the risk of over-fitting and to optimize generalization ability of the model. RESULTS: Strong connection links on certain genes (TNNT1 and FNDC5 in rhabdomyosarcoma (RMS; FCGRT and OLFM1 in Ewing's sarcoma (EWS suggested their potency as central hubs in the interconnection of genes with different functionalities. The results showed that the RMS patients in this dataset are likely to be congenital and at low risk of cardiomyopathy development. The EWS patients are likely to be complicated by EWS-FLI fusion and deficiency in various signaling pathways, including Wnt, Fas/Rho and intracellular oxygen. CONCLUSIONS: The ANN network inference approach and the examination of identified genes in the published literature within the context of the disease highlights the substantial influence of certain genes in sarcomas.

  12. Genetic dissection of acute ethanol responsive gene networks in prefrontal cortex: functional and mechanistic implications.

    Directory of Open Access Journals (Sweden)

    Aaron R Wolen

    Full Text Available Individual differences in initial sensitivity to ethanol are strongly related to the heritable risk of alcoholism in humans. To elucidate key molecular networks that modulate ethanol sensitivity we performed the first systems genetics analysis of ethanol-responsive gene expression in brain regions of the mesocorticolimbic reward circuit (prefrontal cortex, nucleus accumbens, and ventral midbrain across a highly diverse family of 27 isogenic mouse strains (BXD panel before and after treatment with ethanol.Acute ethanol altered the expression of ~2,750 genes in one or more regions and 400 transcripts were jointly modulated in all three. Ethanol-responsive gene networks were extracted with a powerful graph theoretical method that efficiently summarized ethanol's effects. These networks correlated with acute behavioral responses to ethanol and other drugs of abuse. As predicted, networks were heavily populated by genes controlling synaptic transmission and neuroplasticity. Several of the most densely interconnected network hubs, including Kcnma1 and Gsk3β, are known to influence behavioral or physiological responses to ethanol, validating our overall approach. Other major hub genes like Grm3, Pten and Nrg3 represent novel targets of ethanol effects. Networks were under strong genetic control by variants that we mapped to a small number of chromosomal loci. Using a novel combination of genetic, bioinformatic and network-based approaches, we identified high priority cis-regulatory candidate genes, including Scn1b, Gria1, Sncb and Nell2.The ethanol-responsive gene networks identified here represent a previously uncharacterized intermediate phenotype between DNA variation and ethanol sensitivity in mice. Networks involved in synaptic transmission were strongly regulated by ethanol and could contribute to behavioral plasticity seen with chronic ethanol. Our novel finding that hub genes and a small number of loci exert major influence over the ethanol

  13. Reverse-engineering of gene networks for regulating early blood development from single-cell measurements.

    Science.gov (United States)

    Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai

    2017-12-28

    Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate

  14. An interaction map of circulating metabolites, immune gene networks, and their genetic regulation.

    Science.gov (United States)

    Nath, Artika P; Ritchie, Scott C; Byars, Sean G; Fearnley, Liam G; Havulinna, Aki S; Joensuu, Anni; Kangas, Antti J; Soininen, Pasi; Wennerström, Annika; Milani, Lili; Metspalu, Andres; Männistö, Satu; Würtz, Peter; Kettunen, Johannes; Raitoharju, Emma; Kähönen, Mika; Juonala, Markus; Palotie, Aarno; Ala-Korpela, Mika; Ripatti, Samuli; Lehtimäki, Terho; Abraham, Gad; Raitakari, Olli; Salomaa, Veikko; Perola, Markus; Inouye, Michael

    2017-08-01

    Immunometabolism plays a central role in many cardiometabolic diseases. However, a robust map of immune-related gene networks in circulating human cells, their interactions with metabolites, and their genetic control is still lacking. Here, we integrate blood transcriptomic, metabolomic, and genomic profiles from two population-based cohorts (total N = 2168), including a subset of individuals with matched multi-omic data at 7-year follow-up. We identify topologically replicable gene networks enriched for diverse immune functions including cytotoxicity, viral response, B cell, platelet, neutrophil, and mast cell/basophil activity. These immune gene modules show complex patterns of association with 158 circulating metabolites, including lipoprotein subclasses, lipids, fatty acids, amino acids, small molecules, and CRP. Genome-wide scans for module expression quantitative trait loci (mQTLs) reveal five modules with mQTLs that have both cis and trans effects. The strongest mQTL is in ARHGEF3 (rs1354034) and affects a module enriched for platelet function, independent of platelet counts. Modules of mast cell/basophil and neutrophil function show temporally stable metabolite associations over 7-year follow-up, providing evidence that these modules and their constituent gene products may play central roles in metabolic inflammation. Furthermore, the strongest mQTL in ARHGEF3 also displays clear temporal stability, supporting widespread trans effects at this locus. This study provides a detailed map of natural variation at the blood immunometabolic interface and its genetic basis, and may facilitate subsequent studies to explain inter-individual variation in cardiometabolic disease.

  15. Polymorphisms of two neuroendocrine–correlated genes associated ...

    African Journals Online (AJOL)

    Administrator

    2011-09-19

    Sep 19, 2011 ... genes associated with body weight and reproductive traits in Jinghai ... 3Jiangsu Jinghai Poultry Group Co., Ltd., Nantong, 226103, China. Accepted 18 .... Birds had access to feed [commercial corn-soybean diets meeting.

  16. DDPC: Dragon database of genes associated with prostate cancer

    KAUST Repository

    Maqungo, Monique; Kaur, Mandeep; Kwofie, Samuel K.; Radovanovic, Aleksandar; Schaefer, Ulf; Schmeier, Sebastian; Oppon, Ekow; Christoffels, Alan; Bajic, Vladimir B.

    2010-01-01

    associated with Prostate Cancer (DDPC) as an integrated knowledgebase of genes experimentally verified as implicated in PC. DDPC is distinctive from other databases in that (i) it provides pre-compiled biomedical text-mining information on PC, which otherwise

  17. Network Candidate Genes in Breeding for Drought Tolerant Crops

    Directory of Open Access Journals (Sweden)

    Christoph Tim Krannich

    2015-07-01

    Full Text Available Climate change leading to increased periods of low water availability as well as increasing demands for food in the coming years makes breeding for drought tolerant crops a high priority. Plants have developed diverse strategies and mechanisms to survive drought stress. However, most of these represent drought escape or avoidance strategies like early flowering or low stomatal conductance that are not applicable in breeding for crops with high yields under drought conditions. Even though a great deal of research is ongoing, especially in cereals, in this regard, not all mechanisms involved in drought tolerance are yet understood. The identification of candidate genes for drought tolerance that have a high potential to be used for breeding drought tolerant crops represents a challenge. Breeding for drought tolerant crops has to focus on acceptable yields under water-limited conditions and not on survival. However, as more and more knowledge about the complex networks and the cross talk during drought is available, more options are revealed. In addition, it has to be considered that conditioning a crop for drought tolerance might require the production of metabolites and might cost the plants energy and resources that cannot be used in terms of yield. Recent research indicates that yield penalty exists and efficient breeding for drought tolerant crops with acceptable yields under well-watered and drought conditions might require uncoupling yield penalty from drought tolerance.

  18. A survey of genomic studies supports association of circadian clock genes with bipolar disorder spectrum illnesses and lithium response.

    Directory of Open Access Journals (Sweden)

    Michael J McCarthy

    Full Text Available Circadian rhythm abnormalities in bipolar disorder (BD have led to a search for genetic abnormalities in circadian "clock genes" associated with BD. However, no significant clock gene findings have emerged from genome-wide association studies (GWAS. At least three factors could account for this discrepancy: complex traits are polygenic, the organization of the clock is more complex than previously recognized, and/or genetic risk for BD may be shared across multiple illnesses. To investigate these issues, we considered the clock gene network at three levels: essential "core" clock genes, upstream circadian clock modulators, and downstream clock controlled genes. Using relaxed thresholds for GWAS statistical significance, we determined the rates of clock vs. control genetic associations with BD, and four additional illnesses that share clinical features and/or genetic risk with BD (major depression, schizophrenia, attention deficit/hyperactivity. Then we compared the results to a set of lithium-responsive genes. Associations with BD-spectrum illnesses and lithium-responsiveness were both enriched among core clock genes but not among upstream clock modulators. Associations with BD-spectrum illnesses and lithium-responsiveness were also enriched among pervasively rhythmic clock-controlled genes but not among genes that were less pervasively rhythmic or non-rhythmic. Our analysis reveals previously unrecognized associations between clock genes and BD-spectrum illnesses, partly reconciling previously discordant results from past GWAS and candidate gene studies.

  19. ABOUT HYBRID BIDIRECTIONAL ASSOCIATIVE MEMORY NEURAL NETWORKS WITH DISCRETE DELAYS

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    In this paper, hybrid bidirectional associative memory neural networks with discrete delays is considered. By ingeniously importing real parameters di > 0(i = 1,2,···,n) which can be adjusted, we establish some new sufficient conditions for the dynamical characteristics of hybrid bidirectional associative memory neural networks with discrete delays by the method of variation of parameters and some analysis techniques. Our results generalize and improve the related results in [10,11]. Our work is significant...

  20. Study of obesity associated proopiomelanocortin gene polymorphism

    African Journals Online (AJOL)

    Farida El-Baz Mohamed

    2016-03-10

    Mar 10, 2016 ... Blood Pressure Education Program [13]. (2). Auxological ..... factorial management plans involving nutrition, environ- mental control .... Association of glycosylated hemoglobin (HbA1c) levels with · insulin resistance in obese ...

  1. Identification of human disease genes from interactome network using graphlet interaction.

    Directory of Open Access Journals (Sweden)

    Xiao-Dong Wang

    Full Text Available Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes.

  2. Identification of Human Disease Genes from Interactome Network Using Graphlet Interaction

    Science.gov (United States)

    Yang, Lun; Wei, Dong-Qing; Qi, Ying-Xin; Jiang, Zong-Lai

    2014-01-01

    Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes. PMID:24465923

  3. Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations.

    Directory of Open Access Journals (Sweden)

    Xiaodong Cai

    Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.

  4. A Small-Molecule Inducible Synthetic Circuit for Control of the SOS Gene Network without DNA Damage.

    Science.gov (United States)

    Kubiak, Jeffrey M; Culyba, Matthew J; Liu, Monica Yun; Mo, Charlie Y; Goulian, Mark; Kohli, Rahul M

    2017-11-17

    The bacterial SOS stress-response pathway is a pro-mutagenic DNA repair system that mediates bacterial survival and adaptation to genotoxic stressors, including antibiotics and UV light. The SOS pathway is composed of a network of genes under the control of the transcriptional repressor, LexA. Activation of the pathway involves linked but distinct events: an initial DNA damage event leads to activation of RecA, which promotes autoproteolysis of LexA, abrogating its repressor function and leading to induction of the SOS gene network. These linked events can each independently contribute to DNA repair and mutagenesis, making it difficult to separate the contributions of the different events to observed phenotypes. We therefore devised a novel synthetic circuit to unlink these events and permit induction of the SOS gene network in the absence of DNA damage or RecA activation via orthogonal cleavage of LexA. Strains engineered with the synthetic SOS circuit demonstrate small-molecule inducible expression of SOS genes as well as the associated resistance to UV light. Exploiting our ability to activate SOS genes independently of upstream events, we further demonstrate that the majority of SOS-mediated mutagenesis on the chromosome does not readily occur with orthogonal pathway induction alone, but instead requires DNA damage. More generally, our approach provides an exemplar for using synthetic circuit design to separate an environmental stressor from its associated stress-response pathway.

  5. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

    Science.gov (United States)

    Jia, Bin; Wang, Xiaodong

    2013-12-17

    : The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

  6. Characterization of differentially expressed genes using high-dimensional co-expression networks

    DEFF Research Database (Denmark)

    Coelho Goncalves de Abreu, Gabriel; Labouriau, Rodrigo S.

    2010-01-01

    We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation...... that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we...... construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than...

  7. Shared molecular pathways and gene networks for cardiovascular disease and type 2 diabetes mellitus in women across diverse ethnicities.

    Science.gov (United States)

    Chan, Kei Hang K; Huang, Yen-Tsung; Meng, Qingying; Wu, Chunyuan; Reiner, Alexander; Sobel, Eric M; Tinker, Lesley; Lusis, Aldons J; Yang, Xia; Liu, Simin

    2014-12-01

    Although cardiovascular disease (CVD) and type 2 diabetes mellitus (T2D) share many common risk factors, potential molecular mechanisms that may also be shared for these 2 disorders remain unknown. Using an integrative pathway and network analysis, we performed genome-wide association studies in 8155 blacks, 3494 Hispanic American, and 3697 Caucasian American women who participated in the national Women's Health Initiative single-nucleotide polymorphism (SNP) Health Association Resource and the Genomics and Randomized Trials Network. Eight top pathways and gene networks related to cardiomyopathy, calcium signaling, axon guidance, cell adhesion, and extracellular matrix seemed to be commonly shared between CVD and T2D across all 3 ethnic groups. We also identified ethnicity-specific pathways, such as cell cycle (specific for Hispanic American and Caucasian American) and tight junction (CVD and combined CVD and T2D in Hispanic American). In network analysis of gene-gene or protein-protein interactions, we identified key drivers that included COL1A1, COL3A1, and ELN in the shared pathways for both CVD and T2D. These key driver genes were cross-validated in multiple mouse models of diabetes mellitus and atherosclerosis. Our integrative analysis of American women of 3 ethnicities identified multiple shared biological pathways and key regulatory genes for the development of CVD and T2D. These prospective findings also support the notion that ethnicity-specific susceptibility genes and process are involved in the pathogenesis of CVD and T2D. © 2014 American Heart Association, Inc.

  8. Evidence for a Functional Hierarchy of Association Networks.

    Science.gov (United States)

    Choi, Eun Young; Drayna, Garrett K; Badre, David

    2018-05-01

    Patient lesion and neuroimaging studies have identified a rostral-to-caudal functional gradient in the lateral frontal cortex (LFC) corresponding to higher-order (complex or abstract) to lower-order (simple or concrete) cognitive control. At the same time, monkey anatomical and human functional connectivity studies show that frontal regions are reciprocally connected with parietal and temporal regions, forming parallel and distributed association networks. Here, we investigated the link between the functional gradient of LFC regions observed during control tasks and the parallel, distributed organization of association networks. Whole-brain fMRI task activity corresponding to four orders of hierarchical control [Badre, D., & D'Esposito, M. Functional magnetic resonance imaging evidence for a hierarchical organization of the prefrontal cortex. Journal of Cognitive Neuroscience, 19, 2082-2099, 2007] was compared with a resting-state functional connectivity MRI estimate of cortical networks [Yeo, B. T., Krienen, F. M., Sepulcre, J., Sabuncu, M. R., Lashkari, D., Hollinshead, M., et al. The organization of the human cerebral cortex estimated by intrinsic functional connectivity. Journal of Neurophysiology, 106, 1125-1165, 2011]. Critically, at each order of control, activity in the LFC and parietal cortex overlapped onto a common association network that differed between orders. These results are consistent with a functional organization based on separable association networks that are recruited during hierarchical control. Furthermore, corticostriatal functional connectivity MRI showed that, consistent with their participation in functional networks, rostral-to-caudal LFC and caudal-to-rostral parietal regions had similar, order-specific corticostriatal connectivity that agreed with a striatal gating model of hierarchical rule use. Our results indicate that hierarchical cognitive control is subserved by parallel and distributed association networks, together forming

  9. A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures.

    Science.gov (United States)

    Kentzoglanakis, Kyriakos; Poole, Matthew

    2012-01-01

    In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.

  10. Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations.

    Directory of Open Access Journals (Sweden)

    Benjamin A Logsdon

    Full Text Available Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL, which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1, and genes involved in endocytosis (RCY1, the spindle checkpoint (BUB2, sulfonate catabolism (JLP1, and cell-cell communication (PRM7. Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data.

  11. Cooperative adaptive responses in gene regulatory networks with many degrees of freedom.

    Science.gov (United States)

    Inoue, Masayo; Kaneko, Kunihiko

    2013-04-01

    Cells generally adapt to environmental changes by first exhibiting an immediate response and then gradually returning to their original state to achieve homeostasis. Although simple network motifs consisting of a few genes have been shown to exhibit such adaptive dynamics, they do not reflect the complexity of real cells, where the expression of a large number of genes activates or represses other genes, permitting adaptive behaviors. Here, we investigated the responses of gene regulatory networks containing many genes that have undergone numerical evolution to achieve high fitness due to the adaptive response of only a single target gene; this single target gene responds to changes in external inputs and later returns to basal levels. Despite setting a single target, most genes showed adaptive responses after evolution. Such adaptive dynamics were not due to common motifs within a few genes; even without such motifs, almost all genes showed adaptation, albeit sometimes partial adaptation, in the sense that expression levels did not always return to original levels. The genes split into two groups: genes in the first group exhibited an initial increase in expression and then returned to basal levels, while genes in the second group exhibited the opposite changes in expression. From this model, genes in the first group received positive input from other genes within the first group, but negative input from genes in the second group, and vice versa. Thus, the adaptation dynamics of genes from both groups were consolidated. This cooperative adaptive behavior was commonly observed if the number of genes involved was larger than the order of ten. These results have implications in the collective responses of gene expression networks in microarray measurements of yeast Saccharomyces cerevisiae and the significance to the biological homeostasis of systems with many components.

  12. An approach for reduction of false predictions in reverse engineering of gene regulatory networks.

    Science.gov (United States)

    Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar

    2018-05-14

    A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives

  13. Gene duplications in prokaryotes can be associated with environmental adaptation.

    Science.gov (United States)

    Bratlie, Marit S; Johansen, Jostein; Sherman, Brad T; Huang, Da Wei; Lempicki, Richard A; Drabløs, Finn

    2010-10-20

    Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate

  14. Gene duplications in prokaryotes can be associated with environmental adaptation

    Directory of Open Access Journals (Sweden)

    Lempicki Richard A

    2010-10-01

    Full Text Available Abstract Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive

  15. Candidate gene approach for parasite resistance in sheep--variation in immune pathway genes and association with fecal egg count.

    Directory of Open Access Journals (Sweden)

    Kathiravan Periasamy

    Full Text Available Sheep chromosome 3 (Oar3 has the largest number of QTLs reported to be significantly associated with resistance to gastro-intestinal nematodes. This study aimed to identify single nucleotide polymorphisms (SNPs within candidate genes located in sheep chromosome 3 as well as genes involved in major immune pathways. A total of 41 SNPs were identified across 38 candidate genes in a panel of unrelated sheep and genotyped in 713 animals belonging to 22 breeds across Asia, Europe and South America. The variations and evolution of immune pathway genes were assessed in sheep populations across these macro-environmental regions that significantly differ in the diversity and load of pathogens. The mean minor allele frequency (MAF did not vary between Asian and European sheep reflecting the absence of ascertainment bias. Phylogenetic analysis revealed two major clusters with most of South Asian, South East Asian and South West Asian breeds clustering together while European and South American sheep breeds clustered together distinctly. Analysis of molecular variance revealed strong phylogeographic structure at loci located in immune pathway genes, unlike microsatellite and genome wide SNP markers. To understand the influence of natural selection processes, SNP loci located in chromosome 3 were utilized to reconstruct haplotypes, the diversity of which showed significant deviations from selective neutrality. Reduced Median network of reconstructed haplotypes showed balancing selection in force at these loci. Preliminary association of SNP genotypes with phenotypes recorded 42 days post challenge revealed significant differences (P<0.05 in fecal egg count, body weight change and packed cell volume at two, four and six SNP loci respectively. In conclusion, the present study reports strong phylogeographic structure and balancing selection operating at SNP loci located within immune pathway genes. Further, SNP loci identified in the study were found to have

  16. On the role of sparseness in the evolution of modularity in gene regulatory networks.

    Science.gov (United States)

    Espinosa-Soto, Carlos

    2018-05-01

    Modularity is a widespread property in biological systems. It implies that interactions occur mainly within groups of system elements. A modular arrangement facilitates adjustment of one module without perturbing the rest of the system. Therefore, modularity of developmental mechanisms is a major factor for evolvability, the potential to produce beneficial variation from random genetic change. Understanding how modularity evolves in gene regulatory networks, that create the distinct gene activity patterns that characterize different parts of an organism, is key to developmental and evolutionary biology. One hypothesis for the evolution of modules suggests that interactions between some sets of genes become maladaptive when selection favours additional gene activity patterns. The removal of such interactions by selection would result in the formation of modules. A second hypothesis suggests that modularity evolves in response to sparseness, the scarcity of interactions within a system. Here I simulate the evolution of gene regulatory networks and analyse diverse experimentally sustained networks to study the relationship between sparseness and modularity. My results suggest that sparseness alone is neither sufficient nor necessary to explain modularity in gene regulatory networks. However, sparseness amplifies the effects of forms of selection that, like selection for additional gene activity patterns, already produce an increase in modularity. That evolution of new gene activity patterns is frequent across evolution also supports that it is a major factor in the evolution of modularity. That sparseness is widespread across gene regulatory networks indicates that it may have facilitated the evolution of modules in a wide variety of cases.

  17. On the Interplay between Entropy and Robustness of Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Bor-Sen Chen

    2010-05-01

    Full Text Available The interplay between entropy and robustness of gene network is a core mechanism of systems biology. The entropy is a measure of randomness or disorder of a physical system due to random parameter fluctuation and environmental noises in gene regulatory networks. The robustness of a gene regulatory network, which can be measured as the ability to tolerate the random parameter fluctuation and to attenuate the effect of environmental noise, will be discussed from the robust H∞ stabilization and filtering perspective. In this review, we will also discuss their balancing roles in evolution and potential applications in systems and synthetic biology.

  18. Digital Signal Processing and Control for the Study of Gene Networks

    Science.gov (United States)

    Shin, Yong-Jun

    2016-04-01

    Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks.

  19. A fuzzy network module extraction technique for gene expression data

    Indian Academy of Sciences (India)

    2014-05-01

    expression network from the distance matrix. The distance matrix is .... mental process, cellular component assembly involved in ..... the molecules are present in the network. User can ... hsa05213:Endometrial cancer. 24. 0.07.

  20. Genes associated with Type 2 Diabetes and vascular complications.

    Science.gov (United States)

    Montesanto, Alberto; Bonfigli, Anna Rita; Crocco, Paolina; Garagnani, Paolo; De Luca, Maria; Boemi, Massimo; Marasco, Elena; Pirazzini, Chiara; Giuliani, Cristina; Franceschi, Claudio; Passarino, Giuseppe; Testa, Roberto; Olivieri, Fabiola; Rose, Giuseppina

    2018-02-04

    Type 2 Diabetes (T2D) is a chronic disease associated with a number of micro- and macrovascular complications that increase the morbidity and mortality of patients. The risk of diabetic complications has a strong genetic component. To this end, we sought to evaluate the association of 40 single nucleotide polymorphisms (SNPs) in 21 candidate genes with T2D and its vascular complications in 503 T2D patients and 580 healthy controls. The genes were chosen because previously reported to be associated with T2D complications and/or with the aging process. We replicated the association of T2D risk with I GF2BP rs4402960 and detected novel associations with TERT rs2735940 and rs2736098. The addition of these SNPs to a model including traditional risk factors slightly improved risk prediction. After stratification of patients according to the presence/absence of vascular complications, we found significant associations of variants in the CAT , FTO , and UCP1 genes with diabetic retinopathy and nephropathy. Additionally, a variant in the ADIPOQ gene was found associated with macrovascular complications. Notably, these genes are involved in some way in mitochondrial biology and reactive oxygen species regulation. Hence, our findings strongly suggest a potential link between mitochondrial oxidative homeostasis and individual predisposition to diabetic vascular complications.

  1. Integrative analysis for finding genes and networks involved in diabetes and other complex diseases

    DEFF Research Database (Denmark)

    Bergholdt, R.; Størling, Zenia, Marian; Hansen, Kasper Lage

    2007-01-01

    We have developed an integrative analysis method combining genetic interactions, identified using type 1 diabetes genome scan data, and a high-confidence human protein interaction network. Resulting networks were ranked by the significance of the enrichment of proteins from interacting regions. We...... identified a number of new protein network modules and novel candidate genes/proteins for type 1 diabetes. We propose this type of integrative analysis as a general method for the elucidation of genes and networks involved in diabetes and other complex diseases....

  2. Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives

    International Nuclear Information System (INIS)

    Warmflash, Aryeh; Siggia, Eric D; Francois, Paul

    2012-01-01

    The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input–output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria. (paper)

  3. Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives.

    Science.gov (United States)

    Warmflash, Aryeh; Francois, Paul; Siggia, Eric D

    2012-10-01

    The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input-output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria.

  4. Identification of astrocytoma associated genes including cell surface markers

    International Nuclear Information System (INIS)

    Boon, Kathy; Edwards, Jennifer B; Eberhart, Charles G; Riggins, Gregory J

    2004-01-01

    Despite intense effort the treatment options for the invasive astrocytic tumors are still limited to surgery and radiation therapy, with chemotherapy showing little or no increase in survival. The generation of Serial Analysis of Gene Expression (SAGE) profiles is expected to aid in the identification of astrocytoma-associated genes and highly expressed cell surface genes as molecular therapeutic targets. SAGE tag counts can be easily added to public expression databases and quickly disseminated to research efforts worldwide. We generated and analyzed the SAGE transcription profiles of 25 primary grade II, III and IV astrocytomas [1]. These profiles were produced as part of the Cancer Genome Anatomy Project's SAGE Genie [2], and were used in an in silico search for candidate therapeutic targets by comparing astrocytoma to normal brain transcription. Real-time PCR and immunohistochemistry were used for the validation of selected candidate target genes in 2 independent sets of primary tumors. A restricted set of tumor-associated genes was identified for each grade that included genes not previously associated with astrocytomas (e.g. VCAM1, SMOC1, and thymidylate synthetase), with a high percentage of cell surface genes. Two genes with available antibodies, Aquaporin 1 and Topoisomerase 2A, showed protein expression consistent with transcript level predictions. This survey of transcription in malignant and normal brain tissues reveals a small subset of human genes that are activated in malignant astrocytomas. In addition to providing insights into pathway biology, we have revealed and quantified expression for a significant portion of cell surface and extra-cellular astrocytoma genes

  5. Identifying essential genes in bacterial metabolic networks with machine learning methods

    Science.gov (United States)

    2010-01-01

    Background Identifying essential genes in bacteria supports to identify potential drug targets and an understanding of minimal requirements for a synthetic cell. However, experimentally assaying the essentiality of their coding genes is resource intensive and not feasible for all bacterial organisms, in particular if they are infective. Results We developed a machine learning technique to identify essential genes using the experimental data of genome-wide knock-out screens from one bacterial organism to infer essential genes of another related bacterial organism. We used a broad variety of topological features, sequence characteristics and co-expression properties potentially associated with essentiality, such as flux deviations, centrality, codon frequencies of the sequences, co-regulation and phyletic retention. An organism-wise cross-validation on bacterial species yielded reliable results with good accuracies (area under the receiver-operator-curve of 75% - 81%). Finally, it was applied to drug target predictions for Salmonella typhimurium. We compared our predictions to the viability of experimental knock-outs of S. typhimurium and identified 35 enzymes, which are highly relevant to be considered as potential drug targets. Specifically, we detected promising drug targets in the non-mevalonate pathway. Conclusions Using elaborated features characterizing network topology, sequence information and microarray data enables to predict essential genes from a bacterial reference organism to a related query organism without any knowledge about the essentiality of genes of the query organism. In general, such a method is beneficial for inferring drug targets when experimental data about genome-wide knockout screens is not available for the investigated organism. PMID:20438628

  6. Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

    Science.gov (United States)

    Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

    2014-01-16

    To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high

  7. Network statistics of genetically-driven gene co-expression modules in mouse crosses

    Directory of Open Access Journals (Sweden)

    Marie-Pier eScott-Boyer

    2013-12-01

    Full Text Available In biology, networks are used in different contexts as ways to represent relationships between entities, such as for instance interactions between genes, proteins or metabolites. Despite progress in the analysis of such networks and their potential to better understand the collective impact of genes on complex traits, one remaining challenge is to establish the biologic validity of gene co-expression networks and to determine what governs their organization. We used WGCNA to construct and analyze seven gene expression datasets from several tissues of mouse recombinant inbred strains (RIS. For six out of the 7 networks, we found that linkage to module QTLs (mQTLs could be established for 29.3% of gene co-expression modules detected in the several mouse RIS. For about 74.6% of such genetically-linked modules, the mQTL was on the same chromosome as the one contributing most genes to the module, with genes originating from that chromosome showing higher connectivity than other genes in the modules. Such modules (that we considered as genetically-driven had network statistic properties (density, centralization and heterogeneity that set them apart from other modules in the network. Altogether, a sizeable portion of gene co-expression modules detected in mouse RIS panels had genetic determinants as their main organizing principle. In addition to providing a biologic interpretation validation for these modules, these genetic determinants imparted on them particular properties that set them apart from other modules in the network, to the point that they can be predicted to a large extent on the basis of their network statistics.

  8. Integrated in silico Analyses of Regulatory and Metabolic Networks of Synechococcus sp. PCC 7002 Reveal Relationships between Gene Centrality and Essentiality

    Directory of Open Access Journals (Sweden)

    Hyun-Seob Song

    2015-03-01

    Full Text Available Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as “topologically important.” Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termed as “functionally important” genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.

  9. A gene expression signature associated with survival in metastatic melanoma

    Science.gov (United States)

    Mandruzzato, Susanna; Callegaro, Andrea; Turcatel, Gianluca; Francescato, Samuela; Montesco, Maria C; Chiarion-Sileni, Vanna; Mocellin, Simone; Rossi, Carlo R; Bicciato, Silvio; Wang, Ena; Marincola, Francesco M; Zanovello, Paola

    2006-01-01

    Background Current clinical and histopathological criteria used to define the prognosis of melanoma patients are inadequate for accurate prediction of clinical outcome. We investigated whether genome screening by means of high-throughput gene microarray might provide clinically useful information on patient survival. Methods Forty-three tumor tissues from 38 patients with stage III and stage IV melanoma were profiled with a 17,500 element cDNA microarray. Expression data were analyzed using significance analysis of microarrays (SAM) to identify genes associated with patient survival, and supervised principal components (SPC) to determine survival prediction. Results SAM analysis revealed a set of 80 probes, corresponding to 70 genes, associated with survival, i.e. 45 probes characterizing longer and 35 shorter survival times, respectively. These transcripts were included in a survival prediction model designed using SPC and cross-validation which allowed identifying 30 predicting probes out of the 80 associated with survival. Conclusion The longer-survival group of genes included those expressed in immune cells, both innate and acquired, confirming the interplay between immunological mechanisms and the natural history of melanoma. Genes linked to immune cells were totally lacking in the poor-survival group, which was instead associated with a number of genes related to highly proliferative and invasive tumor cells. PMID:17129373

  10. A gene expression signature associated with survival in metastatic melanoma

    Directory of Open Access Journals (Sweden)

    Rossi Carlo R

    2006-11-01

    Full Text Available Abstract Background Current clinical and histopathological criteria used to define the prognosis of melanoma patients are inadequate for accurate prediction of clinical outcome. We investigated whether genome screening by means of high-throughput gene microarray might provide clinically useful information on patient survival. Methods Forty-three tumor tissues from 38 patients with stage III and stage IV melanoma were profiled with a 17,500 element cDNA microarray. Expression data were analyzed using significance analysis of microarrays (SAM to identify genes associated with patient survival, and supervised principal components (SPC to determine survival prediction. Results SAM analysis revealed a set of 80 probes, corresponding to 70 genes, associated with survival, i.e. 45 probes characterizing longer and 35 shorter survival times, respectively. These transcripts were included in a survival prediction model designed using SPC and cross-validation which allowed identifying 30 predicting probes out of the 80 associated with survival. Conclusion The longer-survival group of genes included those expressed in immune cells, both innate and acquired, confirming the interplay between immunological mechanisms and the natural history of melanoma. Genes linked to immune cells were totally lacking in the poor-survival group, which was instead associated with a number of genes related to highly proliferative and invasive tumor cells.

  11. Gastric Cancer Associated Genes Identified by an Integrative Analysis of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Bing Jiang

    2017-01-01

    Full Text Available Gastric cancer is one of the most severe complex diseases with high morbidity and mortality in the world. The molecular mechanisms and risk factors for this disease are still not clear since the cancer heterogeneity caused by different genetic and environmental factors. With more and more expression data accumulated nowadays, we can perform integrative analysis for these data to understand the complexity of gastric cancer and to identify consensus players for the heterogeneous cancer. In the present work, we screened the published gene expression data and analyzed them with integrative tool, combined with pathway and gene ontology enrichment investigation. We identified several consensus differentially expressed genes and these genes were further confirmed with literature mining; at last, two genes, that is, immunoglobulin J chain and C-X-C motif chemokine ligand 17, were screened as novel gastric cancer associated genes. Experimental validation is proposed to further confirm this finding.

  12. Identification of unstable network modules reveals disease modules associated with the progression of Alzheimer's disease.

    Directory of Open Access Journals (Sweden)

    Masataka Kikuchi

    Full Text Available Alzheimer's disease (AD, the most common cause of dementia, is associated with aging, and it leads to neuron death. Deposits of amyloid β and aberrantly phosphorylated tau protein are known as pathological hallmarks of AD, but the underlying mechanisms have not yet been revealed. A high-throughput gene expression analysis previously showed that differentially expressed genes accompanying the progression of AD were more down-regulated than up-regulated in the later stages of AD. This suggested that the molecular networks and their constituent modules collapsed along with AD progression. In this study, by using gene expression profiles and protein interaction networks (PINs, we identified the PINs expressed in three brain regions: the entorhinal cortex (EC, hippocampus (HIP and superior frontal gyrus (SFG. Dividing the expressed PINs into modules, we examined the stability of the modules with AD progression and with normal aging. We found that in the AD modules, the constituent proteins, interactions and cellular functions were not maintained between consecutive stages through all brain regions. Interestingly, the modules were collapsed with AD progression, specifically in the EC region. By identifying the modules that were affected by AD pathology, we found the transcriptional regulation-associated modules that interact with the proteasome-associated module via UCHL5 hub protein, which is a deubiquitinating enzyme. Considering PINs as a system made of network modules, we found that the modules relevant to the transcriptional regulation are disrupted in the EC region, which affects the ubiquitin-proteasome system.

  13. Relevance Epistasis Network of Gastritis for Intra-chromosomes in the Korea Associated Resource (KARE Cohort Study

    Directory of Open Access Journals (Sweden)

    Hyun-hwan Jeong

    2014-12-01

    Full Text Available Gastritis is a common but a serious disease with a potential risk of developing carcinoma. Helicobacter pylori infection is reported as the most common cause of gastritis, but other genetic and genomic factors exist, especially single-nucleotide polymorphisms (SNPs. Association studies between SNPs and gastritis disease are important, but results on epistatic interactions from multiple SNPs are rarely found in previous genome-wide association (GWA studies. In this study, we performed computational GWA case-control studies for gastritis in Korea Associated Resource (KARE data. By transforming the resulting SNP epistasis network into a gene-gene epistasis network, we also identified potential gene-gene interaction factors that affect the susceptibility to gastritis.

  14. Disruption of the neurexin 1 gene is associated with schizophrenia

    DEFF Research Database (Denmark)

    Rujescu, Dan; Ingason, Andres; Cichon, Sven

    2009-01-01

    may vary according to the level of functional impact on the gene, we next restricted the association analysis to CNVs that disrupt exons (0.24% of cases and 0.015% of controls). These were significantly associated with a high odds ratio (P = 0.0027; OR 8.97, 95% CI 1.8-51.9). We conclude that NRXN1...

  15. Whole brain and brain regional coexpression network interactions associated with predisposition to alcohol consumption.

    Directory of Open Access Journals (Sweden)

    Lauren A Vanderlinden

    Full Text Available To identify brain transcriptional networks that may predispose an animal to consume alcohol, we used weighted gene coexpression network analysis (WGCNA. Candidate coexpression modules are those with an eigengene expression level that correlates significantly with the level of alcohol consumption across a panel of BXD recombinant inbred mouse strains, and that share a genomic region that regulates the module transcript expression levels (mQTL with a genomic region that regulates alcohol consumption (bQTL. To address a controversy regarding utility of gene expression profiles from whole brain, vs specific brain regions, as indicators of the relationship of gene expression to phenotype, we compared candidate coexpression modules from whole brain gene expression data (gathered with Affymetrix 430 v2 arrays in the Colorado laboratories and from gene expression data from 6 brain regions (nucleus accumbens (NA; prefrontal cortex (PFC; ventral tegmental area (VTA; striatum (ST; hippocampus (HP; cerebellum (CB available from GeneNetwork. The candidate modules were used to construct candidate eigengene networks across brain regions, resulting in three "meta-modules", composed of candidate modules from two or more brain regions (NA, PFC, ST, VTA and whole brain. To mitigate the potential influence of chromosomal location of transcripts and cis-eQTLs in linkage disequilibrium, we calculated a semi-partial correlation of the transcripts in the meta-modules with alcohol consumption conditional on the transcripts' cis-eQTLs. The function of transcripts that retained the correlation with the phenotype after correction for the strong genetic influence, implicates processes of protein metabolism in the ER and Golgi as influencing susceptibility to variation in alcohol consumption. Integration of these data with human GWAS provides further information on the function of polymorphisms associated with alcohol-related traits.

  16. Convergent functional genomics in addiction research - a translational approach to study candidate genes and gene networks.

    Science.gov (United States)

    Spanagel, Rainer

    2013-01-01

    Convergent functional genomics (CFG) is a translational methodology that integrates in a Bayesian fashion multiple lines of evidence from studies in human and animal models to get a better understanding of the genetics of a disease or pathological behavior. Here the integration of data sets that derive from forward genetics in animals and genetic association studies including genome wide association studies (GWAS) in humans is described for addictive behavior. The aim of forward genetics in animals and association studies in humans is to identify mutations (e.g. SNPs) that produce a certain phenotype; i.e. "from phenotype to genotype". Most powerful in terms of forward genetics is combined quantitative trait loci (QTL) analysis and gene expression profiling in recombinant inbreed rodent lines or genetically selected animals for a specific phenotype, e.g. high vs. low drug consumption. By Bayesian scoring genomic information from forward genetics in animals is then combined with human GWAS data on a similar addiction-relevant phenotype. This integrative approach generates a robust candidate gene list that has to be functionally validated by means of reverse genetics in animals; i.e. "from genotype to phenotype". It is proposed that studying addiction relevant phenotypes and endophenotypes by this CFG approach will allow a better determination of the genetics of addictive behavior.

  17. Chronic obstructive pulmonary disease candidate gene prioritization based on metabolic networks and functional information.

    Directory of Open Access Journals (Sweden)

    Xinyan Wang

    Full Text Available Chronic obstructive pulmonary disease (COPD is a multi-factor disease, in which metabolic disturbances played important roles. In this paper, functional information was integrated into a COPD-related metabolic network to assess similarity between genes. Then a gene prioritization method was applied to the COPD-related metabolic network to prioritize COPD candidate genes. The gene prioritization method was superior to ToppGene and ToppNet in both literature validation and functional enrichment analysis. Top-ranked genes prioritized from the metabolic perspective with functional information could promote the better understanding about the molecular mechanism of this disease. Top 100 genes might be potential markers for diagnostic and effective therapies.

  18. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed; Alroobi, Rami; Banitaan, Shadi; Seridi, Loqmane; Aljarah, Ibrahim; Brewer, James

    2013-01-01

    networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional

  19. Application of Weighted Gene Co-expression Network Analysis for Data from Paired Design.

    Science.gov (United States)

    Li, Jianqiang; Zhou, Doudou; Qiu, Weiliang; Shi, Yuliang; Yang, Ji-Jiang; Chen, Shi; Wang, Qing; Pan, Hui

    2018-01-12

    Investigating how genes jointly affect complex human diseases is important, yet challenging. The network approach (e.g., weighted gene co-expression network analysis (WGCNA)) is a powerful tool. However, genomic data usually contain substantial batch effects, which could mask true genomic signals. Paired design is a powerful tool that can reduce batch effects. However, it is currently unclear how to appropriately apply WGCNA to genomic data from paired design. In this paper, we modified the current WGCNA pipeline to analyse high-throughput genomic data from paired design. We illustrated the modified WGCNA pipeline by analysing the miRNA dataset provided by Shiah et al. (2014), which contains forty oral squamous cell carcinoma (OSCC) specimens and their matched non-tumourous epithelial counterparts. OSCC is the sixth most common cancer worldwide. The modified WGCNA pipeline identified two sets of novel miRNAs associated with OSCC, in addition to the existing miRNAs reported by Shiah et al. (2014). Thus, this work will be of great interest to readers of various scientific disciplines, in particular, genetic and genomic scientists as well as medical scientists working on cancer.

  20. ASSOCIATION OF HFE GENE MUTATION IN THALASSEMIA MAJOR PATIENTS

    Directory of Open Access Journals (Sweden)

    Amit Kumar Tiwari

    2016-11-01

    Full Text Available BACKGROUND Thalassemia major patients are dependent on frequent blood transfusion and consequently develop iron overload. HFE gene mutations (C282Y, H63D and S65C in hereditary haemochromatosis has been shown to be associated with iron overload. The study aims at finding the association of HFE gene mutations in β-thalassemia major patients. MATERIALS AND METHODS A descriptive observational pilot study was conducted including fifty diagnosed -thalassemia major cases. DNA analysis by PCR-RFLP method for HFE gene mutations was performed. RESULTS Only H63D mutation (out of three HFE gene mutations was detected in 8 out of 50 cases. Observed frequency of H63D mutation was 16%. While frequency of C282Y and S65C were 0% each. CONCLUSION The frequency of HFE mutation in -thalassemia major is not very common.