This thesis entitled "Function analysis of unknown genes" presents the use of proteome analysis for the characterisation of yeast (Saccharomyces cerevisiae) genes and their products (proteins especially those of unknown function). This study illustrates that proteome analysis can be used...... to describe different aspects of molecular biology of the cell, to study changes that occur in the cell due to overexpression or deletion of a gene and to identify various protein modifications. The biological questions and the results of the described studies show the diversity of the information that can...... genes and proteins. It reports the first global proteome database collecting 36 yeast single gene deletion mutants and selecting over 650 differences between analysed mutants and the wild type strain. The obtained results show that two-dimensional gel electrophoresis and mass spectrometry based proteome...
Debrabant, Birgit; Soerensen, Mette
Abstract We discuss the use of modified Kolmogorov-Smirnov (KS) statistics in the context of gene set analysis and review corresponding null and alternative hypotheses. Especially, we show that, when enhancing the impact of highly significant genes in the calculation of the test statistic, the co...
Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes
The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network. PMID:19876378
Huang, Yen-Tsung; Lin, Xihong
Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
Jung, Chol-Hee; Wong, Chui E.; Singh, Mohan B.; Bhalla, Prem L.
Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja) revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant, Arabidopsis. PMID:22679494
van Ruissen, Fred; Baas, Frank
In 1995, serial analysis of gene expression (SAGE) was developed as a versatile tool for gene expression studies. SAGE technology does not require pre-existing knowledge of the genome that is being examined and therefore SAGE can be applied to many different model systems. In this chapter, the SAGE
Mocellin, Simone; Rossi, Carlo Riccardo
The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.
Wang, Xiran; Jiang, Leiyu; Tang, Haoru
GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.
Pers, Tune H
Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...
de Leeuw, C.A.; Mooij, J.M.; Heskes, T.; Posthuma, D.
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical
de Leeuw, C.A.; Mooij, J.M.; Heskes, T.; Posthuma, D.
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical
Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S
PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out
de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages) seed coats (globular and torpedo stages) and endosperm (pooled globular to torpedo stages) and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST) (GenBank accessions LIBEST_026995 to LIBEST_027011) were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152) had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid clones that comprise
Full Text Available Abstract Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages seed coats (globular and torpedo stages and endosperm (pooled globular to torpedo stages and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST (GenBank accessions LIBEST_026995 to LIBEST_027011 were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152 had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid
The use of gene expression profiling to predict chemical mode of action would be enhanced by better characterization of variance due to individual, environmental, and technical factors. Meta-analysis of microarray data from untreated or vehicle-treated animals within the control arm of toxicogenomics studies has yielded useful information on baseline fluctuations in gene expression. A dataset of control animal microarray expression data was assembled by a working group of the Health and Environmental Sciences Institute's Technical Committee on the Application of Genomics in Mechanism Based Risk Assessment in order to provide a public resource for assessments of variability in baseline gene expression. Data from over 500 Affymetrix microarrays from control rat liver and kidney were collected from 16 different institutions. Thirty-five biological and technical factors were obtained for each animal, describing a wide range of study characteristics, and a subset were evaluated in detail for their contribution to total variability using multivariate statistical and graphical techniques. The study factors that emerged as key sources of variability included gender, organ section, strain, and fasting state. These and other study factors were identified as key descriptors that should be included in the minimal information about a toxicogenomics study needed for interpretation of results by an independent source. Genes that are the most and least variable, gender-selectiv
Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng
Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.
Kevin L Childs
Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional
Full Text Available Large numbers of quantitative trait loci (QTL affecting complex diseases and other quantitative traits have been reported in humans and model animals. However, the genetic architecture of these traits remains elusive due to the difficulty in identifying causal quantitative trait genes (QTGs for common QTL with relatively small phenotypic effects. A traditional strategy based on techniques such as positional cloning does not always enable identification of a single candidate gene for a QTL of interest because it is difficult to narrow down a target genomic interval of the QTL to a very small interval harboring only one gene. A combination of gene expression analysis and statistical causal analysis can greatly reduce the number of candidate genes. This integrated approach provides causal evidence that one of the candidate genes is a putative QTG for the QTL. Using this approach, I have recently succeeded in identifying a single putative QTG for resistance to obesity in mice. Here, I outline the integration approach and discuss its usefulness using my studies as an example.
Large numbers of quantitative trait loci (QTL) affecting complex diseases and other quantitative traits have been reported in humans and model animals. However, the genetic architecture of these traits remains elusive due to the difficulty in identifying causal quantitative trait genes (QTGs) for common QTL with relatively small phenotypic effects. A traditional strategy based on techniques such as positional cloning does not always enable identification of a single candidate gene for a QTL of interest because it is difficult to narrow down a target genomic interval of the QTL to a very small interval harboring only one gene. A combination of gene expression analysis and statistical causal analysis can greatly reduce the number of candidate genes. This integrated approach provides causal evidence that one of the candidate genes is a putative QTG for the QTL. Using this approach, I have recently succeeded in identifying a single putative QTG for resistance to obesity in mice. Here, I outline the integration approach and discuss its usefulness using my studies as an example.
Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
Tintle Nathan L
Full Text Available Abstract Background Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. Results We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Conclusions Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
Köhler, S; Belkin, S; Schmid, R D
In parallel to the continuous development of increasingly more sophisticated physical and chemical analytical technologies for the detection of environmental pollutants, there is a progressively more urgent need also for bioassays which report not only on the presence of a chemical but also on its bioavailability and its biological effects. As a partial fulfillment of that need, there has been a rapid development of biosensors based on genetically engineered bacteria. Such microorganisms typically combine a promoter-operator, which acts as the sensing element, with reporter gene(s) coding for easily detectable proteins. These sensors have the ability to detect global parameters such as stress conditions, toxicity or DNA-damaging agents as well as specific organic and inorganic compounds. The systems described in this review, designed to detect different groups of target chemicals, vary greatly in their detection limits, specificity, response times and more. These variations reflect on their potential applicability which, for most of the constructs described, is presently rather limited. Nevertheless, present trends promise that additional improvements will make microbial biosensors an important tool for future environmental analysis.
Fehrmann, Rudolf S. N.; Karjalainen, Juha M.; Krajewska, Malgorzata
Many cancer-associated somatic copy number alterations (SCNAs) are known. Currently, one of the challenges is to identify the molecular downstream effects of these variants. Although several SCNAs are known to change gene expression levels, it is not clear whether each individual SCNA affects gen...
Full Text Available Abstract The study of clinical samples is often limited by the amount of material available to study. While proteins cannot be multiplied in their natural form, DNA and RNA can be amplified from small specimens and used for high-throughput analyses. Therefore, genetic studies offer the best opportunity to screen for novel insights of human pathology when little material is available. Precise estimates of DNA copy numbers in a given specimen are necessary. However, most studies investigate static variables such as the genetic background of patients or mutations within pathological specimens without a need to assess proportionality of expression among different genes throughout the genome. Comparative genomic hybridization of DNA samples represents a crude exception to this rule since genomic amplification or deletion is compared among different specimens directly. For gene expression analysis, however, it is critical to accurately estimate the proportional expression of distinct RNA transcripts since such proportions directly govern cell function by modulating protein expression. Furthermore, comparative estimates of relative RNA expression at different time points portray the response of cells to environmental stimuli, indirectly informing about broader biological events affecting a particular tissue in physiological or pathological conditions. This cognitive reaction of cells is similar to the detection of electroencephalographic patterns which inform about the status of the brain in response to external stimuli. As our need to understand human pathophysiology at the global level increases, the development and refinement of technologies for high fidelity messenger RNA amplification have become the focus of increasing interest during the past decade. The need to increase the abundance of RNA has been met not only for gene specific amplification, but, most importantly for global transcriptome wide, unbiased amplification. Now gene
Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun
The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867
Larsen, Lesli H; Gjesing, Anette P; Sørensen, Thorkild I A
To investigate the preproghrelin gene for variants and their association with obesity and type 2 diabetes.......To investigate the preproghrelin gene for variants and their association with obesity and type 2 diabetes....
Fridley, Brooke L; Batzler, Anthony; Li, Liang; Li, Fang; Matimba, Alice; Jenkins, Gregory D; Ji, Yuan; Wang, Liewei; Weinshilboum, Richard M
Responses to therapies, either with regard to toxicities or efficacy, are expected to involve complex relationships of gene products within the same molecular pathway or functional gene set. Therefore, pathways or gene sets, as opposed to single genes, may better reflect the true underlying biology and may be more appropriate units for analysis of pharmacogenomic studies. Application of such methods to pharmacogenomic studies may enable the detection of more subtle effects of multiple genes in the same pathway that may be missed by assessing each gene individually. A gene set analysis of 3821 gene sets is presented assessing the association between basal messenger RNA expression and drug cytotoxicity using ethnically defined human lymphoblastoid cell lines for two classes of drugs: pyrimidines [gemcitabine (dFdC) and arabinoside] and purines [6-thioguanine and 6-mercaptopurine]. The gene set nucleoside-diphosphatase activity was found to be significantly associated with both dFdC and arabinoside, whereas gene set γ-aminobutyric acid catabolic process was associated with dFdC and 6-thioguanine. These gene sets were significantly associated with the phenotype even after adjusting for multiple testing. In addition, five associated gene sets were found in common between the pyrimidines and two gene sets for the purines (3',5'-cyclic-AMP phosphodiesterase activity and γ-aminobutyric acid catabolic process) with a P value of less than 0.0001. Functional validation was attempted with four genes each in gene sets for thiopurine and pyrimidine antimetabolites. All four genes selected from the pyrimidine gene sets (PSME3, CANT1, ENTPD6, ADRM1) were validated, but only one (PDE4D) was validated for the thiopurine gene sets. In summary, results from the gene set analysis of pyrimidine and purine therapies, used often in the treatment of various cancers, provide novel insight into the relationship between genomic variation and drug response.
Boris P Hejblum
Full Text Available Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial, and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package.
Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin
This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Full Text Available Abstract Background Temporal gene expression profiles characterize the time-dynamics of expression of specific genes and are increasingly collected in current gene expression experiments. In the analysis of experiments where gene expression is obtained over the life cycle, it is of interest to relate temporal patterns of gene expression associated with different developmental stages to each other to study patterns of long-term developmental gene regulation. We use tools from functional data analysis to study dynamic changes by relating temporal gene expression profiles of different developmental stages to each other. Results We demonstrate that functional regression methodology can pinpoint relationships that exist between temporary gene expression profiles for different life cycle phases and incorporates dimension reduction as needed for these high-dimensional data. By applying these tools, gene expression profiles for pupa and adult phases are found to be strongly related to the profiles of the same genes obtained during the embryo phase. Moreover, one can distinguish between gene groups that exhibit relationships with positive and others with negative associations between later life and embryonal expression profiles. Specifically, we find a positive relationship in expression for muscle development related genes, and a negative relationship for strictly maternal genes for Drosophila, using temporal gene expression profiles. Conclusion Our findings point to specific reactivation patterns of gene expression during the Drosophila life cycle which differ in characteristic ways between various gene groups. Functional regression emerges as a useful tool for relating gene expression patterns from different developmental stages, and avoids the problems with large numbers of parameters and multiple testing that affect alternative approaches.
Mukherjee, Krishanu; Brocchieri, Luciano; B?rglin, Thomas R.
The full complement of homeobox transcription factor sequences, including genes and pseudogenes, was determined from the analysis of 10 complete genomes from flowering plants, moss, Selaginella, unicellular green algae, and red algae. Our exhaustive genome-wide searches resulted in the discovery in each class of a greater number of homeobox genes than previously reported. All homeobox genes can be unambiguously classified by sequence evolutionary analysis into 14 distinct classes also charact...
An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios
Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental
Smith Desmond J
Full Text Available Abstract Background Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. Results To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in
Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo
In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.
Full Text Available Quinclorac is a highly selective auxin-type herbicide, and is widely used in the effective control of barnyard grass in paddy rice fields, improving the world’s rice yield. The herbicide mode of action of quinclorac has been proposed and hormone interactions affect quinclorac signaling. Because of widespread use, quinclorac may be transported outside rice fields with the drainage waters, leading to soil and water pollution and environmental health problems.In this study, we used 57K Affymetrix rice whole-genome array to identify quinclorac signaling response genes to study the molecular mechanisms of action and detoxification of quinclorac in rice plants. Overall, 637 probe sets were identified with differential expression levels under either 6 or 24 h of quinclorac treatment. Auxin-related genes such as GH3 and OsIAAs responded to quinclorac treatment. Gene Ontology analysis showed that genes of detoxification-related family genes were significantly enriched, including cytochrome P450, GST, UGT, and ABC and drug transporter genes. Moreover, real-time RT-PCR analysis showed that top candidate P450 families such as CYP81, CYP709C and CYP72A genes were universally induced by different herbicides. Some Arabidopsis genes for the same P450 family were up-regulated under quinclorac treatment.We conduct rice whole-genome GeneChip analysis and the first global identification of quinclorac response genes. This work may provide potential markers for detoxification of quinclorac and biomonitors of environmental chemical pollution.
Baseler Michael W
Full Text Available Abstract Background Due to the complex and distributed nature of biological research, our current biological knowledge is spread over many redundant annotation databases maintained by many independent groups. Analysts usually need to visit many of these bioinformatics databases in order to integrate comprehensive annotation information for their genes, which becomes one of the bottlenecks, particularly for the analytic task associated with a large gene list. Thus, a highly centralized and ready-to-use gene-annotation knowledgebase is in demand for high throughput gene functional analysis. Description The DAVID Knowledgebase is built around the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of gene/protein identifiers from a variety of public genomic resources into DAVID gene clusters. The grouping of such identifiers improves the cross-reference capability, particularly across NCBI and UniProt systems, enabling more than 40 publicly available functional annotation sources to be comprehensively integrated and centralized by the DAVID gene clusters. The simple, pair-wise, text format files which make up the DAVID Knowledgebase are freely downloadable for various data analysis uses. In addition, a well organized web interface allows users to query different types of heterogeneous annotations in a high-throughput manner. Conclusion The DAVID Knowledgebase is designed to facilitate high throughput gene functional analysis. For a given gene list, it not only provides the quick accessibility to a wide range of heterogeneous annotation data in a centralized location, but also enriches the level of biological information for an individual gene. Moreover, the entire DAVID Knowledgebase is freely downloadable or searchable at http://david.abcc.ncifcrf.gov/knowledgebase/.
Goel, Neelam; Singh, Shailendra; Aseri, Trilok Chand
The rapid growth of genomic sequence data for both human and nonhuman species has made analyzing these sequences, especially predicting genes in them, very important and is currently the focus of many research efforts. Beside its scientific interest in the molecular biology and genomics community, gene prediction is of considerable importance in human health and medicine. A variety of gene prediction techniques have been developed for eukaryotes over the past few years. This article reviews and analyzes the application of certain soft computing techniques in gene prediction. First, the problem of gene prediction and its challenges are described. These are followed by different soft computing techniques along with their application to gene prediction. In addition, a comparative analysis of different soft computing techniques for gene prediction is given. Finally some limitations of the current research activities and future research directions are provided. Copyright © 2013 Elsevier Inc. All rights reserved.
Cimica, Velasco; Batusic, Danko; Haralanova-Ilieva, Borislava; Chen, Yonglong; Hollemann, Thomas; Pieler, Tomas; Ramadori, Giuliano
We have applied serial analysis of gene expression for studying the molecular mechanism of the rat liver regeneration in the model of 70% partial hepatectomy. We generated three SAGE libraries from a normal control liver (NL library: 52,343 tags), from a sham control operated liver (Sham library: 51,028 tags), and from a regenerating liver (PH library: 53,061 tags). By SAGE bioinformatics analysis we identified 40 induced genes and 20 repressed genes during the liver regeneration. We verified temporal expression of such genes by real time PCR during the regeneration process and we characterized 13 induced genes and 3 repressed genes. We found connective tissue growth factor transcript and protein induced very early at 4 h after PH operation before hepatocytes proliferation is triggered. Our study suggests CTGF as a growth factor signaling mediator that could be involved directly in the mechanism of liver regeneration induction
Jiang, Jinjin; Wang, Yue; Zhu, Bao; Fang, Tingting; Fang, Yujie; Wang, Youping
Brassica includes many successfully cultivated crop species of polyploid origin, either by ancestral genome triplication or by hybridization between two diploid progenitors, displaying complex repetitive sequences and transposons. The U's triangle, which consists of three diploids and three amphidiploids, is optimal for the analysis of complicated genomes after polyploidization. Next-generation sequencing enables the transcriptome profiling of polyploids on a global scale. We examined the gene expression patterns of three diploids (Brassica rapa, B. nigra, and B. oleracea) and three amphidiploids (B. napus, B. juncea, and B. carinata) via digital gene expression analysis. In total, the libraries generated between 5.7 and 6.1 million raw reads, and the clean tags of each library were mapped to 18547-21995 genes of B. rapa genome. The unambiguous tag-mapped genes in the libraries were compared. Moreover, the majority of differentially expressed genes (DEGs) were explored among diploids as well as between diploids and amphidiploids. Gene ontological analysis was performed to functionally categorize these DEGs into different classes. The Kyoto Encyclopedia of Genes and Genomes analysis was performed to assign these DEGs into approximately 120 pathways, among which the metabolic pathway, biosynthesis of secondary metabolites, and peroxisomal pathway were enriched. The non-additive genes in Brassica amphidiploids were analyzed, and the results indicated that orthologous genes in polyploids are frequently expressed in a non-additive pattern. Methyltransferase genes showed differential expression pattern in Brassica species. Our results provided an understanding of the transcriptome complexity of natural Brassica species. The gene expression changes in diploids and allopolyploids may help elucidate the morphological and physiological differences among Brassica species.
Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat
In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.
Full Text Available Gastric cancer is one of the most severe complex diseases with high morbidity and mortality in the world. The molecular mechanisms and risk factors for this disease are still not clear since the cancer heterogeneity caused by different genetic and environmental factors. With more and more expression data accumulated nowadays, we can perform integrative analysis for these data to understand the complexity of gastric cancer and to identify consensus players for the heterogeneous cancer. In the present work, we screened the published gene expression data and analyzed them with integrative tool, combined with pathway and gene ontology enrichment investigation. We identified several consensus differentially expressed genes and these genes were further confirmed with literature mining; at last, two genes, that is, immunoglobulin J chain and C-X-C motif chemokine ligand 17, were screened as novel gastric cancer associated genes. Experimental validation is proposed to further confirm this finding.
Choe, Jae Young; Han, Hyung Soo; Lee, Seon Duk; Lee, Hanna; Lee, Dong Eun; Ahn, Jae Yun; Ryoo, Hyun Wook; Seo, Kang Suk; Kim, Jong Kun
TNF-α regulates immune cells and acts as an endogenous pyrogen. Reverse transcription polymerase chain reaction (RT-PCR) is one of the most commonly used methods for gene expression analysis. Among the alternatives to PCR, loop-mediated isothermal amplification (LAMP) shows good potential in terms of specificity and sensitivity. However, few studies have compared RT-PCR and LAMP for human gene expression analysis. Therefore, in the present study, we compared one-step RT-PCR, two-step RT-LAMP and one-step RT-LAMP for human gene expression analysis. We compared three gene expression analysis methods using the human TNF-α gene as a biomarker from peripheral blood cells. Total RNA from the three selected febrile patients were subjected to the three different methods of gene expression analysis. In the comparison of three gene expression analysis methods, the detection limit of both one-step RT-PCR and one-step RT-LAMP were the same, while that of two-step RT-LAMP was inferior. One-step RT-LAMP takes less time, and the experimental result is easy to determine. One-step RT-LAMP is a potentially useful and complementary tool that is fast and reasonably sensitive. In addition, one-step RT-LAMP could be useful in environments lacking specialized equipment or expertise.
Haman, Jiří; Valenta, Zdeněk; Kalina, Jan
Roč. 1, č. 1 (2013), s. 65-65 ISSN 1805-8698. [EFMI 2013 Special Topic Conference. 17.04.2013-19.04.2013, Prague] Institutional support: RVO:67985807 Keywords : shrinkage estimation * covariance matrix * high dimensional data * gene expression Subject RIV: IN - Informatics, Computer Science
Knudsen, Steen; Workman, Christopher; Sicheritz-Ponten, T.
GenePublisher, a system for automatic analysis of data from DNA microarray experiments, has been implemented with a web interface at http://www.cbs.dtu.dk/services/GenePublisher. Raw data are uploaded to the server together with aspecification of the data. The server performs normalization...
Figure 1. Phylogenetic relation of apple ARF genes. The phylogenetic tree was constructed based on a complete protein sequence align- ment of MdARFs by the neighbour-joining method with bootstrapping analysis (1000 replicates). The scale bar represents 0.05 amino acid substitutions per site. Paralogous gene pairs ...
Microarray analysis of the gene expression profile in triethylene glycol dimethacrylate-treated human dental pulp cells. ... Conclusions: Our results suggest that TEGDMA can change the many functions of hDPCs through large changes in gene expression levels and complex interactions with different signaling pathways.
Full Text Available Reverse transcription-qPCR (RT-qPCR has become a popular method for gene expression studies. Its results require data normalization by housekeeping genes. No single gene is proved to be stably expressed under all experimental conditions. Therefore, systematic evaluation of reference genes is necessary. With the aim to identify optimum reference genes for RT-qPCR analysis of gene expression in different tissues of Panax ginseng and the seedlings grown under heat stress, we investigated the expression stability of eight candidate reference genes, including elongation factor 1-beta (EF1-β, elongation factor 1-gamma (EF1-γ, eukaryotic translation initiation factor 3G (IF3G, eukaryotic translation initiation factor 3B (IF3B, actin (ACT, actin11 (ACT11, glyceraldehyde-3-phosphate dehydrogenase (GAPDH and cyclophilin ABH-like protein (CYC, using four widely used computational programs: geNorm, Normfinder, BestKeeper, and the comparative ΔCt method. The results were then integrated using the web-based tool RefFinder. As a result, EF1-γ, IF3G and EF1-β were the three most stable genes in different tissues of P. ginseng, while IF3G, ACT11 and GAPDH were the top three-ranked genes in seedlings treated with heat. Using three better reference genes alone or in combination as internal control, we examined the expression profiles of MAR, a multiple function-associated mRNA-like non-coding RNA (mlncRNA in P. ginseng. Taken together, we recommended EF1-γ/IF3G and IF3G/ACT11 as the suitable pair of reference genes for RT-qPCR analysis of gene expression in different tissues of P. ginseng and the seedlings grown under heat stress, respectively. The results serve as a foundation for future studies on P. ginseng functional genomics.
Kamaraj, Balu; Gopalakrishnan, Chandrasekhar; Purohit, Rituraj
Albinism is an autosomal recessive genetic disorder due to low secretion of melanin. The oculocutaneous albinism (OCA) and ocular albinism (OA) genes are responsible for melanin production and also act as a potential targets for miRNAs. The role of miRNA is to inhibit the protein synthesis partially or completely by binding with the 3'UTR of the mRNA thus regulating gene expression. In this analysis, we predicted the genetic variation that occurred in 3'UTR of the transcript which can be a reason for low melanin production thus causing albinism. The single nucleotide polymorphisms (SNPs) in 3'UTR cause more new binding sites for miRNA which binds with mRNA which leads to inhibit the translation process either partially or completely. The SNPs in the mRNA of OCA and OA genes can create new binding sites for miRNA which may control the gene expression and lead to hypopigmentation. We have developed a computational procedure to determine the SNPs in the 3'UTR region of mRNA of OCA (TYR, OCA2, TYRP1 and SLC45A2) and OA (GPR143) genes which will be a potential cause for albinism. We identified 37 SNPs in five genes that are predicted to create 87 new binding sites on mRNA, which may lead to abrogation of the translation process. Expression analysis confirms that these genes are highly expressed in skin and eye regions. It is well supported by enrichment analysis that these genes are mainly involved in eye pigmentation and melanin biosynthesis process. The network analysis also shows how the genes are interacting and expressing in a complex network. This insight provides clue to wet-lab researches to understand the expression pattern of OCA and OA genes and binding phenomenon of mRNA and miRNA upon mutation, which is responsible for inhibition of translation process at genomic levels.
Skarman, Axel; Jiang, Li; Hornshøj, Henrik
Abstract Background: Gene set analysis is considered to be a way of improving our biological interpretation of the observed expression patterns. This paper describes different methods applied to analyse expression data from a chicken DNA microarray dataset. Results: Applying different gene set...... analyses to the chicken expression data led to different ranking of the Gene Ontology terms tested. A method for prediction of possible annotations was applied. Conclusion: Biological interpretation based on gene set analyses dependent on the statistical method used. Methods for predicting the possible...
Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung
The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed
Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien
Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. firstname.lastname@example.org; email@example.com.
Long SAGE analysis of genes differentially expressed in the midgut and silk gland between the sexes of the silkwormBombyx mori. Liping Gan, Ying Wang, Jian Xi, Yanshan Niu, Hongyou Qin, Yanghu Sima, Shiqing Xu ...
Pritykin, Yuri; Ghersi, Dario; Singh, Mona
Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655
Nobutaka Hanagata, Taro Takemura and Takashi Minowa
Full Text Available Comprehensive gene expression analysis using DNA microarrays has become a widespread technique in molecular biological research. In the biomaterials field, it is used to evaluate the biocompatibility or cellular toxicity of metals, polymers and ceramics. Studies in this field have extracted differentially expressed genes in the context of differences in cellular responses among multiple materials. Based on these genes, the effects of materials on cells at the molecular level have been examined. Expression data ranging from several to tens of thousands of genes can be obtained from DNA microarrays. For this reason, several tens or hundreds of differentially expressed genes are often present in different materials. In this review, we outline the principles of DNA microarrays, and provide an introduction to methods of extracting information which is useful for evaluating and designing biomaterials from comprehensive gene expression data.
Hanagata, Nobutaka; Takemura, Taro; Minowa, Takashi
Comprehensive gene expression analysis using DNA microarrays has become a widespread technique in molecular biological research. In the biomaterials field, it is used to evaluate the biocompatibility or cellular toxicity of metals, polymers and ceramics. Studies in this field have extracted differentially expressed genes in the context of differences in cellular responses among multiple materials. Based on these genes, the effects of materials on cells at the molecular level have been examined. Expression data ranging from several to tens of thousands of genes can be obtained from DNA microarrays. For this reason, several tens or hundreds of differentially expressed genes are often present in different materials. In this review, we outline the principles of DNA microarrays, and provide an introduction to methods of extracting information which is useful for evaluating and designing biomaterials from comprehensive gene expression data. (topical review)
Beauchamp, Nicholas J.; van Achterberg, Tanja A. E.; Engelse, Marten A.; Pannekoek, Hans; de Vries, Carlie J. M.
Migration and proliferation of vascular smooth muscle cells (SMCs) are key events in atherosclerosis. However, little is known about alterations in gene expression upon transition of the quiescent, contractile SMC to the proliferative SMC. We performed serial analysis of gene expression (SAGE) of
Full Text Available Abstract Background Renal cell carcinoma (RCC is the most common cancer in adult kidney. The accuracy of current diagnosis and prognosis of the disease and the effectiveness of the treatment for the disease are limited by the poor understanding of the disease at the molecular level. To better understand the genetics and biology of RCC, we profiled the expression of 7,129 genes in both clear cell RCC tissue and cell lines using oligonucleotide arrays. Methods Total RNAs isolated from renal cell tumors, adjacent normal tissue and metastatic RCC cell lines were hybridized to affymatrix HuFL oligonucleotide arrays. Genes were categorized into different functional groups based on the description of the Gene Ontology Consortium and analyzed based on the gene expression levels. Gene expression profiles of the tissue and cell line samples were visualized and classified by singular value decomposition. Reverse transcription polymerase chain reaction was performed to confirm the expression alterations of selected genes in RCC. Results Selected genes were annotated based on biological processes and clustered into functional groups. The expression levels of genes in each group were also analyzed. Seventy-four commonly differentially expressed genes with more than five-fold changes in RCC tissues were identified. The expression alterations of selected genes from these seventy-four genes were further verified using reverse transcription polymerase chain reaction (RT-PCR. Detailed comparison of gene expression patterns in RCC tissue and RCC cell lines shows significant differences between the two types of samples, but many important expression patterns were preserved. Conclusions This is one of the initial studies that examine the functional ontology of a large number of genes in RCC. Extensive annotation, clustering and analysis of a large number of genes based on the gene functional ontology revealed many interesting gene expression patterns in RCC. Most
Full Text Available Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.Results: We have developed (gene set matrix analysis GSMA as a useful method for the rapid testing of group-wise up- or downregulation of gene expression simultaneously for multiple lists of genes (gene sets against entire distributions of gene expression changes (datasets for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.
The FMR1 gene, a member of the fragile X-related gene family, is responsible for fragile X syndrome (FXS). Missense single-nucleotide polymorphisms (SNPs) are responsible for many complex diseases. The effect of FMR1 gene missense SNPs is unknown. The aim of this study, using in silico techniques, was to analyze all known missense mutations that can affect the functionality of the FMR1 gene, leading to mental retardation (MR) and FXS. Data on the human FMR1 gene were collected from the Ensembl database (release 81), National Centre for Biological Information dbSNP Short Genetic Variations database, 1000 Genomes Browser, and NHLBI Exome Sequencing Project Exome Variant Server. In silico analysis was then performed. One hundred-twenty different missense SNPs of the FMR1 gene were determined. Of these, 11.66 % of the FMR1 gene missense SNPs were in highly conserved domains, and 83.33 % were in domains with high variety. The results of the in silico prediction analysis showed that 31.66 % of the FMR1 gene SNPs were disease related and that 50 % of SNPs had a pathogenic effect. The results of the structural and functional analysis revealed that although the R138Q mutation did not seem to have a damaging effect on the protein, the G266E and I304N SNPs appeared to disturb the interaction between the domains and affect the function of the protein. This is the first study to analyze all missense SNPs of the FMR1 gene. The results indicate the applicability of a bioinformatics approach to FXS and other FMR1-related diseases. I think that the analysis of FMR1 gene missense SNPs using bioinformatics methods would help diagnosis of FXS and other FMR1-related diseases.
Full Text Available Abstract Background Osteoblast differentiation requires the coordinated stepwise expression of multiple genes. Histone deacetylase inhibitors (HDIs accelerate the osteoblast differentiation process by blocking the activity of histone deacetylases (HDACs, which alter gene expression by modifying chromatin structure. We previously demonstrated that HDIs and HDAC3 shRNAs accelerate matrix mineralization and the expression of osteoblast maturation genes (e.g. alkaline phosphatase, osteocalcin. Identifying other genes that are differentially regulated by HDIs might identify new pathways that contribute to osteoblast differentiation. Results To identify other osteoblast genes that are altered early by HDIs, we incubated MC3T3-E1 preosteoblasts with HDIs (trichostatin A, MS-275, or valproic acid for 18 hours in osteogenic conditions. The promotion of osteoblast differentiation by HDIs in this experiment was confirmed by osteogenic assays. Gene expression profiles relative to vehicle-treated cells were assessed by microarray analysis with Affymetrix GeneChip 430 2.0 arrays. The regulation of several genes by HDIs in MC3T3-E1 cells and primary osteoblasts was verified by quantitative real-time PCR. Nine genes were differentially regulated by at least two-fold after exposure to each of the three HDIs and six were verified by PCR in osteoblasts. Four of the verified genes (solute carrier family 9 isoform 3 regulator 1 (Slc9a3r1, sorbitol dehydrogenase 1, a kinase anchor protein, and glutathione S-transferase alpha 4 were induced. Two genes (proteasome subunit, beta type 10 and adaptor-related protein complex AP-4 sigma 1 were suppressed. We also identified eight growth factors and growth factor receptor genes that are significantly altered by each of the HDIs, including Frizzled related proteins 1 and 4, which modulate the Wnt signaling pathway. Conclusion This study identifies osteoblast genes that are regulated early by HDIs and indicates pathways that
Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan
Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.
Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro
Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.
Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung
Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene (GxG) interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.
Chang Jeffrey T
Full Text Available Abstract Background The biological phenotype of a cell, such as a characteristic visual image or behavior, reflects activities derived from the expression of collections of genes. As such, an ability to measure the expression of these genes provides an opportunity to develop more precise and varied sets of phenotypes. However, to use this approach requires computational methods that are difficult to implement and apply, and thus there is a critical need for intelligent software tools that can reduce the technical burden of the analysis. Tools for gene expression analyses are unusually difficult to implement in a user-friendly way because their application requires a combination of biological data curation, statistical computational methods, and database expertise. Results We have developed SIGNATURE, a web-based resource that simplifies gene expression signature analysis by providing software, data, and protocols to perform the analysis successfully. This resource uses Bayesian methods for processing gene expression data coupled with a curated database of gene expression signatures, all carried out within a GenePattern web interface for easy use and access. Conclusions SIGNATURE is available for public use at http://genepattern.genome.duke.edu/signature/.
Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin
Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.
Ambiguity in texts is a well-known problem: words can carry several meanings, and hence, can be read and interpreted differently. This is also true in the biological literature; names of biological concepts, such as genes and proteins, might be ambiguous, referring in some cases to more than one gene or one protein, or in others, to both genes and proteins at the same time. Public biological databases give a very useful insight about genes and proteins information, including their names. In this study, we made a thorough analysis of the nomenclatures of genes and proteins in two data sources and for six different species. We developed an automated process that parses, extracts, processes and stores information available in two major biological databases: Entrez Gene and UniProtKB. We analysed gene and protein synonyms, their types, frequencies, and the ambiguities within a species, in between data sources and cross-species. We found that at least 40% of the cross-species ambiguities are caused by names that are already ambiguous within the species. Our study shows that from the six species we analysed (Homo Sapiens, Mus Musculus, Arabidopsis Thaliana, Oryza Sativa, Bacillus Subtilis and Pseudomonas Fluorescens), rice (Oriza Sativa) has the best naming model in Entrez Gene database, with low ambiguities between data sources and cross-species.
to protein: through epigenetic modifications, transcription regulators or post-transcriptional controls. The following papers concern several layers of gene regulation with questions answered by different HTS approaches. Genome-wide screening of epigenetic changes by ChIP-seq allowed us to study both spatial...... and temporal alterations of histone modifications (Papers I and II). Coupling the data with machine learning approaches, we established a prediction framework to assess the most informative histone marks as well as their most influential nucleosome positions in predicting the promoter usages. (Papers I...... they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V...
Full Text Available Abstract Background Microarray compendia profile the expression of genes in a number of experimental conditions. Such data compendia are useful not only to group genes and conditions based on their similarity in overall expression over profiles but also to gain information on more subtle relations between genes and conditions. Getting a clear visual overview of all these patterns in a single easy-to-grasp representation is a useful preliminary analysis step: We propose to use for this purpose an advanced exploratory method, called multidimensional unfolding. Results We present a novel algorithm for multidimensional unfolding that overcomes both general problems and problems that are specific for the analysis of gene expression data sets. Applying the algorithm to two publicly available microarray compendia illustrates its power as a tool for exploratory data analysis: The unfolding analysis of a first data set resulted in a two-dimensional representation which clearly reveals temporal regulation patterns for the genes and a meaningful structure for the time points, while the analysis of a second data set showed the algorithm's ability to go beyond a mere identification of those genes that discriminate between different patient or tissue types. Conclusion Multidimensional unfolding offers a useful tool for preliminary explorations of microarray data: By relying on an easy-to-grasp low-dimensional geometric framework, relations among genes, among conditions and between genes and conditions are simultaneously represented in an accessible way which may reveal interesting patterns in the data. An additional advantage of the method is that it can be applied to the raw data without necessitating the choice of suitable genewise transformations of the data.
Jia, Peilin; Ewers, Jeffrey M; Zhao, Zhongming
Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The strategy can be
Full Text Available Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs that are more likely to be associated with epilepsy. The responsible gene(s within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research.In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways.The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The
Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...
Jun 17, 2009 ... Molecular responses and expression analysis of genes in a xerophytic desert shrub Haloxylon ammodendron .... physiological determination and cDNA-AFLP analysis, three groups of seeds were sowed in pots with sand and .... HaDR27. U. 234. PDR-like ABC transporter. AT1G59870. HaDR28. U. 135.
Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661
Full Text Available Kashin-Beck Disease (KBD is an endemic osteochondropathy with an unknown pathogenesis. Diagnosis of KBD is effective only in advanced cases, which eliminates the possibility of early treatment and leads to an inevitable exacerbation of symptoms. Therefore, we aim to identify an accurate blood-based gene signature for the detection of KBD. Previously published gene expression profile data on cartilage and peripheral blood mononuclear cells (PBMCs from adults with KBD were compared to select potential target genes. Microarray analysis was conducted to evaluate the expression of the target genes in a cohort of 100 KBD patients and 100 healthy controls. A gene expression signature was identified using a training set, which was subsequently validated using an independent test set with a minimum redundancy maximum relevance (mRMR algorithm and support vector machine (SVM algorithm. Fifty unique genes were differentially expressed between KBD patients and healthy controls. A 20-gene signature was identified that distinguished between KBD patients and controls with 90% accuracy, 85% sensitivity, and 95% specificity. This study identified a 20-gene signature that accurately distinguishes between patients with KBD and controls using peripheral blood samples. These results promote the further development of blood-based genetic biomarkers for detection of KBD.
Adelson David L
Full Text Available Abstract Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.
Wang, Yunli; Pan, Youlian
Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...
Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia
To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Lim, Dajeong; Lee, Seung-Hwan; Kim, Nam-Kuk; Cho, Yong-Min; Chai, Han-Ha; Seong, Hwan-Hoo; Kim, Heebal
Marbling (intramuscular fat) is an important trait that affects meat quality and is a casual factor determining the price of beef in the Korean beef market. It is a complex trait and has many biological pathways related to muscle and fat. There is a need to identify functional modules or genes related to marbling traits and investigate their relationships through a weighted gene co-expression network analysis based on the system level. Therefore, we investigated the co-expression relationships of genes related to the 'marbling score' trait and systemically analyzed the network topology in Hanwoo (Korean cattle). As a result, we determined 3 modules (gene groups) that showed statistically significant results for marbling score. In particular, one module (denoted as red) has a statistically significant result for marbling score (p = 0.008) and intramuscular fat (p = 0.02) and water capacity (p = 0.006). From functional enrichment and relationship analysis of the red module, the pathway hub genes (IL6, CHRNE, RB1, INHBA and NPPA) have a direct interaction relationship and share the biological functions related to fat or muscle, such as adipogenesis or muscle growth. This is the first gene network study with m.logissimus in Hanwoo to observe co-expression patterns in divergent marbling phenotypes. It may provide insights into the functional mechanisms of the marbling trait.
Full Text Available Marbling (intramuscular fat is an important trait that affects meat quality and is a casual factor determining the price of beef in the Korean beef market. It is a complex trait and has many biological pathways related to muscle and fat. There is a need to identify functional modules or genes related to marbling traits and investigate their relationships through a weighted gene co-expression network analysis based on the system level. Therefore, we investigated the co-expression relationships of genes related to the ‘marbling score’ trait and systemically analyzed the network topology in Hanwoo (Korean cattle. As a result, we determined 3 modules (gene groups that showed statistically significant results for marbling score. In particular, one module (denoted as red has a statistically significant result for marbling score (p = 0.008 and intramuscular fat (p = 0.02 and water capacity (p = 0.006. From functional enrichment and relationship analysis of the red module, the pathway hub genes (IL6, CHRNE, RB1, INHBA and NPPA have a direct interaction relationship and share the biological functions related to fat or muscle, such as adipogenesis or muscle growth. This is the first gene network study with m.logissimus in Hanwoo to observe co-expression patterns in divergent marbling phenotypes. It may provide insights into the functional mechanisms of the marbling trait.
Weber, Kristoffer; Bartsch, Udo; Stocking, Carol; Fehse, Boris
Functional gene analysis requires the possibility of overexpression, as well as downregulation of one, or ideally several, potentially interacting genes. Lentiviral vectors are well suited for this purpose as they ensure stable expression of complementary DNAs (cDNAs), as well as short-hairpin RNAs (shRNAs), and can efficiently transduce a wide spectrum of cell targets when packaged within the coat proteins of other viruses. Here we introduce a multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors designed according to the "building blocks" principle. Using a wide spectrum of different fluorescent markers, including drug-selectable enhanced green fluorescent protein (eGFP)- and dTomato-blasticidin-S resistance fusion proteins, LeGO vectors allow simultaneous analysis of multiple genes and shRNAs of interest within single, easily identifiable cells. Furthermore, each functional module is flanked by unique cloning sites, ensuring flexibility and individual optimization. The efficacy of these vectors for analyzing multiple genes in a single cell was demonstrated in several different cell types, including hematopoietic, endothelial, and neural stem and progenitor cells, as well as hepatocytes. LeGO vectors thus represent a valuable tool for investigating gene networks using conditional ectopic expression and knock-down approaches simultaneously.
Full Text Available Background: Soil salinity can significantly reduce crop production, but the molecular mechanism of salinity tolerance in peanut is poorly understood. A mutant (S1 with higher salinity resistance than its mutagenic parent HY22 (S3 was obtained. Transcriptome sequencing and digital gene expression (DGE analysis were performed with leaves of S1 and S3 before and after plants were irrigated with 250 mM NaCl. Results: A total of 107,725 comprehensive transcripts were assembled into 67,738 unigenes using TIGR Gene Indices clustering tools (TGICL. All unigenes were searched against the euKaryotic Ortholog Groups (KOG, gene ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG databases, and these unigenes were assigned to 26 functional KOG categories, 56 GO terms, 32 KEGG groups, respectively. In total 112 differentially expressed genes (DEGs between S1 and S3 after salinity stress were screened, among them, 86 were responsive to salinity stress in S1 and/or S3. These 86 DEGs included genes that encoded the following kinds of proteins that are known to be involved in resistance to salinity stress: late embryogenesis abundant proteins (LEAs, major intrinsic proteins (MIPs or aquaporins, metallothioneins (MTs, lipid transfer protein (LTP, calcineurin B-like protein-interacting protein kinases (CIPKs, 9-cis-epoxycarotenoid dioxygenase (NCED and oleosins, etc. Of these 86 DEGs, 18 could not be matched with known proteins. Conclusion: The results from this study will be useful for further research on the mechanism of salinity resistance and will provide a useful gene resource for the variety breeding of salinity resistance in peanut. Keywords: Digital gene expression, Gene, Mutant, NaCl, Peanut (Arachis hypogaea L., RNA-seq, Salinity stress, Salinity tolerance, Soil salinity, Transcripts, Unigenes
Erin M Siegel
Full Text Available Aberrant DNA methylation has been observed in cervical cancer; however, most studies have used non-quantitative approaches to measure DNA methylation. The objective of this study was to quantify methylation within a select panel of genes previously identified as targets for epigenetic silencing in cervical cancer and to identify genes with elevated methylation that can distinguish cancer from normal cervical tissues. We identified 49 women with invasive squamous cell cancer of the cervix and 22 women with normal cytology specimens. Bisulfite-modified genomic DNA was amplified and quantitative pyrosequencing completed for 10 genes (APC, CCNA, CDH1, CDH13, WIF1, TIMP3, DAPK1, RARB, FHIT, and SLIT2. A Methylation Index was calculated as the mean percent methylation across all CpG sites analyzed per gene (~4-9 CpG site per sequence. A binary cut-point was defined at >15% methylation. Sensitivity, specificity and area under ROC curve (AUC of methylation in individual genes or a panel was examined. The median methylation index was significantly higher in cases compared to controls in 8 genes, whereas there was no difference in median methylation for 2 genes. Compared to HPV and age, the combination of DNA methylation level of DAPK1, SLIT2, WIF1 and RARB with HPV and age significantly improved the AUC from 0.79 to 0.99 (95% CI: 0.97-1.00, p-value = 0.003. Pyrosequencing analysis confirmed that several genes are common targets for aberrant methylation in cervical cancer and DNA methylation level of four genes appears to increase specificity to identify cancer compared to HPV detection alone. Alterations in DNA methylation of specific genes in cervical cancers, such as DAPK1, RARB, WIF1, and SLIT2, may also occur early in cervical carcinogenesis and should be evaluated.
Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing
Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.
Full Text Available To investigate the association of polymorphisms within candidate genes which we hypothesized may contribute to stress fracture predisposition, a case-control, cross- sectional study design was employed. Genotyping 268 Single Nucleotide Polymorphisms- SNPs within 17 genes in 385 Israeli young male and female recruits (182 with and 203 without stress fractures. Twenty-five polymorphisms within 9 genes (NR3C1, ANKH, VDR, ROR2, CALCR, IL6, COL1A2, CBG, and LRP4 showed statistically significant differences (p < 0.05 in the distribution between stress fracture cases and non stress fracture controls. Seventeen genetic variants were associated with an increased stress fracture risk, and eight variants with a decreased stress fracture risk. None of the SNP associations remained significant after correcting for multiple comparisons (false discovery rate- FDR. Our findings suggest that genes may be involved in stress fracture pathogenesis. Specifically, the CALCR and the VDR genes are intriguing candidates. The putative involvement of these genes in stress fracture predisposition requires analysis of more cases and controls and sequencing the relevant genomic regions, in order to define the specific gene mutations
Shlykova, Irina; Ponosov, Arcady
There are different ways of how to model gene regulatory networks. Differential equations allow for a detailed description of the network's dynamics and provide an explicit model of the gene concentration changes over time. Production and relative degradation rate functions used in such models depend on the vector of steeply sloped threshold functions which characterize the activity of genes. The most popular example of the threshold functions comes from the Boolean network approach, where the threshold functions are given by step functions. The system of differential equations becomes then piecewise linear. The dynamics of this system can be described very easily between the thresholds, but not in the switching domains. For instance this approach fails to analyze stationary points of the system and to define continuous solutions in the switching domains. These problems were studied in , , but the proposed model did not take into account a time delay in cellular systems. However, analysis of real gene expression data shows a considerable number of time-delayed interactions suggesting that time delay is essential in gene regulation. Therefore, delays may have a great effect on the dynamics of the system presenting one of the critical factors that should be considered in reconstruction of gene regulatory networks. The goal of this work is to apply the singular perturbation analysis to certain systems with delay and to obtain an analog of Tikhonov's theorem, which provides sufficient conditions for constracting the limit system in the delay case.
Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna
There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner
Inoue, Tohru; Hirabayashi, Yoko
Authors explain that the radiation effect on biological system is stochastic along the law of physics, differing from chemical effect, using instances of Cs-137 gamma-ray (GR) and benzene (BZ) exposures to mice and of resultant comprehensive analyses of gene expression. Single GR irradiation is done with Gamma Cell 40 (CSR) to C57BL/6 or C3H/He mouse at 0, 0.6 and 3 Gy. BE is given orally at 150 mg/kg/day for 5 days x 2 weeks. Bone marrow cells are sampled 1 month after the exposure. Comprehensive gene expression is analyzed by Gene Chip Mouse Genome 430 2.0 Array (Affymetrix) and data are processed by programs like case normalization, statistics, network generation, functional analysis etc. GR irradiation brings about changes of gene expression, which are classifiable in common genes variable commonly on the dose change and stochastic genes variable stochastically within each dose: e.g., with Welch-t-test, significant differences are between 0/3 Gy (dose-specific difference, 455 pbs (probe set), in stochastic 2113 pbs), 0/0.6 Gy (267 in 1284 pbs) and 0.6/3 Gy (532 pbs); and with one-way analysis of variation (ANOVA) and hierarchial/dendrographic analyses, 520 pbs are shown to involve the dose-dependent 226 and dose-specific 294 pbs. It is also shown that at 3 Gy, expression of common genes are rather suppressed, including those related to the proliferation/apoptosis of B/T cells, and of stochastic genes, related to cell division/signaling. Ven diagram of the common genes of above 520 pbs, stochastic 2113 pbs at 3 Gy and 1284 pbs at 0.6 Gy shows the overlapping genes 29, 2 and 4, respectively, indicating only 35 pbs are overlapping in total. Network analysis of changes by GR shows the rather high expression of genes around hub of cAMP response element binding protein (CREB) at 0.6 Gy, and rather variable expression around CREB hub/suppressed expression of kinesin hub at 3 Gy; in the network by BZ exposure, unchanged or low expression around p53 hub and suppression
M. Ananda Chitra
Full Text Available Background: Staphylococcus pseudintermedius (SP is the major pathogenic species of dogs involved in a wide variety of skin and soft tissue infections. The accessory gene regulator (agr locus of Staphylococcus aureus has been extensively studied, and it influences the expression of many virulence genes. It encodes a two-component signal transduction system that leads to down-regulation of surface proteins and up-regulation of secreted proteins during in vitro growth of S. aureus. The objective of this study was to detect and sequence analyzing the AgrA, B, and D of SP isolated from canine skin infections. Materials and Methods: In this study, we have isolated and identified SP from canine pyoderma and otitis cases by polymerase chain reaction (PCR and confirmed by PCR-restriction fragment length polymorphism. Primers for SP agrA and agrBD genes were designed using online primer designing software and BLAST searched for its specificity. Amplification of the agr genes was carried out for 53 isolates of SP by PCR and sequencing of agrA, B, and D were carried out for five isolates and analyzed using DNAstar and Mega5.2 software. Results: A total of 53 (59% SP isolates were obtained from 90 samples. 15 isolates (28% were confirmed to be methicillinresistant SP (MRSP with the detection of the mecA gene. Accessory gene regulator A, B, and D genes were detected in all the SP isolates. Complete nucleotide sequences of the above three genes for five isolates were submitted to GenBank, and their accession numbers are from KJ133557 to KJ133571. AgrA amino acid sequence analysis showed that it is mainly made of alpha-helices and is hydrophilic in nature. AgrB is a transmembrane protein, and AgrD encodes the precursor of the autoinducing peptide (AIP. Sequencing of the agrD gene revealed that the 5 canine SP strains tested could be divided into three Agr specificity groups (RIPTSTGFF, KIPTSTGFF, and RIPISTGFF based on the putative AIP produced by each strain
In the present study we took initiative to study the self/nonself recognition in hydra and its relation to the immune response. Moreover, performing phylogenetic analysis to look for annotated immune genes in hydra gave us a potential to analyze the expression of minor histocompatibility genes that have been shown to play a major role in grafting and transplantation in mammals. Here we obtained the cDNA library that shows expression of minor histocompatibility genes and confirmed that the annotated sequences in databases are actually present. In addition, grafting experiments suggested, although still preliminary, that homograft showed less rejection response than in heterograft. Involvement of possible minor histocompatibility gene orthologous in immune response was examined by qPCR.
Fitzpatrick, David A
Abstract Background Candida species are the most common cause of opportunistic fungal infection worldwide. Recent sequencing efforts have provided a wealth of Candida genomic data. We have developed the Candida Gene Order Browser (CGOB), an online tool that aids comparative syntenic analyses of Candida species. CGOB incorporates all available Candida clade genome sequences including two Candida albicans isolates (SC5314 and WO-1) and 8 closely related species (Candida dubliniensis, Candida tropicalis, Candida parapsilosis, Lodderomyces elongisporus, Debaryomyces hansenii, Pichia stipitis, Candida guilliermondii and Candida lusitaniae). Saccharomyces cerevisiae is also included as a reference genome. Results CGOB assignments of homology were manually curated based on sequence similarity and synteny. In total CGOB includes 65617 genes arranged into 13625 homology columns. We have also generated improved Candida gene sets by merging\\/removing partial genes in each genome. Interrogation of CGOB revealed that the majority of tandemly duplicated genes are under strong purifying selection in all Candida species. We identified clusters of adjacent genes involved in the same metabolic pathways (such as catabolism of biotin, galactose and N-acetyl glucosamine) and we showed that some clusters are species or lineage-specific. We also identified one example of intron gain in C. albicans. Conclusions Our analysis provides an important resource that is now available for the Candida community. CGOB is available at http:\\/\\/cgob.ucd.ie.
McMillan, Mary; Pereg, Lily
Azospirillum brasilense is a nitrogen fixing bacterium that has been shown to have various beneficial effects on plant growth and yield. Under normal conditions A. brasilense exists in a motile flagellated form, which, under starvation or stress conditions, can undergo differentiation into an encapsulated, cyst-like form. Quantitative RT-PCR can be used to analyse changes in gene expression during this differentiation process. The accuracy of quantification of mRNA levels by qRT-PCR relies on the normalisation of data against stably expressed reference genes. No suitable set of reference genes has yet been described for A. brasilense. Here we evaluated the expression of ten candidate reference genes (16S rRNA, gapB, glyA, gyrA, proC, pykA, recA, recF, rpoD, and tpiA) in wild-type and mutant A. brasilense strains under different culture conditions, including conditions that induce differentiation. Analysis with the software programs BestKeeper, NormFinder and GeNorm indicated that gyrA, glyA and recA are the most stably expressed reference genes in A. brasilense. The results also suggested that the use of two reference genes (gyrA and glyA) is sufficient for effective normalisation of qRT-PCR data.
Full Text Available Azospirillum brasilense is a nitrogen fixing bacterium that has been shown to have various beneficial effects on plant growth and yield. Under normal conditions A. brasilense exists in a motile flagellated form, which, under starvation or stress conditions, can undergo differentiation into an encapsulated, cyst-like form. Quantitative RT-PCR can be used to analyse changes in gene expression during this differentiation process. The accuracy of quantification of mRNA levels by qRT-PCR relies on the normalisation of data against stably expressed reference genes. No suitable set of reference genes has yet been described for A. brasilense. Here we evaluated the expression of ten candidate reference genes (16S rRNA, gapB, glyA, gyrA, proC, pykA, recA, recF, rpoD, and tpiA in wild-type and mutant A. brasilense strains under different culture conditions, including conditions that induce differentiation. Analysis with the software programs BestKeeper, NormFinder and GeNorm indicated that gyrA, glyA and recA are the most stably expressed reference genes in A. brasilense. The results also suggested that the use of two reference genes (gyrA and glyA is sufficient for effective normalisation of qRT-PCR data.
Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.
Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...
Full Text Available Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs. For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR is one of the powerful and efficient methods for detecting high-order gene-gene (GxG interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI. Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.
Michael G. Surette
Full Text Available The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.
Paruzynski, Anna; Glimm, Hanno; Schmidt, Manfred; Kalle, Christof von
Gene therapy-based clinical phase I/II studies using integrating retroviral vectors could successfully treat different monogenetic inherited diseases. However, with increased efficiency of this therapy, severe side effects occurred in various gene therapy trials. In all cases, integration of the vector close to or within a proto-oncogene contributed substantially to the development of the malignancies. Thus, the in-depth analysis of integration site patterns is of high importance to uncover potential clonal outgrowth and to assess the safety of gene transfer vectors and gene therapy protocols. The standard and nonrestrictive linear amplification-mediated PCR (nrLAM-PCR) in combination with high-throughput sequencing exhibits technologies that allow to comprehensively analyze the clonal repertoire of gene-corrected cells and to assess the safety of the used vector system at an early stage on the molecular level. It enables clarifying the biological consequences of the vector system on the fate of the transduced cell. Furthermore, the downstream performance of real-time PCR allows a quantitative estimation of the clonality of individual cells and their clonal progeny. Here, we present a guideline that should allow researchers to perform comprehensive integration site analysis in preclinical and clinical studies. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Full Text Available Abstract Background The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML. The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. Results Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1 cell lines with RUNX1 mutations from FPD-AML patients, 2 over-expression of RUNX1 and CBFβ, and 3 Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFβ. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. Conclusion This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease
Full Text Available DNA microarray technologies are used extensively to profile the expression levels of thousands of genes under various conditions, yielding extremely large data-matrices. Thus, analyzing this information and extracting biologically relevant knowledge becomes a considerable challenge. A classical approach for tackling this challenge is to use clustering (also known as one-way clustering methods where genes (or respectively samples are grouped together based on the similarity of their expression profiles across the set of all samples (or respectively genes. An alternative approach is to develop biclustering methods to identify local patterns in the data. These methods extract subgroups of genes that are co-expressed across only a subset of samples and may feature important biological or medical implications. In this study we evaluate 13 biclustering and 2 clustering (k-means and hierarchical methods. We use several approaches to compare their performance on two real gene expression data sets. For this purpose we apply four evaluation measures in our analysis: (1 we examine how well the considered (biclustering methods differentiate various sample types; (2 we evaluate how well the groups of genes discovered by the (biclustering methods are annotated with similar Gene Ontology categories; (3 we evaluate the capability of the methods to differentiate genes that are known to be specific to the particular sample types we study and (4 we compare the running time of the algorithms. In the end, we conclude that as long as the samples are well defined and annotated, the contamination of the samples is limited, and the samples are well replicated, biclustering methods such as Plaid and SAMBA are useful for discovering relevant subsets of genes and samples.
Bijnens, Luc J.M.; Lewi, Paul J.; Göhlmann, Hinrich W.; Molenberghs, Geert; Wouters, Luc
bioinformatics; biplot; correspondence factor analysis; data mining; data visualization; gene expression data; microarray data; multivariate exploratory data analysis; principal component analysis; Spectral map analysis
Full Text Available Abstract Background The Caenorhabditis elegans genome encodes ten proteins that share sequence similarity with the Hedgehog signaling molecule through their C-terminal autoprocessing Hint/Hog domain. These proteins contain novel N-terminal domains, and C. elegans encodes dozens of additional proteins containing only these N-terminal domains. These gene families are called warthog, groundhog, ground-like and quahog, collectively called hedgehog (hh-related genes. Previously, the expression pattern of seventeen genes was examined, which showed that they are primarily expressed in the ectoderm. Results With the completion of the C. elegans genome sequence in November 2002, we reexamined and identified 61 hh-related ORFs. Further, we identified 49 hh-related ORFs in C. briggsae. ORF analysis revealed that 30% of the genes still had errors in their predictions and we improved these predictions here. We performed a comprehensive expression analysis using GFP fusions of the putative intergenic regulatory sequence with one or two transgenic lines for most genes. The hh-related genes are expressed in one or a few of the following tissues: hypodermis, seam cells, excretory duct and pore cells, vulval epithelial cells, rectal epithelial cells, pharyngeal muscle or marginal cells, arcade cells, support cells of sensory organs, and neuronal cells. Using time-lapse recordings, we discovered that some hh-related genes are expressed in a cyclical fashion in phase with molting during larval development. We also generated several translational GFP fusions, but they did not show any subcellular localization. In addition, we also studied the expression patterns of two genes with similarity to Drosophila frizzled, T23D8.1 and F27E11.3A, and the ortholog of the Drosophila gene dally-like, gpn-1, which is a heparan sulfate proteoglycan. The two frizzled homologs are expressed in a few neurons in the head, and gpn-1 is expressed in the pharynx. Finally, we compare the
Full Text Available Abstract Background Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. Results We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium data. Conclusions Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.
Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E.; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder
Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops. PMID:29672525
Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder; Murphy, Denis J
Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops.
Dec 9, 2014 ... study of a genomewide analysis of apple TCP gene family. These results provide .... synthesize the first-strand cDNA using the PrimeScript First. Strand cDNA ..... only detected in the stem, leaf and fruit (figure 8). When.
Home; Journals; Journal of Genetics; Volume 93; Issue 3. Genomewide ... Teosinte branched1/cycloidea/proliferating cell factor1 (TCP) proteins are a large family of transcriptional regulators in angiosperms. They are ... To the best of our knowledge, this is the first study of a genomewide analysis of apple TCP gene family.
Oct 19, 2011 ... conducted a molecular cloning and functional analysis to study a specific silkworm gene BmICAD related to apoptosis. .... blocking with 5% non-fat milk for 1 h at room temperature, the .... requirements for all next experiments.
Supplementary data: Characterization and phylogenetic analysis of α-gliadin gene sequences reveals significant genomic divergence in Triticeae species. Guang-Rong Li, Tao Lang, En-Nian Yang, Cheng Liu ... The MITE insertion at the 3 UTR is boxed. Figure 2. The secondary structure of MITE insertion in HM452949.
Full Text Available Abstract Background The degree of conservation of gene expression between homologous organs largely remains an open question. Several recent studies reported some evidence in favor of such conservation. Most studies compute organs' similarity across all orthologous genes, whereas the expression level of many genes are not informative about organ specificity. Results Here, we use a modularization algorithm to overcome this limitation through the identification of inter-species co-modules of organs and genes. We identify such co-modules using mouse and human microarray expression data. They are functionally coherent both in terms of genes and of organs from both organisms. We show that a large proportion of genes belonging to the same co-module are orthologous between mouse and human. Moreover, their zebrafish orthologs also tend to be expressed in the corresponding homologous organs. Notable exceptions to the general pattern of conservation are the testis and the olfactory bulb. Interestingly, some co-modules consist of single organs, while others combine several functionally related organs. For instance, amygdala, cerebral cortex, hypothalamus and spinal cord form a clearly discernible unit of expression, both in mouse and human. Conclusions Our study provides a new framework for comparative analysis which will be applicable also to other sets of large-scale phenotypic data collected across different species.
Piasecka, Barbara; Kutalik, Zoltán; Roux, Julien; Bergmann, Sven; Robinson-Rechavi, Marc
The degree of conservation of gene expression between homologous organs largely remains an open question. Several recent studies reported some evidence in favor of such conservation. Most studies compute organs' similarity across all orthologous genes, whereas the expression level of many genes are not informative about organ specificity. Here, we use a modularization algorithm to overcome this limitation through the identification of inter-species co-modules of organs and genes. We identify such co-modules using mouse and human microarray expression data. They are functionally coherent both in terms of genes and of organs from both organisms. We show that a large proportion of genes belonging to the same co-module are orthologous between mouse and human. Moreover, their zebrafish orthologs also tend to be expressed in the corresponding homologous organs. Notable exceptions to the general pattern of conservation are the testis and the olfactory bulb. Interestingly, some co-modules consist of single organs, while others combine several functionally related organs. For instance, amygdala, cerebral cortex, hypothalamus and spinal cord form a clearly discernible unit of expression, both in mouse and human. Our study provides a new framework for comparative analysis which will be applicable also to other sets of large-scale phenotypic data collected across different species.
Regiane F. Travensolo
Full Text Available Xylella fastidiosa genome sequencing has generated valuable data by identifying genes acting either on metabolic pathways or in associated pathogenicity and virulence. Based on available information on these genes, new strategies for studying their expression patterns, such as microarray technology, were employed. A total of 2,600 primer pairs were synthesized and then used to generate fragments using the PCR technique. The arrays were hybridized against cDNAs labeled during reverse transcription reactions and which were obtained from bacteria grown under two different conditions (liquid XDM2 and liquid BCYE. All data were statistically analyzed to verify which genes were differentially expressed. In addition to exploring conditions for X. fastidiosa genome-wide transcriptome analysis, the present work observed the differential expression of several classes of genes (energy, protein, amino acid and nucleotide metabolism, transport, degradation of substances, toxins and hypothetical proteins, among others. The understanding of expressed genes in these two different media will be useful in comprehending the metabolic characteristics of X. fastidiosa, and in evaluating how important certain genes are for the functioning and survival of these bacteria in plants.
Chu, Y X; Chen, H R; Wu, A Z; Cai, R; Pan, J S
Dihydroflavonol 4-reductase (DFR) genes from Rosa chinensis (Asn type) and Calibrachoa hybrida (Asp type), driven by a CaMV 35S promoter, were integrated into the petunia (Petunia hybrida) cultivar 9702. Exogenous DFR gene expression characteristics were similar to flower-color changes, and effects on anthocyanin concentration were observed in both types of DFR gene transformants. Expression analysis showed that exogenous DFR genes were expressed in all of the tissues, but the expression levels were significantly different. However, both of them exhibited a high expression level in petals that were starting to open. The introgression of DFR genes may significantly change DFR enzyme activity. Anthocyanin ultra-performance liquid chromatography results showed that anthocyanin concentrations changed according to DFR enzyme activity. Therefore, the change in flower color was probably the result of a DFR enzyme change. Pelargonidin 3-O-glucoside was found in two different transgenic petunias, indicating that both CaDFR and RoDFR could catalyze dihydrokaempferol. Our results also suggest that transgenic petunias with DFR gene of Asp type could biosynthesize pelargonidin 3-O-glucoside.
Hydra is a simple freshwater solitary polyp used as a model system to study evolutionary aspects. The immune response of this organism has not been studied extensively and the immune response genes have not been identified and characterized. On the other hand, immune response has been investigated and genetic analysis has been initiated in other lower invertebrates. In the present study we took initiative to study the self/nonself recognition in hydra and its relation to the immune response. Moreover, performing phylogenetic analysis to look for annotated immune genes in hydra gave us a potential to analyze the expression of minor histocompatibility genes that have been shown to play a major role in grafting and transplantation in mammals. Here we obtained the cDNA library that shows expression of minor histocompatibility genes and confirmed that the annotated sequences in databases are actually present. In addition, grafting experiments suggested, although still preliminary, that homograft showed less rejection response than in heterograft. Involvement of possible minor histocompatibility gene orthologous in immune response was examined by qPCR.
Lipner, Ettie M.; Garcia, Benjamin J.; Strong, Michael
Tuberculosis and nontuberculous mycobacterial infections constitute a high burden of pulmonary disease in humans, resulting in over 1.5 million deaths per year. Building on the premise that genetic factors influence the instance, progression, and defense of infectious disease, we undertook a systems biology approach to investigate relationships among genetic factors that may play a role in increased susceptibility or control of mycobacterial infections. We combined literature and database mining with network analysis and pathway enrichment analysis to examine genes, pathways, and networks, involved in the human response to Mycobacterium tuberculosis and nontuberculous mycobacterial infections. This approach allowed us to examine functional relationships among reported genes, and to identify novel genes and enriched pathways that may play a role in mycobacterial susceptibility or control. Our findings suggest that the primary pathways and genes influencing mycobacterial infection control involve an interplay between innate and adaptive immune proteins and pathways. Signaling pathways involved in autoimmune disease were significantly enriched as revealed in our networks. Mycobacterial disease susceptibility networks were also examined within the context of gene-chemical relationships, in order to identify putative drugs and nutrients with potential beneficial immunomodulatory or anti-mycobacterial effects. PMID:26751573
Huang, Jianhua; Miao, Xuexia; Jin, Weirong; Couble, Pierre; Mita, Kasuei; Zhang, Yong; Liu, Wenbin; Zhuang, Leijun; Shen, Yan; Keime, Celine; Gandrillon, Olivier; Brouilly, Patrick; Briolay, Jerome; Zhao, Guoping; Huang, Yongping
The silkworm Bombyx mori is one of the most economically important insects and serves as a model for Lepidoptera insects. We used serial analysis of gene expression (SAGE) to derive profiles of expressed genes during the developmental life cycle of the silkworm and to create a reference for understanding silkworm metamorphosis. We generated four SAGE libraries, one from each of the four developmental stages of the silkworm. In total we obtained 257,964 SAGE tags, of which 39,485 were unique tags. Sorted by copy number, 14.1% of the unique tags were detected at a median to high level (five or more copies), 24.2% at lower levels (two to four copies), and 61.7% as single copies. Using a basic local alignment search tool on the EST database, 35% of the tags matched known silkworm expressed sequence tags. SAGE demonstrated that a number of the genes were up- or down-regulated during the four developmental phases of the egg, larva, pupa, and adult. Furthermore, we found that the generation of longer cDNA fragments from SAGE tags constituted the most efficient method of gene identification, which facilitated the analysis of a large number of unknown genes.
Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.
Jansen, Anne M L; Geilenkirchen, Marije A; van Wezel, Tom; Jagmohan-Changur, Shantie C; Ruano, Dina; van der Klift, Heleen M; van den Akker, Brendy E W M; Laros, Jeroen F J; van Galen, Michiel; Wagner, Anja; Letteboer, Tom G W; Gómez-García, Encarna B; Tops, Carli M J; Vasen, Hans F; Devilee, Peter; Hes, Frederik J; Morreau, Hans; Wijnen, Juul T
Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history. Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants. Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1). This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.
Full Text Available Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3, the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA by: i introducing quality control of co-expression similarities, ii parallelizing embedded network construction, and iii developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs. We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA. MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Song, Won-Min; Zhang, Bin
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.
Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori
Jeffrey T Leek
Full Text Available It has unambiguously been shown that genetic, environmental, demographic, and technical factors may have substantial effects on gene expression levels. In addition to the measured variable(s of interest, there will tend to be sources of signal due to factors that are unknown, unmeasured, or too complicated to capture through simple models. We show that failing to incorporate these sources of heterogeneity into an analysis can have widespread and detrimental effects on the study. Not only can this reduce power or induce unwanted dependence across genes, but it can also introduce sources of spurious signal to many genes. This phenomenon is true even for well-designed, randomized studies. We introduce "surrogate variable analysis" (SVA to overcome the problems caused by heterogeneity in expression studies. SVA can be applied in conjunction with standard analysis techniques to accurately capture the relationship between expression and any modeled variables of interest. We apply SVA to disease class, time course, and genetics of gene expression studies. We show that SVA increases the biological accuracy and reproducibility of analyses in genome-wide expression studies.
Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S
Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.
Pui Shan Wong
Full Text Available Fistulifera sp. strain JPCC DA0580 is a newly sequenced pennate diatom that is capable of simultaneously growing and accumulating lipids. This is a unique trait, not found in other related microalgae so far. It is able to accumulate between 40 to 60% of its cell weight in lipids, making it a strong candidate for the production of biofuel. To investigate this characteristic, we used RNA-Seq data gathered at four different times while Fistulifera sp. strain JPCC DA0580 was grown in oil accumulating and non-oil accumulating conditions. We then adapted gene set enrichment analysis (GSEA to investigate the relationship between the difference in gene expression of 7,822 genes and metabolic functions in our data. We utilized information in the KEGG pathway database to create the gene sets and changed GSEA to use re-sampling so that data from the different time points could be included in the analysis. Our GSEA method identified photosynthesis, lipid synthesis and amino acid synthesis related pathways as processes that play a significant role in oil production and growth in Fistulifera sp. strain JPCC DA0580. In addition to GSEA, we visualized the results by creating a network of compounds and reactions, and plotted the expression data on top of the network. This made existing graph algorithms available to us which we then used to calculate a path that metabolizes glucose into triacylglycerol (TAG in the smallest number of steps. By visualizing the data this way, we observed a separate up-regulation of genes at different times instead of a concerted response. We also identified two metabolic paths that used less reactions than the one shown in KEGG and showed that the reactions were up-regulated during the experiment. The combination of analysis and visualization methods successfully analyzed time-course data, identified important metabolic pathways and provided new hypotheses for further research.
Full Text Available Abstract Background Gecko (Gene Expression: Computation and Knowledge Organization is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community. Results Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing ~ 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph, in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (~ 100 users and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data. Conclusions The Gecko system is being made publicly available as free software http://sourceforge.net/projects/geckoe. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.
Chen, Fei; Peng, Guang-Jie; Zhang, Kejian; Hu, Qun; Zhang, Liu-Qing; Liu, Ai-Guo
To screen the FANCA gene mutation and explore the FANCA protein function in Fanconi anemia (FA) patients. FANCA protein expression and its interaction with FANCF were analyzed using Western blot and immunoprecipitation in 3 cases of FA-A. Genomic DNA was used for MLPA analysis followed by sequencing. FANCA protein was undetectable and FANCA and FANCF protein interaction was impaired in these 3 cases of FA-A. Each case of FA-A contained biallelic pathogenic mutations in FANCA gene. No functional FANCA protein was found in these 3 cases of FA-A, and intragenic deletion, frame shift and splice site mutation were the major pathogenic mutations found in FANCA gene.
Full Text Available Drosophila segmentation as a model organism is one of the most highly studied. Among many maternal segmentation coordinate genes, bicoid protein pattern plays a significant role during Drosophila embryogenesis, since this gradient determines most aspects of head and thorax development. Despite the fact that several models have been proposed to describe the bicoid gradient, due to its association with considerable error, each can only partially explain bicoid characteristics. In this paper, a modified version of singular spectrum analysis is examined for filtering and extracting the bicoid gene expression signal. The results with strong evidence indicate that the proposed technique is able to remove noise more effectively and can be considered as a promising method for filtering gene expression measurements for other applications.
Liu, Mengque; Fan, Xinyan; Fang, Kuangnan; Zhang, Qingzhao; Ma, Shuangge
In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance. © 2017 WILEY PERIODICALS, INC.
Nahid H Hajrah
Full Text Available Rhazya stricta is an evergreen shrub that is widely distributed across Western and South Asia, and like many other members of the Apocynaceae produces monoterpene indole alkaloids that have anti-cancer properties. This species is adapted to very harsh desert conditions making it an excellent system for studying tolerance to high temperatures and salinity. RNA-Seq analysis was performed on R. stricta exposed to severe salt stress (500 mM NaCl across four time intervals (0, 2, 12 and 24 h to examine mechanisms of salt tolerance. A large number of transcripts including genes encoding tetrapyrroles and pentatricopeptide repeat (PPR proteins were regulated only after 12 h of stress of seedlings grown in controlled greenhouse conditions. Mechanisms of salt tolerance in R. stricta may involve the upregulation of genes encoding chaperone protein Dnaj6, UDP-glucosyl transferase 85a2, protein transparent testa 12 and respiratory burst oxidase homolog protein b. Many of the highly-expressed genes act on protecting protein folding during salt stress and the production of flavonoids, key secondary metabolites in stress tolerance. Other regulated genes encode enzymes in the porphyrin and chlorophyll metabolic pathway with important roles during plant growth, photosynthesis, hormone signaling and abiotic responses. Heme biosynthesis in R. stricta leaves might add to the level of salt stress tolerance by maintaining appropriate levels of photosynthesis and normal plant growth as well as by the participation in reactive oxygen species (ROS production under stress. We speculate that the high expression levels of PPR genes may be dependent on expression levels of their targeted editing genes. Although the results of PPR gene family indicated regulation of a large number of transcripts under salt stress, PPR actions were independent of the salt stress because their RNA editing patterns were unchanged.
Koia, Jonni H; Moyle, Richard L; Botella, Jose R
Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit
Kim, Dong Sub; Kim, Jinbaek; Kim, Sang Hoon
In this project, we irradiated Arabidopsis plants with various doses of gamma-rays at the vegetative and reproductive stages to assess their radiation sensitivity. After the gene expression profiles and an analysis of the antioxidant response, we selected several Arabidopsis genes for uses of 'Radio marker genes (RMG)' and conducted over-expression and knock-down experiments to confirm the radio sensitivity. Based on these results, we applied two patents for the detection of two RMG (At3g28210 and At4g37990) and development of transgenic plants. Also, we developed a Genechip for use of high-throughput screening of Arabidopsis genes responding only to ionizing radiation and identified RMG to detect radiation leaks. Based on these results, we applied two patents associated with the use of Genechip for different types of radiation and different growth stages. Also, we conducted co-expression network study of specific expressed probes against gamma-ray stress and identified expressed patterns of duplicated genes formed by whole/500kb segmental genome duplication
Formalin-fixed paraffin-embedded (FFPE) tissue samples represent a potentially invaluable resource for genomic research into the molecular basis of disease. However, use of FFPE samples in gene expression studies has been limited by technical challenges resulting from degradation of nucleic acids. Here we evaluated gene expression profiles derived from fresh-frozen (FRO) and FFPE mouse liver tissues using two DNA microarray protocols and two whole transcriptome sequencing (RNA-seq) library preparation methodologies. The ribo-depletion protocol outperformed the other three methods by having the highest correlations of differentially expressed genes (DEGs) and best overlap of pathways between FRO and FFPE groups. We next tested the effect of sample time in formalin (18 hours or 3 weeks) on gene expression profiles. Hierarchical clustering of the datasets indicated that test article treatment, and not preservation method, was the main driver of gene expression profiles. Meta- and pathway analyses indicated that biological responses were generally consistent for 18-hour and 3-week FFPE samples compared to FRO samples. However, clear erosion of signal intensity with time in formalin was evident, and DEG numbers differed by platform and preservation method. Lastly, we investigated the effect of age in FFPE block on genomic profiles. RNA-seq analysis of 8-, 19-, and 26-year-old control blocks using the ribo-depletion protocol resulted in comparable quality metrics, inc
Chunmei; Yu; Xinyan; Liu; Qian; Zhang; Xinyu; He; Wan; Huai; Baohua; Wang; Yunying; Cao; Rong; Zhou
In higher plants, phosphomannomutase(PMM) is essential for synthesizing the antioxidant ascorbic acid through the Smirnoff–Wheeler pathway. Previously, we characterized six PMM genes(Ta PMM-A1, A2, B1, B2, D1 and D2) in common wheat(Triticum aestivum, AABBDD).Here, we report a molecular genetic analysis of PMM genes in Triticum monococcum(AmAm), a diploid wheat species whose Amgenome is closely related to the A genome of common wheat. Two distinct PMM genes, Tm PMM-1 and Tm PMM-2, were found in T. monococcum. The coding region of Tm PMM-1 was intact and highly conserved. In contrast, two main Tm PMM-2 alleles were identified, with Tm PMM-2a possessing an intact coding sequence and Tm PMM-2b being a pseudogene. The transcript level of Tm PMM-2a was much higher than that of Tm PMM-2b, and a bacterially expressed Tm PMM-2a recombinant protein displayed relatively high PMM activity. In general, the total transcript level of PMM was substantially higher in accessions carrying Tm PMM-1 and Tm PMM-2a than those harboring Tm PMM-1 and Tm PMM-2b. However, total PMM protein and activity levels did not differ drastically between the two genotypes. This work provides new information on PMM genes in T. monococcum and expands our understanding on Triticeae PMM genes, which may aid further functional and applied studies of PMM in crop plants.
Full Text Available Serial analysis of gene expression (SAGE is a powerful tool, which provides quantitative and comprehensive expression profile of genes in a given cell population. It works by isolating short fragments of genetic information from the expressed genes that are present in the cell being studied. These short sequences, called SAGE tags, are linked together for efficient sequencing. The frequency of each SAGE tag in the cloned multimers directly reflects the transcript abundance. Therefore, SAGE results in an accurate picture of gene expression at both the qualitative and the quantitative levels. It does not require a hybridization probe for each transcript and allows new genes to be discovered. This technique has been applied widely in human studies and various SAGE tags/SAGE libraries have been generated from different cells/tissues such as dendritic cells, lung fibroblast cells, oocytes, thyroid tissue, B-cell lymphoma, cultured keratinocytes, muscles, brain tissues, sciatic nerve, cultured Schwann cells, cord blood-derived mast cells, retina, macula, retinal pigment epithelial cells, skin cells, and so forth. In this review we present the updated information on the applications of SAGE technology mainly to human studies.
Kwon, Minseok; Leem, Sangseob; Yoon, Joon; Park, Taesung
With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants. Here, we propose GxGrare which is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. The proposed method consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene. The proposed method is illustrated with 1080 whole exome sequencing data of the Korean population in order to identify causal gene-gene interaction for rare variants for type 2 diabetes. The proposed GxGrare performs well for gene-gene interaction detection with collapsing of rare variants. GxGrare is available at http://bibs.snu.ac.kr/software/gxgrare which contains simulation data and documentation. Supported operating systems include Linux and OS X.
Full Text Available Abstract Motivation Detecting differentially expressed (DE genes between disease and normal control group is one of the most common analyses in genome-wide transcriptomic data. Since most studies don’t have a lot of samples, researchers have used meta-analysis to group different datasets for the same disease. Even then, in many cases the statistical power is still not enough. Taking into account the fact that many diseases share the same disease genes, it is desirable to design a statistical framework that can identify diseases’ common and specific DE genes simultaneously to improve the identification power. Results We developed a novel empirical Bayes based mixture model to identify DE genes in specific study by leveraging the shared information across multiple different disease expression data sets. The effectiveness of joint analysis was demonstrated through comprehensive simulation studies and two real data applications. The simulation results showed that our method consistently outperformed single data set analysis and two other meta-analysis methods in identification power. In real data analysis, overall our method demonstrated better identification power in detecting DE genes and prioritized more disease related genes and disease related pathways than single data set analysis. Over 150% more disease related genes are identified by our method in application to Huntington’s disease. We expect that our method would provide researchers a new way of utilizing available data sets from different diseases when sample size of the focused disease is limited.
Stein, Wilfred D; Litman, Thomas; Fojo, Tito
are their corresponding solid tumors. We used the Serial Analysis of Gene Expression (SAGE) database to identify differences between solid tumors and cell lines, hoping to detect genes that could potentially explain differences in drug sensitivity. SAGE libraries were available for both solid tumors and cell lines from...
Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
Zheng, Guangyong; Huang, Tao
In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.
Abdulkarim Yasin Karim
Full Text Available Purpose: Gastric cancer has high incidence and mortality rate in several countries and is still one of the most frequent and lethal disease. In this study, we aimed to determine diagnostic markers in gastric cancer by molecular techniques; include mRNA expression analysis of FABP4 gene. Fatty acid binding protein 4 (FABP4 gene encodes the fatty acid binding protein found in adipocytes. The protein encoded by FABP4 are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. Material and Methods: Total RNA were extracted from paired tumor and normal tissues of 47 gastric cancer. The mRNA expression level of FABP4 was measured employing semi- quantitative reverse transcription- polymerase chain reaction (RT- PCR. Results: The mRNA expression level of FABP4 was significantly decreased (down- regulated. Conclusion: Down-regulation of FABP4 gene seems to occur at the initial steps of gastric cancer development. In order to confirm the relationship between the gastric tumor and FABP4 gene, further analysis like immunohistochemistry and epigenetc techniques are necessary. [Cukurova Med J 2016; 41(2.000: 248-252
Full Text Available Plant type III polyketide synthase (PKS can catalyse the formation of a series of secondary metabolites with different structures and different biological functions; the enzyme plays an important role in plant growth, development and resistance to stress. At present, the PKS gene has been identified and studied in a variety of plants. Here, we identified 11 PKS genes from upland cotton (Gossypium hirsutum and compared them with 41 PKS genes in Populus tremula, Vitis vinifera, Malus domestica and Arabidopsis thaliana. According to the phylogenetic tree, a total of 52 PKS genes can be divided into four subfamilies (I–IV. The analysis of gene structures and conserved motifs revealed that most of the PKS genes were composed of two exons and one intron and there are two characteristic conserved domains (Chal_sti_synt_N and Chal_sti_synt_C of the PKS gene family. In our study of the five species, gene duplication was found in addition to Arabidopsis thaliana and we determined that purifying selection has been of great significance in maintaining the function of PKS gene family. From qRT-PCR analysis and a combination of the role of the accumulation of proanthocyanidins (PAs in brown cotton fibers, we concluded that five PKS genes are candidate genes involved in brown cotton fiber pigment synthesis. These results are important for the further study of brown cotton PKS genes. It not only reveals the relationship between PKS gene family and pigment in brown cotton, but also creates conditions for improving the quality of brown cotton fiber.
Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.
Full Text Available Abstract Background The ferlin gene family possesses a rare and identifying feature consisting of multiple tandem C2 domains and a C-terminal transmembrane domain. Much currently remains unknown about the fundamental function of this gene family, however, mutations in its two most well-characterised members, dysferlin and otoferlin, have been implicated in human disease. The availability of genome sequences from a wide range of species makes it possible to explore the evolution of the ferlin family, providing contextual insight into characteristic features that define the ferlin gene family in its present form in humans. Results Ferlin genes were detected from all species of representative phyla, with two ferlin subgroups partitioned within the ferlin phylogenetic tree based on the presence or absence of a DysF domain. Invertebrates generally possessed two ferlin genes (one with DysF and one without, with six ferlin genes in most vertebrates (three DysF, three non-DysF. Expansion of the ferlin gene family is evident between the divergence of lamprey (jawless vertebrates and shark (cartilaginous fish. Common to almost all ferlins is an N-terminal C2-FerI-C2 sandwich, a FerB motif, and two C-terminal C2 domains (C2E and C2F adjacent to the transmembrane domain. Preservation of these structural elements throughout eukaryotic evolution suggests a fundamental role of these motifs for ferlin function. In contrast, DysF, C2DE, and FerA are optional, giving rise to subtle differences in domain topologies of ferlin genes. Despite conservation of multiple C2 domains in all ferlins, the C-terminal C2 domains (C2E and C2F displayed higher sequence conservation and greater conservation of putative calcium binding residues across paralogs and orthologs. Interestingly, the two most studied non-mammalian ferlins (Fer-1 and Misfire in model organisms C. elegans and D. melanogaster, present as outgroups in the phylogenetic analysis, with results suggesting
Fierro, Ana C; Vandenbussche, Filip; Engelen, Kristof; Van de Peer, Yves; Marchal, Kathleen
Since the second half of the 1990s, a large number of genome-wide analyses have been described that study gene expression at the transcript level. To this end, two major strategies have been adopted, a first one relying on hybridization techniques such as microarrays, and a second one based on sequencing techniques such as serial analysis of gene expression (SAGE), cDNA-AFLP, and analysis based on expressed sequence tags (ESTs). Despite both types of profiling experiments becoming routine techniques in many research groups, their application remains costly and laborious. As a result, the number of conditions profiled in individual studies is still relatively small and usually varies from only two to few hundreds of samples for the largest experiments. More and more, scientific journals require the deposit of these high throughput experiments in public databases upon publication. Mining the information present in these databases offers molecular biologists the possibility to view their own small-scale analysis in the light of what is already available. However, so far, the richness of the public information remains largely unexploited. Several obstacles such as the correct association between ESTs and microarray probes with the corresponding gene transcript, the incompleteness and inconsistency in the annotation of experimental conditions, and the lack of standardized experimental protocols to generate gene expression data, all impede the successful mining of these data. Here, we review the potential and difficulties of combining publicly available expression data from respectively EST analyses and microarray experiments. With examples from literature, we show how meta-analysis of expression profiling experiments can be used to study expression behavior in a single organism or between organisms, across a wide range of experimental conditions. We also provide an overview of the methods and tools that can aid molecular biologists in exploiting these public data.
Rischewski, J; Schneppenheim, R
Patients with Fanconi anemia (Fanc) are at risk of developing leukemia. Mutations of the group A gene (FancA) are most common. A multitude of polymorphisms and mutations within the 43 exons of the gene are described. To examine the role of heterozygosity as a risk factor for malignancies, a partially automatized screening method to identify aberrations was needed. We report on our experience with DHPLC (WAVE (Transgenomic)). PCR amplification of all 43 exons from one individual was performed on one microtiter plate on a gradient thermocycler. DHPLC analysis conditions were established via melting curves, prediction software, and test runs with aberrant samples. PCR products were analyzed twice: native, and after adding a WT-PCR product. Retention patterns were compared with previously identified polymorphic PCR products or mutants. We have defined the mutation screening conditions for all 43 exons of FancA using DHPLC. So far, 40 different sequence variations have been detected in more than 100 individuals. The native analysis identifies heterozygous individuals, and the second run detects homozygous aberrations. Retention patterns are specific for the underlying sequence aberration, thus reducing sequencing demand and costs. DHPLC is a valuable tool for reproducible recognition of known sequence aberrations and screening for unknown mutations in the highly polymorphic FancA gene.
Full Text Available Non-small cell lung cancer (NSCLC represents a genomically unstable cancer type with extensive copy number aberrations. The relationship of gene copy number alterations and subsequent mRNA levels has only fragmentarily been described. The aim of this study was to conduct a genome-wide analysis of gene copy number gains and corresponding gene expression levels in a clinically well annotated NSCLC patient cohort (n = 190 and their association with survival. While more than half of all analyzed gene copy number-gene expression pairs showed statistically significant correlations (10,296 of 18,756 genes, high correlations, with a correlation coefficient >0.7, were obtained only in a subset of 301 genes (1.6%, including KRAS, EGFR and MDM2. Higher correlation coefficients were associated with higher copy number and expression levels. Strong correlations were frequently based on few tumors with high copy number gains and correspondingly increased mRNA expression. Among the highly correlating genes, GO groups associated with posttranslational protein modifications were particularly frequent, including ubiquitination and neddylation. In a meta-analysis including 1,779 patients we found that survival associated genes were overrepresented among highly correlating genes (61 of the 301 highly correlating genes, FDR adjusted p<0.05. Among them are the chaperone CCT2, the core complex protein NUP107 and the ubiquitination and neddylation associated protein CAND1. In conclusion, in a comprehensive analysis we described a distinct set of highly correlating genes. These genes were found to be overrepresented among survival-associated genes based on gene expression in a large collection of publicly available datasets.
Zhang, Lei; Ma, Shiyun; Wang, Huailiang; Su, Hang; Su, Ke; Li, Longjie
The purpose of our study was to identify new pathogenic genes used for exploring the pathogenesis of rheumatoid arthritis (RA). To screen pathogenic genes of RA, an integrated analysis was performed by using the microarray datasets in RA derived from the Gene Expression Omnibus (GEO) database. The functional annotation and potential pathways of differentially expressed genes (DEGs) were further discovered by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. Afterwards, the integrated analysis of DNA methylation and gene expression profiling was used to screen crucial genes. In addition, we used RT-PCR and MSP to verify the expression levels and methylation status of these crucial genes in 20 synovial biopsy samples obtained from 10 RA model mice and 10 normal mice. BCL11B, CCDC88C, FCRLA and APOL6 were both up-regulated and hypomethylated in RA according to integrated analysis, RT-PCR and MSP verification. Four crucial genes (BCL11B, CCDC88C, FCRLA and APOL6) identified and analyzed in this study might be closely connected with the pathogenesis of RA. Copyright © 2017. Published by Elsevier B.V.
Garcia-Fernàndez, J; Baguñà, J; Saló, E
Freshwater planarians (Platyhelminthes, Turbellaria, and Tricladida) are acoelomate, triploblastic, unsegmented, and bilaterally symmetrical organisms that are mainly known for their ample power to regenerate a complete organism from a small piece of their body. To identify potential pattern-control genes in planarian regeneration, we have isolated two homeobox-containing genes, Dth-1 and Dth-2 [Dugesia (Girardia) tigrina homeobox], by using degenerate oligonucleotides corresponding to the most conserved amino acid sequence from helix-3 of the homeodomain. Dth-1 and Dth-2 homeodomains are closely related (68% at the nucleotide level and 78% at the protein level) and show the conserved residues characteristic of the homeodomains identified to data. Similarity with most homeobox sequences is low (30-50%), except with Drosophila NK homeodomains (80-82% with NK-2) and the rodent TTF-1 homeodomain (77-87%). Some unusual amino acid residues specific to NK-2, TTF-1, Dth-1, and Dth-2 can be observed in the recognition helix (helix-3) and may define a family of homeodomains. The deduced amino acid sequences from the cDNAs contain, in addition to the homeodomain, other domains also present in various homeobox-containing genes. The expression of both genes, detected by Northern blot analysis, appear slightly higher in cephalic regions than in the rest of the intact organism, while a slight increase is detected in the central period (5 days) or regeneration. Images PMID:1714599
Allison M Krill
Full Text Available BACKGROUND: Aluminum (Al toxicity is a major worldwide constraint to crop productivity on acidic soils. Al becomes soluble at low pH, inhibiting root growth and severely reducing yields. Maize is an important staple food and commodity crop in acidic soil regions, especially in South America and Africa where these soils are very common. Al exclusion and intracellular tolerance have been suggested as two important mechanisms for Al tolerance in maize, but little is known about the underlying genetics. METHODOLOGY: An association panel of 282 diverse maize inbred lines and three F2 linkage populations with approximately 200 individuals each were used to study genetic variation in this complex trait. Al tolerance was measured as net root growth in nutrient solution under Al stress, which exhibited a wide range of variation between lines. Comparative and physiological genomics-based approaches were used to select 21 candidate genes for evaluation by association analysis. CONCLUSIONS: Six candidate genes had significant results from association analysis, but only four were confirmed by linkage analysis as putatively contributing to Al tolerance: Zea mays AltSB like (ZmASL, Zea mays aluminum-activated malate transporter2 (ALMT2, S-adenosyl-L-homocysteinase (SAHH, and Malic Enzyme (ME. These four candidate genes are high priority subjects for follow-up biochemical and physiological studies on the mechanisms of Al tolerance in maize. Immediately, elite haplotype-specific molecular markers can be developed for these four genes and used for efficient marker-assisted selection of superior alleles in Al tolerance maize breeding programs.
Full Text Available The severity and prevalence of many diseases are known to differ between the sexes. Organ specific sex-biased gene expression may underpin these and other sexually dimorphic traits. To further our understanding of sex differences in transcriptional regulation, we performed meta-analyses of sex biased gene expression in multiple human tissues. We analysed 22 publicly available human gene expression microarray data sets including over 2500 samples from 15 different tissues and 9 different organs. Briefly, by using an inverse-variance method we determined the effect size difference of gene expression between males and females. We found the greatest sex differences in gene expression in the brain, specifically in the anterior cingulate cortex, (1818 genes, followed by the heart (375 genes, kidney (224 genes, colon (218 genes and thyroid (163 genes. More interestingly, we found different parts of the brain with varying numbers and identity of sex-biased genes, indicating that specific cortical regions may influence sexually dimorphic traits. The majority of sex-biased genes in other tissues such as the bladder, liver, lungs and pancreas were on the sex chromosomes or involved in sex hormone production. On average in each tissue, 32% of autosomal genes that were expressed in a sex-biased fashion contained androgen or estrogen hormone response elements. Interestingly, across all tissues, we found approximately two-thirds of autosomal genes that were sex-biased were not under direct influence of sex hormones. To our knowledge this is the largest analysis of sex-biased gene expression in human tissues to date. We identified many sex-biased genes that were not under the direct influence of sex chromosome genes or sex hormones. These may provide targets for future development of sex-specific treatments for diseases.
Zhou, Qian; Yu, Yong-ming
Essential genes are indispensable for the survival of an organism. Investigating features associated with gene essentiality is fundamental to the prediction and identification of the essential genes. Selecting features associated with gene essentiality is fundamental to predict essential genes with computational techniques. We use fractal theory to make comparative analysis of essential and nonessential genes in bacteria. The information dimensions of essential genes and nonessential genes available in the DEG database for 27 bacteria are calculated based on their gene chaos game representations (CGRs). It is found that weak positive linear correlation exists between information dimension and gene length. Moreover, for genes of similar length, the average information dimension of essential genes is larger than that of nonessential genes. This indicates that essential genes show less regularity and higher complexity than nonessential genes. Our results show that for bacterium with a similar number of essential genes and nonessential genes, the CGR information dimension is helpful for the classification of essential genes and nonessential genes. Therefore, the gene CGR information dimension is very probably a useful gene feature for a genetic algorithm predicting essential genes. (paper)
An, X-K; Fang, J; Yu, Z-Z; Lin, Q; Lu, C-X; Qu, H-L; Ma, Q-L
Several genome-wide association studies (GWASs) in Caucasian populations have identified 12 loci that are significantly associated with migraine. More evidence suggests that serotonin receptors are also involved in migraine pathophysiology. In the present study, a case-control study was conducted in a cohort of 581 migraine cases and 533 ethnically matched controls among a Chinese population. Eighteen polymorphisms from serotonin receptors and GWASs were selected, and genotyping was performed using a Sequenom MALDI-TOF mass spectrometry iPLEX platform. The genotypic and allelic distributions of MEF2D rs2274316 and ASTN2 rs6478241 were significantly different between migraine patients and controls. Univariate and multivariate analysis revealed significant associations of polymorphisms in the MEF2D and ASTN2 genes with migraine susceptibility. MEF2D, PRDM16 and ASTN2 were also found to be associated with migraine without aura (MO) and migraine with family history. And, MEF2D and ASTN2 also served as genetic risk factors for the migraine without family history. The generalized multifactor dimensionality reduction analysis identified that MEF2D and HTR2E constituted the two-factor interaction model. Our study suggests that the MEF2D, PRDM16 and ASTN2 genes from GWAS are associated with migraine susceptibility, especially MO, among Chinese patients. It appears that there is no association with serotonin receptor related genes. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Akberdin, Ilya R; Omelyanchuk, Nadezda A; Fadeev, Stanislav I; Leskova, Natalya E; Oschepkova, Evgeniya A; Kazantsev, Fedor V; Matushkin, Yury G; Afonnikov, Dmitry A; Kolchanov, Nikolay A
Multiple experimental data demonstrated that the core gene network orchestrating self-renewal and differentiation of mouse embryonic stem cells involves activity of Oct4, Sox2 and Nanog genes by means of a number of positive feedback loops among them. However, recent studies indicated that the architecture of the core gene network should also incorporate negative Nanog autoregulation and might not include positive feedbacks from Nanog to Oct4 and Sox2. Thorough parametric analysis of the mathematical model based on this revisited core regulatory circuit identified that there are substantial changes in model dynamics occurred depending on the strength of Oct4 and Sox2 activation and molecular complexity of Nanog autorepression. The analysis showed the existence of four dynamical domains with different numbers of stable and unstable steady states. We hypothesize that these domains can constitute the checkpoints in a developmental progression from naïve to primed pluripotency and vice versa. During this transition, parametric conditions exist, which generate an oscillatory behavior of the system explaining heterogeneity in expression of pluripotent and differentiation factors in serum ESC cultures. Eventually, simulations showed that addition of positive feedbacks from Nanog to Oct4 and Sox2 leads mainly to increase of the parametric space for the naïve ESC state, in which pluripotency factors are strongly expressed while differentiation ones are repressed.
Full Text Available The wild silkworm Bombyx mandarina is widely believed to be an ancestor of the domesticated silkworm, Bombyx mori. Silkworms are often used as a model for studying the mechanism of species domestication. Here, we performed transcriptome sequencing of the wild silkworm using an Illumina HiSeq2000 platform. We produced 100,004,078 high-quality reads and assembled them into 50,773 contigs with an N50 length of 1764 bp and a mean length of 941.62 bp. A total of 33,759 unigenes were identified, with 12,805 annotated in the Nr database, 8273 in the Pfam database, and 9093 in the Swiss-Prot database. Expression profile analysis found significant differential expression of 1308 unigenes between the middle silk gland (MSG and posterior silk gland (PSG. Three sericin genes (sericin 1, sericin 2, and sericin 3 were expressed specifically in the MSG and three fibroin genes (fibroin-H, fibroin-L, and fibroin/P25 were expressed specifically in the PSG. In addition, 32,297 Single-nucleotide polymorphisms (SNPs and 361 insertion-deletions (INDELs were detected. Comparison with the domesticated silkworm p50/Dazao identified 5,295 orthologous genes, among which 400 might have experienced or to be experiencing positive selection by Ka/Ks analysis. These data and analyses presented here provide insights into silkworm domestication and an invaluable resource for wild silkworm genomics research.
Full Text Available The objective of this study was to determine the molecular characteristics of the horse vascular endothelial growth factor alpha gene (VEGFα by constructing a phylogenetic tree, and to investigate gene expression profiles in tissues and blood leukocytes after exercise for development of suitable biomarkers. Using published amino acid sequences of other vertebrate species (human, chimpanzee, mouse, rat, cow, pig, chicken and dog, we constructed a phylogenetic tree which showed that equine VEGFα belonged to the same clade of the pig VEGFα. Analysis for synonymous (Ks and non-synonymous substitution ratios (Ka revealed that the horse VEGFα underwent positive selection. RNA was extracted from blood samples before and after exercise and different tissue samples of three horses. Expression analyses using reverse transcription-polymerase chain reaction (RT-PCR and quantitative-polymerase chain reaction (qPCR showed ubiquitous expression of VEGFα mRNA in skeletal muscle, kidney, thyroid, lung, appendix, colon, spinal cord, and heart tissues. Analysis of differential expression of VEGFα gene in blood leukocytes after exercise indicated a unimodal pattern. These results will be useful in developing biomarkers that can predict the recovery capacity of racing horses.
Bjerregaard, Henriette; Pedersen, Shona; Kristensen, Søren Risom; Marcussen, Niels
Differentiation between malignant renal cell carcinoma and benign oncocytoma is of great importance to choose the optimal treatment. Accurate preoperative diagnosis of renal tumor is therefore crucial; however, existing imaging techniques and histologic examinations are incapable of providing an optimal differentiation profile. Analysis of gene expression of molecular markers is a new possibility but relies on appropriate standardization to compare different samples. The aim of this study was to identify stably expressed reference genes suitable for the normalization of results extracted from gene expression analysis of renal tumors. Expression levels of 8 potential reference genes (ATP5J, HMBS, HPRT1, PPIA, TBP, 18S, GAPDH, and POLR2A) were examined by real-time reverse transcription polymerase chain reaction in tumor and normal tissue from removed kidneys from 13 patients with renal cell carcinoma and 5 patients with oncocytoma. The expression levels of genes were compared by gene stability value M, average gene stability M, pairwise variation V, and coefficient of variation CV. More candidates were not suitable for the purpose, but a combination of HMBS, PPIA, ATP5J, and TBP was found to be the best combination with an average gene stability value M of 0.9 and a CV of 0.4 in the 18 tumors and normal tissues. A combination of 4 genes, HMBS, PPIA, ATP5J, and TBP, is a possible reference in renal tumor gene expression analysis by reverse transcription polymerase chain reaction. A combination of four genes, HMBS, PPIA, ATP5J and TBP, being stably expressed in tissues from RCC is possible reference genes for gene expression analysis.
Full Text Available Rice (Oryza sativa L. is a model organism for the functional genomics of monocotyledonous plants since the genome size is considerably smaller than those of other monocotyledonous plants. Although highly accurate genome sequences of indica and japonica rice are available, additional resources such as full-length complementary DNA (FL-cDNA sequences are also indispensable for comprehensive analyses of gene structure and function. We cross-referenced 28.5K individual loci in the rice genome defined by mapping of 578K FL-cDNA clones with the 56K loci predicted in the TIGR genome assembly. Based on the annotation status and the presence of corresponding cDNA clones, genes were classified into 23K annotated expressed (AE genes, 33K annotated non-expressed (ANE genes, and 5.5K non-annotated expressed (NAE genes. We developed a 60mer oligo-array for analysis of gene expression from each locus. Analysis of gene structures and expression levels revealed that the general features of gene structure and expression of NAE and ANE genes were considerably different from those of AE genes. The results also suggested that the cloning efficiency of rice FL-cDNA is associated with the transcription activity of the corresponding genetic locus, although other factors may also have an effect. Comparison of the coverage of FL-cDNA among gene families suggested that FL-cDNA from genes encoding rice- or eukaryote-specific domains, and those involved in regulatory functions were difficult to produce in bacterial cells. Collectively, these results indicate that rice genes can be divided into distinct groups based on transcription activity and gene structure, and that the coverage bias of FL-cDNA clones exists due to the incompatibility of certain eukaryotic genes in bacteria.
Eren, Kemal; Deveci, Mehmet; Küçüktunç, Onur; Çatalyürek, Ümit V.
The need to analyze high-dimension biological data is driving the development of new data mining methods. Biclustering algorithms have been successfully applied to gene expression data to discover local patterns, in which a subset of genes exhibit similar expression levels over a subset of conditions. However, it is not clear which algorithms are best suited for this task. Many algorithms have been published in the past decade, most of which have been compared only to a small number of algorithms. Surveys and comparisons exist in the literature, but because of the large number and variety of biclustering algorithms, they are quickly outdated. In this article we partially address this problem of evaluating the strengths and weaknesses of existing biclustering methods. We used the BiBench package to compare 12 algorithms, many of which were recently published or have not been extensively studied. The algorithms were tested on a suite of synthetic data sets to measure their performance on data with varying conditions, such as different bicluster models, varying noise, varying numbers of biclusters and overlapping biclusters. The algorithms were also tested on eight large gene expression data sets obtained from the Gene Expression Omnibus. Gene Ontology enrichment analysis was performed on the resulting biclusters, and the best enrichment terms are reported. Our analyses show that the biclustering method and its parameters should be selected based on the desired model, whether that model allows overlapping biclusters, and its robustness to noise. In addition, we observe that the biclustering algorithms capable of finding more than one model are more successful at capturing biologically relevant clusters. PMID:22772837
de Bernabe, D. B.-V.; Peterson, P.; Luopajarvi, K.; Matintalo, P.; Alho, A.; Konttinen, Y.; Krohn, K.; de Cordoba, S. R.; Ranki, A.
Alkaptonuria (AKU), the prototypic inborn error of metabolism, has recently been shown to be caused by loss of function mutations in the homogentisate-1,2-dioxygenase gene (HGO). So far 17 mutations have been characterised in AKU patients of different ethnic origin. We describe three novel mutations (R58fs, R330S, and H371R) and one common AKU mutation (M368V), detected by mutational and polymorphism analysis of the HGO gene in five Finnish AKU pedigrees. The three novel AKU mutations are most likely specific for the Finnish population and have originated recently. Keywords: alkaptonuria; homogentisate-1,2-dioxygenase; Finland PMID:10594001
Liu, Wei; Li, Li; Ye, Hua; Tu, Wei
High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.
Full Text Available Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA, into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms.
Base sequence analysis of CCKAR gene (a gene of A-type receptor for cholecystokinin) from OLETF rat, a model rat for insulin-independent diabetes was made based on the base sequence of wild CCKAR gene, which had been clarified in the previous year. From the pancreas of OLETF rat, DNA was extracted and transduced into λphage after fragmentation to construct the gene library of OLETF. Then, λphage DNA clone bound with labelled cDNA of CCKAR gene was analyzed and the gene structure was compared with that of the wild gene. It was demonstrated that CCKAR gene of OLETF had a deletion (6800 b.p.) ranging from the promoter region to the Exon 2, suggesting that CCKAR gene is not functional in OLETF rat. The whole sequence of this mutant gene was registered into Japan DNA Bank (D 50610). Then, F 2 offspring rats were obtained through crossing OLETF (female) and F344 (male) and the time course-changes in the blood glucose level after glucose loading were compared among them. The blood glucose level after glucose loading was significantly higher in the homo-mutant F 2 (CCKAR,-/-) as well as the parent OLETF rat than hetero-mutant F 2 (CCKARm-/+) or the wild rat (CCKAR,+/+). This suggests that CCKAR gene might be involved in the control of blood glucose level and an alteration of the expression level or the functions of CCKAR gene might affect the blood glucose level. (M.N.)
Ahsen, Mehmet Eren; Niculescu, Silviu-Iulian
This brief examines a deterministic, ODE-based model for gene regulatory networks (GRN) that incorporates nonlinearities and time-delayed feedback. An introductory chapter provides some insights into molecular biology and GRNs. The mathematical tools necessary for studying the GRN model are then reviewed, in particular Hill functions and Schwarzian derivatives. One chapter is devoted to the analysis of GRNs under negative feedback with time delays and a special case of a homogenous GRN is considered. Asymptotic stability analysis of GRNs under positive feedback is then considered in a separate chapter, in which conditions leading to bi-stability are derived. Graduate and advanced undergraduate students and researchers in control engineering, applied mathematics, systems biology and synthetic biology will find this brief to be a clear and concise introduction to the modeling and analysis of GRNs.
Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit
Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics
Nueda, Maria José; Carbonell, José; Medina, Ignacio; Dopazo, Joaquín; Conesa, Ana
Serial transcriptomics experiments investigate the dynamics of gene expression changes associated with a quantitative variable such as time or dosage. The statistical analysis of these data implies the study of global and gene-specific expression trends, the identification of significant serial changes, the comparison of expression profiles and the assessment of transcriptional changes in terms of cellular processes. We have created the SEA (Serial Expression Analysis) suite to provide a complete web-based resource for the analysis of serial transcriptomics data. SEA offers five different algorithms based on univariate, multivariate and functional profiling strategies framed within a user-friendly interface and a project-oriented architecture to facilitate the analysis of serial gene expression data sets from different perspectives. SEA is available at sea.bioinfo.cipf.es. PMID:20525784
Full Text Available A novel yellow-green leaf mutant yellow-green leaf-1 (ygl-1 was isolated in self-pollinated progenies from the cross of maize inbred lines Ye478 and Yuanwu02. The mutant spontaneously showed yellow-green character throughout the lifespan. Meanwhile, the mutant reduced contents of chlorophyll and Car, arrested chloroplast development and lowered the capacity of photosynthesis compared with the wild-type Lx7226. Genetic analysis revealed that the mutant phenotype was controlled by a recessive nuclear gene. The ygl-1 locus was initially mapped to an interval of about 0.86 Mb in bin 1.01 on the short arm of chromosome 1 using 231 yellow-green leaf individuals of an F2 segregating population from ygl-1/Lx7226. Utilizing four new polymorphic SSR markers, the ygl-1 locus was narrowed down to a region of about 48 kb using 2930 and 2247 individuals of F2 and F3 mapping populations, respectively. Among the three predicted genes annotated within this 48 kb region, GRMZM2G007441, which was predicted to encode a cpSRP43 protein, had a 1-bp nucleotide deletion in the coding region of ygl-1 resulting in a frame shift mutation. Semi-quantitative RT-PCR analysis revealed that YGL-1 was constitutively expressed in all tested tissues and its expression level was not significantly affected in the ygl-1 mutant from early to mature stages, while light intensity regulated its expression both in the ygl-1 mutant and wild type seedlings. Furthermore, the mRNA levels of some genes involved in chloroplast development were affected in the six-week old ygl-1 plants. These findings suggested that YGL-1 plays an important role in chloroplast development of maize.
Li, Meng-Yao; Song, Xiong; Wang, Feng; Xiong, Ai-Sheng
Parsley, one of the most important vegetables in the Apiaceae family, is widely used in the food, medicinal, and cosmetic industries. Recent studies on parsley mainly focus on its chemical composition, and further research involving the analysis of the plant's gene functions and expressions is required. qPCR is a powerful method for detecting very low quantities of target transcript levels and is widely used to study gene expression. To ensure the accuracy of results, a suitable reference gene is necessary for expression normalization. In this study, four software, namely geNorm, NormFinder, BestKeeper, and RefFinder were used to evaluate the expression stabilities of eight candidate reference genes of parsley ( GAPDH, ACTIN, eIF-4 α, SAND, UBC, TIP41, EF-1 α, and TUB ) under various conditions, including abiotic stresses (heat, cold, salt, and drought) and hormone stimuli treatments (GA, SA, MeJA, and ABA). Results showed that EF-1 α and TUB were the most stable genes for abiotic stresses, whereas EF-1 α, GAPDH , and TUB were the top three choices for hormone stimuli treatments. Moreover, EF-1 α and TUB were the most stable reference genes among all tested samples, and UBC was the least stable one. Expression analysis of PcDREB1 and PcDREB2 further verified that the selected stable reference genes were suitable for gene expression normalization. This study can guide the selection of suitable reference genes in gene expression in parsley.
Full Text Available Parsley is one of the most important vegetable in Apiaceae family and widely used in food industry, medicinal and cosmetic. The recent studies in parsley are mainly focus on chemical composition, further research involving the analysis of the gene functions and expressions will be required. qPCR is a powerful method for detecting very low quantities of target transcript levels and widely used for gene expression studies. To ensure the accuracy of results, a suitable reference gene is necessary for expression normalization. In this study, three software geNorm, NormFinder, and BestKeeper were used to evaluate the expression stabilities of eight candidate reference genes (GAPDH, ACTIN, eIF-4α, SAND, UBC, TIP41, EF-1α, and TUB under various conditions including abiotic stresses (heat, cold, salt, and drought and hormone stimuli treatments (GA, SA, MeJA, and ABA. The results showed that EF-1α and TUB were identified as the most stable genes for abiotic stresses, while EF-1α, GAPDH, and TUB were the top three choices for hormone stimuli treatments. Moreover, EF-1α and TUB were the most stable reference genes across all the tested samples, while UBC was the least stable one. The expression analysis of PcDREB1 and PcDREB2 further verified that the selected stable reference genes were suitable for gene expression normalization. This study provides a guideline for selection the suitable reference genes in gene expression in parsley.
Napier, Maria L; Durga, Dash; Wolsley, Clive J; Chamney, Sarah; Alexander, Sharon; Brennan, Rosie; Simpson, David A; Silvestri, Giuliana; Willoughby, Colin E
To determine the role of rhodopsin (RHO) gene mutations in patients with sector retinitis pigmentosa (RP) from Northern Ireland. A case series of sector RP in a tertiary ocular genetics clinic. Four patients with sector RP were recruited from the Royal Victoria Hospital (Belfast, Northern Ireland) and Altnagelvin Hospital (Londonderry, Northern Ireland) following informed consent. The diagnosis of sector RP was based on clinical examination, International Society for Clinical Electrophysiology of Vision (ISCEV) standard electrophysiology, and visual field analysis. DNA was extracted from peripheral blood leucocytes and the coding regions and adjacent flanking intronic sequences of the RHO gene were polymerase chain reaction (PCR) amplified and cycle sequenced. Rhodopsin mutational status. A heterozygous missense mutation in RHO (c.173C > T) resulting in a non-conservative substitution of threonine to methionine (p. Thr58Met) was identified in one patient and was absent from 360 control individuals. This non-conservative substitution (p.Thr58Met) replaces a highly evolutionary conserved polar hydrophilic threonine residue with a non-polar hydrophobic methionine residue at position 58 near the cytoplasmic border of helix A of RHO. The study identified a RHO gene mutation (p.Thr58Met) not previously reported in RP in a patient with sector RP. These findings outline the phenotypic variability associated with RHO mutations. It has been proposed that the regional effects of RHO mutations are likely to result from interplay between mutant alleles and other genetic, epigenetic and environmental factors.
Trebušak Podkrajšek, Katarina; Stirn Kranjc, Branka; Hovnik, Tinka; Kovač, Jernej; Battelino, Tadej
X-linked ocular albinism type 1 is difficult to differentiate clinically from other forms of albinism in young patients. X-linked ocular albinism type 1 is caused by mutations in the GPR143 gene, encoding melanosome specific G-protein coupled receptor. Patients typically present with moderately to severely reduced visual acuity, nystagmus, strabismus, photophobia, iris translucency, hypopigmentation of the retina, foveal hypoplasia and misrouting of optic nerve fibers at the chiasm. Following clinical ophthalmological evaluation, GPR143 gene mutational analyses were performed in a cohort of 15 pediatric male patients with clinical signs of albinism. Three different mutations in the GPR143 gene were identified in four patients, including a novel c.886G>A (p.Gly296Arg) mutation occurring "de novo" and a novel intronic c.360 + 5G>A mutation, identified in two related boys. Four patients with X-linked ocular albinism type 1 were identified from a cohort of 15 boys with clinical signs of albinism using mutation detection methods. Genetic analysis offers the possibility of early definitive diagnosis of ocular albinism type 1 in a significant portion of boys with clinical signs of albinism.
Mary Qu Yang
Full Text Available Clear cell renal cell carcinoma (ccRCC is the most common and most aggressive form of renal cell cancer (RCC. The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1, as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways. Keywords: ccRCC, Causative mutation, Pathways, Protein-protein interaction, Gene module, eQTL
Jose A Seoane
Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.
Full Text Available BACKGROUND: Trichinellosis is a typical food-borne zoonotic disease which is epidemic worldwide and the nematode Trichinella spiralis is the main pathogen. The life cycle of T. spiralis contains three developmental stages, i.e. adult worms, new borne larva (new borne L1 larva and muscular larva (infective L1 larva. Stage-specific gene expression in the parasites has been investigated with various immunological and cDNA cloning approaches, whereas the genome-wide transcriptome and expression features of the parasite have been largely unknown. The availability of the genome sequence information of T. spiralis has made it possible to deeply dissect parasite biology in association with global gene expression and pathogenesis. METHODOLOGY AND PRINCIPAL FINDINGS: In this study, we analyzed the global gene expression patterns in the three developmental stages of T. spiralis using digital gene expression (DGE analysis. Almost 15 million sequence tags were generated with the Illumina RNA-seq technology, producing expression data for more than 9,000 genes, covering 65% of the genome. The transcriptome analysis revealed thousands of differentially expressed genes within the genome, and importantly, a panel of genes encoding functional proteins associated with parasite invasion and immuno-modulation were identified. More than 45% of the genes were found to be transcribed from both strands, indicating the importance of RNA-mediated gene regulation in the development of the parasite. Further, based on gene ontological analysis, over 3000 genes were functionally categorized and biological pathways in the three life cycle stage were elucidated. CONCLUSIONS AND SIGNIFICANCE: The global transcriptome of T. spiralis in three developmental stages has been profiled, and most gene activity in the genome was found to be developmentally regulated. Many metabolic and biological pathways have been revealed. The findings of the differential expression of several protein
Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K
Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.
Wan, Yongqing; Mao, Mingzhu; Wan, Dongli; Yang, Qi; Yang, Feiyun; Mandlaa; Li, Guojing; Wang, Ruigang
WRKY transcription factors, one of the largest families of transcriptional regulators in plants, play important roles in plant development and various stress responses. The WRKYs of Caragana intermedia are still not well characterized, although many WRKYs have been identified in various plant species. We identified 53 CiWRKY genes from C. intermedia transcriptome data, 28 of which exhibited complete open reading frames (ORFs). These CiWRKYs were divided into three groups via phylogenetic analysis according to their WRKY domains and zinc finger motifs. Conserved domain analysis showed that the CiWRKY proteins contain a highly conserved WRKYGQK motif and two variant motifs (WRKYGKK and WKKYEEK). The subcellular localization of CiWRKY26 and CiWRKY28-1 indicated that these two proteins localized exclusively to nuclei, supporting their role as transcription factors. The expression patterns of the 28 CiWRKYs with complete ORFs were examined through quantitative real-time PCR (qRT-PCR) in various tissues and under different abiotic stresses (drought, cold, salt, high-pH and abscisic acid (ABA)). The results showed that each CiWRKY responded to at least one stress treatment. Furthermore, overexpression of CiWRKY75-1 and CiWRKY40-4 in Arabidopsis thaliana suppressed the drought stress tolerance of the plants and delayed leaf senescence, respectively. Fifty-three CiWRKY genes from the C. intermedia transcriptome were identified and divided into three groups via phylogenetic analysis. The expression patterns of the 28 CiWRKYs under different abiotic stresses suggested that each CiWRKY responded to at least one stress treatment. Overexpression of CiWRKY75-1 and CiWRKY40-4 suppressed the drought stress tolerance of Arabidopsis and delayed leaf senescence, respectively. These results provide a basis for the molecular mechanism through which CiWRKYs mediate stress tolerance.
Full Text Available Context: Paracoccidioides brasiliensis, a dimorphic fungus is the causative agent of paracoccidioidomycosis, a disease globally affecting millions of people. The haloacid dehalogenase (HAD superfamily hydrolases enzyme in the fungi, in particular, is known to be responsible in the pathogenesis by adhering to the tissue. Hence, identification of novel drug targets is essential. Aims: In-silico based identification of co-expressed genes along with HAD superfamily hydrolase in P. brasiliensis during the morphogenesis from mycelium to yeast to identify possible genes as drug targets. Materials and Methods: In total, four datasets were retrieved from the NCBI-gene expression omnibus (GEO database, each containing 4340 genes, followed by gene filtration expression of the data set. Further co-expression (CE study was performed individually and then a combination these genes were visualized in the Cytoscape 2. 8.3. Statistical Analysis Used: Mean and standard deviation value of the HAD superfamily hydrolase gene was obtained from the expression data and this value was subsequently used for the CE calculation purpose by selecting specific correlation power and filtering threshold. Results: The 23 genes that were thus obtained are common with respect to the HAD superfamily hydrolase gene. A significant network was selected from the Cytoscape network visualization that contains total 7 genes out of which 5 genes, which do not have significant protein hits, obtained from gene annotation of the expressed sequence tags by BLAST X. For all the protein PSI-BLAST was performed against human genome to find the homology. Conclusions: The gene co-expression network was obtained with respect to HAD superfamily dehalogenase gene in P. Brasiliensis.
Koul, A.M.; Nadeem, A.; Baryalai, P.
Abstract: Inpp5k gene encodes a protein which plays a very vital role in a number of metabolic pathways. It is very significant in the glucose metabolism where it regulates the signalling of the insulin pathway. But the full molecular details of the pathways regulated by Inpp5k encoded protein are not known. It is speculated that Inpp5k gene expression is altered in case of endometrial adenocarcinoma. Myolc gene encodes for a protein called Myosin-lc which acts an actin-based molecular motor in the cells. II has been studied that this gene down-regulates during endometrial adenocarcinoma and colorectal cancers. In this study the expression analysis of these two was carried out using multiplex PCR. An endogenous control was used for this PCR. ACTS gene served as the endogenous control because of it being a house keeping gene. It thus shows a universal expression in all cells. Thus in this study the gene expression of Inpp5k and Myulc genes was comparatively analysed with ACTS gene. The results that came out of this study showed an over-expression of Inpp5k gene and down-regulation of myolc gene with respect to ACTS gene in cancer cell lines as was indicated by the previous studies with these genes. Expression of both genes i.e. Inpp5k and Myolc was statistically compared between normal and cancerous cell lines and was found statistically significant at a value of P< O.O I in most of the cases. (author)
Here we report the isolation and characterization of porcine NURR1 cDNA. The NURR1 cDNA was RT-PCR cloned using NURR1-specific oligonucleotide primers derived from in silico sequences. The porcine NURR1 cDNA encodes a polypeptide of 598 amino acids, displaying a very high similarity with bovine, human and mouse (99% NURR1 protein. Expression analysis revealed a differential NURR1 mRNA expression in various organs and tissues. NURR1 transcripts could be detected as early as at 60 days of embryo development in different brain tissues. A significant increase in NURR1 transcript in the cerebellum and a decrease in NURR1 transcript in the basal ganglia was observed during embryo development. The porcine NURR1 gene was mapped to chromosome 15. Two missense mutations were found in exon 3, the first coding exon of NURR1. Methylation analysis of the porcine NURR1 gene body revealed a high methylation degree in brain tissue, whereas methylation of the promoter was very low. A decrease in DNA methylation in a discrete region of the NURR1 promoter was observed in pig frontal cortex during pig embryo development. This observation correlated with an increase in NURR1 transcripts. Therefore, methylation might be a determinant of NURR1 expression at certain time points in embryo development.
Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia
Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
Jacob, Tiago R; Peres, Nalu T A; Persinoti, Gabriela F; Silva, Larissa G; Mazucato, Mendelson; Rossi, Antonio; Martinez-Rossi, Nilce M
The selection of reference genes used for data normalization to quantify gene expression by real-time PCR amplifications (qRT-PCR) is crucial for the accuracy of this technique. In spite of this, little information regarding such genes for qRT-PCR is available for gene expression analyses in pathogenic fungi. Thus, we investigated the suitability of eight candidate reference genes in isolates of the human dermatophyte Trichophyton rubrum subjected to several environmental challenges, such as drug exposure, interaction with human nail and skin, and heat stress. The stability of these genes was determined by geNorm, NormFinder and Best-Keeper programs. The gene with the most stable expression in the majority of the conditions tested was rpb2 (DNA-dependent RNA polymerase II), which was validated in three T. rubrum strains. Moreover, the combination of rpb2 and chs1 (chitin synthase) genes provided for the most reliable qRT-PCR data normalization in T. rubrum under a broad range of biological conditions. To the best of our knowledge this is the first report on the selection of reference genes for qRT-PCR data normalization in dermatophytes and the results of these studies should permit further analysis of gene expression under several experimental conditions, with improved accuracy and reliability.
Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen
Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.
Full Text Available Integrative analysis of gene dosage, expression, and ontology (GO data was performed to discover driver genes in the carcinogenesis and chemoradioresistance of cervical cancers. Gene dosage and expression profiles of 102 locally advanced cervical cancers were generated by microarray techniques. Fifty-two of these patients were also analyzed with the Illumina expression method to confirm the gene expression results. An independent cohort of 41 patients was used for validation of gene expressions associated with clinical outcome. Statistical analysis identified 29 recurrent gains and losses and 3 losses (on 3p, 13q, 21q associated with poor outcome after chemoradiotherapy. The intratumor heterogeneity, assessed from the gene dosage profiles, was low for these alterations, showing that they had emerged prior to many other alterations and probably were early events in carcinogenesis. Integration of the alterations with gene expression and GO data identified genes that were regulated by the alterations and revealed five biological processes that were significantly overrepresented among the affected genes: apoptosis, metabolism, macromolecule localization, translation, and transcription. Four genes on 3p (RYBP, GBE1 and 13q (FAM48A, MED4 correlated with outcome at both the gene dosage and expression level and were satisfactorily validated in the independent cohort. These integrated analyses yielded 57 candidate drivers of 24 genetic events, including novel loci responsible for chemoradioresistance. Further mapping of the connections among genetic events, drivers, and biological processes suggested that each individual event stimulates specific processes in carcinogenesis through the coordinated control of multiple genes. The present results may provide novel therapeutic opportunities of both early and advanced stage cervical cancers.
Dylan T Jones
Full Text Available Angiogenesis is essential for solid tumour growth, whilst the molecular profiles of tumour blood vessels have been reported to be different between cancer types. Although presently available anti-angiogenic strategies are providing some promise for the treatment of some cancers it is perhaps not surprisingly that, none of the anti-angiogenic agents available work on all tumours. Thus, the discovery of novel anti-angiogenic targets, relevant to individual cancer types, is required. Using Affymetrix microarray analysis of laser-captured, CD31-positive blood vessels we have identified 63 genes that are upregulated significantly (5-72 fold in angiogenic blood vessels associated with human invasive ductal carcinoma (IDC of the breast as compared with blood vessels in normal human breast. We tested the angiogenic capacity of a subset of these genes. Genes were selected based on either their known cellular functions, their enriched expression in endothelial cells and/or their sensitivity to anti-VEGF treatment; all features implicating their involvement in angiogenesis. For example, RRM2, a ribonucleotide reductase involved in DNA synthesis, was upregulated 32-fold in IDC-associated blood vessels; ATF1, a nuclear activating transcription factor involved in cellular growth and survival was upregulated 23-fold in IDC-associated blood vessels and HEX-B, a hexosaminidase involved in the breakdown of GM2 gangliosides, was upregulated 8-fold in IDC-associated blood vessels. Furthermore, in silico analysis confirmed that AFT1 and HEX-B also were enriched in endothelial cells when compared with non-endothelial cells. None of these genes have been reported previously to be involved in neovascularisation. However, our data establish that siRNA depletion of Rrm2, Atf1 or Hex-B had significant anti-angiogenic effects in VEGF-stimulated ex vivo mouse aortic ring assays. Overall, our results provide proof-of-principle that our approach can identify a cohort of
Full Text Available Abstract Background Serial Analysis of Gene Expression (SAGE is a new technique that allows a detailed and profound quantitative and qualitative knowledge of gene expression profile, without previous knowledge of sequence of analyzed genes. We carried out a modification of SAGE methodology (microSAGE, useful for the analysis of limited quantities of tissue samples, on normal human cervical tissue obtained from a donor without histopathological lesions. Cervical epithelium is constituted mainly by cervical keratinocytes which are the targets of human papilloma virus (HPV, where persistent HPV infection of cervical epithelium is associated with an increase risk for developing cervical carcinomas (CC. Results We report here a transcriptome analysis of cervical tissue by SAGE, derived from 30,418 sequenced tags that provide a wealth of information about the gene products involved in normal cervical epithelium physiology, as well as genes not previously found in uterine cervix tissue involved in the process of epidermal differentiation. Conclusion This first comprehensive and profound analysis of uterine cervix transcriptome, should be useful for the identification of genes involved in normal cervix uterine function, and candidate genes associated with cervical carcinoma.
Cohn Zachary A
Full Text Available Abstract Background Cartilage plays a fundamental role in the development of the human skeleton. Early in embryogenesis, mesenchymal cells condense and differentiate into chondrocytes to shape the early skeleton. Subsequently, the cartilage anlagen differentiate to form the growth plates, which are responsible for linear bone growth, and the articular chondrocytes, which facilitate joint function. However, despite the multiplicity of roles of cartilage during human fetal life, surprisingly little is known about its transcriptome. To address this, a whole genome microarray expression profile was generated using RNA isolated from 18–22 week human distal femur fetal cartilage and compared with a database of control normal human tissues aggregated at UCLA, termed Celsius. Results 161 cartilage-selective genes were identified, defined as genes significantly expressed in cartilage with low expression and little variation across a panel of 34 non-cartilage tissues. Among these 161 genes were cartilage-specific genes such as cartilage collagen genes and 25 genes which have been associated with skeletal phenotypes in humans and/or mice. Many of the other cartilage-selective genes do not have established roles in cartilage or are novel, unannotated genes. Quantitative RT-PCR confirmed the unique pattern of gene expression observed by microarray analysis. Conclusion Defining the gene expression pattern for cartilage has identified new genes that may contribute to human skeletogenesis as well as provided further candidate genes for skeletal dysplasias. The data suggest that fetal cartilage is a complex and transcriptionally active tissue and demonstrate that the set of genes selectively expressed in the tissue has been greatly underestimated.
Roy, Janine; Winter, Christof; Schroeder, Michael
The simultaneous measurement of thousands of genes gives the opportunity to personalize and improve cancer therapy. In addition, the integration of meta-data such as protein-protein interaction (PPI) information into the analyses helps in the identification and prioritization of genes from these screens. Here, we describe a computational approach that identifies genes prognostic for outcome by combining gene profiling data from any source with a network of known relationships between genes.
Waaijenborg, S.; Zwinderman, A.H.
ABSTRACT: BACKGROUND: We generalized penalized canonical correlation analysis for analyzing microarray gene-expression measurements for checking completeness of known metabolic pathways and identifying candidate genes for incorporation in the pathway. We used Wold's method for calculation of the
Costes, Sylvain V.
Goals to achieve for GeneLab AWG - GL vision - Review of GeneLab AWG charter Timeline and milestones for 2018 Logistics - Monthly Meeting - Workshop - Internship - ASGSR Introduction of team leads and goals of each group Introduction of all members Q/A Three-tier Client Strategy to Democratize Data Physiological changes, pathway enrichment, differential expression, normalization, processing metadata, reproducibility, Data federation/integration with heterogeneous bioinformatics external databases The GLDS currently serves over 100 omics investigations to the biomedical community via open access. In order to expand the scope of metadata record searches via the GLDS, we designed a metadata warehouse that collects and updates metadata records from external systems housing similar data. To demonstrate the capabilities of federated search and retrieval of these data, we imported metadata records from three open-access data systems into the GLDS metadata warehouse: NCBI's Gene Expression Omnibus (GEO), EBI's PRoteomics IDEntifications (PRIDE) repository, and the Metagenomics Analysis server (MG-RAST). Each of these systems defines metadata for omics data sets differently. One solution to bridge such differences is to employ a common object model (COM) to which each systems' representation of metadata can be mapped. Warehoused metadata records are then transformed at ETL to this single, common representation. Queries generated via the GLDS are then executed against the warehouse, and matching records are shown in the COM representation (Fig. 1). While this approach is relatively straightforward to implement, the volume of the data in the omics domain presents challenges in dealing with latency and currency of records. Furthermore, the lack of a coordinated has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta
Full Text Available Abstract Background The tumor suppressor gene p53 is involved in multiple cellular pathways including apoptosis, transcriptional control, and cell cycle regulation. In the last decade it has been demonstrated that the single nucleotide polymorphism (SNP at codon 72 of the p53 gene is associated with the risk for development of various neoplasms. MDM2 SNP309 is a single nucleotide T to G polymorphism located in the MDM2 gene promoter. From the time that this well-characterized functional polymorphism was identified, a variety of case-control studies have been published that investigate the possible association between MDM2 SNP309 and cancer risk. However, the results of the published studies, as well as the subsequent meta-analyses, remain contradictory. Methods To investigate whether currently published epidemiological studies can clarify the potential interaction between MDM2 SNP309 and the functional genetic variant in p53 codon72 (Arg72Pro and p53 mutation status, we performed a meta-analysis of the risk estimate on 27,813 cases with various tumor types and 30,295 controls. Results The data we reviewed indicated that variant homozygote 309GG and heterozygote 309TG were associated with a significant increased risk of all tumor types (homozygote comparison: odds ratio (OR = 1.25, 95% confidence interval (CI = 1.13-1.37; heterozygote comparison: OR = 1.10, 95% CI = 1.03-1.17. We also found that the combination of GG and Pro/Pro, TG and Pro/Pro, GG and Arg/Arg significantly increased the risk of cancer (OR = 3.38, 95% CI = 1.77-6.47; OR = 1.88, 95% CI = 1.26-2.81; OR = 1.96, 95% CI = 1.01-3.78, respectively. In a stratified analysis by tumor location, we also found a significant increased risk in brain, liver, stomach and uterus cancer (OR = 1.47, 95% CI = 1.06-2.03; OR = 2.24, 95%CI = 1.57-3.18; OR = 1.54, 95%CI = 1.04-2.29; OR = 1.34, 95%CI = 1.07-1.29, respectively. However, no association was seen between MDM2 SNP309 and tumor susceptibility
Donald R. Love
Full Text Available The role of gene deletion and duplication in the aetiology of disease has become increasingly evident over the last decade. In addition to the classical deletion/duplication disorders diagnosed using molecular techniques, such as Duchenne Muscular Dystrophy and Charcot-Marie-Tooth Neuropathy Type 1A, the significance of partial or whole gene deletions in the pathogenesis of a large number single-gene disorders is becoming more apparent. A variety of dosage analysis methods are available to the diagnostic laboratory but the widespread application of many of these techniques is limited by the expense of the kits/reagents and restrictive targeting to a particular gene or portion of a gene. These limitations are particularly important in the context of a small diagnostic laboratory with modest sample throughput. We have developed a gene-targeted, custom-designed comparative genomic hybridisation (CGH array that allows twelve clinical samples to be interrogated simultaneously for exonic deletions/duplications within any gene (or panel of genes on the array. We report here on the use of the array in the analysis of a series of clinical samples processed by our laboratory over a twelve-month period. The array has proven itself to be robust, flexible and highly suited to the diagnostic environment.
Full Text Available Abstract Background Bifurcation analysis has proven to be a powerful method for understanding the qualitative behavior of gene regulatory networks. In addition to the more traditional forward problem of determining the mapping from parameter space to the space of model behavior, the inverse problem of determining model parameters to result in certain desired properties of the bifurcation diagram provides an attractive methodology for addressing important biological problems. These include understanding how the robustness of qualitative behavior arises from system design as well as providing a way to engineer biological networks with qualitative properties. Results We demonstrate that certain inverse bifurcation problems of biological interest may be cast as optimization problems involving minimal distances of reference parameter sets to bifurcation manifolds. This formulation allows for an iterative solution procedure based on performing a sequence of eigen-system computations and one-parameter continuations of solutions, the latter being a standard capability in existing numerical bifurcation software. As applications of the proposed method, we show that the problem of maximizing regions of a given qualitative behavior as well as the reverse engineering of bistable gene switches can be modelled and efficiently solved.
be ambiguous, referring in some cases to more than one gene or one protein, or in others, to both genes and proteins at the same time. Public biological databases give a very useful insight about genes and proteins information, including their names
Zhao, Ye; Chen, Muyan; Wang, Tianming; Sun, Lina; Xu, Dongxue; Yang, Hongsheng
Quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR) is a technique that is widely used for gene expression analysis, and its accuracy depends on the expression stability of the internal reference genes used as normalization factors. However, many applications of qRT-PCR used housekeeping genes as internal controls without validation. In this study, the expression stability of eight candidate reference genes in three tissues (intestine, respiratory tree, and muscle) of the sea cucumber Apostichopus japonicus was assessed during normal growth and aestivation using the geNorm, NormFinder, delta CT, and RefFinder algorithms. The results indicate that the reference genes exhibited significantly different expression patterns among the three tissues during aestivation. In general, the β-tubulin (TUBB) gene was relatively stable in the intestine and respiratory tree tissues. The optimal reference gene combination for intestine was 40S ribosomal protein S18 (RPS18), TUBB, and NADH dehydrogenase (NADH); for respiratory tree, it was β-actin (ACTB), TUBB, and succinate dehydrogenase cytochrome B small subunit (SDHC); and for muscle it was α-tubulin (TUBA) and NADH dehydrogenase [ubiquinone] 1 α subcomplex subunit 13 (NDUFA13). These combinations of internal control genes should be considered for use in further studies of gene expression in A. japonicus during aestivation.
Full Text Available Abstract Background Published data regarding the associations between genetic variants and asthma risk in Chinese population were inconclusive. The aim of this study was to investigate asthma susceptible genes in Chinese population. Methods The authors conducted 18 meta-analyzes for 18 polymorphisms in 13 genes from eighty-two publications. Results Seven polymorphisms were found being associated with risk of asthma, namely: A Disintegrin and Metalloprotease 33 (ADAM33 T1-C/T (odds ratio [OR] = 6.07, 95% confidence interval [CI]: 2.69-13.73, Angiotensin-Converting Enzyme (ACE D/I (OR = 3.85, 95%CI: 2.49-5.94, High-affinity IgE receptor β chain (FcεRIβ -6843G/A (OR = 1.49, 95%CI: 1.01-2.22, Interleukin 13(IL-13 -1923C/T (OR = 2.99, 95%CI: 2.12-4.24, IL-13 -2044A/G (OR = 1.49, 95%CI: 1.07-2.08, Regulated upon Activation, Normal T cell Expressed and Secreted (RANTES -28C/G (OR = 1.64, 95%CI: 1.09-2.46, Tumor Necrosis Factor-α (TNF-α -308G/A(OR = 1.42, 95%CI: 1.09, 1.85. After subgroup analysis by age, the ACE D/I, β2-Adrenergic Receptor (β2-AR -79G/C, TNF-α -308G/A, Interleukin 4 receptor(IL-4R -1902G/A and IL-13 -1923C/T polymorphisms were found significantly associated with asthma risk in Chinese children. In addition, the ACE D/I, FcεRIβ -6843G/A, TNF-α -308G/A, IL-13 -1923C/T and IL-13 -2044A/G polymorphisms were associated with asthma risk in Chinese adults. Conclusion ADAM33, FcεRIβ, RANTES, TNF-α, ACE, β2-AR, IL-4R and IL-13 genes could be proposed as asthma susceptible genes in Chinese population. Given the limited number of studies, more data are required to validate these associations.
Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo
In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both
Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo
Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further
Naveen S. Khanzada
Full Text Available Bipolar disorder (BPD and schizophrenia (SCH show similar neuropsychiatric behavioral disturbances, including impaired social interaction and communication, seen in autism spectrum disorder (ASD with multiple overlapping genetic and environmental influences implicated in risk and course of illness. GeneAnalytics software was used for pathway analysis and genetic profiling to characterize common susceptibility genes obtained from published lists for ASD (792 genes, BPD (290 genes and SCH (560 genes. Rank scores were derived from the number and nature of overlapping genes, gene-disease association, tissue specificity and gene functions subdivided into categories (e.g., diseases, tissues or functional pathways. Twenty-three genes were common to all three disorders and mapped to nine biological Superpathways including Circadian entrainment (10 genes, score = 37.0, Amphetamine addiction (five genes, score = 24.2, and Sudden infant death syndrome (six genes, score = 24.1. Brain tissues included the medulla oblongata (11 genes, score = 2.1, thalamus (10 genes, score = 2.0 and hypothalamus (nine genes, score = 2.0 with six common genes (BDNF, DRD2, CHRNA7, HTR2A, SLC6A3, and TPH2. Overlapping genes impacted dopamine and serotonin homeostasis and signal transduction pathways, impacting mood, behavior and physical activity level. Converging effects on pathways governing circadian rhythms support a core etiological relationship between neuropsychiatric illnesses and sleep disruption with hypoxia and central brain stem dysfunction.
Gao, Wu-Jun; Li, Shu-Fen; Zhang, Guo-Jun; Wang, Ning-Na; Deng, Chuan-Liang; Lu, Long-Dou
To identify rapidly a number of genes probably involved in sex determination and differentiation of the dioecious plant Asparagus officinalis, gene expression profiles in early flower development for male and female plants were investigated by microarray assay with 8,665 probes. In total, 638 male-biased and 543 female-biased genes were identified. These genes with biased-expression for male and female were involved in a variety of processes associated with molecular functions, cellular components, and biological processes, suggesting that a complex mechanism underlies the sex development of asparagus. Among the differentially expressed genes involved in the reproductive process, a number of genes associated with floral development were identified. Reverse transcription-PCR was performed for validation, and the results were largely consistent with those obtained by microarray analysis. The findings of this study might contribute to understanding of the molecular mechanisms of sex determination and differentiation in dioecious asparagus and provide a foundation for further studies of this plant.
Cava, Claudia; Bertoli, Gloria; Colaprico, Antonio; Olsen, Catharina; Bontempi, Gianluca; Castiglioni, Isabella
Modern high-throughput genomic technologies represent a comprehensive hallmark of molecular changes in pan-cancer studies. Although different cancer gene signatures have been revealed, the mechanism of tumourigenesis has yet to be completely understood. Pathways and networks are important tools to explain the role of genes in functional genomic studies. However, few methods consider the functional non-equal roles of genes in pathways and the complex gene-gene interactions in a network. We present a novel method in pan-cancer analysis that identifies de-regulated genes with a functional role by integrating pathway and network data. A pan-cancer analysis of 7158 tumour/normal samples from 16 cancer types identified 895 genes with a central role in pathways and de-regulated in cancer. Comparing our approach with 15 current tools that identify cancer driver genes, we found that 35.6% of the 895 genes identified by our method have been found as cancer driver genes with at least 2/15 tools. Finally, we applied a machine learning algorithm on 16 independent GEO cancer datasets to validate the diagnostic role of cancer driver genes for each cancer. We obtained a list of the top-ten cancer driver genes for each cancer considered in this study. Our analysis 1) confirmed that there are several known cancer driver genes in common among different types of cancer, 2) highlighted that cancer driver genes are able to regulate crucial pathways.
identification of genes related to sexual disparity in silk protein production efficiency. ... Ysh, a yellow cocoon color sex-limited strain of the silkworm B. mori, ...... alternative splicing of human genes. ... Structure, function and evolution of.
Chuen Yang Chua
Sep 5, 2017 ... The presence of purifying selection and low nucleotide diversity ... (2000) studied the gene substitution of ama1 ... in the gene coding for AMA-1 protein in Plasmodium ... Health Malaysia. ...... X. Asembo Bay Cohort Project.
Genes involved in myopathies: 82 genes, based on the disease groups ... 605517 Muscular dystrophy-dystroglycanopathy (congenital with brain and eye ..... Epilepsy, X-linked, with variable learning disabilities and behavior disorders. 300491.
CCL5 Chemokine (C-C motif) ligand 5 /RANTES. IFNγ Interferon gamma TNFα Tumor necrosis factor alpha HMGB1 High mobility group box 1 protein /high...aim of this study was to analyze gene expression levels of human host factors in melioidosis patients and establish useful correlation with disease...PBMC’s) of study subjects. Gene expression profiles of 25 gene targets including 19 immune response genes and 6 epigenetic factors were analyzed by
Clark Taane G
Full Text Available Abstract Background Imprinted genes show expression from one parental allele only and are important for development and behaviour. This extreme mode of allelic imbalance has been described for approximately 56 human genes. Imprinting status is often disrupted in cancer and dysmorphic syndromes. More subtle variation of gene expression, that is not parent-of-origin specific, termed 'allele-specific gene expression' (ASE is more common and may give rise to milder phenotypic differences. Using two allele-specific high-throughput technologies alongside bioinformatics predictions, normal term human placenta was screened to find new imprinted genes and to ascertain the extent of ASE in this tissue. Results Twenty-three family trios of placental cDNA, placental genomic DNA (gDNA and gDNA from both parents were tested for 130 candidate genes with the Sequenom MassArray system. Six genes were found differentially expressed but none imprinted. The Illumina ASE BeadArray platform was then used to test 1536 SNPs in 932 genes. The array was enriched for the human orthologues of 124 mouse candidate genes from bioinformatics predictions and 10 human candidate imprinted genes from EST database mining. After quality control pruning, a total of 261 informative SNPs (214 genes remained for analysis. Imprinting with maternal expression was demonstrated for the lymphocyte imprinted gene ZNF331 in human placenta. Two potential differentially methylated regions (DMRs were found in the vicinity of ZNF331. None of the bioinformatically predicted candidates tested showed imprinting except for a skewed allelic expression in a parent-specific manner observed for PHACTR2, a neighbour of the imprinted PLAGL1 gene. ASE was detected for two or more individuals in 39 candidate genes (18%. Conclusions Both Sequenom and Illumina assays were sensitive enough to study imprinting and strong allelic bias. Previous bioinformatics approaches were not predictive of new imprinted genes
Full Text Available Stipa grandis P. Smirn. is a dominant plant species in the typical steppe of the Xilingole Plateau of Inner Mongolia. Selection of suitable reference genes for the quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR is important for gene expression analysis and research into the molecular mechanisms underlying the stress responses of S. grandis. In the present study, 15 candidate reference genes (EF1 beta, ACT, GAPDH, SamDC, CUL4, CAP, SNF2, SKIP1, SKIP5, SKIP11, UBC2, UBC15, UBC17, UCH, and HERC2 were evaluated for their stability as potential reference genes for qRT-PCR under different stresses. Four algorithms were used: GeNorm, NormFinder, BestKeeper, and RefFinder. The results showed that the most stable reference genes were different under different stress conditions: EF1beta and UBC15 during drought and salt stresses; ACT and GAPDH under heat stress; SKIP5 and UBC17 under cold stress; UBC15 and HERC2 under high pH stress; UBC2 and UBC15 under wounding stress; EF1beta and UBC17 under jasmonic acid treatment; UBC15 and CUL4 under abscisic acid treatment; and HERC2 and UBC17 under salicylic acid treatment. EF1beta and HERC2 were the most suitable genes for the global analysis of all samples. Furthermore, six target genes, SgPOD, SgPAL, SgLEA, SgLOX, SgHSP90 and SgPR1, were selected to validate the most and least stable reference genes under different treatments. Our results provide guidelines for reference gene selection for more accurate qRT-PCR quantification and will promote studies of gene expression in S. grandis subjected to environmental stress.
Martin, Guiomar; Soy, Judit; Monte, Elena
Members of the PIF quartet (PIFq; PIF1, PIF3, PIF4, and PIF5) collectively contribute to induce growth in Arabidopsis seedlings under short day (SD) conditions, specifically promoting elongation at dawn. Their action involves the direct regulation of growth-related and hormone-associated genes. However, a comprehensive definition of the PIFq-regulated transcriptome under SD is still lacking. We have recently shown that SD and free-running (LL) conditions correspond to "growth" and "no growth" conditions, respectively, correlating with greater abundance of PIF protein in SD. Here, we present a genomic analysis whereby we first define SD-regulated genes at dawn compared to LL in the wild type, followed by identification of those SD-regulated genes whose expression depends on the presence of PIFq. By using this sequential strategy, we have identified 349 PIF/SD-regulated genes, approximately 55% induced and 42% repressed by both SD and PIFq. Comparison with available databases indicates that PIF/SD-induced and PIF/SD-repressed sets are differently phased at dawn and mid-morning, respectively. In addition, we found that whereas rhythmicity of the PIF/SD-induced gene set is lost in LL, most PIF/SD-repressed genes keep their rhythmicity in LL, suggesting differential regulation of both gene sets by the circadian clock. Moreover, we also uncovered distinct overrepresented functions in the induced and repressed gene sets, in accord with previous studies in other examined PIF-regulated processes. Interestingly, promoter analyses showed that, whereas PIF/SD-induced genes are enriched in direct PIF targets, PIF/SD-repressed genes are mostly indirectly regulated by the PIFs and might be more enriched in ABA-regulated genes.
Cumbie, Jason S; Kimbrel, Jeffrey A; Di, Yanming; Schafer, Daniel W; Wilhelm, Larry J; Fox, Samuel E; Sullivan, Christopher M; Curzon, Aron D; Carrington, James C; Mockler, Todd C; Chang, Jeff H
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
Jason S Cumbie
Full Text Available GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
Liu, Wei; Pan, Lei; Zhang, Minlong; Bo, Liyan; Li, Congcong; Liu, Qingqing; Wang, Li; Jin, Faguang
Seawater aspiration-induced acute lung injury (ALI) is a syndrome associated with a high mortality rate, which is characterized by severe hypoxemia, pulmonary edema and inflammation. The present study is the first, to the best of our knowledge, to analyze gene expression profiles from a rat model of seawater aspiration-induced ALI. Adult male Sprague-Dawley rats were instilled with seawater (4 ml/kg) in the seawater aspiration-induced ALI group (S group) or with distilled water (4 ml/kg) in the distilled water negative control group (D group). In the blank control group (C group) the rats' tracheae were exposed without instillation. Subsequently, lung samples were examined by histopathology; total protein concentration was detected in bronchoalveolar lavage fluid (BALF); lung wet/dry weight ratios were determined; and transcript expression was detected by gene sequencing analysis. The results demonstrated that histopathological alterations, pulmonary edema and total protein concentrations in BALF were increased in the S group compared with in the D group. Analysis of differential gene expression identified up and downregulated genes in the S group compared with in the D and C groups. A gene ontology analysis of the differential gene expression revealed enrichment of genes in the functional pathways associated with neutrophil chemotaxis, immune and defense responses, and cytokine activity. Kyoto Encyclopedia of Genes and Genomes analysis revealed that the cytokine-cytokine receptor interaction pathway was one of the most important pathways involved in seawater aspiration-induced ALI. In conclusion, activation of the cytokine-cytokine receptor interaction pathway may have an essential role in the progression of seawater aspiration-induced ALI, and the downregulation of tumor necrosis factor superfamily member 10 may enhance inflammation. Furthermore, IL-6 may be considered a biomarker in seawater aspiration-induced ALI. PMID:27509884
Tan, Jie; Huyck, Matthew; Hu, Dongbo; Zelaya, René A; Hogan, Deborah A; Greene, Casey S
Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data. Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr. We designed
Masami, NAKAJIMA; Junko, SUZUKI; Takehiko, HOSAKA; Tadaaki, HIBI; Katsumi, AKUTSU; School of Agriculture, Ibaraki University; School of Agriculture, Ibaraki University; School of Agriculture, Ibaraki University; Department of Agriculture and Environmental Biology, The University of Tokyo; School of Agriculture, Ibaraki University
The BMR1 gene encoding an ABC transporter was cloned from Botrytis cinerea. To examine the function of BMR1 in B.cinerea, we isolated BMR1-deficient mutants after gene disruption. Disruption vector pBcDF4 was constructed by replacing the BMR1-coding region with a hygromycin B phosphotransferase gene(hph)cassette. The BMR1 disruptants had an increased sensitivity to polyoxin and iprobenfos. Polyoxin and iprobenfos, structurally unrelated compounds, may therefore be substrates of BMR1.
Escott-Price, Valentina; Bellenguez, Céline; Wang, Li-San; Choi, Seung-Hoan; Harold, Denise; Jones, Lesley; Holmans, Peter Alan; Gerrish, Amy; Vedernikov, Alexey; Richards, Alexander; DeStefano, Anita L.; Lambert, Jean-Charles; Ibrahim-Verbaas, Carla A.; Naj, Adam C.; Sims, Rebecca
PUBLISHED BACKGROUND: Alzheimer's disease is a common debilitating dementia with known heritability, for which 20 late onset susceptibility loci have been identified, but more remain to be discovered. This study sought to identify new susceptibility genes, using an alternative gene-wide analytical approach which tests for patterns of association within genes, in the powerful genome-wide association dataset of the International Genomics of Alzheimer's Project Consortium, comprising over...
Singh, Pramesh; Chen, Tianlong; Arendsee, Zebulun; Wurtele, Eve S.; Bassler, Kevin E.
Orphan genes, which are genes unique to each particular species, have recently drawn significant attention for their potential usefulness for organismal robustness. Their origin and regulatory interaction patterns remain largely undiscovered. Recently, methods that use the context likelihood of relatedness to infer a network followed by modularity maximizing community detection algorithms on the inferred network to find the functional structure of regulatory networks were shown to be effective. We apply improved versions of these methods to gene expression data from Arabidopsis thaliana, identify groups (clusters) of interacting genes with related patterns of expression and analyze the structure within those groups. Focusing on clusters that contain orphan genes, we compare the identified clusters to gene ontology (GO) terms, regulons, and pathway designations and analyze their hierarchical structure. We predict new regulatory interactions and unravel the structure of the regulatory interaction patterns of orphan genes. Work supported by the NSF through Grants DMR-1507371 and IOS-1546858.
Chai, Wenbo; Jiang, Pengfei; Huang, Guoyu; Jiang, Haiyang; Li, Xiaoyu
The TCP family is a group of plant-specific transcription factors. TCP genes encode proteins harboring bHLH structure, which is implicated in DNA binding and protein-protein interactions and known as the TCP domain. TCP genes play important roles in plant development and have been evolutionarily and functionally elaborated in various plants, however, no overall phylogenetic analysis or expression profiling of TCP genes in Zea mays has been reported. In the present study, a systematic analysis of molecular evolution and functional prediction of TCP family genes in maize ( Z . mays L.) has been conducted. We performed a genome-wide survey of TCP genes in maize, revealing the gene structure, chromosomal location and phylogenetic relationship of family members. Microsynteny between grass species and tissue-specific expression profiles were also investigated. In total, 29 TCP genes were identified in the maize genome, unevenly distributed on the 10 maize chromosomes. Additionally, ZmTCP genes were categorized into nine classes based on phylogeny and purifying selection may largely be responsible for maintaining the functions of maize TCP genes. What's more, microsynteny analysis suggested that TCP genes have been conserved during evolution. Finally, expression analysis revealed that most TCP genes are expressed in the stem and ear, which suggests that ZmTCP genes influence stem and ear growth. This result is consistent with the previous finding that maize TCP genes represses the growth of axillary organs and enables the formation of female inflorescences. Altogether, this study presents a thorough overview of TCP family in maize and provides a new perspective on the evolution of this gene family. The results also indicate that TCP family genes may be involved in development stage in plant growing conditions. Additionally, our results will be useful for further functional analysis of the TCP gene family in maize.
Pessina, Stefano; Pavan, Stefano; Catalano, Domenico; Gallotta, Alessandra; Visser, Richard G F; Bai, Yuling; Malnoy, Mickael; Schouten, Henk J
Powdery mildew (PM) is a major fungal disease of thousands of plant species, including many cultivated Rosaceae. PM pathogenesis is associated with up-regulation of MLO genes during early stages of infection, causing down-regulation of plant defense pathways. Specific members of the MLO gene family act as PM-susceptibility genes, as their loss-of-function mutations grant durable and broad-spectrum resistance. We carried out a genome-wide characterization of the MLO gene family in apple, peach and strawberry, and we isolated apricot MLO homologs through a PCR-approach. Evolutionary relationships between MLO homologs were studied and syntenic blocks constructed. Homologs that are candidates for being PM susceptibility genes were inferred by phylogenetic relationships with functionally characterized MLO genes and, in apple, by monitoring their expression following inoculation with the PM causal pathogen Podosphaera leucotricha. Genomic tools available for Rosaceae were exploited in order to characterize the MLO gene family. Candidate MLO susceptibility genes were identified. In follow-up studies it can be investigated whether silencing or a loss-of-function mutations in one or more of these candidate genes leads to PM resistance.
Full Text Available Abstract Background Global regulatory mechanisms involving chromatin assembly and remodelling in the promoter regions of genes is implicated in eukaryotic transcription control especially for genes subjected to spatial and temporal regulation. The potential to utilise global regulatory mechanisms for controlling gene expression might depend upon the architecture of the chromatin in and around the gene. In-silico analysis can yield important insights into this aspect, facilitating comparison of two or more classes of genes comprising of a large number of genes within each group. Results In the present study, we carried out a comparative analysis of chromatin characteristics in terms of the scaffold/matrix attachment regions, nucleosome formation potential and the occurrence of repetitive sequences, in the upstream regulatory regions of housekeeping and tissue specific genes. Our data show that putative scaffold/matrix attachment regions are more abundant and nucleosome formation potential is higher in the 5' regions of tissue specific genes as compared to the housekeeping genes. Conclusion The differences in the chromatin features between the two groups of genes indicate the involvement of chromatin organisation in the control of gene expression. The presence of global regulatory mechanisms mediated through chromatin organisation can decrease the burden of invoking gene specific regulators for maintenance of the active/silenced state of gene expression. This could partially explain the lower number of genes estimated in the human genome.
Dong, Yang; Li, Ming; Liu, Puzhao; Song, Haiyan; Zhao, Yuping; Shi, Jianrong
Genes involved in immunity and apoptosis were associated with human presbycusis. CCR3 and GILZ played an important role in the pathogenesis of presbycusis, probably through regulating chemokine receptor, T-cell apoptosis, or T-cell activation pathways. To identify genes associated with human presbycusis and explore the molecular mechanism of presbycusis. Hearing function was tested by pure-tone audiometry. Microarray analysis was performed to identify presbycusis-correlated genes by Illumina Human-6 BeadChip using the peripheral blood samples of subjects. To identify biological process categories and pathways associated with presbycusis-correlated genes, bioinformatics analysis was carried out by Gene Ontology Tree Machine (GOTM) and database for annotation, visualization, and integrated discovery (DAVID). Quantitative RT-PCR (qRT-PCR) was used to validate the microarray data. Microarray analysis identified 469 up-regulated genes and 323 down-regulated genes. Both the dominant biological processes by Gene Ontology (GO) analysis and the enriched pathways by Kyoto encyclopedia of genes and genomes (KEGG) and BIOCARTA showed that genes involved in immunity and apoptosis were associated with presbycusis. In addition, CCR3, GILZ, CXCL10, and CX3CR1 genes showed consistent difference between groups for both the gene chip and qRT-PCR data. The differences of CCR3 and GILZ between presbycusis patients and controls were statistically significant (p < 0.05).
Full Text Available The expression and regulation of genes in different tissues are fundamental questions to be answered in biology. Knowledge enrichment analysis for tissue specific (TS and housekeeping (HK genes may help identify their roles in biological process or diseases and gain new biological insights.In this paper, we performed the knowledge enrichment analysis for 17,343 genes in 84 human tissues using Gene Set Enrichment Analysis (GSEA and Hypergeometric Analysis (HA against three biological ontologies: Gene Ontology (GO, KEGG pathways and Disease Ontology (DO respectively.The analyses results demonstrated that the functions of most gene groups are consistent with their tissue origins. Meanwhile three interesting new associations for HK genes and the skeletal muscle tissuegenes are found. Firstly, Hypergeometric analysis against KEGG database for HK genes disclosed that three disease terms (Parkinson’s disease, Huntington’s disease, Alzheimer’s disease are intensively enriched.Secondly, Hypergeometric analysis against the KEGG database for Skeletal Muscle tissue genes shows that two cardiac diseases of “Hypertrophic cardiomyopathy (HCM” and “Arrhythmogenic right ventricular cardiomyopathy (ARVC” are heavily enriched, which are also considered as no relationship with skeletal functions.Thirdly, “Prostate cancer” is intensively enriched in Hypergeometric analysis against the disease ontology (DO for the Skeletal Muscle tissue genes, which is a much unexpected phenomenon.
Full Text Available Flavonoids, the compounds that impart color to fruits, flowers, and seeds, are the most widespread secondary metabolites in plants. However, a systematic analysis of these loci has not been performed in Brassicaceae. In this study, we isolated 649 nucleotide sequences related to flavonoid biosynthesis, i.e., the Transparent Testa (TT genes, and their associated amino acid sequences in 17 Brassicaceae species, grouped into Arabidopsis or Brassicaceae subgroups. Moreover, 36 copies of 21 genes of the flavonoid biosynthesis pathway were identified in A. thaliana, 53 were identified in B. rapa, 50 in B. oleracea, and 95 in B. napus, followed the genomic distribution, collinearity analysis and genes triplication of them among Brassicaceae species. The results showed that the extensive gene loss, whole genome triplication, and diploidization that occurred after divergence from the common ancestor. Using qRT-PCR methods, we analyzed the expression of eighteen flavonoid biosynthesis genes in 6 yellow- and black-seeded B. napus inbred lines with different genetic background, found that 12 of which were preferentially expressed during seed development, whereas the remaining genes were expressed in all B. napus tissues examined. Moreover, fourteen of these genes showed significant differences in expression level during seed development, and all but four of these (i.e., BnTT5, BnTT7, BnTT10, and BnTTG1 had similar expression patterns among the yellow- and black-seeded B. napus. Results showed that the structural genes (BnTT3, BnTT18 and BnBAN, regulatory genes (BnTTG2 and BnTT16 and three encoding transfer proteins (BnTT12, BnTT19, and BnAHA10 might play an crucial roles in the formation of different seed coat colors in B. napus. These data will be helpful for illustrating the molecular mechanisms of flavonoid biosynthesis in Brassicaceae species.
Lee Bernett TK
Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.
Deng, Li-Ting; Wu, Yu-Ling; Li, Jun-Cheng; OuYang, Kun-Xi; Ding, Mei-Mei; Zhang, Jun-Jie; Li, Shu-Qi; Lin, Meng-Fei; Chen, Han-Bin; Hu, Xin-Sheng; Chen, Xiao-Yang
Moringa oleifera is a promising plant species for oil and forage, but its genetic improvement is limited. Our current breeding program in this species focuses on exploiting the functional genes associated with important agronomical traits. Here, we screened reliable reference genes for accurately quantifying the expression of target genes using the technique of real-time quantitative polymerase chain reaction (RT-qPCR) in M. oleifera. Eighteen candidate reference genes were selected from a transcriptome database, and their expression stabilities were examined in 90 samples collected from the pods in different developmental stages, various tissues, and the roots and leaves under different conditions (low or high temperature, sodium chloride (NaCl)- or polyethyleneglycol (PEG)- simulated water stress). Analyses with geNorm, NormFinder and BestKeeper algorithms revealed that the reliable reference genes differed across sample designs and that ribosomal protein L1 (RPL1) and acyl carrier protein 2 (ACP2) were the most suitable reference genes in all tested samples. The experiment results demonstrated the significance of using the properly validated reference genes and suggested the use of more than one reference gene to achieve reliable expression profiles. In addition, we applied three isotypes of the superoxide dismutase (SOD) gene that are associated with plant adaptation to abiotic stress to confirm the efficacy of the validated reference genes under NaCl and PEG water stresses. Our results provide a valuable reference for future studies on identifying important functional genes from their transcriptional expressions via RT-qPCR technique in M. oleifera.
Full Text Available Background: The presence of diverse types of nanomaterials (NMs in commerce is growing at an exponential pace. As a result, human exposure to these materials in the environment is inevitable, necessitating the need for rapid and reliable toxicity testing methods to accurately assess the potential hazards associated with NMs. In this study, we applied biclustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related biclusters of genes showing similar expression profiles were identified. The identified biclusters were then used to conduct a gene set enrichment analysis on pulmonary gene expression profiles derived from mice exposed to nano-titanium dioxide (nano-TiO2, carbon black (CB or carbon nanotubes (CNTs to determine the disease significance of these data-driven gene sets.Results: Biclusters representing inflammation (chemokine activity, DNA binding, cell cycle, apoptosis, reactive oxygen species (ROS and fibrosis processes were identified. All of the NM studies were significant with respect to the bicluster related to chemokine activity (DAVID; FDR p-value = 0.032. The bicluster related to pulmonary fibrosis was enriched in studies where toxicity induced by CNT and CB studies was investigated, suggesting the potential for these materials to induce lung fibrosis. The pro-fibrogenic potential of CNTs is well established. Although CB has not been shown to induce fibrosis, it induces stronger inflammatory, oxidative stress and DNA damage responses than nano-TiO2 particles.Conclusion: The results of the analysis correctly identified all NMs to be inflammogenic and only CB and CNTs as potentially fibrogenic. In addition to identifying several
Clive H Glover
Full Text Available Stem cell differentiation involves critical changes in gene expression. Identification of these should provide endpoints useful for optimizing stem cell propagation as well as potential clues about mechanisms governing stem cell maintenance. Here we describe the results of a new meta-analysis methodology applied to multiple gene expression datasets from three mouse embryonic stem cell (ESC lines obtained at specific time points during the course of their differentiation into various lineages. We developed methods to identify genes with expression changes that correlated with the altered frequency of functionally defined, undifferentiated ESC in culture. In each dataset, we computed a novel statistical confidence measure for every gene which captured the certainty that a particular gene exhibited an expression pattern of interest within that dataset. This permitted a joint analysis of the datasets, despite the different experimental designs. Using a ranking scheme that favored genes exhibiting patterns of interest, we focused on the top 88 genes whose expression was consistently changed when ESC were induced to differentiate. Seven of these (103728_at, 8430410A17Rik, Klf2, Nr0b1, Sox2, Tcl1, and Zfp42 showed a rapid decrease in expression concurrent with a decrease in frequency of undifferentiated cells and remained predictive when evaluated in additional maintenance and differentiating protocols. Through a novel meta-analysis, this study identifies a small set of genes whose expression is useful for identifying changes in stem cell frequencies in cultures of mouse ESC. The methods and findings have broader applicability to understanding the regulation of self-renewal of other stem cell types.
Van Assche, Evelien; Moons, Tim; Cinar, Ozan; Viechtbauer, Wolfgang; Oldehinkel, Albertine J.; Van Leeuwen, Karla; Verschueren, Karine; Colpin, Hilde; Lambrechts, Diether; Van den Noortgate, Wim; Goossens, Luc; Claes, Stephan; van Winkel, Ruud
BACKGROUND: Most gene-environment interaction studies (G × E) have focused on single candidate genes. This approach is criticized for its expectations of large effect sizes and occurrence of spurious results. We describe an approach that accounts for the polygenic nature of most psychiatric
Varn, Frederick S; Ung, Matthew H; Lou, Shao Ke; Cheng, Chao
Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient's cancer. Identifying robust gene sets that are consistently predictive of a patient's clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set's activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network's topology and applied the GSAS metric to characterize its role in patient survival. Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used
An, L; Xie, H; Chin, MH; Obradovic, Z; Smith, DJ; Megalooikonomou, V
Abstract Background Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we presen...
Full Text Available Gene set analysis is widely used to facilitate biological interpretations in the analyses of differential expression from high throughput profiling data. Wilcoxon Rank-Sum (WRS test is one of the commonly used methods in gene set enrichment analysis. It compares the ranks of genes in a gene set against those of genes outside the gene set. This method is easy to implement and it eliminates the dichotomization of genes into significant and non-significant in a competitive hypothesis testing. Due to the large number of genes being examined, it is impractical to calculate the exact null distribution for the WRS test. Therefore, the normal distribution is commonly used as an approximation. However, as we demonstrate in this paper, the normal approximation is problematic when a gene set with relative small number of genes is tested against the large number of genes in the complementary set. In this situation, a uniform approximation is substantially more powerful, more accurate, and less intensive in computation. We demonstrate the advantage of the uniform approximations in Gene Ontology (GO term analysis using simulations and real data sets.
have employed a large mRNA-seq data set to improve and validate ab initio predicted gene models. This direct experimental evidence also provides reliable determinations of UTR regions and polyadenylation sites, which are not easily predicted in plants. Furthermore, once an annotated genome sequence...... is available, gene expression by mRNA-Seq enables acquisition of a more complete overview of gene isoform usage in complex enzymatic pathways enabling the identification of key genes. Metabolism in potatoes This information is useful e.g. for crop improvement based on manipulation of agronomically important...
Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.
U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...
Full Text Available Wheat seed development is an important physiological process of seed maturation and directly affects wheat yield and quality. In this study, we performed dynamic transcriptome microarray analysis of an elite Chinese bread wheat cultivar (Jimai 20 during grain development using the GeneChip Wheat Genome Array. Grain morphology and scanning electron microscope observations showed that the period of 11–15 days post-anthesis (DPA was a key stage for the synthesis and accumulation of seed starch. Genome-wide transcriptional profiling and significance analysis of microarrays revealed that the period from 11 to 15 DPA was more important than the 15–20 DPA stage for the synthesis and accumulation of nutritive reserves. Series test of cluster analysis of differential genes revealed five statistically significant gene expression profiles. Gene ontology annotation and enrichment analysis gave further information about differentially expressed genes, and MapMan analysis revealed expression changes within functional groups during seed development. Metabolic pathway network analysis showed that major and minor metabolic pathways regulate one another to ensure regular seed development and nutritive reserve accumulation. We performed gene co-expression network analysis to identify genes that play vital roles in seed development and identified several key genes involved in important metabolic pathways. The transcriptional expression of eight key genes involved in starch and protein synthesis and stress defense was further validated by qRT-PCR. Our results provide new insight into the molecular mechanisms of wheat seed development and the determinants of yield and quality.
Full Text Available Being a sister species of Saccharomyces cerevisiae, Saccharomyces uvarum shows great potential regarding the future of the wine industry. The sulfite tolerance of most S. uvarum strains is poor, however. This is a major flaw that limits its utility in the wine industry. In S. cerevisiae, FZF1 plays a positive role in the transcription of SSU1, which encodes a sulfite efflux transport protein that is critical for sulfite tolerance. Although FZF1 has previously been shown to play a role in sulfite tolerance in S. uvarum, there is little information about its action mechanism. To assess the function of FZF1, two over-expression vectors that contained different FZF1 genes, and one FZF1 silencing vector, were constructed and introduced into a sulfite-tolerant S. uvarum strain using electroporation. In addition, an FZF1-deletion strain was constructed. Both of the FZF1-over-expressing strains showed an elevated tolerance to sulfite, and the FZF1-deletion strain showed the opposite effect. Repression of FZF1 transcription failed, however, presumably due to the lack of alleles of DCR1 and AGO. The qRT-PCR analysis was used to examine changes in transcription in the strains. Surprisingly, neither over-expressing strain promoted SSU1 transcription, although MET4 and HAL4 transcripts significantly increased in both sulfite-tolerance increased strains. We conclude that FZF1 plays a different role in the sulfite tolerance of S. uvarum compared to its role in S. cerevisiae.
Hermsen, Sanne A.B., E-mail: Sanne.Hermsen@rivm.nl [Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O. Box 1, 3720 BA Bilthoven (Netherlands); Department of Toxicogenomics, Maastricht University, P.O. Box 616, 6200 MD, Maastricht (Netherlands); Institute for Risk Assessment Sciences (IRAS), Utrecht University, P.O. Box 80.178, 3508 TD, Utrecht (Netherlands); Pronk, Tessa E. [Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O. Box 1, 3720 BA Bilthoven (Netherlands); Department of Toxicogenomics, Maastricht University, P.O. Box 616, 6200 MD, Maastricht (Netherlands); Brandhof, Evert-Jan van den [Centre for Environmental Quality, National Institute for Public Health and the Environment (RIVM), P.O. Box 1, 3720 BA Bilthoven (Netherlands); Ven, Leo T.M. van der [Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O. Box 1, 3720 BA Bilthoven (Netherlands); Piersma, Aldert H. [Centre for Health Protection, National Institute for Public Health and the Environment (RIVM), P.O. Box 1, 3720 BA Bilthoven (Netherlands); Institute for Risk Assessment Sciences (IRAS), Utrecht University, P.O. Box 80.178, 3508 TD, Utrecht (Netherlands)
The zebrafish embryotoxicity test is a promising alternative assay for developmental toxicity. Classically, morphological assessment of the embryos is applied to evaluate the effects of compound exposure. However, by applying differential gene expression analysis the sensitivity and predictability of the test may be increased. For defining gene expression signatures of developmental toxicity, we explored the possibility of using gene expression signatures of compound exposures based on commonly expressed individual genes as well as based on regulated gene pathways. Four developmental toxic compounds were tested in concentration-response design, caffeine, carbamazepine, retinoic acid and valproic acid, and two non-embryotoxic compounds, D-mannitol and saccharin, were included. With transcriptomic analyses we were able to identify commonly expressed genes, which were mostly development related, after exposure to the embryotoxicants. We also identified gene pathways regulated by the embryotoxicants, suggestive of their modes of action. Furthermore, whereas pathways may be regulated by all compounds, individual gene expression within these pathways can differ for each compound. Overall, the present study suggests that the use of individual gene expression signatures as well as pathway regulation may be useful starting points for defining gene biomarkers for predicting embryotoxicity. - Highlights: • The zebrafish embryotoxicity test in combination with transcriptomics was used. • We explored two approaches of defining gene biomarkers for developmental toxicity. • Four compounds in concentration-response design were tested. • We identified commonly expressed individual genes as well as regulated gene pathways. • Both approaches seem suitable starting points for defining gene biomarkers.
Susanta K Behura
Full Text Available Genome sequencing projects have presented the opportunity for analysis of developmental genes in three vector mosquito species: Aedes aegypti, Culex quinquefasciatus, and Anopheles gambiae. A comparative genomic analysis of developmental genes in Drosophila melanogaster and these three important vectors of human disease was performed in this investigation. While the study was comprehensive, special emphasis centered on genes that 1 are components of developmental signaling pathways, 2 regulate fundamental developmental processes, 3 are critical for the development of tissues of vector importance, 4 function in developmental processes known to have diverged within insects, and 5 encode microRNAs (miRNAs that regulate developmental transcripts in Drosophila. While most fruit fly developmental genes are conserved in the three vector mosquito species, several genes known to be critical for Drosophila development were not identified in one or more mosquito genomes. In other cases, mosquito lineage-specific gene gains with respect to D. melanogaster were noted. Sequence analyses also revealed that numerous repetitive sequences are a common structural feature of Drosophila and mosquito developmental genes. Finally, analysis of predicted miRNA binding sites in fruit fly and mosquito developmental genes suggests that the repertoire of developmental genes targeted by miRNAs is species-specific. The results of this study provide insight into the evolution of developmental genes and processes in dipterans and other arthropods, serve as a resource for those pursuing analysis of mosquito development, and will promote the design and refinement of functional analysis experiments.
Wang, Y H; Garvin, D F; Kochian, L V
A subtractive tomato (Lycopersicon esculentum) root cDNA library enriched in genes up-regulated by changes in plant mineral status was screened with labeled mRNA from roots of both nitrate-induced and mineral nutrient-deficient (-nitrogen [N], -phosphorus, -potassium [K], -sulfur, -magnesium, -calcium, -iron, -zinc, and -copper) tomato plants. A subset of cDNAs was selected from this library based on mineral nutrient-related changes in expression. Additional cDNAs were selected from a second mineral-deficient tomato root library based on sequence homology to known genes. These selection processes yielded a set of 1,280 mineral nutrition-related cDNAs that were arrayed on nylon membranes for further analysis. These high-density arrays were hybridized with mRNA from tomato plants exposed to nitrate at different time points after N was withheld for 48 h, for plants that were grown on nitrate/ammonium for 5 weeks prior to the withholding of N. One hundred-fifteen genes were found to be up-regulated by nitrate resupply. Among these genes were several previously identified as nitrate responsive, including nitrate transporters, nitrate and nitrite reductase, and metabolic enzymes such as transaldolase, transketolase, malate dehydrogenase, asparagine synthetase, and histidine decarboxylase. We also identified 14 novel nitrate-inducible genes, including: (a) water channels, (b) root phosphate and K(+) transporters, (c) genes potentially involved in transcriptional regulation, (d) stress response genes, and (e) ribosomal protein genes. In addition, both families of nitrate transporters were also found to be inducible by phosphate, K, and iron deficiencies. The identification of these novel nitrate-inducible genes is providing avenues of research that will yield new insights into the molecular basis of plant N nutrition, as well as possible networking between the regulation of N, phosphorus, and K nutrition.
Tejera, Eduardo; Cruz-Monteagudo, Maykel; Burgos, Germán; Sánchez, María-Eugenia; Sánchez-Rodríguez, Aminael; Pérez-Castillo, Yunierkis; Borges, Fernanda; Cordeiro, Maria Natália Dias Soeiro; Paz-Y-Miño, César; Rebelo, Irene
Preeclampsia is a multifactorial disease with unknown pathogenesis. Even when recent studies explored this disease using several bioinformatics tools, the main objective was not directed to pathogenesis. Additionally, consensus prioritization was proved to be highly efficient in the recognition of genes-disease association. However, not information is available about the consensus ability to early recognize genes directly involved in pathogenesis. Therefore our aim in this study is to apply several theoretical approaches to explore preeclampsia; specifically those genes directly involved in the pathogenesis. We firstly evaluated the consensus between 12 prioritization strategies to early recognize pathogenic genes related to preeclampsia. A communality analysis in the protein-protein interaction network of previously selected genes was done including further enrichment analysis. The enrichment analysis includes metabolic pathways as well as gene ontology. Microarray data was also collected and used in order to confirm our results or as a strategy to weight the previously enriched pathways. The consensus prioritized gene list was rationally filtered to 476 genes using several criteria. The communality analysis showed an enrichment of communities connected with VEGF-signaling pathway. This pathway is also enriched considering the microarray data. Our result point to VEGF, FLT1 and KDR as relevant pathogenic genes, as well as those connected with NO metabolism. Our results revealed that consensus strategy improve the detection and initial enrichment of pathogenic genes, at least in preeclampsia condition. Moreover the combination of the first percent of the prioritized genes with protein-protein interaction network followed by communality analysis reduces the gene space. This approach actually identifies well known genes related with pathogenesis. However, genes like HSP90, PAK2, CD247 and others included in the first 1% of the prioritized list need to be further
Oct 18, 2007 ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their conserved exons. MATERIALS AND METHODS. Multiple sequence alignment. Sucrose synthase gene sequences of various cereals like rice, maize, and barley were accessed from NCBI Genbank database.
Pollen staining test with 1% I2KI solution showed segregation ratio of 15:1 (fertile: sterile), representing two nuclear independent dominant genes controlling the trait carried by fertile parent DN-33-18. Segregation for spikelet fertility in F2 confirmed the results of pollen fertility test. Molecular tagging of fertility restorer genes ...
Home; Journals; Journal of Genetics; Volume 93; Issue 3 ... Research Article Volume 93 Issue 3 December 2014 pp 725-731 ... Although the unique properties of wheat -gliadin gene family are well characterized, little is known about the evolution and genomic divergence of -gliadin gene family within the Triticeae.
Several clustering and biclustering methods have been introduced to analyze the gene expression data by identifying the similar patterns and grouping genes into subsets that share biological significance. However, it is not clear how the different methods compare with each other with respect to the biological relevance of ...
Stougaard, J; Sandal, N N; Grøn, A
The soybean leghaemoglobin lbc(3) gene promoter was analysed in transgenic Lotus corniculatus plants. Hybrid-promoter constructions and 5' deletions were studied using chimeric genes composed of the various promoters, the chloramphenicol acetyltransferase (CAT) coding sequence and the lbc(3) 3...
The Tp73 gene encoding p73 protein belongs to the Tp53 gene family and it functions in the initiation of cell-cycle arrest or apoptosis and also involves in regulating a series of pathways including breast cancer, neuroblastoma and cholorectal cancer. New discoveries about the control and function of p73 are still in progress ...
Ronander, Elena; Bengtsson, Dominique C; Joergensen, Louise
Adhesion of Plasmodium falciparum infected erythrocytes (IE) to human endothelial receptors during malaria infections is mediated by expression of PfEMP1 protein variants encoded by the var genes. The haploid P. falciparum genome harbors approximately 60 different var genes of which only one has...... been believed to be transcribed per cell at a time during the blood stage of the infection. How such mutually exclusive regulation of var gene transcription is achieved is unclear, as is the identification of individual var genes or sub-groups of var genes associated with different receptors...... fluorescent in situ hybridization (FISH) analysis of var gene transcription by the parasite in individual nuclei of P. falciparum IE(1). Here, we present a detailed protocol for carrying out the RNA-FISH methodology for analysis of var gene transcription in single-nuclei of P. falciparum infected human...
Fu, Wei; Xie, Wen; Zhang, Zhuo; Wang, Shaoli; Wu, Qingjun; Liu, Yong; Zhou, Xiaomao; Zhou, Xuguo; Zhang, Youjun
Abstract: Quantitative real-time PCR (qRT-PCR), a primary tool in gene expression analysis, requires an appropriate normalization strategy to control for variation among samples. The best option is to compare the mRNA level of a target gene with that of reference gene(s) whose expression level is stable across various experimental conditions. In this study, expression profiles of eight candidate reference genes from the diamondback moth, Plutella xylostella, were evaluated under diverse experimental conditions. RefFinder, a web-based analysis tool, integrates four major computational programs including geNorm, Normfinder, BestKeeper, and the comparative ΔCt method to comprehensively rank the tested candidate genes. Elongation factor 1 (EF1) was the most suited reference gene for the biotic factors (development stage, tissue, and strain). In contrast, although appropriate reference gene(s) do exist for several abiotic factors (temperature, photoperiod, insecticide, and mechanical injury), we were not able to identify a single universal reference gene. Nevertheless, a suite of candidate reference genes were specifically recommended for selected experimental conditions. Our finding is the first step toward establishing a standardized qRT-PCR analysis of this agriculturally important insect pest. PMID:23983612
Santos, Eliane Macedo Sobrinho; Santos, Hércules Otacílio; Dos Santos Dias, Ivoneth; Santos, Sérgio Henrique; Batista de Paula, Alfredo Maurício; Feltenberger, John David; Sena Guimarães, André Luiz; Farias, Lucyana Conceição
Pathogenesis of odontogenic tumors is not well known. It is important to identify genetic deregulations and molecular alterations. This study aimed to investigate, through bioinformatic analysis, the possible genes involved in the pathogenesis of ameloblastoma (AM) and keratocystic odontogenic tumor (KCOT). Genes involved in the pathogenesis of AM and KCOT were identified in GeneCards. Gene list was expanded, and the gene interactions network was mapped using the STRING software. "Weighted number of links" (WNL) was calculated to identify "leader genes" (highest WNL). Genes were ranked by K-means method and Kruskal-Wallis test was used (Preview data was used to corroborate the bioinformatics data. CDK1 was identified as leader gene for AM. In KCOT group, results show PCNA and TP53 . Both tumors exhibit a power law behavior. Our topological analysis suggested leader genes possibly important in the pathogenesis of AM and KCOT, by clustering coefficient calculated for both odontogenic tumors (0.028 for AM, zero for KCOT). The results obtained in the scatter diagram suggest an important relationship of these genes with the molecular processes involved in AM and KCOT. Ontological analysis for both AM and KCOT demonstrated different mechanisms. Bioinformatics analyzes were confirmed through literature review. These results may suggest the involvement of promising genes for a better understanding of the pathogenesis of AM and KCOT.
Full Text Available The RNA helicases, which help to unwind stable RNA duplexes, and have important roles in RNA metabolism, belong to a class of motor proteins that play important roles in plant development and responses to stress. Although this family of genes has been the subject of systematic investigation in Arabidopsis, rice, and tomato, it has not yet been characterized in cotton. In this study, we identified 161 putative RNA helicase genes in the genome of the diploid cotton species Gossypium raimondii. We classified these genes into three subfamilies, based on the presence of either a DEAD-box (51 genes, DEAH-box (52 genes, or DExD/H-box (58 genes in their coding regions. Chromosome location analysis showed that the genes that encode RNA helicases are distributed across all 13 chromosomes of G. raimondii. Syntenic analysis revealed that 62 of the 161 G. raimondii helicase genes (38.5% are within the identified syntenic blocks. Sixty-six (40.99% helicase genes from G. raimondii have one or several putative orthologs in tomato. Additionally, GrDEADs have more conserved gene structures and more simple domains than GrDEAHs and GrDExD/Hs. Transcriptome sequencing data demonstrated that many of these helicases, especially GrDEADs, are highly expressed at the fiber initiation stage and in mature leaves. To our knowledge, this is the first report of a genome-wide analysis of the RNA helicase gene family in cotton.
Full Text Available Abstract Background Maturation of spermatozoa, including development of motility and the ability to fertilize the oocyte, occurs during transit through the microenvironment of the epididymis. Comprehensive understanding of sperm maturation requires identification and characterization of unique genes expressed in the epididymis. Results We systematically identified 32 novel genes with epididymis-specific or -predominant expression in the mouse epididymis UniGene library, containing 1505 gene-oriented transcript clusters, by in silico and in vitro analyses. The Northern blot analysis revealed various characteristics of the genes at the transcript level, such as expression level, size and the presence of isoform. We found that expression of the half of the genes is regulated by androgens. Further expression analyses demonstrated that the novel genes are region-specific and developmentally regulated. Computational analysis showed that 15 of the genes lack human orthologues, suggesting their implication in male reproduction unique to the mouse. A number of the novel genes are putative epididymal protease inhibitors or β-defensins. We also found that six of the genes have secretory activity, indicating that they may interact with sperm and have functional roles in sperm maturation. Conclusion We identified and characterized 32 novel epididymis-specific or -predominant genes by an integrative approach. Our study is unique in the aspect of systematic identification of novel epididymal genes and should be a firm basis for future investigation into molecular mechanisms underlying sperm maturation in the epididymis.
Song, Jae-Jun; Kwon, Jee Young; Park, Moo Kyun; Seo, Young Rok
The primary aim of this study is to reveal the effect of particulate matter (PM) on the human middle ear epithelial cell (HMEEC). The HMEEC was treated with PM (300 μg/ml) for 24 h. Total RNA was extracted and used for microarray analysis. Molecular pathways among differentially expressed genes were further analyzed by using Pathway Studio 9.0 software. For selected genes, the changes in gene expression were confirmed by real-time PCR. A total of 611 genes were regulated by PM. Among them, 366 genes were up-regulated, whereas 245 genes were down-regulated. Up-regulated genes were mainly involved in cellular processes, including reactive oxygen species generation, cell proliferation, apoptosis, cell differentiation, inflammatory response and immune response. Down-regulated genes affected several cellular processes, including cell differentiation, cell cycle, proliferation, apoptosis and cell migration. A total of 21 genes were discovered as crucial components in potential signaling networks containing 2-fold up regulated genes. Four genes, VEGFA, IL1B, CSF2 and HMOX1 were revealed as key mediator genes among the up-regulated genes. A total of 25 genes were revealed as key modulators in the signaling pathway associated with 2-fold down regulated genes. Four genes, including IGF1R, TIMP1, IL6 and FN1, were identified as the main modulator genes. We identified the differentially expressed genes in PM-treated HMEEC, whose expression profile may provide a useful clue for the understanding of environmental pathophysiology of otitis media. Our work indicates that air pollution, like PM, plays an important role in the pathogenesis of otitis media. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Full Text Available In the post genome era, a major goal of biology is the identification of specific roles for individual genes. We report a new genomic tool for gene characterization, the UCLA Gene Expression Tool (UGET.Celsius, the largest co-normalized microarray dataset of Affymetrix based gene expression, was used to calculate the correlation between all possible gene pairs on all platforms, and generate stored indexes in a web searchable format. The size of Celsius makes UGET a powerful gene characterization tool. Using a small seed list of known cartilage-selective genes, UGET extended the list of known genes by identifying 32 new highly cartilage-selective genes. Of these, 7 of 10 tested were validated by qPCR including the novel cartilage-specific genes SDK2 and FLJ41170. In addition, we retrospectively tested UGET and other gene expression based prioritization tools to identify disease-causing genes within known linkage intervals. We first demonstrated this utility with UGET using genetically heterogeneous disorders such as Joubert syndrome, microcephaly, neuropsychiatric disorders and type 2 limb girdle muscular dystrophy (LGMD2 and then compared UGET to other gene expression based prioritization programs which use small but discrete and well annotated datasets. Finally, we observed a significantly higher gene correlation shared between genes in disease networks associated with similar complex or Mendelian disorders.UGET is an invaluable resource for a geneticist that permits the rapid inclusion of expression criteria from one to hundreds of genes in genomic intervals linked to disease. By using thousands of arrays UGET annotates and prioritizes genes better than other tools especially with rare tissue disorders or complex multi-tissue biological processes. This information can be critical in prioritization of candidate genes for sequence analysis.
Zhou, Xiaobo; Qiu, Weiliang; Sathirapongsasuti, J. Fah.; Cho, Michael H.; Mancini, John D.; Lao, Taotao; Thibault, Derek M.; Litonjua, Gus; Bakke, Per S.; Gulsvik, Amund; Lomas, David A.; Beaty, Terri H.; Hersh, Craig P.; Anderson, Christopher; Geigenmuller, Ute; Raby, Benjamin A.; Rennard, Stephen I.; Perrella, Mark A.; Choi, Augustine M.K.; Quackenbush, John; Silverman, Edwin K.
Hedgehog Interacting Protein (HHIP) was implicated in chronic obstructive pulmonary disease (COPD) by genome-wide association studies (GWAS). However, it remains unclear how HHIP contributes to COPD pathogenesis. To identify genes regulated by HHIP, we performed gene expression microarray analysis in a human bronchial epithelial cell line (Beas-2B) stably infected with HHIP shRNAs. HHIP silencing led to differential expression of 296 genes; enrichment for variants nominally associated with COPD was found. Eighteen of the differentially expressed genes were validated by real-time PCR in Beas-2B cells. Seven of 11 validated genes tested in human COPD and control lung tissues demonstrated significant gene expression differences. Functional annotation indicated enrichment for extracellular matrix and cell growth genes. Network modeling demonstrated that the extracellular matrix and cell proliferation genes influenced by HHIP tended to be interconnected. Thus, we identified potential HHIP targets in human bronchial epithelial cells that may contribute to COPD pathogenesis. PMID:23459001
Full Text Available Genome-wide dissection of the heat stress response (HSR is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5- 4 h at 45°C (high temperature, HT: 5.2% (2,142 genes in Chiifu and 3.7% (1,535 genes in Kenshin. The most enriched GO (Gene Ontology items included 'response to heat', 'response to reactive oxygen species (ROS', 'response to temperature stimulus', 'response to abiotic stimulus', and 'MAPKKK cascade'. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps and heat shock factor (Hsf-like proteins such as HsfB2A (Bra029292, whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853, protein kinases, and phosphatases. Among heat stress (HS marker genes in Arabidopsis, only exportin 1A (XPO1A (Bra008580, Bra006382 can be applied to B. rapa for basal thermotolerance (BT and short-term acquired thermotolerance (SAT gene. CYP707A3 (Bra025083, Bra021965, which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF genes, including DREB2A (Bra005852, were involved in HS tolerance in both lines, Bra024224 (MYB41 and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1] were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data could also provide a
Biedler, James K; Tu, Zhijian
The maternal zygotic transition marks the time at which transcription from the zygotic genome is initiated and a subset of maternal RNAs are progressively degraded in the developing embryo. A number of early zygotic genes have been identified in Drosophila melanogaster and comparisons to sequenced mosquito genomes suggest that some of these early zygotic genes such as bottleneck are fast-evolving or subject to turnover in dipteran insects. One objective of this study is to identify early zygotic genes from the yellow fever mosquito Aedes aegypti to study their evolution. We are also interested in obtaining early zygotic promoters that will direct transgene expression in the early embryo as part of a Medea gene drive system. Two novel early zygotic kinesin light chain genes we call AaKLC2.1 and AaKLC2.2 were identified by transcriptome sequencing of Aedes aegypti embryos at various time points. These two genes have 98% nucleotide and amino acid identity in their coding regions and show transcription confined to the early zygotic stage according to gene-specific RT-PCR analysis. These AaKLC2 genes have a paralogous gene (AaKLC1) in Ae. aegypti. Phylogenetic inference shows that an ortholog to the AaKLC2 genes is only found in the sequenced genome of Culex quinquefasciatus. In contrast, AaKLC1 gene orthologs are found in all three sequenced mosquito species including Anopheles gambiae. There is only one KLC gene in D. melanogaster and other sequenced holometabolous insects that appears to be similar to AaKLC1. Unlike AaKLC2, AaKLC1 is expressed in all life stages and tissues tested, which is consistent with the expression pattern of the An. gambiae and D. melanogaster KLC genes. Phylogenetic inference also suggests that AaKLC2 genes and their likely C. quinquefasciatus ortholog are fast-evolving genes relative to the highly conserved AaKLC1-like paralogs. Embryonic injection of a luciferase reporter under the control of a 1 kb fragment upstream of the AaKLC2.1 start
Full Text Available Abstract Background The maternal zygotic transition marks the time at which transcription from the zygotic genome is initiated and a subset of maternal RNAs are progressively degraded in the developing embryo. A number of early zygotic genes have been identified in Drosophila melanogaster and comparisons to sequenced mosquito genomes suggest that some of these early zygotic genes such as bottleneck are fast-evolving or subject to turnover in dipteran insects. One objective of this study is to identify early zygotic genes from the yellow fever mosquito Aedes aegypti to study their evolution. We are also interested in obtaining early zygotic promoters that will direct transgene expression in the early embryo as part of a Medea gene drive system. Results Two novel early zygotic kinesin light chain genes we call AaKLC2.1 and AaKLC2.2 were identified by transcriptome sequencing of Aedes aegypti embryos at various time points. These two genes have 98% nucleotide and amino acid identity in their coding regions and show transcription confined to the early zygotic stage according to gene-specific RT-PCR analysis. These AaKLC2 genes have a paralogous gene (AaKLC1 in Ae. aegypti. Phylogenetic inference shows that an ortholog to the AaKLC2 genes is only found in the sequenced genome of Culex quinquefasciatus. In contrast, AaKLC1 gene orthologs are found in all three sequenced mosquito species including Anopheles gambiae. There is only one KLC gene in D. melanogaster and other sequenced holometabolous insects that appears to be similar to AaKLC1. Unlike AaKLC2, AaKLC1 is expressed in all life stages and tissues tested, which is consistent with the expression pattern of the An. gambiae and D. melanogaster KLC genes. Phylogenetic inference also suggests that AaKLC2 genes and their likely C. quinquefasciatus ortholog are fast-evolving genes relative to the highly conserved AaKLC1-like paralogs. Embryonic injection of a luciferase reporter under the control of a
Full Text Available Abstract Background Apple fruit develop over a period of 150 days from anthesis to fully ripe. An array representing approximately 13000 genes (15726 oligonucleotides of 45–55 bases designed from apple ESTs has been used to study gene expression over eight time points during fruit development. This analysis of gene expression lays the groundwork for a molecular understanding of fruit growth and development in apple. Results Using ANOVA analysis of the microarray data, 1955 genes showed significant changes in expression over this time course. Expression of genes is coordinated with four major patterns of expression observed: high in floral buds; high during cell division; high when starch levels and cell expansion rates peak; and high during ripening. Functional analysis associated cell cycle genes with early fruit development and three core cell cycle genes are significantly up-regulated in the early stages of fruit development. Starch metabolic genes were associated with changes in starch levels during fruit development. Comparison with microarrays of ethylene-treated apple fruit identified a group of ethylene induced genes also induced in normal fruit ripening. Comparison with fruit development microarrays in tomato has been used to identify 16 genes for which expression patterns are similar in apple and tomato and these genes may play fundamental roles in fruit development. The early phase of cell division and tissue specification that occurs in the first 35 days after pollination has been associated with up-regulation of a cluster of genes that includes core cell cycle genes. Conclusion Gene expression in apple fruit is coordinated with specific developmental stages. The array results are reproducible and comparisons with experiments in other species has been used to identify genes that may play a fundamental role in fruit development.
Full Text Available BACKGROUND AND OBJECTIVES: Analysis of positively-selected genes can help us understand how human evolved, especially the evolution of highly developed cognitive functions. However, previous works have reached conflicting conclusions regarding whether human neuronal genes are over-represented among genes under positive selection. METHODS AND RESULTS: We divided positively-selected genes into four groups according to the identification approaches, compiling a comprehensive list from 27 previous studies. We showed that genes that are highly expressed in the central nervous system are enriched in recent positive selection events in human history identified by intra-species genomic scan, especially in brain regions related to cognitive functions. This pattern holds when different datasets, parameters and analysis pipelines were used. Functional category enrichment analysis supported these findings, showing that synapse-related functions are enriched in genes under recent positive selection. In contrast, immune-related functions, for instance, are enriched in genes under ancient positive selection revealed by inter-species coding region comparison. We further demonstrated that most of these patterns still hold even after controlling for genomic characteristics that might bias genome-wide identification of positively-selected genes including gene length, gene density, GC composition, and intensity of negative selection. CONCLUSION: Our rigorous analysis resolved previous conflicting conclusions and revealed recent adaptation of human brain functions.
Full Text Available Abstract Background Recent circadian clock studies using gene expression microarray in two different tissues of mouse have revealed not all circadian-related genes are synchronized in phase or peak expression times across tissues in vivo. Instead, some circadian-related genes may be delayed by 4–8 hrs in peak expression in one tissue relative to the other. These interesting biological observations prompt a statistical question regarding how to distinguish the synchronized genes from genes that are systematically lagged in phase/peak expression time across two tissues. Results We propose a set of techniques from circular statistics to analyze phase angles of circadian-related genes in two tissues. We first estimate the phases of a cycling gene separately in each tissue, which are then used to estimate the paired angular difference of the phase angles of the gene in the two tissues. These differences are modeled as a mixture of two von Mises distributions which enables us to cluster genes into two groups; one group having synchronized transcripts with the same phase in the two tissues, the other containing transcripts with a discrepancy in phase between the two tissues. For each cluster of genes we assess the association of phases across the tissue types using circular-circular regression. We also develop a bootstrap methodology based on a circular-circular regression model to evaluate the improvement in fit provided by allowing two components versus a one-component von-Mises model. Conclusion We applied our proposed methodologies to the circadian-related genes common to heart and liver tissues in Storch et al. 2, and found that an estimated 80% of circadian-related transcripts common to heart and liver tissues were synchronized in phase, and the other 20% of transcripts were lagged about 8 hours in liver relative to heart. The bootstrap p-value for being one cluster is 0.063, which suggests the possibility of two clusters. Our methodologies can
Kim, Myeong Hee; Kang, So Young; Lee, Woo In
The aim of this study is to investigate the molecular characteristics of occult hepatitis B virus (HBV) infection in 'anti-HBc alone' subjects. Twenty-four patients with 'anti-HBc alone' and 20 control patients diagnosed with HBV were analyzed regarding S and pre-S gene mutations. All specimens were analyzed for HBs Ag, anti-HBc, and anti-HBs. For specimens with an anti-HBc alone, quantitative analysis of HBV DNA, as well as sequencing and mutation analysis of S and pre-S genes, were performed. A total 24 were analyzed for the S gene, and 14 were analyzed for the pre-S gene through sequencing. A total of 20 control patients were analyzed for S and pre-S gene simultaneously. Nineteen point mutations of the major hydrophilic region were found in six of 24 patients. Among them, three mutations, S114T, P127S/T, M133T, were detected in common. Only one mutation was found in five subjects of the control group; this mutation was not found in the occult HBV infection group, however. Pre-S mutations were detected in 10 patients, and mutations of site aa58-aa100 were detected in 9 patients. A mutation on D114E was simultaneously detected. Although five mutations from the control group were found at the same location (aa58-aa100), no mutations of occult HBV infection were detected. The prevalence of occult HBV infection is not low among 'anti-HBc alone' subjects. Variable mutations in the S gene and pre-S gene were associated with the occurrence of occult HBV infection. Further larger scale studies are required to determine the significance of newly detected mutations. © Copyright: Yonsei University College of Medicine 2017
Full Text Available Abstract Background In the nematode Caenorhabditis elegans the conserved Ins/IGF-1 signaling pathway regulates many biological processes including life span, stress response, dauer diapause and metabolism. Detection of differentially expressed genes may contribute to a better understanding of the mechanism by which the Ins/IGF-1 signaling pathway regulates these processes. Appropriate normalization is an essential prerequisite for obtaining accurate and reproducible quantification of gene expression levels. The aim of this study was to establish a reliable set of reference genes for gene expression analysis in C. elegans. Results Real-time quantitative PCR was used to evaluate the expression stability of 12 candidate reference genes (act-1, ama-1, cdc-42, csq-1, eif-3.C, mdh-1, gpd-2, pmp-3, tba-1, Y45F10D.4, rgs-6 and unc-16 in wild-type, three Ins/IGF-1 pathway mutants, dauers and L3 stage larvae. After geNorm analysis, cdc-42, pmp-3 and Y45F10D.4 showed the most stable expression pattern and were used to normalize 5 sod expression levels. Significant differences in mRNA levels were observed for sod-1 and sod-3 in daf-2 relative to wild-type animals, whereas in dauers sod-1, sod-3, sod-4 and sod-5 are differentially expressed relative to third stage larvae. Conclusion Our findings emphasize the importance of accurate normalization using stably expressed reference genes. The methodology used in this study is generally applicable to reliably quantify gene expression levels in the nematode C. elegans using quantitative PCR.
Escott-Price, Valentina; Bellenguez, Céline; Wang, Li-San; Choi, Seung-Hoan; Harold, Denise; Jones, Lesley; Holmans, Peter; Gerrish, Amy; Vedernikov, Alexey; Richards, Alexander; DeStefano, Anita L; Lambert, Jean-Charles; Ibrahim-Verbaas, Carla A; Naj, Adam C; Sims, Rebecca; Jun, Gyungah; Bis, Joshua C; Beecham, Gary W; Grenier-Boley, Benjamin; Russo, Giancarlo; Thornton-Wells, Tricia A; Denning, Nicola; Smith, Albert V; Chouraki, Vincent; Thomas, Charlene; Ikram, M Arfan; Zelenika, Diana; Vardarajan, Badri N; Kamatani, Yoichiro; Lin, Chiao-Feng; Schmidt, Helena; Kunkle, Brian; Dunstan, Melanie L; Vronskaya, Maria; Johnson, Andrew D; Ruiz, Agustin; Bihoreau, Marie-Thérèse; Reitz, Christiane; Pasquier, Florence; Hollingworth, Paul; Hanon, Olivier; Fitzpatrick, Annette L; Buxbaum, Joseph D; Campion, Dominique; Crane, Paul K; Baldwin, Clinton; Becker, Tim; Gudnason, Vilmundur; Cruchaga, Carlos; Craig, David; Amin, Najaf; Berr, Claudine; Lopez, Oscar L; De Jager, Philip L; Deramecourt, Vincent; Johnston, Janet A; Evans, Denis; Lovestone, Simon; Letenneur, Luc; Hernández, Isabel; Rubinsztein, David C; Eiriksdottir, Gudny; Sleegers, Kristel; Goate, Alison M; Fiévet, Nathalie; Huentelman, Matthew J; Gill, Michael; Brown, Kristelle; Kamboh, M Ilyas; Keller, Lina; Barberger-Gateau, Pascale; McGuinness, Bernadette; Larson, Eric B; Myers, Amanda J; Dufouil, Carole; Todd, Stephen; Wallon, David; Love, Seth; Rogaeva, Ekaterina; Gallacher, John; George-Hyslop, Peter St; Clarimon, Jordi; Lleo, Alberto; Bayer, Anthony; Tsuang, Debby W; Yu, Lei; Tsolaki, Magda; Bossù, Paola; Spalletta, Gianfranco; Proitsi, Petra; Collinge, John; Sorbi, Sandro; Garcia, Florentino Sanchez; Fox, Nick C; Hardy, John; Naranjo, Maria Candida Deniz; Bosco, Paolo; Clarke, Robert; Brayne, Carol; Galimberti, Daniela; Scarpini, Elio; Bonuccelli, Ubaldo; Mancuso, Michelangelo; Siciliano, Gabriele; Moebus, Susanne; Mecocci, Patrizia; Zompo, Maria Del; Maier, Wolfgang; Hampel, Harald; Pilotto, Alberto; Frank-García, Ana; Panza, Francesco; Solfrizzi, Vincenzo; Caffarra, Paolo; Nacmias, Benedetta; Perry, William; Mayhaus, Manuel; Lannfelt, Lars; Hakonarson, Hakon; Pichler, Sabrina; Carrasquillo, Minerva M; Ingelsson, Martin; Beekly, Duane; Alvarez, Victoria; Zou, Fanggeng; Valladares, Otto; Younkin, Steven G; Coto, Eliecer; Hamilton-Nelson, Kara L; Gu, Wei; Razquin, Cristina; Pastor, Pau; Mateo, Ignacio; Owen, Michael J; Faber, Kelley M; Jonsson, Palmi V; Combarros, Onofre; O'Donovan, Michael C; Cantwell, Laura B; Soininen, Hilkka; Blacker, Deborah; Mead, Simon; Mosley, Thomas H; Bennett, David A; Harris, Tamara B; Fratiglioni, Laura; Holmes, Clive; de Bruijn, Renee F A G; Passmore, Peter; Montine, Thomas J; Bettens, Karolien; Rotter, Jerome I; Brice, Alexis; Morgan, Kevin; Foroud, Tatiana M; Kukull, Walter A; Hannequin, Didier; Powell, John F; Nalls, Michael A; Ritchie, Karen; Lunetta, Kathryn L; Kauwe, John S K; Boerwinkle, Eric; Riemenschneider, Matthias; Boada, Mercè; Hiltunen, Mikko; Martin, Eden R; Schmidt, Reinhold; Rujescu, Dan; Dartigues, Jean-François; Mayeux, Richard; Tzourio, Christophe; Hofman, Albert; Nöthen, Markus M; Graff, Caroline; Psaty, Bruce M; Haines, Jonathan L; Lathrop, Mark; Pericak-Vance, Margaret A; Launer, Lenore J; Van Broeckhoven, Christine; Farrer, Lindsay A; van Duijn, Cornelia M; Ramirez, Alfredo; Seshadri, Sudha; Schellenberg, Gerard D; Amouyel, Philippe; Williams, Julie
Alzheimer's disease is a common debilitating dementia with known heritability, for which 20 late onset susceptibility loci have been identified, but more remain to be discovered. This study sought to identify new susceptibility genes, using an alternative gene-wide analytical approach which tests for patterns of association within genes, in the powerful genome-wide association dataset of the International Genomics of Alzheimer's Project Consortium, comprising over 7 m genotypes from 25,580 Alzheimer's cases and 48,466 controls. In addition to earlier reported genes, we detected genome-wide significant loci on chromosomes 8 (TP53INP1, p = 1.4×10-6) and 14 (IGHV1-67 p = 7.9×10-8) which indexed novel susceptibility loci. The additional genes identified in this study, have an array of functions previously implicated in Alzheimer's disease, including aspects of energy metabolism, protein degradation and the immune system and add further weight to these pathways as potential therapeutic targets in Alzheimer's disease.
Full Text Available Alzheimer's disease is a common debilitating dementia with known heritability, for which 20 late onset susceptibility loci have been identified, but more remain to be discovered. This study sought to identify new susceptibility genes, using an alternative gene-wide analytical approach which tests for patterns of association within genes, in the powerful genome-wide association dataset of the International Genomics of Alzheimer's Project Consortium, comprising over 7 m genotypes from 25,580 Alzheimer's cases and 48,466 controls.In addition to earlier reported genes, we detected genome-wide significant loci on chromosomes 8 (TP53INP1, p = 1.4×10-6 and 14 (IGHV1-67 p = 7.9×10-8 which indexed novel susceptibility loci.The additional genes identified in this study, have an array of functions previously implicated in Alzheimer's disease, including aspects of energy metabolism, protein degradation and the immune system and add further weight to these pathways as potential therapeutic targets in Alzheimer's disease.
Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes.Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters.Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that
Deokar, Amit A; Tar'an, Bunyamin
Aquaporins (AQPs) are essential membrane proteins that play critical role in the transport of water and many other solutes across cell membranes. In this study, a comprehensive genome-wide analysis identified 40 AQP genes in chickpea ( Cicer arietinum L.). A complete overview of the chickpea AQP (CaAQP) gene family is presented, including their chromosomal locations, gene structure, phylogeny, gene duplication, conserved functional motifs, gene expression, and conserved promoter motifs. To understand AQP's evolution, a comparative analysis of chickpea AQPs with AQP orthologs from soybean, Medicago, common bean, and Arabidopsis was performed. The chickpea AQP genes were found on all of the chickpea chromosomes, except chromosome 7, with a maximum of six genes on chromosome 6, and a minimum of one gene on chromosome 5. Gene duplication analysis indicated that the expansion of chickpea AQP gene family might have been due to segmental and tandem duplications. CaAQPs were grouped into four subfamilies including 15 NOD26-like intrinsic proteins (NIPs), 13 tonoplast intrinsic proteins (TIPs), eight plasma membrane intrinsic proteins (PIPs), and four small basic intrinsic proteins (SIPs) based on sequence similarities and phylogenetic position. Gene structure analysis revealed a highly conserved exon-intron pattern within CaAQP subfamilies supporting the CaAQP family classification. Functional prediction based on conserved Ar/R selectivity filters, Froger's residues, and specificity-determining positions suggested wide differences in substrate specificity among the subfamilies of CaAQPs. Expression analysis of the AQP genes indicated that some of the genes are tissue-specific, whereas few other AQP genes showed differential expression in response to biotic and abiotic stresses. Promoter profiling of CaAQP genes for conserved cis -acting regulatory elements revealed enrichment of cis -elements involved in circadian control, light response, defense and stress responsiveness
Abstract. Background: Human cytomegalovirus (HCMV) is a virus which has the potential to alter cellular gene expression through .... and (reverse: 5'-CAG CAC CAT CCT CCT CTT. CCT CT ..... acute respiratory syndrome (SARS) coronavirus.
Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. According to the relative conservation of homologous gene, a bioinformatics strategy was applied to clone Fusarium ...
of plants such as Arabidopsis, Oryza sativa, Zea mays, poplar, apple and tomato. However ... found to share a similar intron/exon structure and gene length within the same class. .... ches against the proteome and genome files downloaded.
Wermter, Anne-Kathrin; Reichwald, Kathrin; Büch, Thomas
The importance of the melanin-concentrating hormone (MCH) system for regulation of energy homeostasis and body weight has been demonstrated in rodents. We analysed the human MCH receptor 1 gene (MCHR1) with respect to human obesity....
isoforms of cytochrome P450, genes for polyamine biosynthesis (putrescine and proline) ..... CAB97048 mitochondrial half-ABC transporter [Arabidopsis thaliana] up .... AAC72194 pyruvate dehydrogenase E1 beta subunit isoform 3 [Zea mays].
Gao, Na; Ma, Bin-Guang; Zhang, Yu-Sheng; Song, Qin; Chen, Ling-Ling; Zhang, Hong-Yu
To investigate the general radiation-resistant mechanisms of bacteria, bioinformatic method was employed to predict highly expressed genes for four radiation-resistant bacteria, i.e. Deinococcus geothermalis (D. geo), Deinococcus radiodurans (D. rad), Kineococcus radiotolerans (K. rad) and Rubrobacter xylanophilus (R. xyl). It is revealed that most of the three reference gene sets, i.e. ribosomal proteins, transcription factors and major chaperones, are generally highly expressed in the four ...
Travensolo,Regiane F.; Carareto-Alves,Lucia M.; Costa,Maria V.C.G.; Lopes,Tiago J.S.; Carrilho,Emanuel; Lemos,Eliana G.M.
Xylella fastidiosa genome sequencing has generated valuable data by identifying genes acting either on metabolic pathways or in associated pathogenicity and virulence. Based on available information on these genes, new strategies for studying their expression patterns, such as microarray technology, were employed. A total of 2,600 primer pairs were synthesized and then used to generate fragments using the PCR technique. The arrays were hybridized against cDNAs labeled during reverse transcrip...
A full-length cDNA encoding the immunoglobulin (IgM) heavy chain gene of Nile tilapia was successfully cloned using the 5' and 3' RACE techniques. The complete cDNA of the Nile tilapia IgM heavy chain gene is 1,921 bp in length and has an open reading frame (ORF) of 1,740 bp, which corresponds to 580 amino acid ...
Hansen, Kasper Lage; Hansen, Niclas Tue; Karlberg, Erik, Olof, Linnart
to be overexpressed in the normal tissues where defects cause pathology. In contrast, cancer genes and complexes were not overexpressed in the tissues from which the tumors emanate. We specifically identified a complex involved in XY sex reversal that is testis-specific and down-regulated in ovaries. We also......Heritable diseases are caused by germ-line mutations that, despite tissuewide presence, often lead to tissue-specific pathology. Here, we make a systematic analysis of the link between tissue-specific gene expression and pathological manifestations in many human diseases and cancers. Diseases were...
Full Text Available Background: A large number of gene expression profiling (GEP studies on colorectal carcinogenesis have been performed but no reliable gene signature has been identified so far due to the lack of reproducibility in the reported genes. There is growing evidence that functionally related genes, rather than individual genes, contribute to the etiology of complex traits. We used, as a novel approach, pathway enrichment tools to define functionally related genes that are consistently up- or down-regulated in colorectal carcinogenesis. Materials and Methods: We started the analysis with 242 unique annotated genes that had been reported by any of three recent meta-analyses covering GEP studies on genes differentially expressed in carcinoma vs normal mucosa. Most of these genes (218, 91.9% had been reported in at least three GEP studies. These 242 genes were submitted to bioinformatic analysis using a total of nine tools to detect enrichment of Gene Ontology (GO categories or Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. As a final consistency criterion the pathway categories had to be enriched by several tools to be taken into consideration. Results: Our pathway-based enrichment analysis identified the categories of ribosomal protein constituents, extracellular matrix receptor interaction, carbonic anhydrase isozymes, and a general category related to inflammation and cellular response as significantly and consistently overrepresented entities. Conclusions: We triaged the genes covered by the published GEP literature on colorectal carcinogenesis and subjected them to multiple enrichment tools in order to identify the consistently enriched gene categories. These turned out to have known functional relationships to cancer development and thus deserve further investigation.
Santos, Jansen Rodrigo Pereira; Ndeve, Arsenio Daniel; Huynh, Bao-Lam; Matthews, William Charles; Roberts, Philip Alan
Cowpea is one of the most important food and forage legumes in drier regions of the tropics and subtropics. However, cowpea yield worldwide is markedly below the known potential due to abiotic and biotic stresses, including parasitism by root-knot nematodes (Meloidogyne spp., RKN). Two resistance genes with dominant effect, Rk and Rk2, have been reported to provide resistance against RKN in cowpea. Despite their description and use in breeding for resistance to RKN and particularly genetic mapping of the Rk locus, the exact genes conferring resistance to RKN remain unknown. In the present work, QTL mapping using recombinant inbred line (RIL) population 524B x IT84S-2049 segregating for a newly mapped locus and analysis of the transcriptome changes in two cowpea near-isogenic lines (NIL) were used to identify candidate genes for Rk and the newly mapped locus. A major QTL, designated QRk-vu9.1, associated with resistance to Meloidogyne javanica reproduction, was detected and mapped on linkage group LG9 at position 13.37 cM using egg production data. Transcriptome analysis on resistant and susceptible NILs 3 and 9 days after inoculation revealed up-regulation of 109 and 98 genes and down-regulation of 110 and 89 genes, respectively, out of 19,922 unique genes mapped to the common bean reference genome. Among the differentially expressed genes, four and nine genes were found within the QRk-vu9.1 and QRk-vu11.1 QTL intervals, respectively. Six of these genes belong to the TIR-NBS-LRR family of resistance genes and three were upregulated at one or more time-points. Quantitative RT-PCR validated gene expression to be positively correlated with RNA-seq expression pattern for eight genes. Future functional analysis of these cowpea genes will enhance our understanding of Rk-mediated resistance and identify the specific gene responsible for the resistance.
Jansen Rodrigo Pereira Santos
Full Text Available Cowpea is one of the most important food and forage legumes in drier regions of the tropics and subtropics. However, cowpea yield worldwide is markedly below the known potential due to abiotic and biotic stresses, including parasitism by root-knot nematodes (Meloidogyne spp., RKN. Two resistance genes with dominant effect, Rk and Rk2, have been reported to provide resistance against RKN in cowpea. Despite their description and use in breeding for resistance to RKN and particularly genetic mapping of the Rk locus, the exact genes conferring resistance to RKN remain unknown. In the present work, QTL mapping using recombinant inbred line (RIL population 524B x IT84S-2049 segregating for a newly mapped locus and analysis of the transcriptome changes in two cowpea near-isogenic lines (NIL were used to identify candidate genes for Rk and the newly mapped locus. A major QTL, designated QRk-vu9.1, associated with resistance to Meloidogyne javanica reproduction, was detected and mapped on linkage group LG9 at position 13.37 cM using egg production data. Transcriptome analysis on resistant and susceptible NILs 3 and 9 days after inoculation revealed up-regulation of 109 and 98 genes and down-regulation of 110 and 89 genes, respectively, out of 19,922 unique genes mapped to the common bean reference genome. Among the differentially expressed genes, four and nine genes were found within the QRk-vu9.1 and QRk-vu11.1 QTL intervals, respectively. Six of these genes belong to the TIR-NBS-LRR family of resistance genes and three were upregulated at one or more time-points. Quantitative RT-PCR validated gene expression to be positively correlated with RNA-seq expression pattern for eight genes. Future functional analysis of these cowpea genes will enhance our understanding of Rk-mediated resistance and identify the specific gene responsible for the resistance.
Langfelder, Peter; Mischel, Paul S; Horvath, Steve
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to
Langfelder, Peter; Mischel, Paul S.; Horvath, Steve
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to
Full Text Available Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data. Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA in three comprehensive and unbiased empirical studies: (1 Finding genes predictive of lung cancer survival, (2 finding methylation markers related to age, and (3 finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1. However, standard meta-analysis methods perform as good as (if not better than a consensus network approach in terms of validation success (criterion 2. The article also reports a comparison of meta-analysis techniques
Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui
In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.
Background Aspartic proteases (APs) are a large family of proteolytic enzymes found in almost all organisms. In plants, they are involved in many biological processes, such as senescence, stress responses, programmed cell death, and reproduction. Prior to the present study, no grape AP gene(s) had been reported, and their research on woody species was very limited. Results In this study, a total of 50 AP genes (VvAP) were identified in the grape genome, among which 30 contained the complete ASP domain. Synteny analysis within grape indicated that segmental and tandem duplication events contributed to the expansion of the grape AP family. Additional analysis between grape and Arabidopsis demonstrated that several grape AP genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grape and Arabidopsis. Phylogenetic relationships of the 30 VvAPs with the complete ASP domain and their Arabidopsis orthologs, as well as their gene and protein features were analyzed and their cellular localization was predicted. Moreover, expression profiles of VvAP genes in six different tissues were determined, and their transcript abundance under various stresses and hormone treatments were measured. Twenty-seven VvAP genes were expressed in at least one of the six tissues examined; nineteen VvAPs responded to at least one abiotic stress, 12 VvAPs responded to powdery mildew infection, and most of the VvAPs responded to SA and ABA treatments. Furthermore, integrated synteny and phylogenetic analysis identified orthologous AP genes between grape and Arabidopsis, providing a unique starting point for investigating the function of grape AP genes. Conclusions The genome-wide identification, evolutionary and expression analyses of grape AP genes provide a framework for future analysis of AP genes in defining their roles during stress response. Integrated synteny and phylogenetic analyses provide novel insight into the
Xiong, Wangdan; Xu, Xueqin; Zhang, Lin; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang
The WRKY proteins, which contain highly conserved WRKYGQK amino acid sequences and zinc-finger-like motifs, constitute a large family of transcription factors in plants. They participate in diverse physiological and developmental processes. WRKY genes have been identified and characterized in a number of plant species. We identified a total of 58 WRKY genes (JcWRKY) in the genome of the physic nut (Jatropha curcas L.). On the basis of their conserved WRKY domain sequences, all of the JcWRKY proteins could be assigned to one of the previously defined groups, I-III. Phylogenetic analysis of JcWRKY genes with Arabidopsis and rice WRKY genes, and separately with castor bean WRKY genes, revealed no evidence of recent gene duplication in JcWRKY gene family. Analysis of transcript abundance of JcWRKY gene products were tested in different tissues under normal growth condition. In addition, 47 WRKY genes responded to at least one abiotic stress (drought, salinity, phosphate starvation and nitrogen starvation) in individual tissues (leaf, root and/or shoot cortex). Our study provides a useful reference data set as the basis for cloning and functional analysis of physic nut WRKY genes. Copyright © 2013 Elsevier B.V. All rights reserved.
Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B
A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research.
Song, Xiaoming; Duan, Weike; Huang, Zhinan; Liu, Gaofeng; Wu, Peng; Liu, Tongkun; Li, Ying; Hou, Xilin
In plants, flowering is the most important transition from vegetative to reproductive growth. The flowering patterns of monocots and eudicots are distinctly different, but few studies have described the evolutionary patterns of the flowering genes in them. In this study, we analysed the evolutionary pattern, duplication and expression level of these genes. The main results were as follows: (i) characterization of flowering genes in monocots and eudicots, including the identification of family-specific, orthologous and collinear genes; (ii) full characterization of CONSTANS-like genes in Brassica rapa (BraCOL genes), the key flowering genes; (iii) exploration of the evolution of COL genes in plant kingdom and construction of the evolutionary pattern of COL genes; (iv) comparative analysis of CO and FT genes between Brassicaceae and Grass, which identified several family-specific amino acids, and revealed that CO and FT protein structures were similar in B. rapa and Arabidopsis but different in rice; and (v) expression analysis of photoperiod pathway-related genes in B. rapa under different photoperiod treatments by RT-qPCR. This analysis will provide resources for understanding the flowering mechanisms and evolutionary pattern of COL genes. In addition, this genome-wide comparative study of COL genes may also provide clues for evolution of other flowering genes.
He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai
The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF
Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi
Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome
Dou, Lingling; Zhang, Xiaohong; Pang, Chaoyou; Song, Meizhen; Wei, Hengling; Fan, Shuli; Yu, Shuxun
WRKY proteins are major transcription factors involved in regulating plant growth and development. Although many studies have focused on the functional identification of WRKY genes, our knowledge concerning many areas of WRKY gene biology is limited. For example, in cotton, the phylogenetic characteristics, global expression patterns, molecular mechanisms regulating expression, and target genes/pathways of WRKY genes are poorly characterized. Therefore, in this study, we present a genome-wide analysis of the WRKY gene family in cotton (Gossypium raimondii and Gossypium hirsutum). We identified 116 WRKY genes in G. raimondii from the completed genome sequence, and we cloned 102 WRKY genes in G. hirsutum. Chromosomal location analysis indicated that WRKY genes in G. raimondii evolved mainly from segmental duplication followed by tandem amplifications. Phylogenetic analysis of alga, bryophyte, lycophyta, monocot and eudicot WRKY domains revealed family member expansion with increasing complexity of the plant body. Microarray, expression profiling and qRT-PCR data revealed that WRKY genes in G. hirsutum may regulate the development of fibers, anthers, tissues (roots, stems, leaves and embryos), and are involved in the response to stresses. Expression analysis showed that most group II and III GhWRKY genes are highly expressed under diverse stresses. Group I members, representing the ancestral form, seem to be insensitive to abiotic stress, with low expression divergence. Our results indicate that cotton WRKY genes might have evolved by adaptive duplication, leading to sensitivity to diverse stresses. This study provides fundamental information to inform further analysis and understanding of WRKY gene functions in cotton species.
Petersen, Gitte; Seberg, Ole; Baden, Claus
A phylogenetic analysis of the small, Central Asian genus Psathyrostachys Nevski is presented. The analysis is based on morphological characters and nucleotide sequence data from one nuclear gene, DMC1, and three plastid genes, rbcL, rpoA, and rpoC2. Separate analyses of the three data partitions...... (morphology, nuclear sequences, and plastid sequences) result in mostly congruent trees. The plastid and nuclear sequences produce completely congruent trees, and only the trees based on plastid sequences and morphological characters are incongruent. Combined analysis of all data results in a fairly well......-resolved strict consensus tree: Ps. rupestris is the sister to the remaining species, which are divided into two clades: one including Ps. fragilis and Ps. caduca, the other including Ps. juncea, Ps. huashanica, Ps. lanuginosa, Ps. stoloniformis, and Ps. kronenburgii. Pubescent culms and more than 20 mm long...
Zhou, Wuhua; Gong, Li; Li, Xuefeng; Wan, Yunyan; Wang, Xiangfei; Li, Huili; Jiang, Bin
Insulinoma is a rare type tumor and its genetic features remain largely unknown. This study aimed to search for potential key genes and relevant enriched pathways of insulinoma.The gene expression data from GSE73338 were downloaded from Gene Expression Omnibus database. Differentially expressed genes (DEGs) were identified between insulinoma tissues and normal pancreas tissues, followed by pathway enrichment analysis, protein-protein interaction (PPI) network construction, and module analysis. The expressions of candidate key genes were validated by quantitative real-time polymerase chain reaction (RT-PCR) in insulinoma tissues.A total of 1632 DEGs were obtained, including 1117 upregulated genes and 514 downregulated genes. Pathway enrichment results showed that upregulated DEGs were significantly implicated in insulin secretion, and downregulated DEGs were mainly enriched in pancreatic secretion. PPI network analysis revealed 7 hub genes with degrees more than 10, including GCG (glucagon), GCGR (glucagon receptor), PLCB1 (phospholipase C, beta 1), CASR (calcium sensing receptor), F2R (coagulation factor II thrombin receptor), GRM1 (glutamate metabotropic receptor 1), and GRM5 (glutamate metabotropic receptor 5). DEGs involved in the significant modules were enriched in calcium signaling pathway, protein ubiquitination, and platelet degranulation. Quantitative RT-PCR data confirmed that the expression trends of these hub genes were similar to the results of bioinformatic analysis.The present study demonstrated that candidate DEGs and enriched pathways were the potential critical molecule events involved in the development of insulinoma, and these findings were useful for better understanding of insulinoma genesis.
Rodenburg, Sander Y.A.; Terhem, Razak B.; Veloso, Javier; Stassen, Joost H.M.; Kan, van Jan A.L.
Botrytis cinerea is a plant-pathogenic fungus producing apothecia as sexual fruiting bodies. To study the function of mating type (MAT) genes, single-gene deletion mutants were generated in both genes of the MAT1-1 locus and both genes of the MAT1-2 locus. Deletion mutants in two MAT genes were
Pessina, S.; Pavan, S.N.C.; Catalano, D.; Gallotta, A.; Visser, R.G.F.; Bai, Y.; Malnoy, M.; Schouten, H.J.
Background Powdery mildew (PM) is a major fungal disease of thousands of plant species, including many cultivated Rosaceae. PM pathogenesis is associated with up-regulation of MLO genes during early stages of infection, causing down-regulation of plant defense pathways. Specific members of the MLO
Edwards Jeremy S
Full Text Available Abstract Background Genome sequencing and bioinformatics are producing detailed lists of the molecular components contained in many prokaryotic organisms. From this 'parts catalogue' of a microbial cell, in silico representations of integrated metabolic functions can be constructed and analyzed using flux balance analysis (FBA. FBA is particularly well-suited to study metabolic networks based on genomic, biochemical, and strain specific information. Results Herein, we have utilized FBA to interpret and analyze the metabolic capabilities of Escherichia coli. We have computationally mapped the metabolic capabilities of E. coli using FBA and examined the optimal utilization of the E. coli metabolic pathways as a function of environmental variables. We have used an in silico analysis to identify seven gene products of central metabolism (glycolysis, pentose phosphate pathway, TCA cycle, electron transport system essential for aerobic growth of E. coli on glucose minimal media, and 15 gene products essential for anaerobic growth on glucose minimal media. The in silico tpi-, zwf, and pta- mutant strains were examined in more detail by mapping the capabilities of these in silico isogenic strains. Conclusions We found that computational models of E. coli metabolism based on physicochemical constraints can be used to interpret mutant behavior. These in silica results lead to a further understanding of the complex genotype-phenotype relation. Supplementary information: http://gcrg.ucsd.edu/supplementary_data/DeletionAnalysis/main.htm
Rivera-Vega, M Refugio; Chiñas-Lopez, Silvet; Vaca, Ana Luisa Jimenez; Arenas-Sordo, M Luz; Kofman-Alfaro, Susana; Messina-Baas, Olga; Cuevas-Covarrubias, Sergio Alberto
To describe the molecular defects in the Norrie disease protein (NDP) gene in two families with Norrie disease (ND). We analysed two families with ND at molecular level through polymerase chain reaction, DNA sequence analysis and GeneScan. Two molecular defects found in the NDP gene were: a missense mutation (265C > G) within codon 97 that resulted in the interchange of arginine by proline, and a partial deletion in the untranslated 3' region of exon 3 of the NDP gene. Clinical findings were more severe in the family that presented the partial deletion. We also diagnosed the carrier status of one daughter through GeneScan; this method proved to be a useful tool for establishing female carriers of ND. Here we report two novel mutations in the NDP gene in Mexican patients and propose that GeneScan is a viable mean of establishing ND carrier status.
Damotte, V; Guillot-Noel, L; Patsopoulos, N A
adhesion molecule (CAMs) biological pathway using Cytoscape software. This network is a strong candidate, as it is involved in the crossing of the blood-brain barrier by the T cells, an early event in MS pathophysiology, and is used as an efficient therapeutic target. We drew up a list of 76 genes...... in interaction with other genes as a group. Pathway analysis is an alternative way to highlight such group of genes. Using SNP association P-values from eight multiple sclerosis (MS) GWAS data sets, we performed a candidate pathway analysis for MS susceptibility by considering genes interacting in the cell...... belonging to the CAM network. We highlighted 64 networks enriched with CAM genes with low P-values. Filtering by a percentage of CAM genes up to 50% and rejecting enriched signals mainly driven by transcription factors, we highlighted five networks associated with MS susceptibility. One of them, constituted...
Guerriero, Gea; Giorno, Filomena; Ciccotti, Anna Maria; Schmidt, Silvia; Baric, Sanja
Apple proliferation (AP) represents a serious threat to several fruit-growing areas and is responsible for great economic losses. Several studies have highlighted the key role played by the cell wall in response to pathogen attack. The existence of a cell wall integrity signaling pathway which senses perturbations in the cell wall architecture upon abiotic/biotic stresses and activates specific defence responses has been widely demonstrated in plants. More recently a role played by cell wall-related genes has also been reported in plants infected by phytoplasmas. With the aim of shedding light on the cell wall response to AP disease in the economically relevant fruit-tree Malus × domestica Borkh., we investigated the expression of the cellulose (CesA) and callose synthase (CalS) genes in different organs (i.e., leaves, roots and branch phloem) of healthy and infected symptomatic outdoor-grown trees, sampled over the course of two time points (i.e., spring and autumn 2011), as well as in in vitro micropropagated control and infected plantlets. A strong up-regulation in the expression of cell wall biosynthetic genes was recorded in roots from infected trees. Secondary cell wall CesAs showed up-regulation in the phloem tissue from branches of infected plants, while either a down-regulation of some genes or no major changes were observed in the leaves. Micropropagated plantlets also showed an increase in cell wall-related genes and constitute a useful system for a general assessment of gene expression analysis upon phytoplasma infection. Finally, we also report the presence of several ‘knot’-like structures along the roots of infected apple trees and discuss the occurrence of this interesting phenotype in relation to the gene expression results and the modalities of phytoplasma diffusion. PMID:23086810
Full Text Available Many spliceosomal introns exist in the eukaryotic nuclear genome. Despite much research, the evolution of spliceosomal introns remains poorly understood. In this paper, we tried to gain insights into intron evolution from a novel perspective by comparing the gene structures of cytoplasmic ribosomal proteins (CRPs and mitochondrial ribosomal proteins (MRPs, which are held to be of archaeal and bacterial origin, respectively. We analyzed 25 homologous pairs of CRP and MRP genes that together had a total of 527 intron positions. We found that all 12 of the intron positions shared by CRP and MRP genes resulted from parallel intron gains and none could be considered to be "conserved," i.e., descendants of the same ancestor. This was supported further by the high frequency of proto-splice sites at these shared positions; proto-splice sites are proposed to be sites for intron insertion. Although we could not definitively disprove that spliceosomal introns were already present in the last universal common ancestor, our results lend more support to the idea that introns were gained late. At least, our results show that MRP genes were intronless at the time of endosymbiosis. The parallel intron gains between CRP and MRP genes accounted for 2.3% of total intron positions, which should provide a reliable estimate for future inferences of intron evolution.
Yang, Zhi; Wang, Chunyan; Wang, Tao; Bai, Jianhui; Zhao, Yu; Liu, Xuhan; Ma, Qingwei; Wu, Xiaobing; Guo, Ying; Zhao, Yaofeng; Ren, Liming
CD1, as the third family of antigen-presenting molecules, is previously only found in mammals and chickens, which suggests that the chicken and mammalian CD1 shared a common ancestral gene emerging at least 310 million years ago. Here, we describe CD1 genes in the green anole lizard and Crocodylia, demonstrating that CD1 is ubiquitous in mammals, birds, and reptiles. Although the reptilian CD1 protein structures are predicted to be similar to human CD1d and chicken CD1.1, CD1 isotypes are not found to be orthologous between mammals, birds, and reptiles according to phylogenetic analyses, suggesting an independent diversification of CD1 isotypes during the speciation of mammals, birds, and reptiles. In the green anole lizard, although the single CD1 locus and MHC I gene are located on the same chromosome, there is an approximately 10-Mb-long sequence in between, and interestingly, several genes flanking the CD1 locus belong to the MHC paralogous region on human chromosome 19. The CD1 genes in Crocodylia are located in two loci, respectively linked to the MHC region and MHC paralogous region (corresponding to the MHC paralogous region on chromosome 19). These results provide new insights for studying the origin and evolution of CD1.
Maria Victoria Fernández
Full Text Available Alzheimer disease (AD, Frontotemporal lobar degeneration (FTD, Amyotrophic lateral sclerosis (ALS and Parkinson disease (PD have a certain degree of clinical, pathological and molecular overlap. Previous studies indicate that causative mutations in AD and FTD/ALS genes can be found in clinical familial AD. We examined the presence of causative and low frequency coding variants in the AD, FTD, ALS and PD Mendelian genes, in over 450 families with clinical history of AD and over 11,710 sporadic cases and cognitive normal participants from North America. Known pathogenic mutations were found in 1.05% of the sporadic cases, in 0.69% of the cognitively normal participants and in 4.22% of the families. A trend towards enrichment, albeit non-significant, was observed for most AD, FTD and PD genes. Only PSEN1 and PINK1 showed consistent association with AD cases when we used ExAC as the control population. These results suggest that current study designs may contain heterogeneity and contamination of the control population, and that current statistical methods for the discovery of novel genes with real pathogenic variants in complex late onset diseases may be inadequate or underpowered to identify genes carrying pathogenic mutations.
Hill, Jonathon T; Demarest, Bradley; Gorsi, Bushra; Smith, Megan; Yost, H Joseph
During embryogenesis the heart forms as a linear tube that then undergoes multiple simultaneous morphogenetic events to obtain its mature shape. To understand the gene regulatory networks (GRNs) driving this phase of heart development, during which many congenital heart disease malformations likely arise, we conducted an RNA-seq timecourse in zebrafish from 30 hpf to 72 hpf and identified 5861 genes with altered expression. We clustered the genes by temporal expression pattern, identified transcription factor binding motifs enriched in each cluster, and generated a model GRN for the major gene batteries in heart morphogenesis. This approach predicted hundreds of regulatory interactions and found batteries enriched in specific cell and tissue types, indicating that the approach can be used to narrow the search for novel genetic markers and regulatory interactions. Subsequent analyses confirmed the GRN using two mutants, Tbx5 and nkx2-5 , and identified sets of duplicated zebrafish genes that do not show temporal subfunctionalization. This dataset provides an essential resource for future studies on the genetic/epigenetic pathways implicated in congenital heart defects and the mechanisms of cardiac transcriptional regulation. © 2017. Published by The Company of Biologists Ltd.
Shen, Po-Chih; Hour, Ai-Ling; Liu, Li-Yu Daisy
Abiotic stresses are the major limiting factors that affect plant growth, development, yield and final quality. Deciphering the underlying mechanisms of plants' adaptations to stresses using few datasets might overlook the different aspects of stress tolerance in plants, which might be simultaneously and consequently operated in the system. Fortunately, the accumulated microarray expression data offer an opportunity to infer abiotic stress-specific gene expression patterns through meta-analysis. In this study, we propose to combine microarray gene expression data under control, cold, drought, heat, and salt conditions and determined modules (gene sets) of genes highly associated with each other according to the observed expression data. By analyzing the expression variations of the Eigen genes from different conditions, we had identified two, three, and five gene modules as cold-, heat-, and salt-specific modules, respectively. Most of the cold- or heat-specific modules were differentially expressed to a particular degree in shoot samples, while most of the salt-specific modules were differentially expressed to a particular degree in root samples. A gene ontology (GO) analysis on the stress-specific modules suggested that the gene modules exclusively enriched stress-related GO terms and that different genes under the same GO terms may be alternatively disturbed in different conditions. The gene regulatory events for two genes, DREB1A and DEAR1, in the cold-specific gene module had also been validated, as evidenced through the literature search. Our protocols study the specificity of the gene modules that were specifically activated under a particular type of abiotic stress. The biplot can also assist to visualize the stress-specific gene modules. In conclusion, our approach has the potential to further elucidate mechanisms in plants and beneficial for future experiments design under different abiotic stresses.
Lu, Jiuxing; Wang, Tao; Xu, Zongda; Sun, Lidan; Zhang, Qixiang
Prunus mume is an ornamental flower and fruit tree in Rosaceae. We investigated the GRAS gene family to improve the breeding and cultivation of P. mume and other Rosaceae fruit trees. The GRAS gene family encodes transcriptional regulators that have diverse functions in plant growth and development, such as gibberellin and phytochrome A signal transduction, root radial patterning, and axillary meristem formation and gametogenesis in the P. mume genome. Despite the important roles of these genes in plant growth regulation, no findings on the GRAS genes of P. mume have been reported. In this study, we discerned phylogenetic relationships of P. mume GRAS genes, and their locations, structures in the genome and expression levels of different tissues. Out of 46 identified GRAS genes, 45 were located on the 8 P. mume chromosomes. Phylogenetic results showed that these genes could be classified into 11 groups. We found that Group X was P. mume-specific, and three genes of Group IX clustered with the rice-specific gene Os4. We speculated that these genes existed before the divergence of dicotyledons and monocotyledons and were lost in Arabidopsis. Tissue expression analysis indicated that 13 genes showed high expression levels in roots, stems, leaves, flowers and fruits, and were related to plant growth and development. Functional analysis of 24 GRAS genes and an orthologous relationship analysis indicated that many functioned during plant growth and flower and fruit development. Our bioinformatics analysis provides valuable information to improve the economic, agronomic and ecological benefits of P. mume and other Rosaceae fruit trees.
Mao, Guangzhi; Ma, Qiang; Wei, Hengling; Su, Junji; Wang, Hantao; Ma, Qifeng; Fan, Shuli; Song, Meizhen; Zhang, Xianlong; Yu, Shuxun
The young leaves of virescent mutants are yellowish and gradually turn green as the plants reach maturity. Understanding the genetic basis of virescent mutants can aid research of the regulatory mechanisms underlying chloroplast development and chlorophyll biosynthesis, as well as contribute to the application of virescent traits in crop breeding. In this study, fine mapping was employed, and a recessive gene (v 1 ) from a virescent mutant of Upland cotton was narrowed to an 84.1-Kb region containing ten candidate genes. The GhChlI gene encodes the cotton Mg-chelatase I subunit (CHLI) and was identified as the candidate gene for the virescent mutation using gene annotation. BLAST analysis showed that the GhChlI gene has two copies, Gh_A10G0282 and Gh_D10G0283. Sequence analysis indicated that the coding region (CDS) of GhChlI is 1269 bp in length, with three predicted exons and one non-synonymous nucleotide mutation (G1082A) in the third exon of Gh_D10G0283, with an amino acid (AA) substitution of arginine (R) to lysine (K). GhChlI-silenced TM-1 plants exhibited a lower GhChlI expression level, a lower chlorophyll content, and the virescent phenotype. Analysis of upstream regulatory elements and expression levels of GhChlI showed that the expression quantity of GhChlI may be normal, and with the development of the true leaf, the increase in the Gh_A10G0282 dosage may partially make up for the deficiency of Gh_D10G0283 in the v 1 mutant. Phylogenetic analysis and sequence alignment revealed that the protein sequence encoded by the third exon of GhChlI is highly conserved across diverse plant species, in which AA substitutions among the completely conserved residues frequently result in changes in leaf color in various species. These results suggest that the mutation (G1082A) within the GhChlI gene may cause a functional defect of the GhCHLI subunit and thus the virescent phenotype in the v 1 mutant. The GhChlI mutation not only provides a tool for understanding the
Boddu, Jayanand; Cho, Seungho; Muehlbauer, Gary J
Fusarium head blight, caused primarily by Fusarium graminearum, is a major disease problem on barley (Hordeum vulgare L.). Trichothecene mycotoxins produced by the fungus during infection increase the aggressiveness of the fungus and promote infection in wheat (Triticum aestivum L.). Loss-of-function mutations in the TRI5 gene in F. graminearum result in the inability to synthesize trichothecenes and in reduced virulence on wheat. We examined the impact of pathogen-derived trichothecenes on virulence and the transcriptional differences in barley spikes infected with a trichothecene-producing wild-type strain and a loss-of-function tri5 trichothecene nonproducing mutant. Disease severity, fungal biomass, and floret necrosis and bleaching were reduced in spikes inoculated with the tri5 mutant strain compared with the wild-type strain, indicating that the inability to synthesize trichothecenes results in reduced virulence in barley. We detected 63 transcripts that were induced during trichothecene accumulation, including genes encoding putative trichothecene detoxification and transport proteins, ubiquitination-related proteins, programmed cell death-related proteins, transcription factors, and cytochrome P450s. We also detected 414 gene transcripts that were designated as basal defense response genes largely independent of trichothecene accumulation. Our results show that barley exhibits a specific response to trichothecene accumulation that can be separated from the basal defense response. We propose that barley responds to trichothecene accumulation by inducing at least two general responses. One response is the induction of genes encoding trichothecene detoxification and transport activities that may reduce the impact of trichothecenes. The other response is to induce genes encoding proteins associated with ubiquitination and cell death which may promote successful establishment of the disease.
Walker Angela M
Full Text Available Abstract Background The Pregnancy-associated glycoproteins (PAGs belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown that the PAG family segregates into 'ancient' and 'modern' groupings. Along with sequence differences between family members, there are clear distinctions in their spatio-temporal distribution and in their relative level of expression. In this report, 1 we performed an in silico analysis of the bovine genome to further characterize the PAG gene family, 2 we scrutinized proximal promoter sequences of the PAG genes to evaluate the evolution pressures operating on them and to identify putative regulatory regions, 3 we determined relative transcript abundance of selected PAGs during pregnancy and, 4 we performed preliminary characterization of the putative regulatory elements for one of the candidate PAGs, bovine (bo PAG-2. Results From our analysis of the bovine genome, we identified 18 distinct PAG genes and 14 pseudogenes. We observed that the first 500 base pairs upstream of the translational start site contained multiple regions that are conserved among all boPAGs. However, a preponderance of conserved regions, that harbor recognition sites for putative transcriptional factors (TFs, were found to be unique to the modern boPAG grouping, but not the ancient boPAGs. We gathered evidence by means of Q-PCR and screening of EST databases to show that boPAG-2 is the most abundant of all boPAG transcripts. Finally, we provided preliminary evidence for the role of ETS- and DDVL-related TFs in the regulation of the boPAG-2 gene. Conclusion PAGs represent a relatively large gene family in the bovine genome. The proximal promoter regions of these genes display differences in putative TF binding sites, likely contributing to observed
Wang, Yiyi; Feng, Lin; Zhu, Yuxin; Li, Yuan; Yan, Hanwei; Xiang, Yan
WRKY III genes have significant functions in regulating plant development and resistance. In plant, WRKY gene family has been studied in many species, however, there still lack a comprehensive analysis of WRKY III genes in the woody plant species poplar, three representative lineages of flowering plant species are incorporated in most analyses: Arabidopsis (a model plant for annual herbaceous dicots), grape (one model plant for perennial dicots) and Oryza sativa (a model plant for monocots). In this study, we identified 10, 6, 13 and 28 WRKY III genes in the genomes of Populus trichocarpa, grape (Vitis vinifera), Arabidopsis thaliana and rice (Oryza sativa), respectively. Phylogenetic analysis revealed that the WRKY III proteins could be divided into four clades. By microsynteny analysis, we found that the duplicated regions were more conserved between poplar and grape than Arabidopsis or rice. We dated their duplications by Ks analysis of Populus WRKY III genes and demonstrated that all the blocks were formed after the divergence of monocots and dicots. Strong purifying selection has played a key role in the maintenance of WRKY III genes in Populus. Tissue expression analysis of the WRKY III genes in Populus revealed that five were most highly expressed in the xylem. We also performed quantitative real-time reverse transcription PCR analysis of WRKY III genes in Populus treated with salicylic acid, abscisic acid and polyethylene glycol to explore their stress-related expression patterns. This study highlighted the duplication and diversification of the WRKY III gene family in Populus and provided a comprehensive analysis of this gene family in the Populus genome. Our results indicated that the majority of WRKY III genes of Populus was expanded by large-scale gene duplication. The expression pattern of PtrWRKYIII gene identified that these genes play important roles in the xylem during poplar growth and development, and may play crucial role in defense to drought
van Hal, N L; Vorst, O; van Houwelingen, A M; Kok, E J; Peijnenburg, A; Aharoni, A; van Tunen, A J; Keijer, J
DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed. These comprise array manufacturing and design, array hybridisation, scanning, and data handling. Furthermore, it is discussed how DNA microarrays can be applied in the working fields of: safety, functionality and health of food and gene discovery and pathway engineering in plants.
Agung, Muhammad Budi; Budiarsa, I. Made; Suwastika, I. Nengah
Cocoa bean is one of the main commodities from Indonesia for the world, which still have problem regarding yield degradation due to pathogens and disease attack. Developing robust cacao plant that genetically resistant to pathogen and disease attack is an ideal solution in over taking on this problem. The aim of this study was to identify Theobroma cacao genes on database of cacao genome that homolog to response genes of pathogen and disease attack in other plant, through in silico analysis. Basic information survey and gene identification were performed in GenBank and The Arabidopsis Information Resource database. The In silico analysis contains protein BLAST, homology test of each gene's protein candidates, and identification of homologue gene in Cacao Genome Database using data source "Theobroma cacao cv. Matina 1-6 v1.1" genome. Identification found that Thecc1EG011959t1 (EDS1), Thecc1EG006803t1 (EDS5), Thecc1EG013842t1 (ICS1), and Thecc1EG015614t1 (BG_PPAP) gene of Cacao Genome Database were Theobroma cacao genes that homolog to plant's resistance genes which highly possible to have similar functions of each gene's homologue gene.
Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar
Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.
Dieterich, Christine; Puey, Angela; Lin, Sylvia; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C; Ng, Hanna H
Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and sacrificed on day 8. Clinical chemistry and histopathology demonstrated kidney injury at 400 mg/kg only. Hierarchical clustering analysis revealed that kidney gene expression profiles of all mice treated at 400 mg/kg clustered with those of mice administered 200 mg/kg iv. Transcriptional profiling might thus be more sensitive than current clinical markers for detecting kidney damage, though the profiles can differ with the route of administration. Analysis of transcripts whose expression was changed by at least twofold compared with vehicle saline after high iv and ip doses of vancomycin suggested the possibility of oxidative stress and mitochondrial damage in vancomycin-induced toxicity. In addition, our data showed changes in expression of several transcripts from the complement and inflammatory pathways. Such expression changes were confirmed by relative real-time reverse transcription-polymerase chain reaction. Finally, our results further substantiate the use of gene markers of kidney toxicity such as KIM-1/Havcr1, as indicators of renal injury.
Blenk, Steffen; Engelmann, Julia C; Pinkert, Stefan; Weniger, Markus; Schultz, Jörg; Rosenwald, Andreas; Müller-Hermelink, Hans K; Müller, Tobias; Dandekar, Thomas
Mantle cell lymphoma (MCL) is an incurable B cell lymphoma and accounts for 6% of all non-Hodgkin's lymphomas. On the genetic level, MCL is characterized by the hallmark translocation t(11;14) that is present in most cases with few exceptions. Both gene expression and comparative genomic hybridization (CGH) data vary considerably between patients with implications for their prognosis. We compare patients over and below the median of survival. Exploratory principal component analysis of gene expression data showed that the second principal component correlates well with patient survival. Explorative analysis of CGH data shows the same correlation. On chromosome 7 and 9 specific genes and bands are delineated which improve prognosis prediction independent of the previously described proliferation signature. We identify a compact survival predictor of seven genes for MCL patients. After extensive re-annotation using GEPAT, we established protein networks correlating with prognosis. Well known genes (CDC2, CCND1) and further proliferation markers (WEE1, CDC25, aurora kinases, BUB1, PCNA, E2F1) form a tight interaction network, but also non-proliferative genes (SOCS1, TUBA1B CEBPB) are shown to be associated with prognosis. Furthermore we show that aggressive MCL implicates a gene network shift to higher expressed genes in late cell cycle states and refine the set of non-proliferative genes implicated with bad prognosis in MCL. The results from explorative data analysis of gene expression and CGH data are complementary to each other. Including further tests such as Wilcoxon rank test we point both to proliferative and non-proliferative gene networks implicated in inferior prognosis of MCL and identify suitable markers both in gene expression and CGH data
Full Text Available Few driver genes have been well established in esophageal squamous cell carcinoma (ESCC. Identification of the genomic aberrations that contribute to changes in gene expression profiles can be used to predict driver genes.We searched for driver genes in ESCC by integrative analysis of gene expression microarray profiles and copy number data. To narrow down candidate genes, we performed survival analysis on expression data and tested the genetic vulnerability of each genes using public RNAi screening data. We confirmed the results by performing RNAi experiments and evaluating the clinical relevance of candidate genes in an independent ESCC cohort.We found 10 significantly recurrent copy number alterations accompanying gene expression changes, including loci 11q13.2, 7p11.2, 3q26.33, and 17q12, which harbored CCND1, EGFR, SOX2, and ERBB2, respectively. Analysis of survival data and RNAi screening data suggested that GRB7, located on 17q12, was a driver gene in ESCC. In ESCC cell lines harboring 17q12 amplification, knockdown of GRB7 reduced the proliferation, migration, and invasion capacities of cells. Moreover, siRNA targeting GRB7 had a synergistic inhibitory effect when combined with trastuzumab, an anti-ERBB2 antibody. Survival analysis of the independent cohort also showed that high GRB7 expression was associated with poor prognosis in ESCC.Our integrative analysis provided important insights into ESCC pathogenesis. We identified GRB7 as a novel ESCC driver gene and potential new therapeutic target.
Colding, H; Hartzen, S H; Mohammadi, M
Recently, PCR-restriction fragment length polymorphism (PCR-RFLP) of the urease genes of Helicobacter pylori was evaluated in a meta-analysis; acceptable discriminatory indices of the ureAB and C genes were found. In the present investigation, we found a discriminatory index of 0.95 for 191...... is comparable to typing of other H. pylori urease genes....
Kumar, Kamal; Srivastava, Vikas; Purayannur, Savithri; Kaladhar, V Chandra; Cheruvu, Purnima Jaiswal; Verma, Praveen Kumar
The WRKY genes have been identified as important transcriptional modulators predominantly during the environmental stresses, but they also play critical role at various stages of plant life cycle. We report the identification of WRKY domain (WD)-encoding genes from galegoid clade legumes chickpea (Cicer arietinum L.) and barrel medic (Medicago truncatula). In total, 78 and 98 WD-encoding genes were found in chickpea and barrel medic, respectively. Comparative analysis suggests the presence of both conserved and unique WRKYs, and expansion of WRKY family in M. truncatula primarily by tandem duplication. Exclusively found in galegoid legumes, CaWRKY16 and its orthologues encode for a novel protein having a transmembrane and partial Exo70 domains flanking a group-III WD. Genomic region of galegoids, having CaWRKY16, is more dynamic when compared with millettioids. In onion cells, fused CaWRKY16-EYFP showed punctate fluorescent signals in cytoplasm. The chickpea WRKY group-III genes were further characterized for their transcript level modulation during pathogenic stress and treatments of abscisic acid, jasmonic acid, and salicylic acid (SA) by real-time PCR. Differential regulation of genes was observed during Ascochyta rabiei infection and SA treatment. Characterization of A. rabiei and SA inducible gene CaWRKY50 showed that it localizes to plant nucleus, binds to W-box, and have a C-terminal transactivation domain. Overexpression of CaWRKY50 in tobacco plants resulted in early flowering and senescence. The in-depth comparative account presented here for two legume WRKY genes will be of great utility in hastening functional characterization of crop legume WRKYs and will also help in characterization of Exo70Js. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Wu, Zhenying; Xu, Xueqin; Xiong, Wangdan; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Wu, Guojiang; Jiang, Huawu
The NAC proteins (NAM, ATAF1/2 and CUC2) are plant-specific transcriptional regulators that have a conserved NAM domain in the N-terminus. They are involved in various biological processes, including both biotic and abiotic stress responses. In the present study, a total of 100 NAC genes (JcNAC) were identified in physic nut (Jatropha curcas L.). Based on phylogenetic analysis and gene structures, 83 JcNAC genes were classified as members of, or proposed to be diverged from, 39 previously predicted orthologous groups (OGs) of NAC sequences. Physic nut has a single intron-containing NAC gene subfamily that has been lost in many plants. The JcNAC genes are non-randomly distributed across the 11 linkage groups of the physic nut genome, and appear to be preferentially retained duplicates that arose from both ancient and recent duplication events. Digital gene expression analysis indicates that some of the JcNAC genes have tissue-specific expression profiles (e.g. in leaves, roots, stem cortex or seeds), and 29 genes differentially respond to abiotic stresses (drought, salinity, phosphorus deficiency and nitrogen deficiency). Our results will be helpful for further functional analysis of the NAC genes in physic nut.
Rode, Tone Mari; Berget, Ingunn; Langsrud, Solveig; Møretrø, Trond; Holck, Askild
Microorganisms are constantly exposed to new and altered growth conditions, and respond by changing gene expression patterns. Several methods for studying gene expression exist. During the last decade, the analysis of microarrays has been one of the most common approaches applied for large scale gene expression studies. A relatively new method for gene expression analysis is MassARRAY, which combines real competitive-PCR and MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry. In contrast to microarray methods, MassARRAY technology is suitable for analysing a larger number of samples, though for a smaller set of genes. In this study we compare the results from MassARRAY with microarrays on gene expression responses of Staphylococcus aureus exposed to acid stress at pH 4.5. RNA isolated from the same stress experiments was analysed using both the MassARRAY and the microarray methods. The MassARRAY and microarray methods showed good correlation. Both MassARRAY and microarray estimated somewhat lower fold changes compared with quantitative real-time PCR (qRT-PCR). The results confirmed the up-regulation of the urease genes in acidic environments, and also indicated the importance of metal ion regulation. This study shows that the MassARRAY technology is suitable for gene expression analysis in prokaryotes, and has advantages when a set of genes is being analysed for an organism exposed to many different environmental conditions.
Sung, Chang Ohk; Choi, Chel Hun; Ko, Young-Hyeh; Ju, Hyunjeong; Choi, Yoon-La; Kim, Nyunsu; Kang, So Young; Ha, Sang Yun; Choi, Kyusam; Bae, Duk-Soo; Lee, Jeong-Won; Kim, Tae-Joong; Song, Sang Yong; Kim, Byoung-Gie
Ovarian clear cell adenocarcinoma (Ov-CCA) is a distinctive subtype of ovarian epithelial carcinoma. In this study, we performed array comparative genomic hybridization (aCGH) and paired gene expression microarray of 19 fresh-frozen samples and conducted integrative analysis. For the copy number alterations, significantly amplified regions (false discovery rate [FDR] q genes demonstrating frequent copy number alterations (>25% of samples) that correlated with gene expression (FDR genes were mainly located on 8p11.21, 8p21.2-p21.3, 8q22.1, 8q24.3, 17q23.2-q23.3, 19p13.3, and 19p13.11. Among the regions, 8q24.3 was found to contain the most genes (30 of 94 genes) including PTK2. The 8q24.3 region was indicated as the most significant region, as supported by copy number, GISTIC, and integrative analysis. Pathway analysis using differentially expressed genes on 8q24.3 revealed several major nodes, including PTK2. In conclusion, we identified a set of 94 candidate genes with frequent copy number alterations that correlated with gene expression. Specific chromosomal alterations, such as the 8q24.3 gain containing PTK2, could be a therapeutic target in a subset of Ov-CCAs. Copyright © 2013. Published by Elsevier Inc.
Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu
The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.
MOTIVATION: Competitive gene set analysis intends to assess whether a specific set of genes is more associated with a trait than the remaining genes. However, the statistical models assumed to date to underly these methods do not enable a clear cut formulation of the competitive null hypothesis....... This is a major handicap to the interpretation of results obtained from a gene set analysis. RESULTS: This work presents a hierarchical statistical model based on the notion of dependence measures, which overcomes this problem. The two levels of the model naturally reflect the modular structure of many gene set...... analysis methods. We apply the model to show that the popular GSEA method, which recently has been claimed to test the self-contained null hypothesis, actually tests the competitive null if the weight parameter is zero. However, for this result to hold strictly, the choice of the dependence measures...
Ma, W; Zhang, T-F; Lu, P; Lu, S H
Breast cancer is categorized into two broad groups: estrogen receptor positive (ER+) and ER negative (ER-) groups. Previous study proposed that under trastuzumab-based neoadjuvant chemotherapy, tumor initiating cell (TIC) featured ER- tumors response better than ER+ tumors. Exploration of the molecular difference of these two groups may help developing new therapeutic strategies, especially for ER- patients. With gene expression profile from the Gene Expression Omnibus (GEO) database, we performed partial least squares (PLS) based analysis, which is more sensitive than common variance/regression analysis. We acquired 512 differentially expressed genes. Four pathways were found to be enriched with differentially expressed genes, involving immune system, metabolism and genetic information processing process. Network analysis identified five hub genes with degrees higher than 10, including APP, ESR1, SMAD3, HDAC2, and PRKAA1. Our findings provide new understanding for the molecular difference between TIC featured ER- and ER+ breast tumors with the hope offer supports for therapeutic studies.
Grauers, Anna; Wang, Jingwen; Einarsdottir, Elisabet
samples from 100 surgically treated idiopathic scoliosis patients. Novel or rare missense, nonsense, or splice site variants were selected for individual genotyping in the 1,739 cases and 1,812 controls. In addition, the 5'UTR, noncoding exon and promoter regions of LBX1, not covered by exome sequencing...... by exome sequencing after filtration and an initial genotyping validation. However, we could not verify any association to idiopathic scoliosis in the large cohort of 1,739 cases and 1,812 controls. We did not find any variants in the 5'UTR, noncoding exon and promoter regions of LBX1. CONCLUSIONS: Here...... that are significantly associated with idiopathic scoliosis in Asian and Caucasian populations, rs11190870 close to the LBX1 gene being the most replicated finding. PURPOSE: The aim of the present study was to investigate the genetics of idiopathic scoliosis in a Scandinavian cohort by performing a candidate gene study...
Full Text Available Abstract Background The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. Results We have developed MINER (Microarray Interactive Network Exploration and Representation, an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Conclusion Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Arun Sondur Jayappa
Full Text Available The Janus kinase and signal transducer and activator of transcription (JAK-STAT pathway genes along with suppressors of cytokine signalling (SOCS family genes play a crucial role in controlling cytokine signals in the mammary gland and thus mammary gland development. Mammary gene expression studies showed differential expression patterns for all the JAK-STAT pathway genes. Gene expression studies using qRT-PCR revealed differential expression of SOCS2, SOCS4 and SOCS5 genes across the lactation cycle in dairy cows. Using genotypes from 1,546 Australian Holstein- Friesian bulls, a statistical model based on SNPs within 500kb of JAK-STAT pathway genes, and SOCS genes alone was carried out. The analysis suggested that these genes and pathways make a significant contribution to the Australian milk production traits. Selection of 24 SNPs close to SOCS1, SOCS3, SOCS5, SOCS7 and CISH genes were significantly associated with, Australian Profit Ranking (APR, Australian Selection Index (ASI and protein yield (PY. This study supports the view that there may be some merit in choosing SNPs around functionally relevant genes for the selection and genetic improvement schemes for dairy production traits.
Wang, Haifang; Du, Jiulin; Yan, Jun
In the study of circadian rhythms, it has been a puzzle how a limited number of circadian clock genes can control diverse aspects of physiology. Here we investigate circadian gene expression genome-wide using larval zebrafish as a model system. We made use of a spatial gene expression atlas to investigate the expression of circadian genes in various tissues and cell types. Comparison of genome-wide circadian gene expression data between zebrafish and mouse revealed a nearly anti-phase relationship and allowed us to detect novel evolutionarily conserved circadian genes in vertebrates. We identified three groups of zebrafish genes with distinct responses to light entrainment: fast light-induced genes, slow light-induced genes, and dark-induced genes. Our computational analysis of the circadian gene regulatory network revealed several transcription factors (TFs) involved in diverse aspects of circadian physiology through transcriptional cascade. Of these, microphthalmia-associated transcription factor a (mitfa), a dark-induced TF, mediates a circadian rhythm of melanin synthesis, which may be involved in zebrafish's adaptation to daily light cycling. Our study describes a systematic method to discover previously unidentified TFs involved in circadian physiology in complex organisms. PMID:23468616
Zai, W S; Miao, L X; Xiong, Z L; Zhang, H L; Ma, Y R; Li, Y L; Chen, Y B; Ye, S G
Heat shock protein 90 (Hsp90) is a protein produced by plants in response to adverse environmental stresses. In this study, we identified and analyzed Hsp90 gene family members using a bioinformatic method based on genomic data from tomato (Solanum lycopersicum L.). The results illustrated that tomato contains at least 7 Hsp90 genes distributed on 6 chromosomes; protein lengths ranged from 267-794 amino acids. Intron numbers ranged from 2-19 in the genes. The phylogenetic tree revealed that Hsp90 genes in tomato (Solanum lycopersicum L.), rice (Oryza sativa L.), and Arabidopsis (Arabidopsis thaliana L.) could be divided into 5 groups, which included 3 pairs of orthologous genes and 4 pairs of paralogous genes. Expression analysis of RNA-sequence data showed that the Hsp90-1 gene was specifically expressed in mature fruits, while Hsp90-5 and Hsp90-6 showed opposite expression patterns in various tissues of cultivated and wild tomatoes. The expression levels of the Hsp90-1, Hsp90-2, and Hsp90- 3 genes in various tissues of cultivated tomatoes were high, while both the expression levels of genes Hsp90-3 and Hsp90-4 were low. Additionally, quantitative real-time polymerase chain reaction showed that these genes were involved in the responses to yellow leaf curl virus in tomato plant leaves. Our results provide a foundation for identifying the function of the Hsp90 gene in tomato.
Full Text Available Background and objective Non-small cell lung cancer (NSCLC is one of the most common malignant tumors; however, its causes are still not completely understood. This study was designed to screen the key genes and pathways related to NSCLC occurrence and development and to establish the scientific foundation for the genetic mechanisms and targeted therapy of NSCLC. Methods Both gene set-enrichment analysis (GSEA and meta-analysis (meta were used to screen the critical pathways and genes that might be corretacted with the development and progression of lung cancer at the transcription level. Results Using the GSEA and meta methods, focal adhesion and regulation of actin cytoskeleton were determined to be the more prominent overlapping significant pathways. In the focal adhesion pathway, 31 genes were statistically significant (P<0.05, whereas in the regulation of actin cytoskeleton pathway, 32 genes were statistically significant (P<0.05. Conclusion The focal adhesion and the regulation of actin cytoskeleton pathways might play important roles in the occurrence and development of NSCLC. Further studies are needed to determine the biological function for the positiue genes.
Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; PlÄ der, Wojciech; Przybecki, Zbigniew
Real-time quantitative polymerase chain reaction is consider as the most reliable method for gene expression studies. However, the expression of target gene could be misinterpreted due to improper normalization. Therefore, the crucial step for analysing of qPCR data is selection of suitable reference genes, which should be validated experimentally. In order to choice the gene with stable expression in the designed experiment, we performed reference gene expression analysis. In this study genes described in the literature and novel genes predicted as control genes, based on the in silico analysis of transcriptome data were used. Analysis with geNorm and NormFinder algorithms allow to create the ranking of candidate genes and indicate the best reference for flower morphogenesis study. According to the results, genes CACS and CYCL were characterised the most stable expression, but the least suitable genes were TUA and EF.
Microarray Data Analysis of Space Grown Arabidopsis Leaves for Genes Important in Vascular Patterning. Analysis of Space Grown Arabidopsis with Microarray Data from GeneLab: Identification of Genes Important in Vascular Patterning
Weitzel, A. J.; Wyatt, S. E.; Parsons-Wingerter, P.
Venation patterning in leaves is a major determinant of photosynthesis efficiency because of its dependency on vascular transport of photo-assimilates, water, and minerals. Arabidopsis thaliana grown in microgravity show delayed growth and leaf maturation. Gene expression data from the roots, hypocotyl, and leaves of A. thaliana grown during spaceflight vs. ground control analyzed by Affymetrix microarray are available through NASA's GeneLab (GLDS-7). We analyzed the data for differential expression of genes in leaves resulting from the effects of spaceflight on vascular patterning. Two genes were found by preliminary analysis to be up-regulated during spaceflight that may be related to vascular formation. The genes are responsible for coding an ARGOS (Auxin-Regulated Gene Involved in Organ Size)-like protein (potentially affecting cell elongation in the leaves), and an F-box/kelch-repeat protein (possibly contributing to protoxylem specification). Further analysis that will focus on raw data quality assessment and a moderated t-test may further confirm up-regulation of the two genes and/or identify other gene candidates. Plants defective in these genes will then be assessed for phenotype by the mapping and quantification of leaf vascular patterning by NASA's VESsel GENeration (VESGEN) software to model specific vascular differences of plants grown in spaceflight.
Full Text Available The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a "genes as characters" approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis, similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as
Lu, Bin; Yang, Weizhao; Dai, Qiang; Fu, Jinzhong
The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a “genes as characters” approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog) and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara) but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis), similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as characters
Jun 17, 2009 ... traditional breeding as well as gene-engineering approa- ches (Ottow et al., ... tolerance, H. ammodendron is one of the main tree species used for ... labeled with DIG-11-dUTP by reverse transcription using. SuperScript III ...
Dec 9, 2013 ... early diagnosis of complex diseases or cancer without obvious symptoms. [Gong J., Diao B., Yao G. J., ... expression levels of thousands of genes in a specific cell or tissue. Previous ..... base of the brain. It mainly controls the ...
with 16 members, accounting for 32% of all the 50 Arabidopsis LEA genes. Aligning ... suggest that they play roles different from those of other LEA proteins. .... [Glycine max]; AAD53078, water stress-induced ER5 protein [Capsicum annuum]; ...
Background: The application of complexity information on DNA sequence and protein in biological processes are well established in this study. Available sequences for DNMT1 gene, which is a maintenance methyltransferase is responsible for copying DNA methylation patterns to the daughter strands durin...
Aug 26, 2016 ... Late embryogenesis abundant (LEA) protein family is a large protein family that includes proteins accumulated at late stages of seed development or in vegetative tissues in response to drought, salinity, cold stress and exogenous application of abscisic acid. In order to isolate peanut genes, an expressed ...
Troponin I is one of myofibrillar proteins required for the calcium regulation of skeletal muscle contraction. The expression of both genes, TNNI1 and TNNI2, in troponin is muscle fibre specific and may affect meat quality traits. In this study, the PCR-RFLP method was applied to genotype 120 Mongcai pigs at three ...
Dec 26, 2014 ... In this study, 31 putative apple ARF genes have been identified and located within the apple genome. ... including growth and development of the root and stem, for- ..... Script 1st Strand cDNA Synthesis Kit (Takara, Dalian,.
Using, DNA markers and genome organization, several important disease resistance genes have been analyzed in mungbean (Vigna radiata), cowpea (Vigna unguiculata), common bean (Phaseolus vulgaris), and soybean (Glycine max). In the process, medium-density linkage maps consisting of restriction fragment length polymorphism (RFLP) markers were constructed for both mungbean and cowpea. Comparisons between these maps, as well as the maps of soybean and common bean, indicate that there is significant conservation of DNA marker order, though the conserved blocks in soybean are much shorter than in the others. DNA mapping results also indicate that a gene for seed weight may be conserved between mungbean and cowpea. Using the linkage maps, genes that control bruchid (genus Callosobruchus) and powdery mildew (Erysiphe polygoni) resistance in mungbean, aphid resistance in cowpea (Aphis craccivora), and cyst nematode (Heterodera glycines) resistance in soybean have all been mapped and characterized. For some of these traits resistance was found to be oligogenic and DNA mapping uncovered multiple genes involved in the phenotype. (author)
Full Text Available We evaluated the expression of several genes involved in tissue remodelling and bone development in patients with calcific tendinopathy of the rotator cuff. Biopsies from calcified and non-calcified areas were obtained from 10 patients (8 women and 2 men; average age: 55 years; range: 40-68 with calcific tendinopathy of the rotator cuff. To evaluate the expression of selected genes, RNA extraction, cDNA synthesis and quantitative polymerase chain reaction (PCR were performed. A significantly increased expression of tissue transglutaminase (tTG2 and its substrate, osteopontin, was detected in the calcific areas compared to the levels observed in the normal tissue from the same subject with calcific tendinopathy, whereas a modest increase was observed for catepsin K. There was also a significant decrease in mRNA expression of Bone Morphogenetic Protein (BMP4 and BMP6 in the calcific area. BMP-2, collagen V and vascular endothelial growth factor (VEGF did not show significant differences. Collagen X and matrix metalloproteinase (MMP-9 were not detectable. A variation in expression of these genes could be characteristic of this form tendinopathy, since an increased level of these genes has not been detected in other forms of tendon lesions.
This dissertation describes the development of molecular tools to identify genes that are involved in production and health traits in poultry. To unravel the chicken genome, fluorescent molecular markers (microsatellite markers) were developed and optimized to perform high throughput
... were more in the high artemisinin producer species, A. annua, than the other species. We have reported that the light-responsive elements, W-box, CAAT-box, 5′-UTR py-rich stretch, TATA-box sequence and tandem repeat sequences have been identified as important factors in the increased expression of ADS gene.
Haloxylon ammodendron (C.A Mey.) Bunge is a xero-halophytic desert shrub with excellent drought resistance and salt tolerance. To decipher the molecular responses involved in its drought resistance, the cDNA-AFLP (amplified fragment length polymorphism) technique was employed to identify genes expressed ...
The TP53 gene encoding p53 protein is involved in regulating a series of pathways. New discoveries about the function and control of p53 are still in progress and it is hoped to develop better therapeutics and diagnostics by exploiting this system. Evolutionary studies are of prime importance in the field of biological ...
Hal, van N.L.W.; Vorst, O.; Houwelingen, van A.M.M.L.; Kok, E.J.; Peijnenburg, A.A.C.M.; Aharoni, A.; Tunen, van A.J.; Keijer, J.
DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed.
Plon, Sharon E.; Wheeler, David A.; Strong, Louise C.; Tomlinson, Gail E.; Pirics, Michael; Meng, Qingchang; Cheung, Hannah C.; Begin, Phyllis R.; Muzny, Donna M.; Lewis, Lora; Biegel, Jaclyn A.; Gibbs, Richard A.
Clinical cancer genetic susceptibility analysis typically proceeds sequentially beginning with the most likely causative gene. The process is time consuming and the yield is low particularly for families with unusual patterns of cancer. We determined the results of in parallel mutation analysis of a large cancer-associated gene panel. We performed deletion analysis and sequenced the coding regions of 45 genes (8 oncogenes and 37 tumor suppressor or DNA repair genes) in 48 childhood cancer patients who also (1) were diagnosed with a second malignancy under age 30, (2) have a sibling diagnosed with cancer under age 30 and/or (3) have a major congenital anomaly or developmental delay. Deleterious mutations were identified in 6 of 48 (13%) families, 4 of which met the sibling criteria. Mutations were identified in genes previously implicated in both dominant and recessive childhood syndromes including SMARCB1, PMS2, and TP53. No pathogenic deletions were identified. This approach has provided efficient identification of childhood cancer susceptibility mutations and will have greater utility as additional cancer susceptibility genes are identified. Integrating parallel analysis of large gene panels into clinical testing will speed results and increase diagnostic yield. The failure to detect mutations in 87% of families highlights that a number of childhood cancer susceptibility genes remain to be discovered. PMID:21356188
Oct 24, 2011 ... G), and the major structural protein of inner capsid particles (ICP), and also specific antigen of mucosa immunization that mediate specific immunological reaction. In this report, sequence analysis of VP6 gene of giant panda rotavirus was carried out. Full-length VP6 gene encoding for ICP of giant panda.
Kaczkowski, Bogumil; Tanaka, Yuji; Kawaji, Hideya
Genes that are commonly deregulated in cancer are clinically attractive as candidate pan-diagnostic markers and therapeutic targets. To globally identify such targets, we compared Cap Analysis of Gene Expression (CAGE) profiles from 225 different cancer cell lines and 339 corresponding primary cell...
GENE ARRAY ANALYSIS OF THE VENTRAL PROSTATE IN RATS EXPOSED TO EITHER VINCLOZOLIN OR PROCYMIDONE. MB Rosen, VS Wilson, JE Schmid, and LE Gray Jr. US EPA, ORD, NHEERL, RTP, NC.Vinclozolin (Vi) and procymidone (Pr) are antiandrogenic fungicides. While changes in gene expr...
Cloning and homologic analysis of Tpn I gene in silkworm Bombyx mori. Y Zhao, Yao Q, X Tang, Q Wang, H Yin, Z Hu, J Lu, K Chen. Abstract. The troponin complex is composed of three subunits, Troponin C (the calcium sensor component) and Troponin T and I (structural proteins). Tpn C is encoded by multiple genes in ...
Jul 17, 2012 ... These nucleotide and protein sequence analysis of the putative swrW gene provides vital information on the versatility .... chain reaction (PCR) products were stored at 4°C. Presence of ... identical to the same gene with an E-value of 0.0. .... The Prokaryotes-A Handbook on the Biol. of Bacteria:Ecophysiol.
Riess, O; Weber, B; Nørremølle, Anne
as to whether mutations in the human PDEB gene might cause LCA. We have previously cloned and characterized the human homologue of the mouse Pdeb gene and have mapped it to chromosome 4p16.3. In this study, a total of 23 LCA families of various ethnic backgrounds have been investigated. Linkage analysis using...
Ram, Chet; Koramutla, Murali Krishna; Bhattacharya, Ramcharan
Brassica juncea is a chief oil yielding crop in many parts of the world including India. With advancement of molecular techniques, RT-qPCR based study of gene-expression has become an integral part of experimentations in crop breeding. In RT-qPCR, use of appropriate reference gene(s) is pivotal. The virtue of the reference genes, being constant in expression throughout the experimental treatments, needs to be validated case by case. Appropriate reference gene(s) for normalization of gene-expression data in B. juncea during the biotic stress of aphid infestation is not known. In the present investigation, 11 reference genes identified from microarray database of Arabidopsis-aphid interaction at a cut off FDR ≤0.1, along with two known reference genes of B. juncea, were analyzed for their expression stability upon aphid infestation. These included 6 frequently used and 5 newly identified reference genes. Ranking orders of the reference genes in terms of expression stability were calculated using advanced statistical approaches such as geNorm, NormFinder, delta Ct and BestKeeper. The analysis suggested CAC, TUA and DUF179 as the most suitable reference genes. Further, normalization of the gene-expression data of STP4 and PR1 by the most and the least stable reference gene, respectively has demonstrated importance and applicability of the recommended reference genes in aphid infested samples of B. juncea. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Taft, A S; Vermeire, J J; Bernier, J; Birkeland, S R; Cipriano, M J; Papa, A R; McArthur, A G; Yoshino, T P
Infection of the snail, Biomphalaria glabrata, by the free-swimming miracidial stage of the human blood fluke, Schistosoma mansoni, and its subsequent development to the parasitic sporocyst stage is critical to establishment of viable infections and continued human transmission. We performed a genome-wide expression analysis of the S. mansoni miracidia and developing sporocyst using Long Serial Analysis of Gene Expression (LongSAGE). Five cDNA libraries were constructed from miracidia and in vitro cultured 6- and 20-day-old sporocysts maintained in sporocyst medium (SM) or in SM conditioned by previous cultivation with cells of the B. glabrata embryonic (Bge) cell line. We generated 21 440 SAGE tags and mapped 13 381 to the S. mansoni gene predictions (v4.0e) either by estimating theoretical 3' UTR lengths or using existing 3' EST sequence data. Overall, 432 transcripts were found to be differentially expressed amongst all 5 libraries. In total, 172 tags were differentially expressed between miracidia and 6-day conditioned sporocysts and 152 were differentially expressed between miracidia and 6-day unconditioned sporocysts. In addition, 53 and 45 tags, respectively, were differentially expressed in 6-day and 20-day cultured sporocysts, due to the effects of exposure to Bge cell-conditioned medium.
Norton James H
Full Text Available Abstract Background Menisci play a vital role in load transmission, shock absorption and joint stability. There is increasing evidence suggesting that OA menisci may not merely be bystanders in the disease process of OA. This study sought: 1 to determine the prevalence of meniscal degeneration in OA patients, and 2 to examine gene expression in OA meniscal cells compared to normal meniscal cells. Methods Studies were approved by our human subjects Institutional Review Board. Menisci and articular cartilage were collected during joint replacement surgery for OA patients and lower limb amputation surgery for osteosarcoma patients (normal control specimens, and graded. Meniscal cells were prepared from these meniscal tissues and expanded in monolayer culture. Differential gene expression in OA meniscal cells and normal meniscal cells was examined using Affymetrix microarray and real time RT-PCR. Results The grades of meniscal degeneration correlated with the grades of articular cartilage degeneration (r = 0.672; P HLA-DPA1, integrin, beta 2 (ITGB2, ectonucleotide pyrophosphatase/phosphodiesterase 1 (ENPP1, ankylosis, progressive homolog (ANKH and fibroblast growth factor 7 (FGF7, were expressed at significantly higher levels in OA meniscal cells compared to normal meniscal cells. Importantly, many of the genes that have been shown to be differentially expressed in other OA cell types/tissues, including ADAM metallopeptidase with thrombospondin type 1 motif 5 (ADAMTS5 and prostaglandin E synthase (PTGES, were found to be expressed at significantly higher levels in OA meniscal cells. This consistency suggests that many of the genes detected in our study are disease-specific. Conclusion Our findings suggest that OA is a whole joint disease. Meniscal cells may play an active role in the development of OA. Investigation of the gene expression profiles of OA meniscal cells may reveal new therapeutic targets for OA therapy and also may uncover novel
Full Text Available Abstract Background This paper addresses key biological problems and statistical issues in the analysis of large gene expression data sets that describe systemic temporal response cascades to therapeutic doses in multiple tissues such as liver, skeletal muscle, and kidney from the same animals. Affymetrix time course gene expression data U34A are obtained from three different tissues including kidney, liver and muscle. Our goal is not only to find the concordance of gene in different tissues, identify the common differentially expressed genes over time and also examine the reproducibility of the findings by integrating the results through meta analysis from multiple tissues in order to gain a significant increase in the power of detecting differentially expressed genes over time and to find the differential differences of three tissues responding to the drug. Results and conclusion Bayesian categorical model for estimating the proportion of the 'call' are used for pre-screening genes. Hierarchical Bayesian Mixture Model is further developed for the identifications of differentially expressed genes across time and dynamic clusters. Deviance information criterion is applied to determine the number of components for model comparisons and selections. Bayesian mixture model produces the gene-specific posterior probability of differential/non-differential expression and the 95% credible interval, which is the basis for our further Bayesian meta-inference. Meta-analysis is performed in order to identify commonly expressed genes from multiple tissues that may serve as ideal targets for novel treatment strategies and to integrate the results across separate studies. We have found the common expressed genes in the three tissues. However, the up/down/no regulations of these common genes are different at different time points. Moreover, the most differentially expressed genes were found in the liver, then in kidney, and then in muscle.
Xi, W-D; Liu, Y-J; Sun, X-B; Shan, J; Yi, L; Zhang, T-T
RNA-seq data of colon adenocarcinoma (COAD) were analyzed with bioinformatics tools to discover critical genes in the disease. Relevant small molecule drugs, transcription factors (TFs) and microRNAs (miRNAs) were also investigated. RNA-seq data of COAD were downloaded from The Cancer Genome Atlas (TCGA). Differential analysis was performed with package edgeR. False positive discovery (FDR) 1 were set as the cut-offs to screen out differentially expressed genes (DEGs). Gene coexpression network was constructed with package Ebcoexpress. GO enrichment analysis was performed for the DEGs in the gene coexpression network with DAVID. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was also performed for the genes with KOBASS 2.0. Modules were identified with MCODE of Cytoscape. Relevant small molecules drugs were predicted by Connectivity map. Relevant miRNAs and TFs were searched by WebGestalt. A total of 457 DEGs, including 255 up-regulated and 202 down-regulated genes, were identified from 437 COAD and 39 control samples. A gene coexpression network was constructed containing 40 DEGs and 101 edges. The genes were mainly associated with collagen fibril organization, extracellular matrix organization and translation. Two modules were identified from the gene coexpression network, which were implicated in muscle contraction and extracellular matrix organization, respectively. Several critical genes were disclosed, such as MYH11, COL5A2 and ribosomal proteins. Nine relevant small molecule drugs were identified, such as scriptaid and STOCK1N-35874. Accordingly, a total of 17 TFs and 10 miRNAs related to COAD were acquired, such as ETS2, NFAT, AP4, miR-124A, MiR-9, miR-96 and let-7. Several critical genes and relevant drugs, TFs and miRNAs were revealed in COAD. These findings could advance the understanding of the disease and benefit therapy development.
Full Text Available Abstract Background Domain or gene fusion analysis is a bioinformatics method for detecting gene fusions in one organism by comparing its genome to that of other organisms. The occurrence of gene fusions suggests that the two original genes that participated in the fusion are functionally linked, i.e. their gene products interact either as part of a multi-subunit protein complex, or in a metabolic pathway. Gene fusion analysis has been used to identify protein functional links in prokaryotes as well as in eukaryotic model organisms, such as yeast and Drosophila. Results In this study we have extended this approach to include a number of recently sequenced protists, four of which are pathogenic, to identify fusion linked proteins in Trypanosoma brucei, the causative agent of African sleeping sickness. We have also examined the evolution of the gene fusion events identified, to determine whether they can be attributed to fusion or fission, by looking at the conservation of the fused genes and of the individual component genes across the major eukaryotic and prokaryotic lineages. We find relatively limited occurrence of gene fusions/fissions within the protist lineages examined. Our results point to two trypanosome-specific gene fissions, which have recently been experimentally confirmed, one fusion involving proteins involved in the same metabolic pathway, as well as two novel putative functional links between fusion-linked protein pairs. Conclusions This is the first study of protein functional links in T. brucei identified by gene fusion analysis. We have used strict thresholds and only discuss results which are highly likely to be genuine and which either have already been or can be experimentally verified. We discuss the possible impact of the identification of these novel putative protein-protein interactions, to the development of new trypanosome therapeutic drugs.
Kjaerulff, S; Davey, William John; Nielsen, O
We previously identified two genes, mfm1 and mfm2, with the potential to encode the M-factor mating pheromone of the fission yeast Schizosaccharomyces pombe (J. Davey, EMBO J. 11:951-960, 1992), but further analysis revealed that a mutant strain lacking both genes still produced active M-factor. ......We previously identified two genes, mfm1 and mfm2, with the potential to encode the M-factor mating pheromone of the fission yeast Schizosaccharomyces pombe (J. Davey, EMBO J. 11:951-960, 1992), but further analysis revealed that a mutant strain lacking both genes still produced active M...... that is not rescued by addition of exogenous M-factor. A mutational analysis reveals that all three mfm genes contribute to the production of M-factor. Their transcription is limited to M cells and requires the mat1-Mc and ste11 gene products. Each gene is induced when the cells are starved of nitrogen and further...
Full Text Available Many cells experience hypoxia, or low oxygen, and respond by dramatically altering gene expression. In the yeast Saccharomyces cerevisiae, genes that respond are required for many oxygen-dependent cellular processes, such as respiration, biosynthesis, and redox regulation. To more fully characterize the global response to hypoxia, we exposed yeast to hypoxic conditions, extracted RNA at different times, and performed RNA sequencing (RNA-seq analysis. Time-course statistical analysis revealed hundreds of genes that changed expression by up to 550-fold. The genes responded with varying kinetics suggesting that multiple regulatory pathways are involved. We identified most known oxygen-regulated genes and also uncovered new regulated genes. Reverse transcription-quantitative PCR (RT-qPCR analysis confirmed that the lysine methyltransferase EFM6 and the recombinase DMC1, both conserved in humans, are indeed oxygen-responsive. Looking more broadly, oxygen-regulated genes participate in expected processes like respiration and lipid metabolism, but also in unexpected processes like amino acid and vitamin metabolism. Using principle component analysis, we discovered that the hypoxic response largely occurs during the first 2 hr and then a new steady-state expression state is achieved. Moreover, we show that the oxygen-dependent genes are not part of the previously described environmental stress response (ESR consisting of genes that respond to diverse types of stress. While hypoxia appears to cause a transient stress, the hypoxic response is mostly characterized by a transition to a new state of gene expression. In summary, our results reveal that hypoxia causes widespread and complex changes in gene expression to prepare the cell to function with little or no oxygen.
Joshua C Kwekel
Full Text Available Age is a predisposing condition for susceptibility to chronic kidney disease and progression as well as acute kidney injury that may arise due to the adverse effects of some drugs. Age-related differences in kidney biology, therefore, are a key concern in understanding drug safety and disease progression. We hypothesize that the underlying suite of genes expressed in the kidney at various life cycle stages will impact susceptibility to adverse drug reactions. Therefore, establishing changes in baseline expression data between these life stages is the first and necessary step in evaluating this hypothesis. Untreated male F344 rats were sacrificed at 2, 5, 6, 8, 15, 21, 78, and 104 weeks of age. Kidneys were collected for histology and gene expression analysis. Agilent whole-genome rat microarrays were used to query global expression profiles. An ANOVA (p1.5 in relative mRNA expression, was used to identify 3,724 unique differentially expressed genes (DEGs. Principal component analyses of these DEGs revealed three major divisions in life-cycle renal gene expression. K-means cluster analysis identified several groups of genes that shared age-specific patterns of expression. Pathway analysis of these gene groups revealed age-specific gene networks and functions related to renal function and aging, including extracellular matrix turnover, immune cell response, and renal tubular injury. Large age-related changes in expression were also demonstrated for the genes that code for qualified renal injury biomarkers KIM-1, Clu, and Tff3. These results suggest specific groups of genes that may underlie age-specific susceptibilities to adverse drug reactions and disease. This analysis of the basal gene expression patterns of renal genes throughout the life cycle of the rat will improve the use of current and future renal biomarkers and inform our assessments of kidney injury and disease.
Bendjilali, Nasrine; MacLeon, Samuel; Kalra, Gurmannat; Willis, Stephen D; Hossian, A K M Nawshad; Avery, Erica; Wojtowicz, Olivia; Hickman, Mark J
Many cells experience hypoxia, or low oxygen, and respond by dramatically altering gene expression. In the yeast Saccharomyces cerevisiae, genes that respond are required for many oxygen-dependent cellular processes, such as respiration, biosynthesis, and redox regulation. To more fully characterize the global response to hypoxia, we exposed yeast to hypoxic conditions, extracted RNA at different times, and performed RNA sequencing (RNA-seq) analysis. Time-course statistical analysis revealed hundreds of genes that changed expression by up to 550-fold. The genes responded with varying kinetics suggesting that multiple regulatory pathways are involved. We identified most known oxygen-regulated genes and also uncovered new regulated genes. Reverse transcription-quantitative PCR (RT-qPCR) analysis confirmed that the lysine methyltransferase EFM6 and the recombinase DMC1, both conserved in humans, are indeed oxygen-responsive. Looking more broadly, oxygen-regulated genes participate in expected processes like respiration and lipid metabolism, but also in unexpected processes like amino acid and vitamin metabolism. Using principle component analysis, we discovered that the hypoxic response largely occurs during the first 2 hr and then a new steady-state expression state is achieved. Moreover, we show that the oxygen-dependent genes are not part of the previously described environmental stress response (ESR) consisting of genes that respond to diverse types of stress. While hypoxia appears to cause a transient stress, the hypoxic response is mostly characterized by a transition to a new state of gene expression. In summary, our results reveal that hypoxia causes widespread and complex changes in gene expression to prepare the cell to function with little or no oxygen. Copyright © 2017 Bendjilali et al.
Robert T Gaeta
. Furthermore, our microarray analysis did not provide strong evidence that homoeologous rearrangements were a determinant of genome-wide nonadditive gene expression. In light of the inherent limitations of the Arabidopsis microarray to measure gene expression in polyploid Brassicas, further studies are warranted.
Blomstrøm, Monica Marie
several growth modulators and invasion modulators were identified and independently validated. These candidates revealed a group of genes with metastasis-related functions in vitro that are involved in RNA-related processes, such as RNA-processing. Moreover, a general feature was that proliferation......) and non-CSCs. The main goal of this project was to functionally characterize a set of candidate genes recovered from next-generation sequencing analysis for their role in breast cancer metastasis formation. The starting gene set comprised 104 gene variants; i.e. 57 wildtype and 47 mutated variants. During...
Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J
Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.
Cao, Yunpeng; Meng, Dandan; Abdullah, Muhammad; Jin, Qing; Lin, Yi; Cai, Yongping
The VQ motif-containing gene, a member of the plant-specific genes, is involved in the plant developmental process and various stress responses. The VQ motif-containing gene family has been studied in several plants, such as rice ( Oryza sativa ), maize ( Zea mays ), and Arabidopsis ( Arabidopsis thaliana ). However, no systematic study has been performed in Pyrus species, which have important economic value. In our study, we identified 41 and 28 VQ motif-containing genes in Pyrus bretschneideri and Pyrus communis , respectively. Phylogenetic trees were calculated using A. thaliana and O. sativa VQ motif-containing genes as a template, allowing us to categorize these genes into nine subfamilies. Thirty-two and eight paralogous of VQ motif-containing genes were found in P. bretschneideri and P. communis , respectively, showing that the VQ motif-containing genes had a more remarkable expansion in P. bretschneideri than in P. communis . A total of 31 orthologous pairs were identified from the P. bretschneideri and P. communis VQ motif-containing genes. Additionally, among the paralogs, we found that these duplication gene pairs probably derived from segmental duplication/whole-genome duplication (WGD) events in the genomes of P. bretschneideri and P. communis , respectively. The gene expression profiles in both P. bretschneideri and P. communis fruits suggested functional redundancy for some orthologous gene pairs derived from a common ancestry, and sub-functionalization or neo-functionalization for some of them. Our study provided the first systematic evolutionary analysis of the VQ motif-containing genes in Pyrus , and highlighted the diversification and duplication of VQ motif-containing genes in both P. bretschneideri and P. communis .
Bao, Yanyan; Gao, Yingjie; Shi, Yujing; Cui, Xiaolan
H1N1, a major pathogenic subtype of influenza A virus, causes a respiratory infection in humans and livestock that can range from a mild infection to more severe pneumonia associated with acute respiratory distress syndrome. Understanding the dynamic changes in the genome and the related functional changes induced by H1N1 influenza virus infection is essential to elucidating the pathogenesis of this virus and thereby determining strategies to prevent future outbreaks. In this study, we filtered the significantly expressed genes in mouse pneumonia using mRNA microarray analysis. Using STC analysis, seven significant gene clusters were revealed, and using STC-GO analysis, we explored the significant functions of these seven gene clusters. The results revealed GOs related to H1N1 virus-induced inflammatory and immune functions, including innate immune response, inflammatory response, specific immune response, and cellular response to interferon-beta. Furthermore, the dynamic regulation relationships of the key genes in mouse pneumonia were revealed by dynamic gene network analysis, and the most important genes were filtered, including Dhx58, Cxcl10, Cxcl11, Zbp1, Ifit1, Ifih1, Trim25, Mx2, Oas2, Cd274, Irgm1, and Irf7. These results suggested that during mouse pneumonia, changes in the expression of gene clusters and the complex interactions among genes lead to significant changes in function. Dynamic gene expression analysis revealed key genes that performed important functions. These results are a prelude to advancements in mouse H1N1 influenza virus infection biology, as well as the use of mice as a model organism for human H1N1 influenza virus infection studies.
Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing
The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.
Yazun Bashir Jarrar
Nov 26, 2017 ... Sequence analysis of the N-acetyltransferase 2 gene (NAT2) among Jordanian volunteers, Libyan. Journal of Medicine .... For molecular modeling of NAT2 protein, visualized ..... cal clustering. .... cular dynamics simulation.
[Solc R., Hirschfeldova K., Kebrdlova V. and Baxova A. 2014 Analysis of common SHOX gene sequence variants ... based on a Gibbs sampling strategy were done using .... SHOX (short stature homeobox) are an important cause of growth.
Crampton, Michael C
Full Text Available This presentation focused on the transcriptional analysis of heterologous gene expression using the endogenous sD promoter from Bacillus halodurans. It concludes to a successful implementation of a high throughput mRNA sandwich hybridisation...
sequencing gene differential expression analysis (Chen et al. ... DNase digestion (Takara, Shiga, Japan), so that any remain- ..... ing the early moments of pollen germination (Guyon et al. 2000). The steady-state transcript level of PGPS/D3 ...
Chen, Chun; Xie, Tingna; Ye, Sudan; Jensen, Annette Bruun; Eilenberg, Jørgen
The selection of suitable reference genes is crucial for accurate quantification of gene expression and can add to our understanding of host-pathogen interactions. To identify suitable reference genes in Pandora neoaphidis, an obligate aphid pathogenic fungus, the expression of three traditional candidate genes including 18S rRNA(18S), 28S rRNA(28S) and elongation factor 1 alpha-like protein (EF1), were measured by quantitative polymerase chain reaction at different developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae), and under different nutritional conditions. We calculated the expression stability of candidate reference genes using four algorithms including geNorm, NormFinder, BestKeeper and Delta Ct. The analysis results revealed that the comprehensive ranking of candidate reference genes from the most stable to the least stable was 18S (1.189), 28S (1.414) and EF1 (3). The 18S was, therefore, the most suitable reference gene for real-time RT-PCR analysis of gene expression under all conditions. These results will support further studies on gene expression in P. neoaphidis. Copyright © 2015 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Full Text Available Abstract The selection of suitable reference genes is crucial for accurate quantification of gene expression and can add to our understanding of host–pathogen interactions. To identify suitable reference genes in Pandora neoaphidis, an obligate aphid pathogenic fungus, the expression of three traditional candidate genes including 18S rRNA(18S, 28S rRNA(28S and elongation factor 1 alpha-like protein (EF1, were measured by quantitative polymerase chain reaction at different developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae, and under different nutritional conditions. We calculated the expression stability of candidate reference genes using four algorithms including geNorm, NormFinder, BestKeeper and Delta Ct. The analysis results revealed that the comprehensive ranking of candidate reference genes from the most stable to the least stable was 18S (1.189, 28S (1.414 and EF1 (3. The 18S was, therefore, the most suitable reference gene for real-time RT-PCR analysis of gene expression under all conditions. These results will support further studies on gene expression in P. neoaphidis.
Zhang, Tianyu; Pabst, Breana; Klapper, Isaac; Stewart, Philip S
A theory for analysis and prediction of spatial and temporal patterns of gene and protein expression within microbial biofilms is derived. The theory integrates phenomena of solute reaction and diffusion, microbial growth, mRNA or protein synthesis, biomass advection, and gene transcript or protein turnover. Case studies illustrate the capacity of the theory to simulate heterogeneous spatial patterns and predict microbial activities in biofilms that are qualitatively different from those of planktonic cells. Specific scenarios analyzed include an inducible GFP or fluorescent protein reporter, a denitrification gene repressed by oxygen, an acid stress response gene, and a quorum sensing circuit. It is shown that the patterns of activity revealed by inducible stable fluorescent proteins or reporter unstable proteins overestimate the region of activity. This is due to advective spreading and finite protein turnover rates. In the cases of a gene induced by either limitation for a metabolic substrate or accumulation of a metabolic product, maximal expression is predicted in an internal stratum of the biofilm. A quorum sensing system that includes an oxygen-responsive negative regulator exhibits behavior that is distinct from any stage of a batch planktonic culture. Though here the analyses have been limited to simultaneous interactions of up to two substrates and two genes, the framework applies to arbitrarily large networks of genes and metabolites. Extension of reaction-diffusion modeling in biofilms to the analysis of individual genes and gene networks is an important advance that dovetails with the growing toolkit of molecular and genetic experimental techniques.
Kim, Jin-Ae; Bhatnagar, Nikita; Kwon, Soon Jae; Min, Myung Ki; Moon, Seok-Jun; Yoon, In Sun; Kwon, Taek-Ryoun; Kim, Sun Tae; Kim, Beom-Gi
The phytohormone abscisic acid (ABA) enables plants to adapt to adverse environmental conditions through the modulation of metabolic pathways and of growth and developmental programs. We used comparative microarray analysis to identify genes exhibiting ABA-dependent expression and other hormone-dependent expression among them in Oryza sativa shoot and root. We identified 854 genes as significantly up- or down-regulated in root or shoot under ABA treatment condition. Most of these genes had similar expression profiles in root and shoot under ABA treatment condition, whereas 86 genes displayed opposite expression responses in root and shoot. To examine the crosstalk between ABA and other hormones, we compared the expression profiles of the ABA-dependently regulated genes under several different hormone treatment conditions. Interestingly, around half of the ABA-dependently expressed genes were also regulated by jasmonic acid based on microarray data analysis. We searched the promoter regions of these genes for cis-elements that could be responsible for their responsiveness to both hormones, and found that ABRE and MYC2 elements, among others, were common to the promoters of genes that were regulated by both ABA and JA. These results show that ABA and JA might have common gene expression regulation system and might explain why the JA could function for both abiotic and biotic stress tolerance.
Full Text Available The formation of granulomas is associated with the resolution of Q fever, a zoonosis due to Coxiella burnetii; however the molecular mechanisms of granuloma formation remain poorly understood. We generated human granulomas with peripheral blood mononuclear cells and beads coated with C. burnetii, using BCG extracts as controls. A microarray analysis showed dramatic changes in gene expression in granuloma cells of which more than 50% were commonly modulated genes in response to C. burnetii and BCG. They included M1-related genes and genes related to chemotaxis. The inhibition of the chemokines, CCL2 and CCL5, directly interfered with granuloma formation. C. burnetii granulomas also expressed a specific transcriptional profile that was essentially enriched in genes associated with type I interferon response. Our results showed that granuloma formation is associated with a core of transcriptional response based on inflammatory genes. The specific granulomatous response to C. burnetii is characterized by the activation of type I interferon pathway.
Salagacka-Kubiak, Aleksandra; Żebrowska, Marta; Wosiak, Agnieszka; Balcerczak, Mariusz; Mirowski, Marek; Balcerczak, Ewa
The aim of this study was to evaluate the participation of polymorphism at position C421A and mRNA expression of the ABCG2 gene in the development of peptic ulcers, which is a very common and severe disease. ABCG2, encoded by the ABCG2 gene, has been found inter alia in the gastrointestinal tract, where it plays a protective role eliminating xenobiotics from cells into the extracellular environment. The materials for the study were biopsies of gastric mucosa taken during a routine endoscopy. For genotyping by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) at position C421A, DNA was isolated from 201 samples, while for the mRNA expression level by real-time PCR, RNA was isolated from 60 patients. The control group of healthy individuals consisted of 97 blood donors. The dominant genotype in the group of peptic ulcer patients and healthy individuals was homozygous CC. No statistically significant differences between healthy individuals and the whole group of peptic ulcer patients and, likewise, between the subgroups of peptic ulcer patients (infected and uninfected with Helicobacter pylori) were found. ABCG2 expression relative to GAPDH expression was found in 38 of the 60 gastric mucosa samples. The expression level of the gene varies greatly among cases. The statistically significant differences between the intensity (p = 0.0375) of H. pylori infection and ABCG2 gene expression have been shown. It was observed that the more intense the infection, the higher the level of ABCG2 expression.
To compare the sequences and tissue expressions of the two aces between the two species, cDNAs encoding two ace genes were cloned and designated as Bmm-ace1 and Bmm-ace2 from the larvae of the B. mandarina. The amino acid sequence of Bmm-ace1 shared 99.71% homology with its homolog, Bm-ace1, in B.
Full Text Available Background. Coronary artery atherosclerosis is a chronic inflammatory disease. This study aimed to identify the key changes of gene expression between early and advanced carotid atherosclerotic plaque in human. Methods. Gene expression dataset GSE28829 was downloaded from Gene Expression Omnibus (GEO, including 16 advanced and 13 early stage atherosclerotic plaque samples from human carotid. Differentially expressed genes (DEGs were analyzed. Results. 42,450 genes were obtained from the dataset. Top 100 up- and downregulated DEGs were listed. Functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG identification were performed. The result of functional and pathway enrichment analysis indicted that the immune system process played a critical role in the progression of carotid atherosclerotic plaque. Protein-protein interaction (PPI networks were performed either. Top 10 hub genes were identified from PPI network and top 6 modules were inferred. These genes were mainly involved in chemokine signaling pathway, cell cycle, B cell receptor signaling pathway, focal adhesion, and regulation of actin cytoskeleton. Conclusion. The present study indicated that analysis of DEGs would make a deeper understanding of the molecular mechanisms of atherosclerosis development and they might be used as molecular targets and diagnostic biomarkers for the treatment of atherosclerosis.
Full Text Available Polyunsaturated fatty acids (PUFAs, especially n-3 polyunsaturated fatty acids, docosahexaenoic acid (DHA and eicosapentaenoic acid (EPA, are known to protect against inflammation-induced bone loss in chronic inflammatory diseases, such as rheumatoid arthritis, periodontitis and osteoporosis. We previously reported that DHA, not EPA, inhibited osteoclastogenesis induced by the receptor activator of nuclear factor-κB ligand (sRANKL in vitro. In this study, we performed gene expression analysis using microarrays to identify genes affected by the DHA treatment during osteoclastogenesis. DHA strongly inhibited osteoclastogenesis at the late stage. Among the genes upregulated by the sRANKL treatment, 4779 genes were downregulated by DHA and upregulated by the EPA treatment. Gene ontology analysis identified sets of genes related to cell motility, cell adhesion, cell-cell signaling and cell morphogenesis. Quantitative PCR analysis confirmed that DC-STAMP, an essential gene for the cell fusion process in osteoclastogenesis, and other osteoclast-related genes, such as Siglec-15, Tspan7 and Mst1r, were inhibited by DHA.
Wu, Jian; Wang, Feiyan; Cheng, Lin; Kong, Fuling; Peng, Zhen; Liu, Songyu; Yu, Xiaolin; Lu, Gang
Auxin response factors (ARFs) encode transcriptional factors that bind specifically to the TGTCTC-containing auxin response elements found in the promoters of primary/early auxin response genes that regulate plant development. In this study, investigation of the tomato genome revealed 21 putative functional ARF genes (SlARFs), a number comparable to that found in Arabidopsis (23) and rice (25). The full cDNA sequences of 15 novel SlARFs were isolated and delineated by sequencing of PCR products. A comprehensive genome-wide analysis of this gene family is presented, including the gene structures, chromosome locations, phylogeny, and conserved motifs. In addition, a comparative analysis between ARF family genes in tomato and maize was performed. A phylogenetic tree generated from alignments of the full-length protein sequences of 21 OsARFs, 23 AtARFs, 31 ZmARFs, and 21 SlARFs revealed that these ARFs were clustered into four major groups. However, we could not find homologous genes in rice, maize, or tomato with AtARF12-15 and AtARF20-23. The expression patterns of tomato ARF genes were analyzed by quantitative real-time PCR. Our comparative analysis will help to define possible functions for many of these newly isolated ARF-family genes in plant development.
Ding, Mingquan; Chen, Jiadong; Jiang, Yurong; Lin, Lifeng; Cao, YueFen; Wang, Minhua; Zhang, Yuting; Rong, Junkang; Ye, Wuwei
WRKY transcription factors play important roles in various stress responses in diverse plant species. In cotton, this family has not been well studied, especially in relation to fiber development. Here, the genomes and transcriptomes of Gossypium raimondii and Gossypium arboreum were investigated to identify fiber development related WRKY genes. This represents the first comprehensive comparative study of WRKY transcription factors in both diploid A and D cotton species. In total, 112 G. raimondii and 109 G. arboreum WRKY genes were identified. No significant gene structure or domain alterations were detected between the two species, but many SNPs distributed unequally in exon and intron regions. Physical mapping revealed that the WRKY genes in G. arboreum were not located in the corresponding chromosomes of G. raimondii, suggesting great chromosome rearrangement in the diploid cotton genomes. The cotton WRKY genes, especially subgroups I and II, have expanded through multiple whole genome duplications and tandem duplications compared with other plant species. Sequence comparison showed many functionally divergent sites between WRKY subgroups, while the genes within each group are under strong purifying selection. Transcriptome analysis suggested that many WRKY genes participate in specific fiber development processes such as fiber initiation, elongation and maturation with different expression patterns between species. Complex WRKY gene expression such as differential Dt and At allelic gene expression in G. hirsutum and alternative splicing events were also observed in both diploid and tetraploid cottons during fiber development process. In conclusion, this study provides important information on the evolution and function of WRKY gene family in cotton species.
Gesing, Stefan; Schindler, Daniel; Nowrousian, Minou
Ascomycetes differentiate four major morphological types of fruiting bodies (apothecia, perithecia, pseudothecia and cleistothecia) that are derived from an ancestral fruiting body. Thus, fruiting body differentiation is most likely controlled by a set of common core genes. One way to identify such genes is to search for genes with evolutionary conserved expression patterns. Using suppression subtractive hybridization (SSH), we selected differentially expressed transcripts in Pyronema confluens (Pezizales) by comparing two cDNA libraries specific for sexual and for vegetative development, respectively. The expression patterns of selected genes from both libraries were verified by quantitative real time PCR. Expression of several corresponding homologous genes was found to be conserved in two members of the Sordariales (Sordaria macrospora and Neurospora crassa), a derived group of ascomycetes that is only distantly related to the Pezizales. Knockout studies with N. crassa orthologues of differentially regulated genes revealed a functional role during fruiting body development for the gene NCU05079, encoding a putative MFS peptide transporter. These data indicate conserved gene expression patterns and a functional role of the corresponding genes during fruiting body development; such genes are candidates of choice for further functional analysis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sheng, Sheng; Liao, Cheng-Wu; Zheng, Yu; Zhou, Yu; Xu, Yan; Song, Wen-Miao; He, Peng; Zhang, Jian; Wu, Fu-An
Meteorus pulchricornis is an endoparasitoid wasp which attacks the larvae of various lepidopteran pests. We present the first antennal transcriptome dataset for M. pulchricornis. A total of 48,845,072 clean reads were obtained and 34,967 unigenes were assembled. Of these, 15,458 unigenes showed a significant similarity (E-value <10 -5 ) to known proteins in the NCBI non-redundant protein database. Gene ontology (GO) and cluster of orthologous groups (COG) analyses were used to classify the functions of M. pulchricornis antennae genes. We identified 16 putative odorant-binding protein (OBP) genes, eight chemosensory protein (CSP) genes, 99 olfactory receptor (OR) genes, 19 ionotropic receptor (IR) genes and one sensory neuron membrane protein (SNMP) gene. BLASTx best hit results and phylogenetic analysis both indicated that these chemosensory genes were most closely related to those found in other hymenopteran species. Real-time quantitative PCR assays showed that 14 MpulOBP genes were antennae-specific. Of these, MpulOBP6, MpulOBP9, MpulOBP10, MpulOBP12, MpulOBP15 and MpulOBP16 were found to have greater expression in the antennae than in other body parts, while MpulOBP2 and MpulOBP3 were expressed predominately in the legs and abdomens, respectively. These results might provide a foundation for future studies of olfactory genes and chemoreception in M. pulchricornis. Copyright © 2017 Elsevier Inc. All rights reserved.
Chernoff, A; Kasparcova, V; Linehan, W M; Stolle, C A
Von Hippel-Lindau (VHL) disease is an autosomal dominant disorder that predisposes the affected individual to develop characteristic tumors. These include CNS hemangioblastoma, retinal angiomas, endolymphatic sac tumors, pancreatic cysts and tumors, epididymal cystadenomas, pheochromocytomas, renal cysts, and clear-cell renal carcinoma. The VHL gene was localized to 3p25 and then isolated by Latif et al. (1). The gene contains three exons with an open reading frame of 852 nucleotides, which encode a predicted protein of 284 amino acids. The VHL protein is believed to have several functions. It is involved in transcription regulation through its inhibition of elongation by binding to the B and C subunits of elongin. Mutations of VHL allow the B and C subunits to bind with the A subunit. This complex then overcomes "pausing" of RNA polymerase during mRNA transcription (2,3). Several studies suggest that the VHL protein is also involved in regulation of hypoxia-inducible transcripts, particularly vascular endothelial growth factor (VEGF), by altering mRNA stability (4,5). Therefore, VHL gene mutations permit the overexpression of VEGF under normoxic conditions, which leads to the angiogenesis believed to be required for tumor growth. The VHL-elongin BC complex (VBC) also binds two other proteins-CUL2 and Rbx1-in a complex that has structural similarity to other E3 ubiquitin ligase complexes (6). Such complexes mediate the degradation of cell-cycle regulatory proteins.
Flynn, Elizabeth K; Kamat, Aparna; Lach, Francis P; Donovan, Frank X; Kimble, Danielle C; Narisu, Narisu; Sanborn, Erica; Boulad, Farid; Davies, Stella M; Gillio, Alfred P; Harris, Richard E; MacMillan, Margaret L; Wagner, John E; Smogorzewska, Agata; Auerbach, Arleen D; Ostrander, Elaine A; Chandrasekharappa, Settara C
Fanconi anemia (FA) is a rare recessive disease resulting from mutations in one of at least 16 different genes. Mutation types and phenotypic manifestations of FA are highly heterogeneous and influence the clinical management of the disease. We analyzed 202 FA families for large deletions, using high-resolution comparative genome hybridization arrays, single-nucleotide polymorphism arrays, and DNA sequencing. We found pathogenic deletions in 88 FANCA, seven FANCC, two FANCD2, and one FANCB families. We find 35% of FA families carry large deletions, accounting for 18% of all FA pathogenic variants. Cloning and sequencing across the deletion breakpoints revealed that 52 FANCA deletion ends, and one FANCC deletion end extended beyond the gene boundaries, potentially affecting neighboring genes with phenotypic consequences. Seventy-five percent of the FANCA deletions are Alu-Alu mediated, predominantly by AluY elements, and appear to be caused by nonallelic homologous recombination. Individual Alu hotspots were identified. Defining the haplotypes of four FANCA deletions shared by multiple families revealed that three share a common ancestry. Knowing the exact molecular changes that lead to the disease may be critical for a better understanding of the FA phenotype, and to gain insight into the mechanisms driving these pathogenic deletion variants. © 2014 WILEY PERIODICALS, INC.
Flanagan, Simon; Lee, Maggie; Li, Cheryl C. Y.; Suter, Catherine M.; Buckland, Michael E.
Mutations in isocitrate dehydrogenase (IDH)-1 or -2 are found in the majority of WHO grade II and III astrocytomas and oligodendrogliomas, and secondary glioblastomas. Almost all described mutations are heterozygous missense mutations affecting a conserved arginine residue in the substrate binding site of IDH1 (R132) or IDH2 (R172). But the exact mechanism of IDH mutations in neoplasia is not understood. It has been proposed that IDH mutations impart a “toxic gain-of-function” to the mutant protein, however a dominant-negative effect of mutant IDH has also been described, implying that IDH may function as a tumor suppressor gene. As most, if not all, tumor suppressor genes are inactivated by epigenetic silencing, in a wide variety of tumors, we asked if IDH1 or IDH2 carry the epigenetic signature of a tumor suppressor by assessing cytosine methylation at their promoters. Methylation was quantified in 68 human brain tumors, including both IDH-mutant and IDH wildtype, by bisulfite pyrosequencing. In all tumors examined, CpG methylation levels were less than 8%. Our data demonstrate that inactivation of IDH function through promoter hypermethylation is not common in human gliomas and other brain tumors. These findings do not support a tumor suppressor role for IDH genes in human gliomas.
Hodar, Christian; Zuñiga, Alejandro; Pulgar, Rodrigo; Travisany, Dante; Chacon, Carlos; Pino, Michael; Maass, Alejandro; Cambiazo, Verónica
In the early Drosophila melanogaster embryo, Dpp, a secreted molecule that belongs to the TGF-β superfamily of growth factors, activates a set of downstream genes to subdivide the dorsal region into amnioserosa and dorsal epidermis. Here, we examined the expression pattern and transcriptional regulation of Dtg, a new target gene of Dpp signaling pathway that is required for proper amnioserosa differentiation. We showed that the expression of Dtg was controlled by Dpp and characterized a 524-bp enhancer that mediated expression in the dorsal midline, as well as, in the differentiated amnioserosa in transgenic reporter embryos. This enhancer contained a highly conserved region of 48-bp in which bioinformatic predictions and in vitro assays identified three Mad binding motifs. Mutational analysis revealed that these three motifs were necessary for proper expression of a reporter gene in transgenic embryos, suggesting that short and highly conserved genomic sequences may be indicative of functional regulatory regions in D. melanogaster genes. Dtg orthologs were not detected in basal lineages of Dipterans, which unlike D. melanogaster develop two extra-embryonic membranes, amnion and serosa, nevertheless Dtg orthologs were identified in the transcriptome of Musca domestica, in which dorsal ectoderm patterning leads to the formation of a single extra-embryonic membrane. These results suggest that Dtg was recruited as a new component of the network that controls dorsal ectoderm patterning in the lineage leading to higher Cyclorrhaphan flies, such as D. melanogaster and M. domestica. Copyright © 2013 Elsevier B.V. All rights reserved.
Alvarez, José M; Bueno, Natalia; Cañas, Rafael A; Avila, Concepción; Cánovas, Francisco M; Ordás, Ricardo J
WUSCHEL-RELATED HOMEOBOX (WOX) genes are key players controlling stem cells in plants and can be divided into three clades according to the time of their appearance during plant evolution. Our knowledge of stem cell function in vascular plants other than angiosperms is limited, they separated from gymnosperms ca 300 million years ago and their patterning during embryogenesis differs significantly. For this reason, we have used the model gymnosperm Pinus pinaster to identify WOX genes and perform a thorough analysis of their gene expression patterns. Using transcriptomic data from a comprehensive range of tissues and stages of development we have shown three major outcomes: that the P. pinaster genome encodes at least fourteen members of the WOX family spanning all the major clades, that the genome of gymnosperms contains a WOX gene with no homologues in angiosperms representing a transitional stage between intermediate- and WUS-clade proteins, and that we can detect discrete WUS and WOX5 transcripts for the first time in a gymnosperm. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Buffat, Christophe; Boubred, Farid; Mondon, Françoise; Chelbi, Sonia T; Feuerstein, Jean-Marc; Lelièvre-Pégorier, Martine; Vaiman, Daniel; Simeoni, Umberto
In this study, low birth weight was induced in rats by feeding the dams with a low-protein diet during pregnancy. Kidneys from the fetuses at the end of gestation were collected and showed a reduction in overall and relative weight, in parallel with other tissues (heart and liver). This reduction was associated with a reduction in nephrons number. To better understand the molecular basis of this observation, a transcriptome analysis contrasting kidneys from control and protein-deprived rats was performed, using a platform based upon long isothermic oligonucleotides, strengthening the robustness of the results. We could identify over 1800 transcripts modified more than twice (772 induced and 1040 repressed). Genes of either category were automatically classified according to functional criteria, making it possible to bring to light a large cluster of genes involved in coagulation and complement cascades. The promoters of the most induced and most repressed genes were contrasted for their composition in putative transcription factor binding sites, suggesting an overrepresentation of the AP1R binding site, together with the transcription induction of factors actually binding to this site in the set of induced genes. The induction of coagulation cascades in the kidney of low-birth-weight rats provides a putative rationale for explaining thrombo-endothelial disorders also observed in intrauterine growth-restricted human newborns. These alterations in the kidneys have been reported as a probable cause for cardiovascular diseases in the adult.
Wu, Hong; Zheng, Xiaohong; Araki, Yoshio; Sahara, Hiroshi; Takagi, Hiroshi; Shimoi, Hitoshi
During the brewing of Japanese sake, Saccharomyces cerevisiae cells produce a high concentration of ethanol compared with other ethanol fermentation methods. We analyzed the gene expression profiles of yeast cells during sake brewing using DNA microarray analysis. This analysis revealed some characteristics of yeast gene expression during sake brewing and provided a scaffold for a molecular level understanding of the sake brewing process. PMID:16997994
Thomassen, Mads; Jochumsen, Kirsten M; Mogensen, Ole
the relation of gene expression and chromosomal position to identify chromosomal regions of importance for early recurrence of ovarian cancer. By use of *Gene Set Enrichment Analysis*, we have ranked chromosomal regions according to their association to survival. Over-representation analysis including 1...... using death (P = 0.015) and recurrence (P = 0.002) as outcome. The combined mutation score is strongly associated to upregulation of several growth factor pathways....
Full Text Available Background: Colorectal cancer (CRC is one of the most frequently occurring cancers in Japan, and thus a wide range of methods have been deployed to study the molecular mechanisms of CRC. In this study, we performed a comprehensive analysis of CRC, incorporating copy number aberration (CRC and gene expression data. For the last four years, we have been collecting data from CRC cases and organizing the information as an “omics” study by integrating many kinds of analysis into a single comprehensive investigation. In our previous studies, we had experienced difficulty in finding genes related to CRC, as we observed higher noise levels in the expression data than in the data for other cancers. Because chromosomal aberrations are often observed in CRC, here, we have performed a combination of CNA analysis and expression analysis in order to identify some new genes responsible for CRC. This study was performed as part of the Clinical Omics Database Project at Tokyo Medical and Dental University. The purpose of this study was to investigate the mechanism of genetic instability in CRC by this combination of expression analysis and CNA, and to establish a new method for the diagnosis and treatment of CRC. Materials and methods: Comprehensive gene expression analysis was performed on 79 CRC cases using an Affymetrix Gene Chip, and comprehensive CNA analysis was performed using an Affymetrix DNA Sty array. To avoid the contamination of cancer tissue with normal cells, laser micro-dissection was performed before DNA/RNA extraction. Data analysis was performed using original software written in the R language. Result: We observed a high percentage of CNA in colorectal cancer, including copy number gains at 7, 8q, 13 and 20q, and copy number losses at 8p, 17p and 18. Gene expression analysis provided many candidates for CRC-related genes, but their association with CRC did not reach the level of statistical significance. The combination of CNA and gene
Mishra, Pashupati; Törönen, Petri; Leino, Yrjö; Holm, Liisa
Gene set analysis is the analysis of a set of genes that collectively contribute to a biological process. Most popular gene set analysis methods are based on empirical P-value that requires large number of permutations. Despite numerous gene set analysis methods developed in the past decade, the most popular methods still suffer from serious limitations. We present a gene set analysis method (mGSZ) based on Gene Set Z-scoring function (GSZ) and asymptotic P-values. Asymptotic P-value calculation requires fewer permutations, and thus speeds up the gene set analysis process. We compare the GSZ-scoring function with seven popular gene set scoring functions and show that GSZ stands out as the best scoring function. In addition, we show improved performance of the GSA method when the max-mean statistics is replaced by the GSZ scoring function. We demonstrate the importance of both gene and sample permutations by showing the consequences in the absence of one or the other. A comparison of asymptotic and empirical methods of P-value estimation demonstrates a clear advantage of asymptotic P-value over empirical P-value. We show that mGSZ outperforms the state-of-the-art methods based on two different evaluations. We compared mGSZ results with permutation and rotation tests and show that rotation does not improve our asymptotic P-values. We also propose well-known asymptotic distribution models for three of the compared methods. mGSZ is available as R package from cran.r-project.org. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: firstname.lastname@example.org.
Hayat, S.; Cheng, Z.; Chen, X.
The PHD-finger proteins are conserved in eukaryotic organisms and are involved in a variety of important functions in different biological processes in plants. However, the function of PHD fingers are poorly known in tomato (Solanum lycopersicum L.). In current study, we identified 45 putative genes coding Phd finger protein in tomato distributed on 11 chromosomes except for chromosome 8. Some of the genes encode other conserved key domains besides Phd-finger. Phylogenetic analysis of these 45 proteins resulted in seven clusters. Most Phd finger proteins were predicted to PML body location. These PHD-finger genes displayed differential expression either in various organs, at different development stages and under stresses in tomato. Our study provides the first systematic analysis of PHD-finger genes and proteins in tomato. This preliminary study provides a very useful reference information for Phd-finger proteins in tomato. They will be helpful for cloning and functional study of tomato PHD-finger genes. (author)
Han, Y; Zheng, Q S; Wei, Y P; Chen, J; Liu, R; Wan, H J
In this study, we examined phytoene synthetase (PSY), the first key limiting enzyme in the synthesis of carotenoids and catalyzing the formation of geranylgeranyl pyrophosphate in terpenoid biosynthesis. We used known amino acid sequences of the PSY gene in tomato plants to conduct a genome-wide search and identify putative candidates in 34 sequenced plants. A total of 101 homologous genes were identified. Phylogenetic analysis revealed that PSY evolved independently in algae as well as monocotyledonous and dicotyledonous plants. Our results showed that the amino acid structures exhibited 5 motifs (motifs 1 to 5) in algae and those in higher plants were highly conserved. The PSY gene structures showed that the number of intron in algae varied widely, while the number of introns in higher plants was 4 to 5. Identification of PSY genes in plants and the analysis of the gene structure may provide a theoretical basis for studying evolutionary relationships in future analyses.
Church, Philip C; Goscinski, Andrzej; Lefèvre, Christophe
Microarrays and more recently RNA sequencing has led to an increase in available gene expression data. How to manage and store this data is becoming a key issue. In response we have developed EXP-PAC, a web based software package for storage, management and analysis of gene expression and sequence data. Unique to this package is SQL based querying of gene expression data sets, distributed normalization of raw gene expression data and analysis of gene expression data across experiments and species. This package has been populated with lactation data in the international milk genomic consortium web portal (http://milkgenomics.org/). Source code is also available which can be hosted on a Windows, Linux or Mac APACHE server connected to a private or public network (http://mamsap.it.deakin.edu.au/~pcc/Release/EXP_PAC.html). Copyright © 2012 Elsevier Inc. All rights reserved.
Barris Wesley C
Full Text Available Abstract Background Detailed information regarding the number and organization of transfer RNA (tRNA genes at the genome level is becoming readily available with the increase of DNA sequencing of whole genomes. However the identification of functional tRNA genes is challenging for species that have large numbers of repetitive elements containing tRNA derived sequences, such as Bos taurus. Reliable identification and annotation of entire sets of tRNA genes allows the evolution of tRNA genes to be understood on a genomic scale. Results In this study, we explored the B. taurus genome using bioinformatics and comparative genomics approaches to catalogue and analyze cow tRNA genes. The initial analysis of the cow genome using tRNAscan-SE identified 31,868 putative tRNA genes and 189,183 pseudogenes, where 28,830 of the 31,868 predicted tRNA genes were classified as repetitive elements by the RepeatMasker program. We then used comparative genomics to further discriminate between functional tRNA genes and tRNA-derived sequences for the remaining set of 3,038 putative tRNA genes. For our analysis, we used the human, chimpanzee, mouse, rat, horse, dog, chicken and fugu genomes to predict that the number of active tRNA genes in cow lies in the vicinity of 439. Of this set, 150 tRNA genes were 100% identical in their sequences across all nine vertebrate genomes studied. Using clustering analyses, we identified a new tRNA-GlyCCC subfamily present in all analyzed mammalian genomes. We suggest that this subfamily originated from an ancestral tRNA-GlyGCC gene via a point mutation prior to the radiation of the mammalian lineages. Lastly, in a separate analysis we created phylogenetic profiles for each putative cow tRNA gene using a representative set of genomes to gain an overview of common evolutionary histories of tRNA genes. Conclusion The use of a combination of bioinformatics and comparative genomics approaches has allowed the confident identification of a
Ivan G. Costa
Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.
Full Text Available Berberine, a natural isoquinoline alkaloid of many medicinal herbs, has an active function against a variety of microbial infections including Microsporum canis (M. canis. However, the underlying mechanisms are poorly understood. To study the effect of berberine chloride on M. canis infection, a Digital Gene Expression (DGE tag profiling was constructed and a transcriptome analysis of the M. canis cellular responses upon berberine treatment was performed. Illumina/Hisseq sequencing technique was used to generate the data of gene expression profile, and the following enrichment analysis of Gene Ontology (GO and Pathway function were conducted based on the data of transcriptome. The results of DGE showed that there were 8476945, 14256722, 7708575, 5669955, 6565513 and 9303468 tags respectively, which was obtained from M. canis incubated with berberine or control DMSO. 8,783 genes were totally mapped, and 1,890 genes have shown significant changes between the two groups. 1,030 genes were up-regulated and 860 genes were down-regulated (P<0.05 in berberine treated group compared to the control group. Besides, twenty-three GO terms were identified by Gene Ontology functional enrichment analysis, such as calcium-transporting ATPase activity, 2-oxoglutarate metabolic process, valine catabolic process, peroxisome and unfolded protein binding. Pathway significant enrichment analysis indicated 6 signaling pathways that are significant, including steroid biosynthesis, steroid hormone biosynthesis, Parkinson's disease, 2,4-Dichlorobenzoate degradation, and tropane, piperidine and Isoquinoline alkaloid biosynthesis. Among these, eleven selected genes were further verified by qRT-PCR. Our findings provide a comprehensive view on the gene expression profile of M. canis upon berberine treatment, and shed light on its complicated effects on M. canis.
Full Text Available Aging is closely connected with death, progressive physiological decline, and increased risk of diseases, such as cancer, arteriosclerosis, heart disease, hypertension, and neurodegenerative diseases. It is reported that moxibustion can treat more than 300 kinds of diseases including aging related problems and can improve immune function and physiological functions. The digital gene expression profiling of aged mice with or without moxibustion treatment was investigated and the mechanisms of moxibustion in aged mice were speculated by gene ontology and pathway analysis in the study. Almost 145 million raw reads were obtained by digital gene expression analysis and about 140 million (96.55% were clean reads. Five differentially expressed genes with an adjusted P value 1 were identified between the control and moxibustion groups. They were Gm6563, Gm8116, Rps26-ps1, Nat8f4, and Igkv3-12. Gene ontology analysis was carried out by the GOseq R package and functional annotations of the differentially expressed genes related to translation, mRNA export from nucleus, mRNA transport, nuclear body, acetyltransferase activity, and so on. Kyoto Encyclopedia of Genes and Genomes database was used for pathway analysis and ribosome was the most significantly enriched pathway term.
Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan
WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.
Luo, Xiongjian; Huang, Liang; Han, Leng; Luo, Zhenwu; Hu, Fang; Tieu, Roger; Gan, Lin
Schizophrenia is a common mental disorder with high heritability and strong genetic heterogeneity. Common disease-common variants hypothesis predicts that schizophrenia is attributable in part to common genetic variants. However, recent studies have clearly demonstrated that copy number variations (CNVs) also play pivotal roles in schizophrenia susceptibility and explain a proportion of missing heritability. Though numerous CNVs have been identified, many of the regions affected by CNVs show poor overlapping among different studies, and it is not known whether the genes disrupted by CNVs contribute to the risk of schizophrenia. By using cumulative scoring, we systematically prioritized the genes affected by CNVs in schizophrenia. We identified 8 top genes that are frequently disrupted by CNVs, including NRXN1, CHRNA7, BCL9, CYFIP1, GJA8, NDE1, SNAP29, and GJA5. Integration of genes affected by CNVs with known schizophrenia susceptibility genes (from previous genetic linkage and association studies) reveals that many genes disrupted by CNVs are also associated with schizophrenia. Further protein-protein interaction (PPI) analysis indicates that protein products of genes affected by CNVs frequently interact with known schizophrenia-associated proteins. Finally, systematic integration of CNVs prioritization data with genetic association and PPI data identifies key schizophrenia candidate genes. Our results provide a global overview of genes impacted by CNVs in schizophrenia and reveal a densely interconnected molecular network of de novo CNVs in schizophrenia. Though the prioritized top genes represent promising schizophrenia risk genes, further work with different prioritization methods and independent samples is needed to confirm these findings. Nevertheless, the identified key candidate genes may have important roles in the pathogenesis of schizophrenia, and further functional characterization of these genes may provide pivotal targets for future therapeutics and
Anna A. Igolkina
Full Text Available Schizophrenia (SCZ is a psychiatric disorder of unknown etiology. There is evidence suggesting that aberrations in neurodevelopment are a significant attribute of schizophrenia pathogenesis and progression. To identify biologically relevant molecular abnormalities affecting neurodevelopment in SCZ we used cultured neural progenitor cells derived from olfactory neuroepithelium (CNON cells. Here, we tested the hypothesis that variance in gene expression differs between individuals from SCZ and control groups. In CNON cells, variance in gene expression was significantly higher in SCZ samples in comparison with control samples. Variance in gene expression was enriched in five molecular pathways: serine biosynthesis, PI3K-Akt, MAPK, neurotrophin and focal adhesion. More than 14% of variance in disease status was explained within the logistic regression model (C-value = 0.70 by predictors accounting for gene expression in 69 genes from these five pathways. Structural equation modeling (SEM was applied to explore how the structure of these five pathways was altered between SCZ patients and controls. Four out of five pathways showed differences in the estimated relationships among genes: between KRAS and NF1, and KRAS and SOS1 in the MAPK pathway; between PSPH and SHMT2 in serine biosynthesis; between AKT3 and TSC2 in the PI3K-Akt signaling pathway; and between CRK and RAPGEF1 in the focal adhesion pathway. Our analysis provides evidence that variance in gene expression is an important characteristic of SCZ, and SEM is a promising method for uncovering altered relationships between specific genes thus suggesting affected gene regulation associated with the disease. We identified altered gene-gene interactions in pathways enriched for genes with increased variance in expression in SCZ. These pathways and loci were previously implicated in SCZ, providing further support for the hypothesis that gene expression variance plays important role in the etiology
Celik Altunoglu, Yasemin; Baloglu, Mehmet Cengiz; Baloglu, Pinar; Yer, Esra Nurten; Kara, Sibel
Late embryogenesis abundant (LEA) proteins are large and diverse group of polypeptides which were first identified during seed dehydration and then in vegetative plant tissues during different stress responses. Now, gene family members of LEA proteins have been detected in various organisms. However, there is no report for this protein family in watermelon and melon until this study. A total of 73 LEA genes from watermelon ( ClLEA ) and 61 LEA genes from melon ( CmLEA ) were identified in this comprehensive study. They were classified into four and three distinct clusters in watermelon and melon, respectively. There was a correlation between gene structure and motif composition among each LEA groups. Segmental duplication played an important role for LEA gene expansion in watermelon. Maximum gene ontology of LEA genes was observed with poplar LEA genes. For evaluation of tissue specific expression patterns of ClLEA and CmLEA genes, publicly available RNA-seq data were analyzed. The expression analysis of selected LEA genes in root and leaf tissues of drought-stressed watermelon and melon were examined using qRT-PCR. Among them, ClLEA - 12 - 17 - 46 genes were quickly induced after drought application. Therefore, they might be considered as early response genes for water limitation conditions in watermelon. In addition, CmLEA - 42 - 43 genes were found to be up-regulated in both tissues of melon under drought stress. Our results can open up new frontiers about understanding of functions of these important family members under normal developmental stages and stress conditions by bioinformatics and transcriptomic approaches.
Liu, Jiewei; Li, Ming; Luo, Xiong-Jian; Su, Bing
Schizophrenia (SCZ) is a complex mental disorder with high heritability. Genetic studies (especially recent genome-wide association studies) have identified many risk genes for schizophrenia. However, the physical interactions among the proteins encoded by schizophrenia risk genes remain elusive and it is not known whether the identified risk genes converge on common molecular networks or pathways. Here we systematically investigated the network characteristics of schizophrenia risk genes using the high-confidence protein-protein interactions (PPI) from the human interactome. We found that schizophrenia risk genes encode a densely interconnected PPI network (P = 4.15 × 10 -31 ). Compared with the background genes, the schizophrenia risk genes in the interactome have significantly higher degree (P = 5.39 × 10 -11 ), closeness centrality (P = 7.56 × 10 -11 ), betweeness centrality (P = 1.29 × 10 -11 ), clustering coefficient (P = 2.22 × 10 -2 ), and shorter average shortest path length (P = 7.56 × 10 -11 ). Based on the densely interconnected PPI network, we identified 48 hub genes and 4 modules formed by highly interconnected schizophrenia genes. We showed that the proteins encoded by schizophrenia hub genes have significantly more direct physical interactions. Gene ontology (GO) analysis revealed that cell adhesion, cell cycle, immune system response, and GABR-receptor complex categories were enriched in the modules formed by highly interconnected schizophrenia risk genes. Our study reveals that schizophrenia risk genes encode a densely interconnected molecular network and demonstrates the modular nature of schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.
Cosphiadi, Irawan; Atmakusumah, Tubagus D; Siregar, Nurjati C; Muthalib, Abdul; Harahap, Alida; Mansyur, Muchtarruddin
Approximately 30% to 40% of breast cancer recurrences involve bone metastasis (BM). Certain genes have been linked to BM; however, none have been able to predict bone involvement. In this study, we analyzed gene expression profiles in advanced breast cancer patients to elucidate genes that can be used to predict BM. A total of 92 advanced breast cancer patients, including 46 patients with BM and 46 patients without BM, were identified for this study. Immunohistochemistry and gene expression analysis was performed on 81 formalin-fixed paraffin-embedded samples. Data were collected through medical records, and gene expression of 200 selected genes compiled from 6 previous studies was performed using NanoString nCounter. Genetic expression profiles showed that 22 genes were significantly differentially expressed between breast cancer patients with metastasis in bone and other organs (BM+) and non-BM, whereas subjects with only BM showed 17 significantly differentially expressed genes. The following genes were associated with an increasing incidence of BM in the BM+ group: estrogen receptor 1 (ESR1), GATA binding protein 3 (GATA3), and melanophilin with an area under the curve (AUC) of 0.804. In the BM group, the following genes were associated with an increasing incidence of BM: ESR1, progesterone receptor, B-cell lymphoma 2, Rab escort protein, N-acetyltransferase 1, GATA3, annexin A9, and chromosome 9 open reading frame 116. ESR1 and GATA3 showed an increased strength of association with an AUC of 0.928. A combination of the identified 3 genes in BM+ and 8 genes in BM showed better prediction than did each individual gene, and this combination can be used as a training set. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Full Text Available Systemic immunosuppression is a risk factor for melanoma, and sunburn-induced immunosuppression is thought to be causal. Genes in immunosuppression pathways are therefore candidate melanoma-susceptibility genes. If variants within these genes individually have a small effect on disease risk, the association may be undetected in genome-wide association (GWA studies due to low power to reach a high significance level. Pathway-based approaches have been suggested as a method of incorporating a priori knowledge into the analysis of GWA studies. In this study, the association of 1113 single nucleotide polymorphisms (SNPs in 43 genes (39 genomic regions related to immunosuppression have been analysed using a gene-set approach in 1539 melanoma cases and 3917 controls from the GenoMEL consortium GWA study. The association between melanoma susceptibility and the whole set of tumour-immunosuppression genes, and also predefined functional subgroups of genes, was considered. The analysis was based on a measure formed by summing the evidence from the most significant SNP in each gene, and significance was evaluated empirically by case-control label permutation. An association was found between melanoma and the complete set of genes (p(emp=0.002, as well as the subgroups related to the generation of tolerogenic dendritic cells (p(emp=0.006 and secretion of suppressive factors (p(emp=0.0004, thus providing preliminary evidence of involvement of tumour-immunosuppression gene polymorphisms in melanoma susceptibility. The analysis was repeated on a second phase of the GenoMEL study, which showed no evidence of an association. As one of the first attempts to replicate a pathway-level association, our results suggest that low power and heterogeneity may present challenges.
Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko
Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
Davis, Charronne F; Dorak, M Tevfik
The most common mutation of the HFE gene C282Y has shown a risk association with childhood acute lymphoblastic leukemia (ALL) in Welsh and Scottish case-control studies. This finding has not been replicated outside Britain. Here, we present a thorough analysis of the HFE gene in a panel of HLA homozygous reference cell lines and in the original population sample from South Wales (117 childhood ALL cases and 414 newborn controls). The 21 of 24 variants analyzed were from the HFE gene region extending 52 kb from the histone gene HIST1H1C to HIST1H1T. We identified the single-nucleotide polymorphism (SNP) rs807212 as a tagging SNP for the most common HFE region haplotype, which contains wild-type alleles of all HFE variants examined. This intergenic SNP rs807212 yielded a strong male-specific protective association (per allele OR = 0.38, 95% CI = 0.22-0.64, P (trend) = 0.0002; P = 0.48 in females), which accounted for the original C282Y risk association. In the HapMap project data, rs807212 was in strong linkage disequilibrium with 25 other SNPs spanning 151 kb around HFE. Minor alleles of these 26 SNPs characterized the most common haplotype for the HFE region, which lacked all disease-associated HFE variants. The HapMap data suggested positive selection in this region even in populations where the HFE C282Y mutation is absent. These results have implications for the sex-specific associations observed in this region and suggest the inclusion of rs807212 in future studies of the HFE gene and the extended HLA class I region.
Chu, Jeffrey Shih Chieh
Regulatory Factor X (RFX) is a family of transcription factors (TF) that is conserved in all metazoans, in some fungi, and in only a few single-cellular organisms. Seven members are found in mammals, nine in fishes, three in fruit flies, and a single member in nematodes and fungi. RFX is involved in many different roles in humans, but a particular function that is conserved in many metazoans is its regulation of ciliogenesis. Probing over 150 genomes for the presence of RFX and ciliary genes ...
Allan Andrew C
Full Text Available Abstract Background Transcription factors (TFs co-ordinately regulate target genes that are dispersed throughout the genome. This co-ordinate regulation is achieved, in part, through the interaction of transcription factors with conserved cis-regulatory motifs that are in close proximity to the target genes. While much is known about the families of transcription factors that regulate gene expression in plants, there are few well characterised cis-regulatory motifs. In Arabidopsis, over-expression of the MYB transcription factor PAP1 (PRODUCTION OF ANTHOCYANIN PIGMENT 1 leads to transgenic plants with elevated anthocyanin levels due to the co-ordinated up-regulation of genes in the anthocyanin biosynthetic pathway. In addition to the anthocyanin biosynthetic genes, there are a number of un-associated genes that also change in expression level. This may be a direct or indirect consequence of the over-expression of PAP1. Results Oligo array analysis of PAP1 over-expression Arabidopsis plants identified genes co-ordinately up-regulated in response to the elevated expression of this transcription factor. Transient assays on the promoter regions of 33 of these up-regulated genes identified eight promoter fragments that were transactivated by PAP1. Bioinformatic analysis on these promoters revealed a common cis-regulatory motif that we showed is required for PAP1 dependent transactivation. Conclusion Co-ordinated gene regulation by individual transcription factors is a complex collection of both direct and indirect effects. Transient transactivation assays provide a rapid method to identify direct target genes from indirect target genes. Bioinformatic analysis of the promoters of these direct target genes is able to locate motifs that are common to this sub-set of promoters, which is impossible to identify with the larger set of direct and indirect target genes. While this type of analysis does not prove a direct interaction between protein and DNA
Angeline S Andrew
Full Text Available Bladder cancer is the 4(th most common cancer among men in the U.S. We analyzed variant genotypes hypothesized to modify major biological processes involved in bladder carcinogenesis, including hormone regulation, apoptosis, DNA repair, immune surveillance, metabolism, proliferation, and telomere maintenance. Logistic regression was used to assess the relationship between genetic variation affecting these processes and susceptibility in 563 genotyped urothelial cell carcinoma cases and 863 controls enrolled in a case-control study of incident bladder cancer conducted in New Hampshire, U.S. We evaluated gene-gene interactions using Multifactor Dimensionality Reduction (MDR and Statistical Epistasis Network analysis. The 3'UTR flanking variant form of the hormone regulation gene HSD3B2 was associated with increased bladder cancer risk in the New Hampshire population (adjusted OR 1.85 95%CI 1.31-2.62. This finding was successfully replicated in the Texas Bladder Cancer Study with 957 controls, 497 cases (adjusted OR 3.66 95%CI 1.06-12.63. The effect of this prevalent SNP was stronger among males (OR 2.13 95%CI 1.40-3.25 than females (OR 1.56 95%CI 0.83-2.95, (SNP-gender interaction P = 0.048. We also identified a SNP-SNP interaction between T-cell activation related genes GATA3 and CD81 (interaction P = 0.0003. The fact that bladder cancer incidence is 3-4 times higher in males suggests the involvement of hormone levels. This biologic process-based analysis suggests candidate susceptibility markers and supports the theory that disrupted hormone regulation plays a role in bladder carcinogenesis.
Full Text Available Nucleotide-binding site (NBS disease resistance genes play an important role in defending plants from a variety of pathogens and insect pests. Many R-genes have been identified in various plant species. However, little is known about the NBS-encoding genes in Brachypodium distachyon. In this study, using computational analysis of the B. distachyon genome, we identified 126 regular NBS-encoding genes and characterized them on the bases of structural diversity, conserved protein motifs, chromosomal locations, gene duplications, promoter region, and phylogenetic relationships. EST hits and full-length cDNA sequences (from Brachypodium database of 126 R-like candidates supported their existence. Based on the occurrence of conserved protein motifs such as coiled-coil (CC, NBS, leucine-rich repeat (LRR, these regular NBS-LRR genes were classified into four subgroups: CC-NBS-LRR, NBS-LRR, CC-NBS, and X-NBS. Further expression analysis of the regular NBS-encoding genes in Brachypodium database revealed that these genes are expressed in a wide range of libraries, including those constructed from various developmental stages, tissue types, and drought challenged or nonchallenged tissue.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Nimura, Yoshinori; Kumagai, Ken; Kouzu, Yoshinao; Higo, Morihiro; Kato, Yoshikuni; Seki, Naohiko; Yamada, Shigeru
In order to identify a set of genes related to radiation sensitivity of squamous cell carcinoma (SCC) and establish a predictive method, we compared expression profiles of radio-sensitive/radio-resistant SCC cell lines, using the in-house cDNA microarray consisting of 2,201 human genes derived from full-length enriched SCC cDNA libraries and the Human oligo chip 30 K (Hitachi Software Engineering). Surviving fractions (SF) after irradiation of heavy iron were calculated by colony formation assay. Three pairs (TE2-TE13, YES5-YES6, and HSC3-HSC2), sensitive (SF1 0.6), were selected for the microarray analysis. The results of cDNA microarray analysis showed that 20 genes in resistant cell lines and 5 genes in sensitive cell lines were up regulated more than 1.5-fold compared with sensitive and resistant cell lines respectively. Fourteen out of 25 genes were confirmed the gene expression profiles by real-time polymerase chain reaction (PCR). Twenty-seven genes identified by Human oligo chip 30 K are candidate for the markers to distinguish radio-sensitive from radio-resistant. These results suggest that the isolated 27 genes are the candidates that might be used as specific molecular markers to predict radiation sensitivity. (author)
Song, Hui; Wang, Pengfei; Hou, Lei; Zhao, Shuzhen; Zhao, Chuanzhi; Xia, Han; Li, Pengcheng; Zhang, Ye; Bian, Xiaotong; Wang, Xingjun
WRKY proteins are plant specific transcription factors involved in various developmental and physiological processes, especially in biotic and abiotic stress resistance. Although previous studies suggested that WRKY proteins in soybean (Glycine max var. Williams 82) involved in both abiotic and biotic stress responses, the global information of WRKY proteins in the latest version of soybean genome (Wm82.a2v1) and their response to dehydration and salt stress have not been reported. In this study, we identified 176 GmWRKY proteins from soybean Wm82.a2v1 genome. These proteins could be classified into three groups, namely group I (32 proteins), group II (120 proteins), and group III (24 proteins). Our results showed that most GmWRKY genes were located on Chromosome 6, while chromosome 11, 12, and 20 contained the least number of this gene family. More GmWRKY genes were distributed on the ends of chromosomes to compare with other regions. The cis-acting elements analysis suggested that GmWRKY genes were transcriptionally regulated upon dehydration and salt stress. RNA-seq data analysis indicated that three GmWRKY genes responded negatively to dehydration, and 12 genes positively responded to salt stress at 1, 6, and 12 h, respectively. We confirmed by qRT-PCR that the expression of GmWRKY47 and GmWRKY 58 genes was decreased upon dehydration, and the expression of GmWRKY92, 144 and 165 genes was increased under salt treatment.
Weitzeal, A. J.; Wyatt, S. E.; Parsons-Wingerter, P.
Venation patterning in leaves is a major determinant of photosynthesis efficiency because of its dependency on vascular transport of photoassimilates, water, and minerals. Arabidopsis thaliana grown in microgravity show delayed growth and leaf maturation. Gene expression data from the roots, hypocotyl, and leaves of A. thaliana grown during spaceflight vs. ground control analyzed by Affymetrix microarray are available through NASAs GeneLab (GLDS-7). We analyzed the data for differential expression of genes in leaves resulting from the effects of spaceflight on vascular patterning. Two genes were found by preliminary analysis to be upregulated during spaceflight that may be related to vascular formation. The genes are responsible for coding an ARGOS like protein (potentially affecting cell elongation in the leaves), and an F-boxkelch-repeat protein (possibly contributing to protoxylem specification). Further analysis that will focus on raw data quality assessment and a moderated t-test may further confirm upregulation of the two genes and/or identify other gene candidates. Plants defective in these genes will then be assessed for phenotype by the mapping and quantification of leaf vascular patterning by NASAs VESsel GENeration (VESGEN) software to model specific vascular differences of plants grown in spaceflight.
Wu, Shuang; Wu, Hulin
One of the fundamental problems in time course gene expression data analysis is to identify genes associated with a biological process or a particular stimulus of interest, like a treatment or virus infection. Most of the existing methods for this problem are designed for data with longitudinal replicates. But in reality, many time course gene experiments have no replicates or only have a small number of independent replicates. We focus on the case without replicates and propose a new method for identifying differentially expressed genes by incorporating the functional principal component analysis (FPCA) into a hypothesis testing framework. The data-driven eigenfunctions allow a flexible and parsimonious representation of time course gene expression trajectories, leaving more degrees of freedom for the inference compared to that using a prespecified basis. Moreover, the information of all genes is borrowed for individual gene inferences. The proposed approach turns out to be more powerful in identifying time course differentially expressed genes compared to the existing methods. The improved performance is demonstrated through simulation studies and a real data application to the Saccharomyces cerevisiae cell cycle data.
Elvezia Maria Paraboschi
Full Text Available Abnormalities in RNA metabolism and alternative splicing (AS are emerging as important players in complex disease phenotypes. In particular, accumulating evidence suggests the existence of pathogenic links between multiple sclerosis (MS and altered AS, including functional studies showing that an imbalance in alternatively-spliced isoforms may contribute to disease etiology. Here, we tested whether the altered expression of AS-related genes represents a MS-specific signature. A comprehensive comparative analysis of gene expression profiles of publicly-available microarray datasets (190 MS cases, 182 controls, followed by gene-ontology enrichment analysis, highlighted a significant enrichment for differentially-expressed genes involved in RNA metabolism/AS. In detail, a total of 17 genes were found to be differentially expressed in MS in multiple datasets, with CELF1 being dysregulated in five out of seven studies. We confirmed CELF1 downregulation in MS (p = 0.0015 by real-time RT-PCRs on RNA extracted from blood cells of 30 cases and 30 controls. As a proof of concept, we experimentally verified the unbalance in alternatively-spliced isoforms in MS of the NFAT5 gene, a putative CELF1 target. In conclusion, for the first time we provide evidence of a consistent dysregulation of splicing-related genes in MS and we discuss its possible implications in modulating specific AS events in MS susceptibility genes.
Full Text Available MADS-box transcription factors,as a large gene family,play an important role in plant growth and development,especially act as key regulators in controlling the identities of floral organs in flowering plants.They are also significant in the evolutionary revelation.In order to understand MADS-box genes,we need more information of MADS-box genes in non flowering plant.MADS-box genes of Ceratopteris thalictroides were selected to clone and analysis by using RACE method.Two MADS-box genes,designated CtMADS1 and CtMADS2 in C. thalictroides,were cloned.Analysis indicates that CtMADS1 is belonged to MIKC*-clade,while CtMADS2 is belonged to MIKCc-clade.Phylogeny suggests that these two MADS-box genes of C. thalictroides have a close relationship with flowering plants,the data indicates that at least two different MADS-box genes are homologous to floral homeotic genes existed in the last common ancestor of contemporary vascular plants.
Roberta Fogliatto Mariot
Full Text Available Potato (Solanum tuberosum yield has increased dramatically over the last 50 years and this has been achieved by a combination of improved agronomy and biotechnology efforts. Gene studies are taking place to improve new qualities and develop new cultivars. Reverse transcriptase quantitative polymerase chain reaction (RT-qPCR is a bench-marking analytical tool for gene expression analysis, but its accuracy is highly dependent on a reliable normalization strategy of an invariant reference genes. For this reason, the goal of this work was to select and validate reference genes for transcriptional analysis of edible tubers of potato. To do so, RT-qPCR primers were designed for ten genes with relatively stable expression in potato tubers as observed in RNA-Seq experiments. Primers were designed across exon boundaries to avoid genomic DNA contamination. Differences were observed in the ranking of candidate genes identified by geNorm, NormFinder and BestKeeper algorithms. The ranks determined by geNorm and NormFinder were very similar and for all samples the most stable candidates were C2, exocyst complex component sec3 (SEC3 and ATCUL3/ATCUL3A/CUL3/CUL3A (CUL3A. According to BestKeeper, the importin alpha and ubiquitin-associated/ts-n genes were the most stable. Three genes were selected as reference genes for potato edible tubers in RT-qPCR studies. The first one, called C2, was selected in common by NormFinder and geNorm, the second one is SEC3, selected by NormFinder, and the third one is CUL3A, selected by geNorm. Appropriate reference genes identified in this work will help to improve the accuracy of gene expression quantification analyses by taking into account differences that may be observed in RNA quality or reverse transcription efficiency across the samples.
Zhang, Yucheng; Gao, Min; Singer, Stacy D.; Fei, Zhangjun; Wang, Hua; Wang, Xiping
Background The TIFY gene family constitutes a plant-specific group of genes with a broad range of functions. This family encodes four subfamilies of proteins, including ZML, TIFY, PPD and JASMONATE ZIM-Domain (JAZ) proteins. JAZ proteins are targets of the SCFCOI1 complex, and function as negative regulators in the JA signaling pathway. Recently, it has been reported in both Arabidopsis and rice that TIFY genes, and especially JAZ genes, may be involved in plant defense against insect feeding, wounding, pathogens and abiotic stresses. Nonetheless, knowledge concerning the specific expression patterns and evolutionary history of plant TIFY family members is limited, especially in a woody species such as grape. Methodology/Principal Findings A total of two TIFY, four ZML, two PPD and 11 JAZ genes were identified in the Vitis vinifera genome. Phylogenetic analysis of TIFY protein sequences from grape, Arabidopsis and rice indicated that the grape TIFY proteins are more closely related to those of Arabidopsis than those of rice. Both segmental and tandem duplication events have been major contributors to the expansion of the grape TIFY family. In addition, synteny analysis between grape and Arabidopsis demonstrated that homologues of several grape TIFY genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of lineages that led to grape and Arabidopsis. Analyses of microarray and quantitative real-time RT-PCR expression data revealed that grape TIFY genes are not a major player in the defense against biotrophic pathogens or viruses. However, many of these genes were responsive to JA and ABA, but not SA or ET. Conclusion The genome-wide identification, evolutionary and expression analyses of grape TIFY genes should facilitate further research of this gene family and provide new insights regarding their evolutionary history and regulatory control. PMID:22984514
Mariot, Roberta Fogliatto; de Oliveira, Luisa Abruzzi; Voorhuijzen, Marleen M; Staats, Martijn; Hutten, Ronald C B; Van Dijk, Jeroen P; Kok, Esther; Frazzon, Jeverson
Potato (Solanum tuberosum) yield has increased dramatically over the last 50 years and this has been achieved by a combination of improved agronomy and biotechnology efforts. Gene studies are taking place to improve new qualities and develop new cultivars. Reverse transcriptase quantitative polymerase chain reaction (RT-qPCR) is a bench-marking analytical tool for gene expression analysis, but its accuracy is highly dependent on a reliable normalization strategy of an invariant reference genes. For this reason, the goal of this work was to select and validate reference genes for transcriptional analysis of edible tubers of potato. To do so, RT-qPCR primers were designed for ten genes with relatively stable expression in potato tubers as observed in RNA-Seq experiments. Primers were designed across exon boundaries to avoid genomic DNA contamination. Differences were observed in the ranking of candidate genes identified by geNorm, NormFinder and BestKeeper algorithms. The ranks determined by geNorm and NormFinder were very similar and for all samples the most stable candidates were C2, exocyst complex component sec3 (SEC3) and ATCUL3/ATCUL3A/CUL3/CUL3A (CUL3A). According to BestKeeper, the importin alpha and ubiquitin-associated/ts-n genes were the most stable. Three genes were selected as reference genes for potato edible tubers in RT-qPCR studies. The first one, called C2, was selected in common by NormFinder and geNorm, the second one is SEC3, selected by NormFinder, and the third one is CUL3A, selected by geNorm. Appropriate reference genes identified in this work will help to improve the accuracy of gene expression quantification analyses by taking into account differences that may be observed in RNA quality or reverse transcription efficiency across the samples.
Full Text Available Limonoids produced by citrus are a group of highly bioactive secondary metabolites which provide health benefits for humans. Currently there is a lack of information derived from research on the genetic mechanisms controlling the biosynthesis of limonoids, which has limited the improvement of citrus for high production of limonoids. In this study, the transcriptome sequences of leaves, phloems and seeds of pummelo (Citrus grandis (L. Osbeck at different development stages with variances in limonoids contents were used for digital gene expression profiling analysis in order to identify the genes corresponding to the biosynthesis of limonoids. Pair-wise comparison of transcriptional profiles between different tissues identified 924 differentially expressed genes commonly shared between them. Expression pattern analysis suggested that 382 genes from three conjunctive groups of K-means clustering could be possibly related to the biosynthesis of limonoids. Correlation analysis with the samples from different genotypes, and different developing tissues of the citrus revealed that the expression of 15 candidate genes were highly correlated with the contents of limonoids. Among them, the cytochrome P450s (CYP450s and transcriptional factor MYB demonstrated significantly high correlation coefficients, which indicated the importance of those genes on the biosynthesis of limonoids. CiOSC gene encoding the critical enzyme oxidosqualene cyclase (OSC for biosynthesis of the precursor of triterpene scaffolds was found positively corresponding to the accumulation of limonoids during the development of seeds. Suppressing the expression of CiOSC with VIGS (Virus-induced gene silencing demonstrated that the level of gene silencing was significantly correlated to the reduction of limonoids contents. The results indicated that the CiOSC gene plays a pivotal role in biosynthesis of limonoids.
Full Text Available Abstract Background In ovo electroporation is a widely used technique to study gene function in developmental biology. Despite the widespread acceptance of this technique, no genome-wide analysis of the effects of in ovo electroporation, principally the current applied across the tissue and exogenous vector DNA introduced, on endogenous gene expression has been undertaken. Here, the effects of electric current and expression of a GFP-containing construct, via electroporation into the midbrain of Hamburger-Hamilton stage 10 chicken embryos, are analysed by microarray. Results Both current alone and in combination with exogenous DNA expression have a small but reproducible effect on endogenous gene expression, changing the expression of the genes represented on the array by less than 0.1% (current and less than 0.5% (current + DNA, respectively. The subset of genes regulated by electric current and exogenous DNA span a disparate set of cellular functions. However, no genes involved in the regional identity were affected. In sharp contrast to this, electroporation of a known transcription factor, Dmrt5, caused a much greater change in gene expression. Conclusions These findings represent the first systematic genome-wide analysis of the effects of in ovo electroporation on gene expression during embryonic development. The analysis reveals that this process has minimal impact on the genetic basis of cell fate specification. Thus, the study demonstrates the validity of the in ovo electroporation technique to study gene function and expression during development. Furthermore, the data presented here can be used as a resource to refine the set of transcriptional responders in future in ovo electroporation studies of specific gene function.
Chatziioannou, Aristotelis; Moulos, Panagiotis; Kolisis, Fragiskos N
The microarray data analysis realm is ever growing through the development of various tools, open source and commercial. However there is absence of predefined rational algorithmic analysis workflows or batch standardized processing to incorporate all steps, from raw data import up to the derivation of significantly differentially expressed gene lists. This absence obfuscates the analytical procedure and obstructs the massive comparative processing of genomic microarray datasets. Moreover, the solutions provided, heavily depend on the programming skills of the user, whereas in the case of GUI embedded solutions, they do not provide direct support of various raw image analysis formats or a versatile and simultaneously flexible combination of signal processing methods. We describe here Gene ARMADA (Automated Robust MicroArray Data Analysis), a MATLAB implemented platform with a Graphical User Interface. This suite integrates all steps of microarray data analysis including automated data import, noise correction and filtering, normalization, statistical selection of differentially expressed genes, clustering, classification and annotation. In its current version, Gene ARMADA fully supports 2 coloured cDNA and Affymetrix oligonucleotide arrays, plus custom arrays for which experimental details are given in tabular form (Excel spreadsheet, comma separated values, tab-delimited text formats). It also supports the analysis of already processed results through its versatile import editor. Besides being fully automated, Gene ARMADA incorporates numerous functionalities of the Statistics and Bioinformatics Toolboxes of MATLAB. In addition, it provides numerous visualization and exploration tools plus customizable export data formats for seamless integration by other analysis tools or MATLAB, for further processing. Gene ARMADA requires MATLAB 7.4 (R2007a) or higher and is also distributed as a stand-alone application with MATLAB Component Runtime. Gene ARMADA provides a
Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools
Full Text Available Studies have demonstrated that nucleotide-binding site-leucine-rich repeat (NBS-LRR genes respond to pathogen attack in plants. Characterization of NBS-LRR genes in peanut is not well documented. The newly released whole genome sequences of Arachis duranensis and Arachis ipaënsis have allowed a global analysis of this important gene family in peanut to be conducted. In this study, we identified 393 (AdNBS and 437 (AiNBS NBS-LRR genes from A. duranensis and A. ipaënsis, respectively, using bioinformatics approaches. Full-length sequences of 278 AdNBS and 303 AiNBS were identified. Fifty-one orthologous, four AdNBS paralogous, and six AiNBS paralogous gene pairs were predicted. All paralogous gene pairs were located in the same chromosomes, indicating that tandem duplication was the most likely mechanism forming these paralogs. The paralogs mainly underwent purifying selection, but most LRR 8 domains underwent positive selection. More gene clusters were found in A. ipaënsis than in A. duranensis, possibly owing to tandem duplication events occurring more frequently in A. ipaënsis. The expression profile of NBS-LRR genes was different between A. duranensis and A. hypogaea after Aspergillus flavus infection. The up-regulated expression of NBS-LRR in A. duranensis was continuous, while these genes responded to the pathogen temporally in A. hypogaea.
Full Text Available The soil insect Bradysia odoriphaga (Diptera: Sciaridae causes substantial damage to Chinese chive. Suitable reference genes in B. odoriphaga (Bradysia odoriphaga have yet to be identified for normalizing target gene expression among samples by quantitative real-time PCR (qRT-PCR. This study was focused on identifying the expression stability of 12 candidate housekeeping genes in B. odoriphaga under various experiment conditions. The final stability ranking of 12 housekeeping genes was obtained with RefFinder, and the most suitable number of reference genes was analyzed by GeNorm. The results revealed that the most appropriate sets of internal controls were RPS15, RPL18, and RPS18 across developmental phases; RPS15, RPL28, and GAPDH across temperatures; RPS15 and RPL18 across pesticide treatments; RSP5, RPS18, and SDHA across photoperiods; ACTb, RPS18, and RPS15 across diets; RPS13 and RPL28 across populations; and RPS15, ACTb, and RPS18 across all samples. The use of the most suitable reference genes versus an arbitrarily selected reference gene resulted in significant differences in the analysis of a target gene expression. HSP23 in B. odoriphaga was found to be up-regulated under low temperatures. These results will contribute to the standardization of qRT-PCR and will also be valuable for further research on gene function in B. odoriphaga.
Darias, M J; Zambonino-Infante, J L; Hugot, K; Cahu, C L; Mazurais, D
During the larval period, marine teleosts undergo very fast growth and dramatic changes in morphology, metabolism, and behavior to accomplish their metamorphosis into juvenile fish. Regulation of gene expression is widely thought to be a key mechanism underlying the management of the biological processes required for harmonious development over this phase of life. To provide an overall analysis of gene expression in the whole body during sea bass larval development, we monitored the expression of 6,626 distinct genes at 10 different points in time between 7 and 43 days post-hatching (dph) by using heterologous hybridization of a rainbow trout cDNA microarray. The differentially expressed genes (n = 485) could be grouped into two categories: genes that were generally up-expressed early, between 7 and 23 dph, and genes up-expressed between 25 and 43 dph. Interestingly, among the genes regulated during the larval period, those related to organogenesis, energy pathways, biosynthesis, and digestion were over-represented compared with total set of analyzed genes. We discuss the quantitative regulation of whole-body contents of these specific transcripts with regard to the ontogenesis and maturation of essential functions that take place over larval development. Our study is the first utilization of a transcriptomic approach in sea bass and reveals dynamic changes in gene expression patterns in relation to marine finfish larval development.
Zhu, Yizhang; Wang, Likun; Yin, Yuxin; Yang, Ence
Postmortem mRNA degradation is considered to be the major concern in gene expression research utilizing human postmortem tissues. A key factor in this process is the postmortem interval (PMI), which is defined as the interval between death and sample collection. However, global patterns of postmortem mRNA degradation at individual gene levels across diverse human tissues remain largely unknown. In this study, we performed a systematic analysis of alteration of gene expression associated with PMI in human tissues. From the Genotype-Tissue Expression (GTEx) database, we evaluated gene expression levels of 2,016 high-quality postmortem samples from 316 donors of European descent, with PMI ranging from 1 to 27 hours. We found that PMI-related mRNA degradation is tissue-specific, gene-specific, and even genotype-dependent, thus drawing a more comprehensive picture of PMI-associated gene expression across diverse human tissues. Additionally, we also identified 266 differentially variable (DV) genes, such as DEFB4B and IFNG, whose expression is significantly dispersed between short PMI (S-PMI) and long PMI (L-PMI) groups. In summary, our analyses provide a comprehensive profile of PMI-associated gene expression, which will help interpret gene expression patterns in the evaluation of postmortem tissues.
Giallourakis, Cosmas; Benita, Yair; Molinie, Benoit; Cao, Zhifang; Despo, Orion; Pratt, Henry E.; Zukerberg, Lawrence R.; Daly, Mark J.; Rioux, John D.; Xavier, Ramnik J.
Profiling studies of mRNA and miRNA, particularly microarray-based studies, have been extensively used to create compendia of genes that are preferentially expressed in the immune system. In some instances, functional studies have been subsequently pursued. Recent efforts such as ENCODE have demonstrated the benefit of coupling RNA-Seq analysis with information from expressed sequence tags (ESTs) for transcriptomic analysis. However, the full characterization and identification of transcripts that function as modulators of human immune responses remains incomplete. In this study, we demonstrate that an integrated analysis of human ESTs provides a robust platform to identify the immune transcriptome. Beyond recovering a reference set of immune-enriched genes and providing large-scale cross-validation of previous microarray studies, we discovered hundreds of novel genes preferentially expressed in the immune system, including non-coding RNAs. As a result, we have established the Immunogene database, representing an integrated EST “road map” of gene expression in human immune cells, which can be used to further investigate the function of coding and non-coding genes in the immune system. Using this approach, we have uncovered a unique metabolic gene signature of human macrophages and identified PRDM15 as a novel overexpressed gene in human lymphomas. Thus we demonstrate the utility of EST profiling as a basis for further deconstruction of physiologic and pathologic immune processes. PMID:23616578
Saroj K. Dangi
Full Text Available Aim: Blackleg disease is caused by Clostridium chauvoei in ruminants. Although virulence factors such as C. chauvoei toxin A, sialidase, and flagellin are well characterized, hyaluronidases of C. chauvoei are not characterized. The present study was aimed at cloning and sequence analysis of hyaluronoglucosaminidase (nagH gene of C. chauvoei. Materials and Methods: C. chauvoei strain ATCC 10092 was grown in ATCC 2107 media and confirmed by polymerase chain reaction (PCR using the primers specific for 16-23S rDNA spacer region. nagH gene of C. chauvoei was amplified and cloned into pRham-SUMO vector and transformed into Escherichia cloni 10G cells. The construct was then transformed into E. cloni cells. Colony PCR was carried out to screen the colonies followed by sequencing of nagH gene in the construct. Results: PCR amplification yielded nagH gene of 1143 bp product, which was cloned in prokaryotic expression system. Colony PCR, as well as sequencing of nagH gene, confirmed the presence of insert. Sequence was then subjected to BLAST analysis of NCBI, which confirmed that the sequence was indeed of nagH gene of C. chauvoei. Phylogenetic analysis of the sequence showed that it is closely related to Clostridium perfringens and Clostridium paraputrificum. Conclusion: The gene for virulence factor nagH was cloned into a prokaryotic expression vector and confirmed by sequencing.
Zhu Xiaodong; Guo Ya; Qu Song; Li Ling; Huang Shiting; Li Danrong; Zhang Wei
Objective: To discover radioresistance associated molecular biomarkers and its mechanism in nasopharyngeal carcinoma by protein-protein interaction network analysis. Methods: Whole genome expression microarray was applied to screen out differentially expressed genes in two cell lines CNE-2R and CNE-2 with different radiosensitivity. Four differentially expressed genes were randomly selected for further verification by the semi-quantitative RT-PCR analysis with self-designed primers. The common differentially expressed genes from two experiments were analyzed with the SNOW online database in order to find out the central node related to the biomarkers of nasopharyngeal carcinoma radioresistance. The expression of STAT1 in CNE-2R and CNE-2 cells was measured by Western blot. Results: Compared with CNE-2 cells, 374 genes in CNE-2R cells were differentially expressed while 197 genes showed significant differences. Four randomly selected differentially expressed genes were verified by RT-PCR and had same change trend in consistent with the results of chip assay. Analysis with the SNOW database demonstrated that those 197 genes could form a complicated interaction network where STAT1 and JUN might be two key nodes. Indeed, the STAT1-α expression in CNE-2R was higher than that in CNE-2 (t=4.96, P<0.05). Conclusions: The key nodes of STAT1 and JUN may be the molecular biomarkers leading to radioresistance in nasopharyngeal carcinoma, and STAT1-α might have close relationship with radioresistance. (authors)
Full Text Available Objective Molecular cloning and bioinformatics analysis of annexin A2 (ANXA2 gene in sika deer antler tip were conducted. The role of ANXA2 gene in the growth and development of the antler were analyzed initially. Methods The reverse transcriptase polymerase chain reaction (RT-PCR was used to clone the cDNA sequence of the ANXA2 gene from antler tip of sika deer (Cervus Nippon hortulorum and the bioinformatics methods were applied to analyze the amino acid sequence of Anxa2 protein. The mRNA expression levels of the ANXA2 gene in different growth stages were examined by real time reverse transcriptase polymerase chain reaction (real time RT-PCR. Results The nucleotide sequence analysis revealed an open reading frame of 1,020 bp encoding 339 amino acids long protein of calculated molecular weight 38.6 kDa and isoelectric point 6.09. Homologous sequence alignment and phylogenetic analysis indicated that the Anxa2 mature protein of sika deer had the closest genetic distance with Cervus elaphus and Bos mutus. Real time RT-PCR results showed that the gene had differential expression levels in different growth stages, and the expression level of the ANXA2 gene was the highest at metaphase (rapid growing period. Conclusion ANXA2 gene may promote the cell proliferation, and the finding suggested Anxa2 as an important candidate for regulating the growth and development of deer antler.
Full Text Available Screening of proteolytic and fibrinolytic bacteria from Indonesian soy bean based fermented food Oncom revealed several potential isolates. Based on 16s rDNA gene analysis, one particular isolate with the highest proteolytic and fibrinolytic activity was identified as Stenotrophomonas sp. The protease gene was amplified to generate a 1749 bp Polymerase Chain Reaction product and BLAST analysis, revealed 90% homology with gene encoding protease enzyme from Stenotrophomonas maltophilia. The putative amino acid sequence indicated a serine protease enzyme with typical amino acid aspartate, histidine and serine in the catalytic triad. The gene was translated into a pre-pro-protein consisted of cleavage site on its N terminal and Pre-Peptidase Cterminal domain. Cloning of the protease gene in pET22b with Escherichia coli BL21 DE3 as the host showed that the gene was expressed as insoluble protein fraction. This is the first report for analysis of protease gene from food origin Stenotrophomonas sp.
Zhou, Changpin; Chen, Yanbo; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang
The MYB proteins comprise one of the largest transcription factor families in plants, and play key roles in regulatory networks controlling development, metabolism, and stress responses. A total of 125 MYB genes (JcMYB) have been identified in the physic nut (Jatropha curcas L.) genome, including 120 2R-type MYB, 4 3R-MYB, and 1 4R-MYB genes. Based on exon-intron arrangement of MYBs from both lower (Physcomitrella patens) and higher (physic nut, Arabidopsis, and rice) plants, we can classify plant MYB genes into ten groups (MI-X), except for MIX genes which are nonexistent in higher plants. We also observed that MVIII genes may be one of the most ancient MYB types which consist of both R2R3- and 3R-MYB genes. Most MYB genes (76.8% in physic nut) belong to the MI group which can be divided into 34 subgroups. The JcMYB genes were nonrandomly distributed on its 11 linkage groups (LGs). The expansion of MYB genes across several subgroups was observed and resulted from genome triplication of ancient dicotyledons and from both ancient and recent tandem duplication events in the physic nut genome. The expression patterns of several MYB duplicates in the physic nut showed differences in four tissues (root, stem, leaf, and seed), and 34 MYB genes responded to at least one abiotic stressor (drought, salinity, phosphate starvation, and nitrogen starvation) in leaves and/or roots based on the data analysis of digital gene expression tags. Overexpression of the JcMYB001 gene in Arabidopsis increased its sensitivity to drought and salinity stresses. Copyright © 2015 Elsevier B.V. All rights reserved.
Matthew J Loza
Full Text Available Recent identifications of associations between novel variants in inflammation-related genes and several common diseases emphasize the need for systematic evaluations of these genes in disease susceptibility. Considering that many genes are involved in the complex inflammation responses and many genetic variants in these genes have the potential to alter the functions and expression of these genes, we assembled a list of key inflammation-related genes to facilitate the identification of genetic associations of diseases with an inflammation-related etiology. We first reviewed various phases of inflammation responses, including the development of immune cells, sensing of danger, influx of cells to sites of insult, activation and functional responses of immune and non-immune cells, and resolution of the immune response. Assisted by the Ingenuity Pathway Analysis, we then identified 17 functional sub-pathways that are involved in one or multiple phases. This organization would greatly increase the chance of detecting gene-gene interactions by hierarchical clustering of genes with their functional closeness in a pathway. Finally, as an example application, we have developed tagging single nucleotide polymorphism (tSNP arrays for populations of European and African descent to capture all the common variants of these key inflammation-related genes. Assays of these tSNPs have been designed and assembled into two Affymetrix ParAllele customized chips, one each for European (12,011 SNPs and African (21,542 SNPs populations. These tSNPs have greater coverage for these inflammation-related genes compared to the existing genome-wide arrays, particularly in the African population. These tSNP arrays can facilitate systematic evaluation of inflammation pathways in disease susceptibility. For additional applications, other genotyping platforms could also be employed. For existing genome-wide association data, this list of key inflammation-related genes and
Guardia, Gabriela D A; Pires, Luís Ferreira; Vêncio, Ricardo Z N; Malmegrim, Kelen C R; de Farias, Cléver R G
Gene expression studies are generally performed through multi-step analysis processes, which require the integrated use of a number of analysis tools. In order to facilitate tool/data integration, an increasing number of analysis tools have been developed as or adapted to semantic web services. In recent years, some approaches have been defined for the development and semantic annotation of web services created from legacy software tools, but these approaches still present many limitations. In addition, to the best of our knowledge, no suitable approach has been defined for the functional genomics domain. Therefore, this paper aims at defining an integrated methodology for the implementation of RESTful semantic web services created from gene expression analysis tools and the semantic annotation of such services. We have applied our methodology to the development of a number of services to support the analysis of different types of gene expression data, including microarray and RNASeq. All developed services are publicly available in the Gene Expression Analysis Services (GEAS) Repository at http://dcm.ffclrp.usp.br/lssb/geas. Additionally, we have used a number of the developed services to create different integrated analysis scenarios to reproduce parts of two gene expression studies documented in the literature. The first study involves the analysis of one-color microarray data obtained from multiple sclerosis patients and healthy donors. The second study comprises the analysis of RNA-Seq data obtained from melanoma cells to investigate the role of the remodeller BRG1 in the proliferation and morphology of these cells. Our methodology provides concrete guidelines and technical details in order to facilitate the systematic development of semantic web services. Moreover, it encourages the development and reuse of these services for the creation of semantically integrated solutions for gene expression analysis.
Gabriela D A Guardia
Full Text Available Gene expression studies are generally performed through multi-step analysis processes, which require the integrated use of a number of analysis tools. In order to facilitate tool/data integration, an increasing number of analysis tools have been developed as or adapted to semantic web services. In recent years, some approaches have been defined for the development and semantic annotation of web services created from legacy software tools, but these approaches still present many limitations. In addition, to the best of our knowledge, no suitable approach has been defined for the functional genomics domain. Therefore, this paper aims at defining an integrated methodology for the implementation of RESTful semantic web services created from gene expression analysis tools and the semantic annotation of such services. We have applied our methodology to the development of a number of services to support the analysis of different types of gene expression data, including microarray and RNASeq. All developed services are publicly available in the Gene Expression Analysis Services (GEAS Repository at http://dcm.ffclrp.usp.br/lssb/geas. Additionally, we have used a number of the developed services to create different integrated analysis scenarios to reproduce parts of two gene expression studies documented in the literature. The first study involves the analysis of one-color microarray data obtained from multiple sclerosis patients and healthy donors. The second study comprises the analysis of RNA-Seq data obtained from melanoma cells to investigate the role of the remodeller BRG1 in the proliferation and morphology of these cells. Our methodology provides concrete guidelines and technical details in order to facilitate the systematic development of semantic web services. Moreover, it encourages the development and reuse of these services for the creation of semantically integrated solutions for gene expression analysis.
Liu, Shikai; Li, Qi; Liu, Zhanjiang
Although a large set of full-length transcripts was recently assembled in catfish, annotation of large gene families, especially those with duplications, is still a great challenge. Most often, complexities in annotation cause mis-identification and thereby much confusion in the scientific literature. As such, detailed phylogenetic analysis and/or orthology analysis are required for annotation of genes involved in gene families. The ATP-binding cassette (ABC) transporter gene superfamily is a large gene family that encodes membrane proteins that transport a diverse set of substrates across membranes, playing important roles in protecting organisms from diverse environment. In this work, we identified a set of 50 ABC transporters in catfish genome. Phylogenetic analysis allowed their identification and annotation into seven subfamilies, including 9 ABCA genes, 12 ABCB genes, 12 ABCC genes, 5 ABCD genes, 2 ABCE genes, 4 ABCF genes and 6 ABCG genes. Most ABC transporters are conserved among vertebrates, though cases of recent gene duplications and gene losses do exist. Gene duplications in catfish were found for ABCA1, ABCB3, ABCB6, ABCC5, ABCD3, ABCE1, ABCF2 and ABCG2. The whole set of catfish ABC transporters provide the essential genomic resources for future biochemical, toxicological and physiological studies of ABC drug efflux transporters. The establishment of orthologies should allow functional inferences with the information from model species, though the function of lineage-specific genes can be distinct because of specific living environment with different selection pressure.
Kelmansky, Diana M; Martínez, Elena J; Leiva, Víctor
In this paper, we introduce a new family of power transformations, which has the generalized logarithm as one of its members, in the same manner as the usual logarithm belongs to the family of Box-Cox power transformations. Although the new family has been developed for analyzing gene expression data, it allows a wider scope of mean-variance related data to be reached. We study the analytical properties of the new family of transformations, as well as the mean-variance relationships that are stabilized by using its members. We propose a methodology based on this new family, which includes a simple strategy for selecting the family member adequate for a data set. We evaluate the finite sample behavior of different classical and robust estimators based on this strategy by Monte Carlo simulations. We analyze real genomic data by using the proposed transformation to empirically show how the new methodology allows the variance of these data to be stabilized.
Drosten, Matthias; Lechuga, Carmen G; Barbacid, Mariano
Proliferation and differentiation of epidermal keratinocytes are tightly controlled to ensure proper development and homeostasis of the epidermis. The Ras family of small GTPases has emerged as a central node in the coordination of cell proliferation in the epidermis. Recent genetic evidence from mouse models has revealed that the intensity of Ras signaling modulates the proliferative capacity of epidermal keratinocytes. Interfering with Ras signaling either by combined elimination of the 3 Ras genes from the basal layer of the epidermis or by overexpression of dominant-negative Ras isoforms caused epidermal thinning due to hypoproliferation of keratinocytes. In contrast, overexpression of oncogenic Ras mutants in different epidermal cell layers led to hyperproliferative phenotypes including the development of papillomas and squamous cell carcinomas. Here, we discuss the value of loss- and gain-of-function studies in mouse models to assess the role of Ras signaling in the control of epidermal proliferation. PMID:24150175
Full Text Available Abstract Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs have multiple cores, whereas Graphics Processing Units (GPUs also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1 the interaction of SNPs within it in parallel, and 2 the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/.
Full Text Available Argonaute protein family is the key players in pathways of gene silencing and small regulatory RNAs in different organisms. Argonaute proteins can bind small noncoding RNAs and control protein synthesis, affect messenger RNA stability, and even participate in the production of new forms of small RNAs. The aim of this study was to characterize and perform bioinformatic analysis of Argonaute proteins in 32 plant species that their genome was sequenced. A total of 437 Argonaute genes were identified and were analyzed based on lengths, gene structure, and protein structure. Results showed that Argonaute proteins were highly conserved across plant kingdom. Phylogenic analysis divided plant Argonautes into three classes. Argonaute proteins have three conserved domains PAZ, MID and PIWI. In addition to three conserved domains namely, PAZ, MID, and PIWI, we identified few more domains in AGO of some plant species. Expression profile analysis of Argonaute proteins showed that expression of these genes varies in most of tissues, which means that these proteins are involved in regulation of most pathways of the plant system. Numbers of alternative transcripts of Argonaute genes were highly variable among the plants. A thorough analysis of large number of putative Argonaute genes revealed several interesting aspects associated with this protein and brought novel information with promising usefulness for both basic and biotechnological applications.
Mingora, Christina; Ewer, Jason; Ospina-Giraldo, Manuel
We have scanned the Phytophthora infestans, P. ramorum, and P. sojae genomes for the presence of putative pectin methylesterase genes and conducted a sequence analysis of all gene models found. We also searched for potential regulatory motifs in the promoter region of the proposed P. infestans models, and investigated the gene expression levels throughout the course of P. infestans infection on potato plants, using in planta and detached leaf assays. We found that genes located on contiguous chromosomal regions contain similar motifs in the promoter region, indicating the possibility of a shared regulatory mechanism. Results of our investigations also suggest that, during the pathogenicity process, the expression levels of some of the analyzed genes vary considerably when compared to basal expression observed in in vitro cultures of non-sporulating mycelium. These results were observed both in planta and in detached leaf assays. Copyright © 2014 Elsevier B.V. All rights reserved.
Heng-Wei Zhang; Xian-Fu Sun; Ya-Ning He; Jun-Tao Li; Xu-Hui Guo; Hui Liu
Objective: To analyze breast cancer bone metastasis rel